
Huffman Coding | GeeksforGeeks
Huffman Coding | GeeksforGeeks
Find Complete Code at GeeksforGeeks Article: http://www.geeksforgeeks.org/greedy-a…
This video is contributed by Illuminati
Please Like, Comment and Share the Video among your friends.
Also, Subscribe if you haven’t already! :)
Content
0.06 -> hello friends and welcome to geeks who
2.46 -> geeks in this video we will be learning
4.799 -> Huffman coding technique using greedy
7.02 -> choice property so Huffman coding is a
10.349 -> data compression algorithm that is if we
13.349 -> have some data it will compress it in
16.109 -> such a way such that while decoding it
19.14 -> no loss of information occurs in this B
23.46 -> assign a variable length code to input
26.039 -> characters this variable length code
28.83 -> depends on the occurrence or frequency
31.41 -> of the character in our data the most
34.23 -> frequent character gets the smallest
36.329 -> code and the least frequent character
38.91 -> gets the largest code see here how the
42.09 -> code length varies depending upon the
45.059 -> frequency of characters now whatever
48.719 -> code is assigned to our characters
50.7 -> puzzles prefix property which requires
54.12 -> that there is no whole code word in the
57.09 -> system that is a prefix of any other
60.66 -> code word in that system for example a
64.229 -> code with code words 0 1 1 has the
68.88 -> prefix property whereas a code
71.7 -> consisting of 0 1 and 1 1 does not
76.189 -> because here 1 is a prefix of 1 1 this
82.29 -> property makes sure that our data is
84.869 -> uniquely decodable now let's see how
89.43 -> this compression work is carried out
91.35 -> using Huffman coding technique this is
95.189 -> the sample data information our data
98.31 -> consists of 6 unique characters and
101.18 -> these are their frequencies so let's
104.549 -> have the Huffman algorithm these are the
107.549 -> 4 step that will help us to build a
109.74 -> Huffman tree which in turn will provide
112.649 -> us the Huffman codes for all the
114.63 -> characters we have the data we have the
117.45 -> algorithm so let's do a quick trial run
119.729 -> and build a Huffman tree the first step
123.39 -> says to build a min heap for each unique
126.689 -> character if you don't know about binary
130.11 -> heap you may refer to our quick video
132.629 -> tutorial
133.6 -> we have covered earlier this heap would
137.26 -> be based on frequencies of the input
139.36 -> characters such that each node data is
142.36 -> less than or equal to the data in the
145.24 -> nodes children once we have them in heap
148.33 -> we need to extract two nodes with
151.21 -> minimum frequencies
152.62 -> this step is greedy in nature here we
156.61 -> have character a and B with the least
159.79 -> frequencies in the next step we form a
163.84 -> new internal node with frequency equal
167.2 -> to the sum of the two nodes frequencies
170.04 -> the first extracted node forms its left
173.74 -> child and the second extracted node
176.71 -> becomes the right child after node
180.31 -> formation we add this new internal node
183.37 -> to our main heap in last step we perform
187.06 -> step 2 m3 until only a single node
190.3 -> remains in our hip so we again extract
194.17 -> the two nodes with minimum frequencies
196.2 -> here we have C and D with least
200.14 -> frequencies we form a new internal node
204.01 -> and add node C to its left and node D to
208.87 -> it's right and it's frequency as the sum
212.23 -> of its children and finally add this new
215.71 -> internal node to our main heap again we
219.52 -> extract the two nodes with minimum
221.41 -> frequencies here we have n1 and E with
226.12 -> least frequencies we form a new internal
229.39 -> node and add node and one to its left
233.17 -> and node E to it's right and it's
236.44 -> frequency as the sum of its children
239.43 -> finally we add this new internal node
242.41 -> entry to avoid min-heap our heap still
246.94 -> have more than one node so again we
250.45 -> extract the two nodes with minimum
252.46 -> frequencies here we have n2 and n3 with
257.2 -> least frequencies they form a new
260.92 -> internal node and add
263.05 -> n2 to its left and node n3 to its right
267.61 -> and it's frequency as the sum of its
270.669 -> children and finally add this new
273.729 -> internal node m4 to our min-heap we now
278.77 -> have two nodes left merging them would
282.099 -> give a single node and thus our
284.53 -> algorithm may stop so we form a new
288.039 -> internal node n 5 and add F with lesser
292.659 -> frequency to its left and n4 with higher
296.65 -> frequency to it's right and it's
299.409 -> frequency as the sum of its two children
303.18 -> so this would be the final output of
306.55 -> this algorithm and we'll call it a
309.4 -> Huffman tree now to generate code from
313.3 -> this tree will Traverse it from the root
316.12 -> and assign zero to the left child and
319.449 -> one to the right child once we encounter
323.469 -> a leaf node we may print this code this
327.759 -> would be the codes for each character
330.33 -> let's see how we obtain for node C so
335.77 -> here is C we start from root go to right
340.029 -> go to left again left so we have one
344.05 -> zero and zero let's generate for a here
350.05 -> we have a this starts from root go to
353.56 -> right again right to left to left we
357.789 -> have this one 1 then 0 and 0 similarly
364.12 -> we do for all the characters now let's
367.419 -> see how to code this algorithm this is
371.409 -> the implementation which has been taken
373.779 -> from X for geeks this function will help
377.68 -> us to build a Huffman tree and is
379.87 -> equivalent to the four step algorithm
382.089 -> that we have covered earlier the input a
385.839 -> character array their corresponding
388.449 -> frequencies and the number of characters
391.379 -> rebuild our main heap out of these input
394.529 -> you may refer to any mini creating
397.449 -> function as here we are skipping it
401.05 -> we iterate until the size of our men
403.87 -> heap becomes one we extract the top node
407.919 -> and store it in variable left V again
412.21 -> extract a node and store it in right we
416.379 -> now form a new internal node and assign
419.409 -> its frequency as sum of frequencies of
422.44 -> its two children and also add left to
426.31 -> its left child and right to its right we
430.72 -> finally add this new internal node to
433.33 -> our main heap we keep doing this until
436.75 -> only a single node remains in our main
438.909 -> heap we finally return this node this
442.75 -> function will help us to generate codes
445.15 -> using a Huffman tree we input a pointer
448.659 -> to root node and auxilary added to store
451.509 -> the code form so far and the size of it
454.349 -> we assign 0 if left edge exists and
458.289 -> recur similarly if right edge exists we
462.94 -> assign 1 and record if we reach a leaf
467.409 -> node we print the code value stored in
470.44 -> our array now let's see the complexity
474.55 -> of this code this code will run in Big O
478.15 -> of n log n times here n are the number
482.68 -> of unique characters in our input now
486.729 -> this n log n is because we are calling
488.979 -> our extract main function 2 into n minus
492.129 -> 1 times each time extracting two nodes
495.49 -> with minimum value also each call takes
499.029 -> Big O of log in time to heap if I so
503.469 -> with this we come to an end of this
505.419 -> tutorial if you have any doubts or
508.33 -> suggestions please leave them in the
510.58 -> comment section below thanks for
512.469 -> watching
520.25 -> you
Source: https://www.youtube.com/watch?v=0kNXhFIEd_w