Huffman Coding | GeeksforGeeks

Aug 15, 2023

Huffman Coding | GeeksforGeeks

Find Complete Code at GeeksforGeeks Article: http://www.geeksforgeeks.org/greedy-a…

This video is contributed by Illuminati

Please Like, Comment and Share the Video among your friends.

Also, Subscribe if you haven’t already! :)

Content

0.06 -> hello friends and welcome to geeks who

2.46 -> geeks in this video we will be learning

4.799 -> Huffman coding technique using greedy

7.02 -> choice property so Huffman coding is a

10.349 -> data compression algorithm that is if we

13.349 -> have some data it will compress it in

16.109 -> such a way such that while decoding it

19.14 -> no loss of information occurs in this B

23.46 -> assign a variable length code to input

26.039 -> characters this variable length code

28.83 -> depends on the occurrence or frequency

31.41 -> of the character in our data the most

34.23 -> frequent character gets the smallest

36.329 -> code and the least frequent character

38.91 -> gets the largest code see here how the

42.09 -> code length varies depending upon the

45.059 -> frequency of characters now whatever

48.719 -> code is assigned to our characters

50.7 -> puzzles prefix property which requires

54.12 -> that there is no whole code word in the

57.09 -> system that is a prefix of any other

60.66 -> code word in that system for example a

64.229 -> code with code words 0 1 1 has the

68.88 -> prefix property whereas a code

71.7 -> consisting of 0 1 and 1 1 does not

76.189 -> because here 1 is a prefix of 1 1 this

82.29 -> property makes sure that our data is

84.869 -> uniquely decodable now let's see how

89.43 -> this compression work is carried out

91.35 -> using Huffman coding technique this is

95.189 -> the sample data information our data

98.31 -> consists of 6 unique characters and

101.18 -> these are their frequencies so let's

104.549 -> have the Huffman algorithm these are the

107.549 -> 4 step that will help us to build a

109.74 -> Huffman tree which in turn will provide

112.649 -> us the Huffman codes for all the

114.63 -> characters we have the data we have the

117.45 -> algorithm so let's do a quick trial run

119.729 -> and build a Huffman tree the first step

123.39 -> says to build a min heap for each unique

126.689 -> character if you don't know about binary

130.11 -> heap you may refer to our quick video

132.629 -> tutorial

133.6 -> we have covered earlier this heap would

137.26 -> be based on frequencies of the input

139.36 -> characters such that each node data is

142.36 -> less than or equal to the data in the

145.24 -> nodes children once we have them in heap

148.33 -> we need to extract two nodes with

151.21 -> minimum frequencies

152.62 -> this step is greedy in nature here we

156.61 -> have character a and B with the least

159.79 -> frequencies in the next step we form a

163.84 -> new internal node with frequency equal

167.2 -> to the sum of the two nodes frequencies

170.04 -> the first extracted node forms its left

173.74 -> child and the second extracted node

176.71 -> becomes the right child after node

180.31 -> formation we add this new internal node

183.37 -> to our main heap in last step we perform

187.06 -> step 2 m3 until only a single node

190.3 -> remains in our hip so we again extract

194.17 -> the two nodes with minimum frequencies

196.2 -> here we have C and D with least

200.14 -> frequencies we form a new internal node

204.01 -> and add node C to its left and node D to

208.87 -> it's right and it's frequency as the sum

212.23 -> of its children and finally add this new

215.71 -> internal node to our main heap again we

219.52 -> extract the two nodes with minimum

221.41 -> frequencies here we have n1 and E with

226.12 -> least frequencies we form a new internal

229.39 -> node and add node and one to its left

233.17 -> and node E to it's right and it's

236.44 -> frequency as the sum of its children

239.43 -> finally we add this new internal node

242.41 -> entry to avoid min-heap our heap still

246.94 -> have more than one node so again we

250.45 -> extract the two nodes with minimum

252.46 -> frequencies here we have n2 and n3 with

257.2 -> least frequencies they form a new

260.92 -> internal node and add

263.05 -> n2 to its left and node n3 to its right

267.61 -> and it's frequency as the sum of its

270.669 -> children and finally add this new

273.729 -> internal node m4 to our min-heap we now

278.77 -> have two nodes left merging them would

282.099 -> give a single node and thus our

284.53 -> algorithm may stop so we form a new

288.039 -> internal node n 5 and add F with lesser

292.659 -> frequency to its left and n4 with higher

296.65 -> frequency to it's right and it's

299.409 -> frequency as the sum of its two children

303.18 -> so this would be the final output of

306.55 -> this algorithm and we'll call it a

309.4 -> Huffman tree now to generate code from

313.3 -> this tree will Traverse it from the root

316.12 -> and assign zero to the left child and

319.449 -> one to the right child once we encounter

323.469 -> a leaf node we may print this code this

327.759 -> would be the codes for each character

330.33 -> let's see how we obtain for node C so

335.77 -> here is C we start from root go to right

340.029 -> go to left again left so we have one

344.05 -> zero and zero let's generate for a here

350.05 -> we have a this starts from root go to

353.56 -> right again right to left to left we

357.789 -> have this one 1 then 0 and 0 similarly

364.12 -> we do for all the characters now let's

367.419 -> see how to code this algorithm this is

371.409 -> the implementation which has been taken

373.779 -> from X for geeks this function will help

377.68 -> us to build a Huffman tree and is

379.87 -> equivalent to the four step algorithm

382.089 -> that we have covered earlier the input a

385.839 -> character array their corresponding

388.449 -> frequencies and the number of characters

391.379 -> rebuild our main heap out of these input

394.529 -> you may refer to any mini creating

397.449 -> function as here we are skipping it

401.05 -> we iterate until the size of our men

403.87 -> heap becomes one we extract the top node

407.919 -> and store it in variable left V again

412.21 -> extract a node and store it in right we

416.379 -> now form a new internal node and assign

419.409 -> its frequency as sum of frequencies of

422.44 -> its two children and also add left to

426.31 -> its left child and right to its right we

430.72 -> finally add this new internal node to

433.33 -> our main heap we keep doing this until

436.75 -> only a single node remains in our main

438.909 -> heap we finally return this node this

442.75 -> function will help us to generate codes

445.15 -> using a Huffman tree we input a pointer

448.659 -> to root node and auxilary added to store

451.509 -> the code form so far and the size of it

454.349 -> we assign 0 if left edge exists and

458.289 -> recur similarly if right edge exists we

462.94 -> assign 1 and record if we reach a leaf

467.409 -> node we print the code value stored in

470.44 -> our array now let's see the complexity

474.55 -> of this code this code will run in Big O

478.15 -> of n log n times here n are the number

482.68 -> of unique characters in our input now

486.729 -> this n log n is because we are calling

488.979 -> our extract main function 2 into n minus

492.129 -> 1 times each time extracting two nodes

495.49 -> with minimum value also each call takes

499.029 -> Big O of log in time to heap if I so

503.469 -> with this we come to an end of this

505.419 -> tutorial if you have any doubts or

508.33 -> suggestions please leave them in the

510.58 -> comment section below thanks for

512.469 -> watching

520.25 -> you

Source: https://www.youtube.com/watch?v=0kNXhFIEd_w