|
| 1 | +# Huffman Coding using Greedy Approach |
| 2 | +🔴 Language used : **Python 3** |
| 3 | + |
| 4 | +## 🎯 Aim |
| 5 | +The aim of this script is to find out the huffman code of each characters presented in the list in an ascending order. |
| 6 | + |
| 7 | +## 👉 Purpose |
| 8 | +The main purpose of this script is to show the implementation of Greedy Approach to find out the the huffman code of each characters presented in the list in an ascending order. |
| 9 | + |
| 10 | +## 📄 Description |
| 11 | +Huffman coding is a lossless data compression algorithm. The idea is to assign variable-length codes to input characters, lengths of the assigned codes are based on the frequencies of corresponding characters. The most frequent character gets the smallest code and the least frequent character gets the largest code. |
| 12 | + |
| 13 | +The variable-length codes assigned to input characters are Prefix Codes, means the codes (bit sequences) are assigned in such a way that the code assigned to one character is not the prefix of code assigned to any other character. This is how Huffman Coding makes sure that there is no ambiguity when decoding the generated bitstream. |
| 14 | + |
| 15 | +🔴 Examples: |
| 16 | + |
| 17 | +``` |
| 18 | +Constraints: |
| 19 | +chars[] -> an array of characters. |
| 20 | +freq[] -> an array of frequencies of the respective characters. |
| 21 | +
|
| 22 | +Input: |
| 23 | +character Frequency |
| 24 | + a 5 |
| 25 | + b 9 |
| 26 | + c 12 |
| 27 | + d 13 |
| 28 | + e 16 |
| 29 | + f 45 |
| 30 | + |
| 31 | +After processing through the algorithm, it will generate the Huffman codes |
| 32 | +for each of the characters presented in an ascending order. |
| 33 | +
|
| 34 | +The output will be like this, |
| 35 | +character Huffman Code |
| 36 | + f 0 |
| 37 | + c 100 |
| 38 | + d 101 |
| 39 | + a 1100 |
| 40 | + b 1101 |
| 41 | + e 111 |
| 42 | +``` |
| 43 | + |
| 44 | +## 🧮 Workflow & Algorithm |
| 45 | +Let's discuss the workflow and the algorithm with the above mentioned example. |
| 46 | +- Build a min heap that contains 6 nodes where each node represents root of a tree with single node. |
| 47 | +- Extract two minimum frequency nodes from min heap. Add a new internal node with frequency `5 + 9 = 14.` |
| 48 | +``` |
| 49 | + 14 |
| 50 | + / \ |
| 51 | + / \ |
| 52 | + a -> 5 b -> 9 |
| 53 | +
|
| 54 | +Now min heap contains 5 nodes where 4 nodes are roots of trees with single element each, |
| 55 | +and one heap node is root of tree with 3 elements |
| 56 | +
|
| 57 | +The tree will be now, |
| 58 | + character Frequency |
| 59 | + c 12 |
| 60 | + d 13 |
| 61 | + Internal Node 14 |
| 62 | + e 16 |
| 63 | + f 45 |
| 64 | +``` |
| 65 | +- Step 3: Extract two minimum frequency nodes from heap. Add a new internal node with frequency `12 + 13 = 25` |
| 66 | +``` |
| 67 | + 25 |
| 68 | + / \ |
| 69 | + / \ |
| 70 | + c -> 12 d -> 13 |
| 71 | + |
| 72 | +Now min heap contains 4 nodes where 2 nodes are roots of trees with single element each, |
| 73 | +and two heap nodes are root of tree with more than one nodes |
| 74 | +
|
| 75 | + character Frequency |
| 76 | +Internal Node 14 |
| 77 | + e 16 |
| 78 | +Internal Node 25 |
| 79 | + f 45 |
| 80 | +``` |
| 81 | +- Extract two minimum frequency nodes. Add a new internal node with frequency `14 + 16 = 30` |
| 82 | +``` |
| 83 | + 30 |
| 84 | + / \ |
| 85 | + 14 e -> 16 |
| 86 | + / \ |
| 87 | + / \ |
| 88 | + a -> 5 b -> 9 |
| 89 | +
|
| 90 | +Now min heap contains 3 nodes. |
| 91 | +
|
| 92 | + character Frequency |
| 93 | +Internal Node 25 |
| 94 | +Internal Node 30 |
| 95 | + f 45 |
| 96 | +``` |
| 97 | +- Extract two minimum frequency nodes. Add a new internal node with frequency `25 + 30 = 55` |
| 98 | +``` |
| 99 | + 55 |
| 100 | + / \ |
| 101 | + / 30 |
| 102 | + / / \ |
| 103 | + 25 14 e -> 16 |
| 104 | + / \ / \ |
| 105 | + c d a b |
| 106 | + 12 13 5 9 |
| 107 | +
|
| 108 | +Now min heap contains 2 nodes. |
| 109 | +
|
| 110 | +character Frequency |
| 111 | + f 45 |
| 112 | +Internal Node 55 |
| 113 | +``` |
| 114 | +- Extract two minimum frequency nodes. Add a new internal node with frequency `45 + 55 = 100` |
| 115 | +``` |
| 116 | + 100 |
| 117 | + / \ |
| 118 | + f->45 \ |
| 119 | + 55 |
| 120 | + / \ |
| 121 | + / 30 |
| 122 | + / / \ |
| 123 | + 25 14 e -> 16 |
| 124 | + / \ / \ |
| 125 | + c d a b |
| 126 | + 12 13 5 9 |
| 127 | + |
| 128 | +Now min heap contains only one node. |
| 129 | +
|
| 130 | +character Frequency |
| 131 | +Internal Node 100 |
| 132 | +``` |
| 133 | +- Since the heap contains only one node, the algorithm stops here. |
| 134 | +- **Steps to print codes from Huffman Tree:** Traverse the tree formed starting from the root. Maintain an auxiliary array. While moving to the left child, write 0 to the array. While moving to the right child, write 1 to the array. Print the array when a leaf node is encountered. |
| 135 | +``` |
| 136 | + (0) 100 (1) |
| 137 | + / \ |
| 138 | + f->45 \ |
| 139 | + 55 |
| 140 | + / \ (1) |
| 141 | + (0) / 30 |
| 142 | + / (0) / \ (1) |
| 143 | + 25 14 e -> 16 |
| 144 | + / \ / \ |
| 145 | + c d a b |
| 146 | + 12 13 5 9 |
| 147 | + (0) (1) (0) (1) |
| 148 | +``` |
| 149 | +- The codes are as follows: |
| 150 | +``` |
| 151 | +character code-word |
| 152 | + f 0 |
| 153 | + c 100 |
| 154 | + d 101 |
| 155 | + a 1100 |
| 156 | + b 1101 |
| 157 | + e 111 |
| 158 | +``` |
| 159 | + |
| 160 | +## 💻 Input and Output |
| 161 | +- **Test Case 1 :** |
| 162 | +```python |
| 163 | +Input Given : |
| 164 | +chars = ['a', 'b', 'c', 'd', 'e', 'f'] |
| 165 | +freq = [ 5, 9, 12, 13, 16, 45] |
| 166 | +``` |
| 167 | + |
| 168 | + |
| 169 | + |
| 170 | +- **Test Case 2 :** |
| 171 | +```python |
| 172 | +Input Given : |
| 173 | +chars = ['a', 'b', 'c', 'd'] |
| 174 | +freq = [ 5, 1, 6, 3] |
| 175 | +``` |
| 176 | + |
| 177 | + |
| 178 | +## ⏰ Time and Space complexity |
| 179 | +- **Time Complexity :** `O(n*log n)`. |
| 180 | +- **Space Complexity :** `O(n*log n)`. |
| 181 | + |
| 182 | +--------------------------------------------------------------- |
| 183 | +## 🖋️ Author |
| 184 | +**Code contributed by, _Abhishek Sharma_, 2022 [@abhisheks008](github.com/abhisheks008)** |
| 185 | + |
| 186 | +[](https://www.python.org/) |
0 commit comments