Recall in lectures that we presented a Huffman tree that was constructed by examining the frequency of characters on Wikipedia:
{{/Labs/01/Images/fig3.svg}}
The corresponding Huffman coding for this tree is:
{{/Labs/01/Images/HuffmanBySymbol.svg}}
Using this coding, we can decode the sequence
1100010100010100110100001100111001111011100111
as:
11000 = C (remaining bits: 10100010100110100001100111001111011100111)
1010 = O (remaining bits: 0010100110100001100111001111011100111)
001010 = M (remaining bits: 0110100001100111001111011100111)
011010 = P (remaining bits: 0001100111001111011100111)
000 = [space] (remaining bits: 1100111001111011100111)
1100111 = 1 (remaining bits: 001111011100111)
00111101 = 0 (remaining bits: 1100111)
1100111 = 1 (no remaining bits, end of sequence)
Resulting in the string 'COMP 101'.
Use this Huffman coding to decode the following bits:
11010011001001001100110101001
Encoding information using a Huffman coding follows the opposite direction (look up the symbol and emit the corresponding code). This is straightforward when the table describing the Huffman coding is sorted by symbol:
{{/Labs/01/Images/HuffmanByCharacter.svg}}
Now, we can quickly confirm that the sequence ‘QWERTY’ would be encoded as:
1000100001(Q) 100011(W) 111(E) 1011(R) 0101(T) 0010000(Y)
1000100001100011111101101010010000 (34 bits).