Newer
Older
labs / tiddlers / content / labs / lab01 / _Labs_01_Decoding and Encoding Data Sequences.md

Recall in lectures that we presented a Huffman tree that was constructed by examining the frequency of characters on Wikipedia: {{/Labs/01/Images/fig3.svg}} The corresponding Huffman coding for this tree is:
{{/Labs/01/Images/HuffmanTreeCode}}

Using this coding, we can decode the sequence

1100010100010100110100001100111001111011100111

as: 11000 = C (remaining bits: 10100010100110100001100111001111011100111)

1010 = O (remaining bits: 0010100110100001100111001111011100111)

001010 = M (remaining bits: 0110100001100111001111011100111)

011010 = P (remaining bits: 0001100111001111011100111)

000 = [space] (remaining bits: 1100111001111011100111)

1100111 = 1 (remaining bits: 001111011100111)

00111101 = 0 (remaining bits: 1100111)

1100111 = 1 (no remaining bits, end of sequence)

Resulting in the string 'COMP 101'.

Exercise

Use this Huffman coding to decode the following bits: 11010011001001001100110101001

Answer?


Encoding information using a Huffman coding follows the opposite direction (look up the symbol and emit the corresponding code). This is straightforward when the table describing the Huffman coding is sorted by symbol:

{{/Labs/01/Images/HuffmanBySymbol.png}} Now, we can quickly confirm that the sequence ‘QWERTY’ would be encoded as 1000100001(Q) 100011(W) 111(E) 1011(R​) 0101(T) 0010000(Y) 1000100001100011111101101010010000 (34 bits).