I'm still working on Part 1 the Huffman Algorithm, and I think I'm close to having something that works. When I compress and then decompress, I compare the original file with the extracted file, but they do not match. However when I compare the code tables and the vectors of HuffmanCharFreqs (which tells you all characters and their corresponding frequencies) for the compression step and the decompression step, they do match. I am thinking maybe I still have the wrong idea on how characters are represented in bit/byte form.
So is it correct to assume that a single character is represented by only a single byte? I thought I might have come across something about Unicode, which Java uses, as 16 bits or 2 bytes and I already know that ASCII is just 8 bits or 1 byte. Still a bit unclear about this issue. |