Current Forum: Homework 4 - Huffman Trees (Part 1) |
Date: Sun Oct 7 2001 10:12 pm |
Author: Bortz, Andrew S. <abortz@andrew.cmu.edu> |
Subject: Re: remaining bits |
|
|
There are two easy ways to do this. One is to store the length of the original, uncompressed file in the header of the compressed one. That way, you know when to stop reading bits from the compressed file during decompression because you've written enough bytes to the output.
The second is to use a special character to indicate the end of the file. This character has a frequency of 1 in the original (duh) and gets its own code in the Huffman tree. When you read in the decompression the codeword that corresponds to the EOF character, you stop.
In terms of complexity, storing the length of the file as part of the header is easiest. It also tends to use less space, since the EOF character generally takes quite a few bits to encode in the compressed file. The only advantage of using an EOF character is when you don't know the length of the file ahead of time (but we do). |
|