Final Project: Huffman Encoding
Assigned: April 27, 2018
Due: May 15, 2018 at 5pm
Summary: In this assignment, you will apply several of the data structures we have used
in class to create a file compression/decompression system using the Huffman
Encoding algorithm. You will have some time in class to work on the
assignment with your partner, but plan to spend time outside of class on the
assignment as well.
Learning To experience full-system object oriented design and appropriately use
Goals: provided data structures.
Collaboration: For this assignment, you will submit work in partners assigned in class. While
you can work with your partner any way you wish, I encourage you to use a
pair-programming approach while working on Class 1 in particular, whereas
Classes 2 and 3 can be worked on independently.
Submission : Turnin Form
Assignment
In this assignment, you will be working on information compression using a Huffman Tree. It is a
large assignment, so start with your partner early. You can choose to either work on it in a pair-
programming approach or by dividing some of the interfaces described below, implementing them
independently, and integrating them. While test-driven development for individual classes is
recommended, submission of test cases is not required. Bare-bones starter code is provided here.
At this point, you should be familiar with the core standard library data structures such as strings,
vectors, lists, stacks, queues, sets, and maps. Use them judiciously in your assignment!
Additionally, you are provided with utility classes for bitstreams, objects that work like regular
streams but read and write individual bits instead of multi-bit or multi-character data elements.
Note that while the interfaces below are required, you may also write additional internal methods if
it helps you organize your code. Do make sure that different classes and declared in different
header files, implemented in different implementation files, and added to the Makefile for
compilation.
Class 1: The decision tree (40%)
In a prior assignment, you implemented a special-purpose decision tree where internal and
external nodes each held different kinds of information. In this assignment, you will create a new
decision tree class that stores different information. The Huffman decision tree is a binary tree
where every node has a “weight” property describing the cumulative frequencies of everything in
that subtree, and leaves additionally have a “value” property that describes the character
represented by that leaf. A description of the Huffman tree construction algorithm is in section 12.4
of our textbook. You are strongly encouraged to review the previous decision tree assignment to
make this component easier.
The Huffman tree class should implement the following interfaces:
http://cs.whitman.edu/~strattja/cs/270/2018S/hw/fp.huffman.shtml
1/35/5/2018
CS 270, 2018S: Final Project
HuffmanTree(const map<char, float>&): construct a Huffman Tree from a set of characters
with frequencies. Your constructor should use the algorithms from the textbook and in class
to build a tree by merging subtrees together. Remember, a priority queue of Nodes is going
to be very helpful here!
void writeTree(ostream&): write the Huffman tree as text to a stream. The encoding format
for the Huffman decision tree is similar to the question-based decision tree. Print one node
per line: if the node is an internal node, begin the line with “NODE:”, followed by the
frequency of the characters in that subtree. If a node is a leaf, begin the line with “LEAF:”,
followed by the character, followed by the frequency of that character. A simple tree might
look like:
NODE: 1.0
LEAF: a 0.5
LEAF: b 0.5
HuffmanTree(istream&): construct a Huffman Tree from an already-open stream, according to
the same tree format described above. When done, the constructor should use the seekg
method to reset the istream to the beginning of its content (like closing and reopening the file,
without actually closing and reopening it.)
char decode(ibitstream&): beginning at the top of Huffman tree, decode a single character
from a stream of bits. Use the ibitstream member function “readBit” to read individual bits.
Traverse the tree, and return the decoded character. You may want to revisit the traversal
code of the previous Decision Tree data structure as a refresher.
map<char, std::vector
a map from each character represented in the Huffman Tree to the bit coding that would be
required to reach that character in the tree. (Hint: think about using a vector as a stack.
Keep track of what choices you have made in your traversal by pushing items on to the back
of the vector, and when you reach a leaf, the vector will contain the sequence of choices
made to reach that leaf. Add that vector to the map, and when you return from a recursion
step in the traversal, pop the last decision-item off the end of the stack. All of the entries in
the vector should be either 0 or 1.)
To test your HuffmanTree separately from an encoder, decoder, you’ll have to manually construct
the map and stream arguments in your test cases. Recall that cout can be used as an ostream,
cin as an istream. Bitstreams are a little different: take a look at the ibitstreamtest.cpp file to see
an example of how to create and use stringbitstream objects that might help you in your test cases.
Class 2: The Huffman Encoder (25%)
This class will have a HuffmanTree and an istream& as fields, and will use it to manage the
encoding and decoding of files through the following interfaces
HuffmanEncoder(istream&): create an encoder and construct the HuffmanTree for the given
istream. When the construction is completed, reset the istream to the beginning of its input
using the seekg function with the std::ios::beg position marker, e.g.
“mystream.seekg(0,std::ios::beg);”
int writeEncodedText(obitstream&): write the original text in encoded format to the given,
open obitstream, and return the number of characters encoded. Remember to reset the
istream position at the end of the function.
void writeTree(ostream&): invokes the writeTree function of the HuffmanTree to write the tree
used to encode this istream to the given output stream.
Class 3: The Huffman Decoder (25%)
http://cs.whitman.edu/~strattja/cs/270/2018S/hw/fp.huffman.shtml
2/35/5/2018
CS 270, 2018S: Final Project
This class will have a HuffmanTree as a field, and will use it to manage the decoding of a
compressed stream as produced by the encoder.
HuffmanDecoder(istream&): create a decoder and construct the HuffmanTree from the given
istream, which stores the tree in text format (as produced by writeTree).
void decodeText(int, ibitstream&, ostream&): decode the message from the given input bit
stream into the output character stream. The first argument indicates the expected number
of characters to be decoded.
void writeTree(ostream&): invokes the writeTree function of the HuffmanTree to write the
tree. (This is primarily for debugging purposes, most applications will not need it.)
Above and Beyond: User application (10%)
To complete the assignment, you must design and write some command-line user interface to
enable a user to encode and decode text with the classes you have built. As some suggestions to
get you started thinking, you could asking the user to provide names for files to read and write
through prompts or the command-line, or asking the user to type strings and print the huffman-
encoded results as ‘1’ and ‘0’ characters to the terminal, or write multiple different applications with
different main functions for handling encoding and decoding using the same libraries. Whichever
route you choose, start by looking up the documentation for the kind of streams you want to deal
with - files are handled through ofstreams and ifstreams (or ofbitstreams and ifbitstreams) and
strings can be processed using stringstream objects. Good links for the Stanford bitstream
libraries here, and the C++ standard i/o library here. Talk with your partner to decide how you want
your application to be informative, artistic, utilitarian, fun, or any or all of the above. Whatever
interface you design and choose, include a README.txt file that describes what you did and how
to use your application.
Submitting your work
Make sure your files are named correctly as specified above. In your comments at the top, make
sure you:
acknowledge any help or outside sources, in accordance with the academic honesty policy;
describe any known bugs in the program.
Rezip your materials using the command “zip -r mywhitmanID_Huffman.zip FP_huffman” in the
folder containing the final project unzipped package. Submit your completed package using the
online turnin form. Be sure to select the Final Project.