Text compression through variable-length coding
Technologies: Java, Maven, JUnit, javadoc
HuffmanRevisited is a command-line app compressing text files into a proprietary
.huf format, and decompressing them back again.
It uses Huffman coding, a famous technique invented by David A. Huffman. It relies on repeatedly gathering into a new subtree the lowest two members of a sorted list of subtrees whose leaves correspond to the unique characters of the string being compressed; the sorting being by accumulated frequency of occurrence of any character held in that subtree.
As I discuss in the blog post linked below, I enjoyed working on bitwise transformations, the overall design and in particular setting up tests with JUnit.
I plan to use javadoc and write more tests to turn this into a fully-documented and specced piece of software.