Module constriction.symbol.huffman

Codebooks for Huffman coding [1].

The Huffman algorithm constructs an optimal code book for a given categorical probability distribution. Note however, that even an optimal code book is limited to assigning an integer number of bits to each symbol, which results in an overhead of up to 1 bit per symbol in the regime of low entropy per symbol. Stream codes, as provided by the module constriction.stream remove most of this overhead by amortizing compressed bits over several symbols.

Our implementation of Huffman trees uses the order in of provided symbols to break ties if two subtrees have the same weight. Thus, reordering probabilities can affect the shape of a Huffman tree in a nontrivial way in edge cases.

References

[1] Huffman, David A. "A method for the construction of minimum-redundancy codes." Proceedings of the IRE 40.9 (1952): 1098-1101.

Classes

class DecoderHuffmanTree (probabilities)

A Huffman tree that can be used for decoding data.

Expects a single argument probabilities, which is a rank-1 numpy array with float dtype that specifies the probabilities of each one of the symbols in the range {0, 1, ..., len(probabilities)-1}. All probabilities must be nonnegative and finite, but probabilities do not need to add up to one since only the ratios of probabilities will affect the shape of the constructed Huffman tree (note, however, that rescaling probabilities can, in edge cases, affect the shape of the Huffman tree due to rounding errors, so be consistent with how you scale probabilities).

Examples

See examples in parent module.

class EncoderHuffmanTree (probabilities)

A Huffman tree that can be used for encoding data.

Expects a single argument probabilities, which is a rank-1 numpy array with float dtype that specifies the probabilities of each one of the symbols in the range {0, 1, ..., len(probabilities)-1}. All probabilities must be nonnegative and finite, but probabilities do not need to add up to one since only the ratios of probabilities will affect the shape of the constructed Huffman tree (note, however, that rescaling probabilities can, in edge cases, affect the shape of the Huffman tree due to rounding errors, so be consistent with how you scale probabilities).

Examples

See examples in parent module.