Module constriction.symbol.huffman
Codebooks for Huffman coding [1].
The Huffman algorithm constructs an optimal code book for a given categorical
probability distribution. Note however, that even an optimal code book is limited to
assigning an integer number of bits to each symbol, which results in an overhead of up
to 1 bit per symbol in the regime of low entropy per symbol. Stream codes, as provided
by the module constriction.stream
remove most of this overhead by
amortizing compressed bits over several symbols.
Our implementation of Huffman trees uses the order in of provided symbols to break ties if two subtrees have the same weight. Thus, reordering probabilities can affect the shape of a Huffman tree in a nontrivial way in edge cases.
References
[1] Huffman, David A. "A method for the construction of minimum-redundancy codes." Proceedings of the IRE 40.9 (1952): 1098-1101.
Classes
class DecoderHuffmanTree (probabilities)
-
A Huffman tree that can be used for decoding data.
Expects a single argument
probabilities
, which is a rank-1 numpy array with floatdtype
that specifies the probabilities of each one of the symbols in the range{0, 1, ..., len(probabilities)-1}
. All probabilities must be nonnegative and finite, but probabilities do not need to add up to one since only the ratios of probabilities will affect the shape of the constructed Huffman tree (note, however, that rescaling probabilities can, in edge cases, affect the shape of the Huffman tree due to rounding errors, so be consistent with how you scale probabilities).Examples
See examples in parent module.
class EncoderHuffmanTree (probabilities)
-
A Huffman tree that can be used for encoding data.
Expects a single argument
probabilities
, which is a rank-1 numpy array with floatdtype
that specifies the probabilities of each one of the symbols in the range{0, 1, ..., len(probabilities)-1}
. All probabilities must be nonnegative and finite, but probabilities do not need to add up to one since only the ratios of probabilities will affect the shape of the constructed Huffman tree (note, however, that rescaling probabilities can, in edge cases, affect the shape of the Huffman tree due to rounding errors, so be consistent with how you scale probabilities).Examples
See examples in parent module.