Wikipedia

Entropy encoding

In information theory an entropy encoding is a lossless data compression scheme that is independent of the specific characteristics of the medium.

One of the main types of entropy coding creates and assigns a unique prefix-free code to each unique symbol that occurs in the input. [1] These entropy encoders then compress data by replacing each fixed-length input symbol with the corresponding variable-length prefix-free output codeword. The length of each codeword is approximately proportional to the negative logarithm of the probability of occurrence of that codeword. Therefore, the most common symbols use the shortest codes.[2]

According to Shannon's source coding theorem, the optimal code length for a symbol is , where is the number of symbols used to make output codes and is the probability of the input symbol.

Two of the most common entropy encoding techniques are Huffman coding and arithmetic coding.[3] If the approximate entropy characteristics of a data stream are known in advance (especially for signal compression), a simpler static code may be useful. These static codes include universal codes (such as Elias gamma coding or Fibonacci coding) and Golomb codes (such as unary coding or Rice coding).

Since 2014, data compressors have started using the Asymmetric Numeral Systems family of entropy coding techniques, which allows combination of the compression ratio of arithmetic coding with a processing cost similar to Huffman coding.

Entropy as a measure of similarity

Besides using entropy encoding as a way to compress digital data, an entropy encoder can also be used to measure the amount of similarity between streams of data and already existing classes of data. This is done by generating an entropy coder/compressor for each class of data; unknown data is then classified by feeding the uncompressed data to each compressor and seeing which compressor yields the highest compression. The coder with the best compression is probably the coder trained on the data that was most similar to the unknown data.

See also

References

  1. ^ "Education - Entropy Encoding". www.pcs-ip.eu. Retrieved 2020-10-13.
  2. ^ "What is Entropy Coding | IGI Global". www.igi-global.com. Retrieved 2020-10-13.
  3. ^ Huffman, David (1952). "A Method for the Construction of Minimum-Redundancy Codes". Proceedings of the IRE. Institute of Electrical and Electronics Engineers (IEEE). 40 (9): 1098–1101. doi:10.1109/jrproc.1952.273898. ISSN 0096-8390.

External links

This article is copied from an article on Wikipedia® - the free encyclopedia created and edited by its online user community. The text was not checked or edited by anyone on our staff. Although the vast majority of Wikipedia® encyclopedia articles provide accurate and timely information, please do not assume the accuracy of any particular article. This article is distributed under the terms of GNU Free Documentation License.

Copyright © 2003-2025 Farlex, Inc Disclaimer
All content on this website, including dictionary, thesaurus, literature, geography, and other reference data is for informational purposes only. This information should not be considered complete, up to date, and is not intended to be used in place of a visit, consultation, or advice of a legal, medical, or any other professional.