information theory: Measurement of Information Content

Measurement of Information Content

Numerically, information is measured in bits (short for binary digit; see binary system). One bit is equivalent to the choice between two equally likely choices. For example, if we know that a coin is to be tossed but are unable to see it as it falls, a message telling whether the coin came up heads or tails gives us one bit of information. When there are several equally likely choices, the number of bits is equal to the logarithm of the number of choices taken to the base two. For example, if a message specifies one of sixteen equally likely choices, it is said to contain four bits of information. When the various choices are not equally probable, the situation is more complex.

Interestingly, the mathematical expression for information content closely resembles the expression for entropy in thermodynamics. The greater the information in a message, the lower its randomness, or “noisiness,” and hence the smaller its entropy. Since the information content is, in general, associated with a source that generates messages, it is often called the entropy of the source. Often, because of constraints such as grammar, a source does not use its full range of choice. A source that uses just 70% of its freedom of choice would be said to have a relative entropy of 0.7. The redundancy of such a source is defined as 100% minus the relative entropy, or, in this case, 30%. The redundancy of English is estimated to be about 50%; i.e., about half of the elements used in writing or speaking are freely chosen, and the rest are required by the structure of the language.

Sections in this article:

The Columbia Electronic Encyclopedia, 6th ed. Copyright © 2023, Columbia University Press. All rights reserved.

See more Encyclopedia articles on: Computers and Computing