Information Theory

The version of information theory formulated by mathematician and engineer Claude Shannon (1916–2001) addresses the processes involved in the transmission of digitized data down a communication channel. Once a set of data has been encoded into binary strings, these strings are converted into electronic pulses, each of equal length, typically with 0 represented by zero volts and 1 by + 5 volts. Thus, a string such as 0100110 would be transmitted as seven pulses:

It is clear from the example that the lengths of pulses must be fixed in order to distinguish between 1 and 11. In practice, the diagram represents an idealized state. Electronic pulses are not perfectly discrete, and neither are the lengths of pulses absolutely precise. The electronic circuits that generate these signals are based upon analogue processes that do not operate perfectly, and each pulse will consist of millions of electrons emitted and controlled by transistors and other components that only operate within certain tolerances. As a result, in addition to the information sent intentionally down a channel, it is necessary to cater for the presence of error in the signal; such error is called noise.

This example illustrates the dangers inherent in the differences between the way one represents a process in a conceptual system and the underlying physical processes that deliver it. To conceive of computers as if they operate with perfectly clear 0 and 1 circuits is to overlook the elaborate and extensive error-checking necessary to ensure that data are not transmitted incorrectly, which is expensive both in time and cost.

In 1948, Shannon published what came to be the defining paper of communication theory. In this paper he investigated how noise imposes a fundamental limit on the rate at which data can be transmitted down a channel. Early in his paper he wrote:

The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning; that is they refer to or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the engineering problem. (p.379)

The irrelevance of meaning to communication is precisely the point that encoding and the transmission of information are not intrinsically connected. Shannon realized that if one wishes to transmit the binary sequence 0100110 down a channel, it is irrelevant what it means, not least because different encodings can make it mean almost anything. What matters is that what one intends to transmit—as a binary string—should arrive "exactly or approximately" at the other end as that same binary string. The assumption is that the encoding process that produces the binary string and the decoding process that regenerates the original message are known both to the transmitter and the receiver. Communication theory addresses the problems of ensuring that what is received is what was transmitted, to a good approximation.

See also INFORMATION; INFORMATION TECHNOLOGY

Bibliography

Shannon, Claude E. "A Mathematical Theory of Communication." The Bell System Technical Journal 27 (1948): 379–423, 623–656.

JOHN C. PUDDEFOOT