A Bit About Bits
Shannon's Measure of Information
20 Questions
-
How many possible animals can be described using 20 questions?
-
How many guesses do you need to guess the price of something that costs up to a million bucks?
-
Represent game as a tree
-
The depth of tree is logarithm
1
____/\____
/ \
/ \
/ \
2 3 log2(8) = 3
/ \ / \
/ \ / \
/ \ / \
4 5 6 7
/ \ / \ / \ / \
a b c d e f g h
-log2 P is the number of bits required to specify one of N possible symbols where P = 1/N
information - from a sender's point of view: a measure of the freedom of choice in composing a message. If there are two choices, say, "yes" or "no", then there is one bit of choice. From a receiver's point of view: receiving one message of two reduces the uncertainty by one bit. (After [1], page 266)
entropy = uncertainty = H = Sum -P(x) log P(x)
x
information gain = H(after) - H(before)
examples:
A = 50%, C = 25%, G = 12.5%, T = 12.5%
Sending one symbol transmits 1.75 bits of information:
H = -1/2 log2 (1/2) + -1/4 log2 (1/4) + -1/8 log2 (1/8) + -1/8 log2 (1/8)
= 1.75
However, if A, C, G, T are all equally likely, then H = 2.0 bits
A message of 1000 symbols then transmits 1,750 bits of information (case 1) or 2,000 bits (case 2)
Noise-free channel:
H before transmission: -1/2 log2 (1/2) + -1/2 log2 (1/2) = 1.0 bits H after transmission: 0 log2 0 + -1 log2 1 = 0.0 bits total information gained: 1.0 bit
Noisy channel example (99% reliable, 1% noisy):
H before transmission: -1/2 log2 (1/2) + -1/2 log2 (1/2) = 1.0 bits H after transmission: -0.99 log2 (0.99) + -0.01 log2 (0.01) = 0.081 bits total information gained: 0.919 bits
Noisy channel example (50% reliable, 50% noisy):
H before transmission: -1/2 log2 (1/2) + -1/2 log2 (1/2) = 1.0 bits H after transmission: -0.50 log2 0.50 + -0.50 log2 (0.50) = 1.0 bits total information gained: 0.0 bits
Problems with Estimating Probabilities
If you don't know the probabilities, then you can't measure entropy directly.
But, the probabilities can be learned.
Connections to simulated evolution? Holland's Schema Theorem [5].
The Laws of Thermodynamics
entropy - the measure of disorder in a physical system.
The First Law of Thermodynamics (The Law of Conservation)
The total amount of energy in a closed system remains constant.
The Second Law of Thermodynamics (The Second Law)
The total amount of disorder in a closed system increases.
The Third Law of Thermodynamics (The Big Conjecture ???)
The information in a closed system tends to coagulate.
Information is Physical
Landauer: Erasing information necessarily causes energy loss. Key to understanding Maxwell's Demon.
Bennett: "Reversible computing". Computing does not necessarily require energy.
Maybe reversibility gives meaning to information?
Images, Telephone calls, and Music
Here is an image that is 600 width x 125 height x 3 colors x 8 bits = 1,800,000 bits:
But the actual file size is only 50,992 bits. How can that be? Where are the rest of the bits?
Consider: 01010101010101010101010101010101010101010101010101010101010101010101010101010101
There are 80 bits there. But you could describe the pattern by saying "01,40 times" which is much shorter. Most data (images, speech, music, etc.) is made up of patterns. Compression takes advantage of the patterns, thereby turning the patterns into symbols.
If you squeeze data into as few number of bits as possible (get rid of all the patterns), then the compressed representation is indistinguishable from noise (i.e., complete randomness).
What happens when you get noise in such representation? How would you know?
Redundency checks, or even better, error correcting codes!
References
[1] The Bit and The Pendulum, Tom Siegfried. 2000, John Wiley & Sons.
[2] The Mathematical Theory of Communication, Claude Shannon and Warren Weaver. 1949, University of Illinois Press.
[3] Landauer
[4] Bennet
[5] The Genetic Algorithm, John Holland.
Links
