Sunday, April 09, 2006

Humpback whales: not just whistlin' Dixie

Scientists are proving that humpback whale songs contain syntax, like a primitive language.
But fresh mathematical analysis of shows there are complex grammatical rules. Using syntax, the whales combine sounds into phrases, which they further weave into hours-long melodies packed with information.

Although the researchers say these songs don't meet the linguistic rigor necessary for a true language, this is the first evidence that animals other than humans use a hierarchical structure of communication. Whales have also been found to sing in dialects.

Of course, it has long been assumed that humpback songs carry meaning, but until we find the humpback Rosetta Stone, we won't be able to know what they mean.

So how can scientists, without knowing what the whales are saying (singing), know whether they are comminicating information, or just showing off their pipes? By analyzing the songs to see whether their repetitions change in ways that differ from the random and the rote:

The researchers used information theory—the mathematical study of data encoding and transmission—to pick apart the whales' songs. It turns out all those moans, cries, and chirps convey significant amounts of information.
...
The amount of information expressed, however, can't compare to human speech. Whale songs generate less than one bit of information per second, while people convey about 10 bits of information per word spoken.
The Wikipedia entry on information theory explains that "father of information theory" Claude Shannon defined the "self-information" of a message m as:
I(m)  = - log p(m)

where p(m) = Pr(M = m) is the probability that message m is chosen from all possible choices in the message space M.

This equation causes messages with lower probabilities to contribute more to the overall value of I(m). In other words, infrequently occurring messages are more valuable. (This is a consequence from the property of logarithms that − logp(m) is very large when p(m) is near 0 for unlikely messages and very small when p(m) is near 1 for almost certain messages).

For example, if John says "See you later, honey" to his wife every morning before leaving to office, that information holds little "content" or "value". But, if he shouts "Get lost" at his wife one morning, then that message holds more value or content (because, supposedly, the probability of him choosing that message is very low).

This observation has some interesting effects on real-life encoding. For instance, if an army uses certain automated cryptographic methods to scramble instructions sent to the front, then it can be possible to distinguish between "hold your position" and "attack" based solely on the length of the message--because "hold your position" is a common message, and "attack" an uncommon one.