Book opinion

L.L. Gatlin
Information theory and the living system

The book looks at living organisms through the lens of information theory. From the side of living organisms, DNA (RNA) and proteins are considered; from information theory - entropy. The main outcome of the analysis presented in the book is the second theorem of evolution. The living organism is imagined as an information processing channel, where the input is DNA and the output is proteins. The information processing system should transmit information effectively and without errors; hence, evolution must have ‘optimized’ this transmission. The input of a system is DNA, and DNA is an alphabet with four letters, and entropy calculations can be applied. The entropy D1 is the divergence from the equiprobable state, and D2 is the divergence from independence. The sum of D1 and D2 is the total divergence from the maximum entropy state. Maximum entropy is log a, where a is the number of letters in the alphabet. Given these variables, one may compute redundancy R = (D1 + D2)/log a. After analyzing the DNA of various living organisms in such a way, the author reached the conclusion that vertebrates have evolved to have higher R values, which were achieved by holding D1 relatively constant and increasing D2. The tendency is true only for higher organisms. The lower organisms with high R values achieved this state primarily by increasing D1. Therefore, throughout the evolution, DNA sequences with higher information density were selected, which also happened to minimize the error in the information processing channel. It is shown that evolution worked toward minimizing entropy at the DNA level. However, it does not violate the second law of thermodynamics, as the entropy on protein chains was maximized. The author introduces game theory as a possible way this interplay of entropies could be achieved.

For me, the book was somewhat complicated and difficult to process, although I am almost sure that I got the main message right. I should point out that I am not an evolutionary biologist or even a biologist, but, in my field of neuroscience, I see many similar instances of applying mathematical concepts to biological data (EEG, fMRI, etc.). It is truly fascinating; however, what I find misleading is that when I encounter such analysis, often the claims are made losing sight of the substrate. For instance, big claims that the brain is critical often miss the description of what is actually critical. Spikes? LFP signal? Or the alpha rhythm envelope? I felt the same way about this book. Efficient information processing and storage, in the end, boil down to the entropies of the DNA strand. And I get it; it is possible to draw such a conclusion. However, I feel that it would be wise to make smaller, more humble claims.

Overall, the book was worth reading. The math there is not complicated, and the argumentation is clearly laid out. Maybe some parts are not up to date (the book is from 1972), but even the manuscript on biorxv can be outdated. Moreover, now I know why the answer to every question is 42!

Memorable quote: “If improving the hardware of the computer were the only method of improving the efficiency of information processing in the living system, evolution would have reached its apex in E. coli.”

July, 2023