An elusive and fundamental question in science is what makes a physical system alive? Certain physical systems like mountains, rivers, and rocks are generally considered nonliving, while systems like bees, trees, and grass, are considered living. A key question here is what exactly distinguishes a physical system to be living or not? While science typically employees very general definitions, like living systems are open to energy, reproduce, and evolve over time, highly specific definitions are rarely provided. This article attempts to provide a definition for life, based in an information-theoretic approach. This definition, which will be sketched out below, is that living systems occur when self-making information is self-embedded.
The basics of this definition are as follow. A certain subset of physical systems can be modeled as informational systems, where physical markers represent bits of data in a communication, computation, or cybernetic systems. Following the notion that every biological system must contain the information to self-make (e.g. DNA), but not every information system is a biological system (e.g. silicon computers are non-living), only a particular a subset of informational systems would be considered living. Following the proposed criterion, only information systems containing the information to self-make that is marked in the self being made, would be considered living. Information and computation are generally substrate independent and can be equally represented on silicon chips, CD drives, or even knots on a rope. However, the information in biological systems is substrate dependent, and the markers of the information to self-make must also be embedded, or marked, by self that is being made. For example, the information of DNA on a hard drive is not living. Only when the information of DNA is marked by DNA in a living cell (in the appropriate environment), is it considered living. Essentially, when the information to self-make a system X is itself marked by system X, it would be considered living.
To fully appreciate this seemingly simple definition of life, it is helpful to take a step back and provide some background of different approaches to define life within physical theories. Some important concepts that will be lightly covered include:
- The relation of higher-level emergence of life to lower-level physical theories.
- The role of thermodynamic entropy in life.
- The role of Shannon informational entropy and life.
- Nature’s embedded metainformation (nature marking information about nature).
- The role of autopoiesis and self-making in life.
- Implications of life as self-making information self-marked.
An important disclaimer on the topic of the emergence of life is that most scientific approaches do not assume that living systems possess something fundamentally new, or incompatible, with lower-level physical systems. Following the unity of science, the notion that science attempts to provides consistent description of one reality, the rules of biology (rules of certain chemical systems) should never violate the rules of physics (rules of all matter and energy in spacetime), even if the rules of biology are not efficiently predictable by physics. This means that life, or living systems, emerges in a certain subset of physical systems, and does not introduce something that violates physical laws. A similar approach occurs in computers. A certain arrangement of circuit hardware makes the emergent effect of computational software, but the software is not something new in nature. The software emerges from hardware under a particular configuration. Likewise, the emergent effect of life, and consciousness, emerges from a particular configuration of physical parts. A primary task of science is to specifically identify which arrangements of matter makes the emergent effect of life. Denying the requirement for lower-level rules to be compatible with higher-level rules essentially breaks the attempt for science to understand how nature as a unified whole operates, and results to modeling nature as disjointed realities with different rules (which can lead to other logical inconsistencies).
An early pioneer in defining life via a physical theory was Erwin Schrödinger, who wrote on the topic in the 1944 book What is Life? Schrödinger defined life not only in terms of a self-reproducing aperiodic DNA crystals (this was before the discovery of the DNA helix structure), but also in terms of an open thermodynamic systems that maintains order in an out of equilibrium state. Following the second law of thermodynamics, the entropy, or disorder, of an isolated system tends to increase over time. This creates that effect that ordered systems tend to become more random, and disperse energy, over time. Life, on the other hand, seems to defy the increase of entropy by creating highly ordered structures over time. This, however, is still compatible with thermodynamics when considering an open system. A system open to the flow of matter and energy can decrease in entropy (thereby increasing order) if the environment increases in entropy enough so that the total universe maintains or increases in entropy. Essentially, life can use energy from the environment to locally increase order and decrease entropy. Following this definition, living systems must be open systems to counter the effects of entropy. However, Schrödinger did not provide more specific details of what exactly life is ordering, beyond self-reproducing. The definition of life used in this article provides further clarity of exactly what is being ordered in open living systems, namely information to self-make that is self-embedded.
Information theory, introduced by the Claude Shannon in 1948, explains how communication and computation occur when physical systems mark and manipulate data. An essential definition provided by Shannon is the notion of mutual information, which is the information that a sender and receiver share. When there is a high amount of mutual information between a sender and receiver (e.g. strong radio signal), informational signals can be communication with a high degree of accuracy. However, when more and more noise is exposed to a system (e.g. fuzzy radio signal), the signal and communication can get distorted and the mutual information decreases.
Shannon information has an important connection to energy and thermodynamic entropy. Following Landauer’s principle, energy is required to store, read, and write information in an irreversible process. Additionally, following the tendency for the increase of thermodynamic entropy, information tends to degrade over time with noise. Storing, maintaining, processing, sending, and receiving information requires the controlled use of energy and open systems that work to counter the effects of thermodynamic entropy and the increase of noise. For example, computers and the internet require energy inputs. Following Landauer’s principle, the information systems in living processes requires the controlled use of energy, aligning and providing more detail to Schrödinger’s perspective that life must be an open system to maintain order.
There is another importance distinction to make about information, which is that some information relates to properties about nature (e.g. physics equations), while other information does not provide any additional insight to nature beyond the physical markers themselves (e.g. scrambled letters). An interesting connection here is the idea of embedded metainformation, which can be expressed via a metalanguage. A metalanguage is about a language. For example, noun and verbs are words about words in the English language. Also, due to the fact that these words are also part of the English language, rather than another language, they are called embedded. A similar effect occurs in a physics textbook. The information marked by the atoms in the physics textbook are about the properties of atoms in nature. Scientific knowledge can be thought of as an embedded information, where nature is marking information about nature. The information in living systems is about nature, and particularly self-making the system in the environment, rather than something arbitrary or about some other property in nature. DNA are structures in nature, but also contain information about how to manipulate other structures in nature to self-make the cell. Further, nature’s metainformation to self-make a living system is not embedded anywhere in nature, but in the living system itself (so it is called “self-embedded).
The distinction of life as self-making is an important point made by Francisco Varela and Humberto Maturana, which spearheaded the idea of autopoiesis in 1972. Following Varela’s and Maturana’s theory of life, systems capable of autopoiesis, or self-making in the environment, are living systems. Self-making is more comprehensive that just self-production, as self-making entails making the very parts which a system is composed of. A system that can self-make gathers input materials from the environment, and processes these as needed in order to self-make. This is analogous to having a full supply chain in a single production system. The living system can take base materials from the environment, including materials produced by other non-living and living systems, and processes these materials to fully self-make the system with no external agents. Self-making also entails self-replication, self-assembly, self-diagnosis, self-repair, and as well the ability to adapt the self-making information as the environmental conditions change.
Living systems contain a specific type of information to self-make the system in the environment. It is important to stress the “in the environment”. The ability to interpret an information system is always in relation to the environment (or between multiple systems), which can be thought about in terms of mutual information. While information is often described in term of bits of information present in a single storage device, these bits only make sense to another system once interpreted via mutual information of a sender and receiver. So, it is important to note that the information of life, is not an isolated definition of information that only provides self-making instructions regardless of the environment. The information of life contains the information to self-make the system within the environment.
An information-theoretic approach has interesting connections to evolution. If life has the information to self-make in the environment, but the environment is constantly changing, then the information in life must also be able to change. If organisms are unable to adapt their self-making information to work in response to the environment, they will fail to sustain over multiple generations. This can be accomplished through Darwinian evolution, and other adaptive mechanisms in biology.
Another implication of living systems as self-making, self-marked, information systems is that the increase of thermodynamics would expose the living system to information degradation. For example, information for how to construct something stored in a book, CD, or flash drive, would degrade over time. To counter this, living systems need error-correction mechanisms and ability to assess and fix errors when self-making and self-replicating. There are many biological mechanisms and duplicative copies of information (e.g. multiple chromosome) to increase redundancy, improve error correction, and ensure that the self-making information is not degraded and lost over time.
Another interesting implication of this definition of life is that the information to self-make the living systems must be the same size, or smaller, the system itself. If the information was stored in a place bigger than the system that it intends to self-make (e.g. DNA code on a large hard drive bigger than cells) it can never self-make because the markers are larger than the object being made. So, the same number of amount of atoms, or less atoms, must be used to mark information for the number of atoms in the object being self-made. For example, DNA uses one molecule, A, T, G, C, to mark information to encode a larger molecule to form proteins. When the information storage is smaller in the living object being self-made, some amount of data compression is needed, and emergent models must be used that reduces the degrees of freedom considered. Now, it should be noted that certain information markers may equally represent the structure and information, so no compression is needed. For example, there are theories that RNA served as an early information system and was also itself the structure in chemical reactions, so the information was the same size as the structure being self-made. Over time, DNA was able to further compress information and have other auxiliary structures that are encompassed in the object being self-made. The most important point here, is that the size of the information markers must be equal to or smaller, but never larger, then the living object being self-made. Additionally, when it is smaller, data compression and emergent models must be used by the living system.
I believe it is very helpful to utilize information theoretic ideas to further advance biology and the question, what is life? This by no means is an exhaustive proof, but I thought it is helpful to sketch out some ideas of defining life as self-making information self-embedded. This definition provides clarity what the information in life is about (self-making) and where this information is marked (self-embedded). This provides clarity to what exactly life is doing as an open system decreasing thermodynamic entropy (using energy to gather and processing self-making, self-embedded information) and provides some insights to informational error-correction and evolution (the self-making instructions in life must adapt to environment changes and noise increases by thermodynamic entropy to sustain). Please share any thoughts below and feel free to further build off this definition of life.
Want to learn more about systems theory, natural science, and sustainability? Check out my book! https://davidshugar.com/book/