The boundary between natural processes and computation

I have been having a troubling thought for quite a while, especially due to all the hype around AI. The rough question that had been on my mind was "How does nature or evolution pass down enough genetic information to recreate humans? ". More specifically, how does nature compress enough information in the genes in the gametes to recreate neural connection in a brain without specifying all of the parameters?


For a reference, a gamete carries roughly 3 billion base pairs. Each base pairs encode binary information. Therefore, total information in terms of bits would be 3 billion * 2 = 6 billion bits ~= 0.75 GB of information. So, a fusion of male and female gametes should contain twice the information ~ 1.5 GB. Whereas a human brain is estimated to have around 85 billion neurons, with 100 trillion of synapses. But how is this possible? What kind of compression is the nature really doing here? I asked this question alot, and the most frequent answer I used to get was "evolution". That sounded to me as an escape with along the lines "I don't know but nature somehow does it".

After thinking for some time with some "assumptions" or "axioms" if you will, I stumbled upon a disturbing realization. The core assumption that I made is that, maybe nature doesn't even try at all trying to control dynamics like we do with machines. As to, what happens in chips is that we're using electrons to flip switches. But evolution does not try to do this, instead it tunes dynamics by controlling structures. What I mean by this is that the information that is being encoded through gametes mostly contains structural information and some instructions on which genes to activate. A lot of what I draw is based on protein folding problem. In protein folding we go from a sequence of amino acids to a 3d structure of protein that plays a vital role in its functioning. 


The rough idea is that, RNA copies the amino acid sequence which is chain of triad of base pairs like: "ATG-GCT-GAA" via process called transcription. Ribosomes are responsible for creating the protein. Then protein folds into their respective 3d structure. Until recently we had no idea what 3d structure would a protein take from an amino acid sequence. Researchers, PHD students took years to figure out the structure with alot of effort for a single protein. That changed when Google DeepMind made a breakthrough with their model AlphaFold which could accurately predict 3d structure of all proteins known to humans. It was a giant leap which won Demis Hassabis a Nobel prize in chemistry. (Very inspiring work)

Then one would naturally ask what comes after the structure? The main idea here that what comes after is natural evolution according to laws of physics/chemistry. The structure naturally responds chemically and physically according to its surrounding environment. What's important to understand here is that, unlike machines where the control is explicit. Biology let's nature play out. This is very subtle but a massive difference that people miss out. There are no penalties when things evolve physically/naturally according to existing laws. So, a rough blueprint would be: Initial activation + structural information + physical evolution

Naturally one would ask, what's the difference with this and machines that we make or computation that we do? I'd like to think of what machines do today is simulation. We try to simulate everything. While doing mathematical calculation machines simulates a particular algorithm. What a machine can simulate and cannot depended on question whether there exists an efficient algorithm or mathematical description at all? I could roughly categorize the type of problems into following domain (based on efficiency of simulation):

1) Mathematical problems (Efficient algorithms)

2) Mathematical problems (Inefficient algorithms)

3) Physical process (Efficient mathematical descriptions)

4) Physical process (Unsolvable)

The first two category are all man made. I'd like to think mathematics was invented rather than discovered (mostly), because not all mathematics conforms to physical reality. (Mathematical fiction as some people like to call it). I don't have such harsh view. Math is very creative field). There seems to exist alot of efficient algorithms for such manmade constructions, except some problems in the second one like factorization, discrete log problem etc. 

Our concern of interest is the last two categories. There seem to exist efficient mathematical descriptions for various classical physical process. Eg: Finding projectile motion, finding motion of planets around the sun etc. But it turns out alot of physical process like many body problem are unsolvable. In such problem it's not possible to accurately predict position of bodies, at time Tn from initial T0, with a generalized equation. The reason for this unsolvability is that the many body system is very sensitive to even a very minute change in any variable affecting a system. This is why such systems are called chaotic system. Theoretically you can solve predict by either keeping track of all possible infinitely many variables possibly affecting the state of systems. Or you approximate by keeping a track of all state of the system at every minute timestep (Brute force approach). Turns out many body systems are of great importance as we find it whenever we go smaller in molecular and atomic world or dynamic larger systems like planetary motion. 

                                                                Fig: 3 body system

Why did we go from protein folding to many body problem?

There's an important observation that we can make about both of these systems. 

First, in protein folding the number of potential 3D structure that formulated form an amino acid sequence is vastly large. But thankfully, nature has a preference for choosing a particular 3D structure per sequence. That's what the AlphaFold model figured out. It is very weird to find out that nature's choice was computationally reducible. I like to refer to these problems as "so called static problems with preferred direction from nature". Quite mouthful, I know. Here, I generally refer static to as fixed input and output with some dynamic process in between. Preferred direction refers to the solution that the evolution optimizes for (may not be optimal). In summary protein folding problem can be generalized as:


1) Static/Fixed input

2) Static/Fixed Output 

3) Nature preferred solution in a large solution space (can non optimal)

4) Function is injective as no two or more 3D structure is made from same protein sequence in the nature's preferred solution space

Meanwhile, the many body problem is not the same. First of all, there is no general solution. The solution is to such system is very chaotic and unpredictable. Surely, the output and the input states are fixed, but there's no preferred direction from the nature. The system is sensitive to initial conditions so, even a slight change in atom's position due to some random unaccounted variable especially quantum could result in different solution every time. So the main difference we see here is:

1) No preferred solution as everything is purely physical laws playing out

2) Injectivity is not guaranteed due to sensitive initial conditions

Injectivity usually arises due to nature's preferred or optimized solution over larger solution space via evolution. So, it might be a good place to start by asking whether a problem is injective or not.

So, where am I going with this? 

Where this all leads to is a simple question. 

Will AI be able to solve the second type of problem where there is no general solution nor evolutionary preference?

This is very important, because if we want to be able to solve disease, we'd need to be able to simulate all of the atoms inside of cells or at least predict the outcome. The system will be many body system (Around 100 trillion atoms/bodies per cell). This goes for, better weather prediction, fluid dynamics like Navier strokes equation and so on. For humanity to reach abundance solving such problems are of utmost importance. 

But wait there's more. Remember above where I talked about how every computation that we do is done by restricting electrons to flip light switches and how nature just simple evolves without being bound to such constraints. A key observational axiom we can claim from this is that "Computation is less powerful than Physical State Evolution", therefore is a subset of the later.


                                                        
                                                    Fig: Natural process vs Computation

Now we can clearly see that, AI will only be able to solve the many body problem if and only if the claimed axiom is false. Aka: "Computation is as powerful as physical evolution". If such is the case then, it opens up a door for our world to be able be a simulation.

Surprisingly by observing just two problems we can formulate a weaker Deutsch-Church-Turing thesis for non quantum substrate with probabilistic Turing machine (may or may not be true) and allow us to probe the boundary between natural processes and computation.

Comments

Popular Posts