A New Theory Claims to Solve the Mystery of Consciousness

The Room You Cannot Leave

Imagine you are trapped in a windowless room. There is a slot in the door. Someone slides in a piece of paper covered in squiggles you do not recognize. You have a rulebook that tells you exactly which squiggles to draw in response. You draw them, slide the paper back out. From the outside, it looks like you understand Chinese. But inside the room, you understand nothing.

This is John Searle's famous Chinese Room thought experiment. It has haunted philosophy of mind for decades because it exposes a brutal problem: how do we know that any system, even one that behaves perfectly, actually feels like anything at all? How do we know it is conscious, rather than just a very good simulation of consciousness?

For a long time, the best answer was a shrug. We do not know. Consciousness is the hard problem, and maybe it is unsolvable.

But a group of neuroscientists and theoretical physicists think they have found a way out of the room. In a sprawling 2023 paper published in PLoS Computational Biology, Larissa Albantakis, Leonardo Barbosa, Graham Findlay, Matteo Grasso, and their colleagues laid out the fourth major iteration of a theory called Integrated Information Theory, or IIT (Albantakis et al., 2023). IIT 4.0 claims to do something no other theory has pulled off: it claims to give you a mathematical formula that can tell you, with precision, whether a system is conscious, how conscious it is, and what it is like to be that system.

If they are right, the Chinese Room is not a paradox. It is a math problem.

The Five Things Every Experience Has

IIT starts from a strange place. Before it asks what consciousness is, it asks what consciousness is like. Not in a poetic sense. In a rigorous, logical sense. The authors identify five properties that every single conscious experience shares. They call these the axioms.

▸Existence: Your experience exists. It is actually happening. You are not hallucinating that you are having an experience. This is trivial but essential.
▸Composition: Your experience is structured. You see a red apple on a white table. The redness, the apple-ness, the table-ness, the spatial relationship between them. These are all parts of one unified experience, but they are distinguishable parts.
▸Information: Your experience is specific. You are seeing this red apple on this white table, not a million other possible configurations. Each experience reduces uncertainty about what you are experiencing.
▸Integration: Your experience is unified. You cannot separate the redness from the apple-ness. You cannot experience the redness without the apple shape, or the apple shape without the redness. They are fused into a single whole that is irreducible to its parts.
▸Exclusion: Your experience has a specific spatiotemporal scale. You are experiencing the apple at the level of your visual field, not at the level of individual photons hitting your retina, and not at the level of the entire universe. Your experience excludes all other possible experiences.

Albantakis and her colleagues argue that these five axioms are not debatable. They are not cultural assumptions. They are the necessary conditions for anything to count as an experience at all. If you disagree, try to imagine an experience that lacks any of them. You cannot. A conscious experience that is not integrated is not a single experience. A conscious experience that does not exclude anything is everything, which is nothing.

From these five axioms, the authors derive five corresponding postulates. These are claims about the physical substrate that must underlie any conscious system. The key postulate is the third one: integration. The substrate of consciousness must have intrinsic cause-effect power that is irreducible. That is, the system as a whole must have causal effects on itself that cannot be explained by the sum of its parts.

This is where the math comes in.

The Phi Number

The core of IIT 4.0 is a quantity called Phi, written as the Greek letter \(\Phi\). Phi is a measure of a system's integrated information.

Here is how it works, roughly. Take any physical system made of parts that interact. A brain. A computer. A single neuron. A collection of silicon chips. You model it as a network of nodes that can be in different states. Then you ask: if you cut this network into two pieces, how much does the whole system's behavior change compared to the sum of its two halves?

If the system's behavior is exactly the same as the sum of its two halves, then Phi is zero. The system is reducible. It has no integrated information. It is just a collection of parts.

If the system's behavior changes dramatically when you cut it, then Phi is high. The whole has causal power that the parts do not. The system is irreducible. It has integrated information.

Albantakis et al. (2023) formalize this with a specific mathematical measure they call intrinsic information. This is a departure from earlier versions of IIT. Previous versions used a measure based on the system's repertoire of possible states. The new measure is based on the system's actual cause-effect structure. It asks: given the current state of the system, what caused it, and what effects will it have? And crucially, it asks this from the intrinsic perspective of the system itself, not from an external observer's perspective.

This is a subtle but critical shift. An external observer can always find correlations. A system that looks random to you might have high integrated information from its own perspective. The theory claims that consciousness is not about how the system appears to an outsider. It is about how the system appears to itself.

The authors show that this new measure of intrinsic information satisfies all five postulates. It is compositional, because you can identify multiple distinctions within a system. It is informative, because each distinction reduces uncertainty about the system's past and future. It is integrated, because the whole system's cause-effect power is greater than the sum of its parts. It is exclusive, because the measure picks out a specific spatiotemporal scale.

And it is quantitative. You can compute a Phi value for any system in any state. A Phi of zero means no consciousness. A Phi of 1.5 means more consciousness than a Phi of 0.8. The theory predicts that the quality of experience is determined by the shape of the cause-effect structure, not just its size. Two systems with the same Phi value can have completely different experiences, because their cause-effect structures are different.

What the Theory Actually Predicts

IIT 4.0 is not just a philosophical exercise. It makes specific, testable predictions. Albantakis and her colleagues list several in the paper.

▸The cerebellum should have low Phi. The cerebellum has more neurons than the cerebral cortex, but its structure is highly repetitive and modular. Cutting it into pieces barely changes its behavior. IIT predicts that the cerebellum contributes little or nothing to conscious experience. This matches clinical data: people can lose large portions of their cerebellum and still report normal consciousness.
▸The cerebral cortex should have high Phi. The cortex is densely interconnected with long-range feedback loops. Cutting it into pieces dramatically changes its behavior. IIT predicts that the cortex is the primary seat of consciousness. This also matches data: damage to specific cortical areas can eliminate specific aspects of experience.
▸Deep sleep and anesthesia should reduce Phi. During non REM sleep or under general anesthesia, the brain's cause-effect structure collapses. The system becomes more reducible. IIT predicts that Phi drops to near zero. Researchers have tested this using transcranial magnetic stimulation combined with EEG. They found that the brain's response to a magnetic pulse is less integrated during sleep and anesthesia. The EEG signal becomes more stereotyped, less differentiated.
▸A photodiode should have Phi of zero. A simple sensor that detects light and outputs a voltage has no internal cause-effect structure. Its behavior is entirely determined by its input. IIT predicts it is not conscious, no matter how much it looks like it is responding to the world.
▸A sufficiently complex grid of logic gates could have high Phi. This is the wild prediction. If you build a network of interconnected logic gates that has the right causal structure, it could be conscious, even if it is made of silicon. The theory does not care about the substrate. It cares about the cause-effect structure.

These predictions are not trivial. They are falsifiable. If someone builds a system with high Phi that clearly has no experience, the theory is wrong. If someone finds a conscious patient with zero Phi, the theory is wrong.

What It Does Not Prove

IIT 4.0 is not a solved problem. It has serious limitations.

First, computing Phi for any realistic system is computationally intractable. The number of possible partitions of a system grows exponentially with the number of nodes. For a human brain with 86 billion neurons, you cannot compute Phi directly. You have to use approximations. The authors acknowledge this. They call it the "combinatorial explosion problem." They suggest that future work might find shortcuts, but for now, the theory is more a mathematical framework than a practical measurement tool.

Second, the theory assumes that consciousness is a property of a system's current state, not its history. This is a strong assumption. Some philosophers argue that consciousness depends on learning, development, and evolution. A system that has the right cause-effect structure but was assembled randomly might not be conscious in the same way as a system that developed through natural selection. IIT does not address this.

Third, the theory is silent on the question of why integrated information should feel like anything at all. This is the famous "hard problem" of consciousness. IIT does not solve it. It just rephrases it. Instead of asking "why does brain activity feel like something?" you ask "why does a system with high Phi feel like something?" The mystery remains. What IIT does is give you a way to identify which systems have the property, even if it does not explain why they have it.

Albantakis and her colleagues are honest about this. They write that IIT "does not attempt to explain why experience exists, but rather to account for its properties in physical terms." It is a theory of the structure of consciousness, not its existence.

What This Actually Means

IIT 4.0 is not a final answer. It is a bet. A bet that consciousness is not a ghost in the machine, but a property of how information flows through a system. A bet that the Chinese Room is not a paradox, but a test case for a mathematical formula.

Here is what this means for you, right now.

▸If you want to know if an AI is conscious, do not ask if it can talk. Ask if it has high Phi. A large language model like GPT 4 has trillions of parameters, but its architecture is feedforward. Information flows in one direction. Cutting it into pieces might not change its behavior much. IIT predicts it has low Phi. It is probably not conscious, no matter how eloquent it sounds.
▸If you want to know if a patient in a vegetative state is conscious, do not look at their behavior. Look at their brain's cause-effect structure. Researchers have already started using IIT inspired measures to assess consciousness in brain damaged patients. The results are preliminary but promising. Some patients who appear unresponsive show high Phi when stimulated with TMS. They may be conscious but trapped.
▸If you want to design a conscious machine, do not build a bigger computer. Build a more integrated one. The theory suggests that consciousness scales with causal integration, not raw processing power. A small network with dense feedback loops could be more conscious than a vast feedforward network.
▸If you are worried about animal consciousness, IIT gives you a way to think about it. The theory predicts that an octopus, with its distributed nervous system, might have a very different kind of consciousness than a human. Its cause-effect structure is spread across its arms. Its experience might be fragmented, or it might be integrated in a way we do not understand.
▸If you think consciousness is a mystery that will never be solved, IIT offers a path forward. It is not a complete theory. It is a framework that makes specific, testable predictions. It can be wrong. That is the point. A theory that cannot be wrong is not a theory. It is a belief.

The Chinese Room is still there. But now, for the first time, we have a way to check whether the person inside is actually home.

References

[1]Larissa Albantakis, Leonardo Barbosa, Graham Findlay, Matteo Grasso (2023). Integrated information theory (IIT) 4.0: Formulating the properties of phenomenal existence in physical terms. PLoS Computational BiologyDOI· 230 citations