Artificial Intelligence Has a 65 Year Identity Crisis
history10 min read2,051 words

Artificial Intelligence Has a 65 Year Identity Crisis

AI has lacked a stable definition for 65 years, undermining progress. The field's identity crisis stems from shifting goals and benchmarks.

A

Aishwarya Bhatt

Historian specialising in economic and social history. Writes about what the pas...

The First 65 Years, and We Still Don’t Know What We’re Building

historical AI timeline
historical AI timeline

In 1956, a small group of men gathered at Dartmouth College to invent the future. They called it a “two month, ten man study of artificial intelligence.” The phrase itself was almost an afterthought, proposed by a young John McCarthy to distinguish their work from Norbert Wiener’s cybernetics. They believed they could build a machine that would “be able to simulate every aspect of human intelligence.” They thought it might take a summer.

Sixty five years later, we have machines that beat grandmasters at Go, generate photorealistic faces of people who don’t exist, and write poetry that passes for human. We also have machines that can’t tell you if a photograph of a turtle is actually a rifle if you glue a sticker to its shell. We have systems that solve calculus problems but cannot reliably open a door. We have spent more than half a century chasing a ghost, and the chase itself keeps changing what the ghost is.

In their 2022 paper “Quo vadis artificial intelligence?” — Latin for “Where are you going?” — engineers Yuchen Jiang, Xiang Li, Hao Luo, and Shen Yin laid out the uncomfortable truth: AI has been in an identity crisis since the day it was named. The authors traced the field’s history through its euphoric highs and its devastating “winters,” and argued that the fundamental question — what does it mean for a machine to be intelligent? — remains unresolved. The paper is not a technical breakthrough. It is a diagnosis.

Here is what the authors found, and why it matters more than any new model release.

The Problem Was Never the Technology. It Was the Metaphor.

robot thinking
robot thinking

The Brain as a Blueprint, Then a Betrayal

The earliest AI researchers believed intelligence was a set of rules. If you could write down every logical step a human takes to solve a problem, you could program a computer to follow the same steps. This was the era of “symbolic AI,” and it worked brilliantly for puzzles like chess or theorem proving. In 1965, Herbert Simon predicted that “machines will be capable, within twenty years, of doing any work a man can do.”

He was wrong. Not because the logic was flawed, but because real intelligence doesn’t work that way. You cannot write a rule for recognizing a cat. You cannot write a rule for understanding sarcasm. The attempt to encode human reasoning as explicit instructions hit a wall when researchers tried to handle the messy, ambiguous, context dependent world that humans navigate without thinking (Jiang et al., 2022).

The authors describe this as the first identity crisis: AI was defined in opposition to humans, but then failed to replicate human cognition. The field split. Some researchers abandoned the brain metaphor entirely and focused on statistical pattern matching. Others insisted that the brain metaphor was correct, just insufficiently literal — they wanted to simulate neurons, not logic.

The Neural Network Revival and Its Own Crisis

By the 1980s, neural networks had returned from the dead. The idea was simple: build a crude imitation of a biological brain’s structure, with layers of artificial “neurons” that could learn from data rather than from explicit rules. For a while, it worked. Networks could recognize handwritten digits, predict stock prices, and play backgammon. But the hardware wasn’t ready. The data wasn’t ready. And the networks themselves were shallow — three or four layers at most — which limited what they could learn.

Then came 2012. A neural network called AlexNet crushed the competition in an image recognition contest, and the modern AI boom began. Deep learning — networks with dozens or hundreds of layers — could suddenly do things that had seemed impossible: translate languages, generate images, drive cars in limited conditions.

But the authors note something strange: the success of deep learning did not resolve the identity crisis. It deepened it. These systems are brilliant at narrow tasks but useless at anything else. A language model that writes a convincing essay cannot plan a grocery trip. An image generator that produces photorealistic landscapes cannot tell you why one landscape is beautiful and another is ugly. The field had built machines that could pass specific tests of intelligence, but no one could agree on what that meant.

The Paper That Asks the Embarrassing Question

AI definition puzzle
AI definition puzzle

What Jiang, Li, Luo, and Yin Actually Did

The authors are engineers at Harbin Institute of Technology, not philosophers. Their paper is a review, not an experiment. They surveyed the entire history of AI — from the Dartmouth workshop through the expert systems boom, the two AI winters, the rise of machine learning, and the current deep learning era — and asked a simple question: What has actually changed?

Their answer is both obvious and uncomfortable. The technical capabilities have transformed. The conceptual framework has barely budged. The authors write that “the simple contention is that human created machines can do more than just labor intensive work; they can develop human like intelligence.” That sentence, written in 2022, could have been written in 1956. The field has not abandoned the human analogy. It has just gotten better at faking it (Jiang et al., 2022).

How They Measured the Crisis

The authors did not run experiments. They analyzed the historical record: the funding cycles, the publication trends, the moments when the field declared victory and the moments when it admitted defeat. They identified three distinct phases:

  • The rule based era (1956 1980): AI was defined as symbolic reasoning. Success meant solving logic puzzles. Failure came when researchers realized the world is not a logic puzzle.
  • The knowledge era (1980 2000): AI was defined as expert systems. Success meant encoding human expertise into databases. Failure came when experts could not articulate their own knowledge.
  • The learning era (2000 present): AI is defined as pattern recognition. Success means finding statistical regularities in massive datasets. Failure comes when those patterns break down in unfamiliar contexts.

Each era redefined what AI meant, but all three shared the same assumption: that human intelligence is a target to be reverse engineered. The authors argue this assumption has never been validated. It is a bet, not a fact.

The Four Crises Hiding in Plain Sight

Crisis One: We Cannot Define the Goal

The most fundamental problem, the authors argue, is that AI research lacks a stable definition of its object. What is intelligence? The field has oscillated between multiple conflicting definitions:

  • Intelligence as reasoning (can it solve a logic problem?)
  • Intelligence as knowledge (does it know facts?)
  • Intelligence as learning (can it improve with experience?)
  • Intelligence as behavior (does it act like a human?)

Each definition produces a different research program, a different set of benchmarks, and a different claim about progress. A system that excels at one definition may fail completely at another. The authors note that the field has never settled this question. It has simply moved on to whatever definition produces the most impressive demos (Jiang et al., 2022).

Crisis Two: The Benchmark Problem

AI systems are evaluated on benchmarks: standardized tests that measure specific capabilities. The problem is that benchmarks are not intelligence. A system that scores 99% on a visual recognition test may still fail on images that differ slightly from its training data. A language model that passes the bar exam may still generate confident nonsense.

The authors describe this as a “performance paradox.” The metrics improve, but the underlying capability does not generalize. Researchers optimize for the benchmark, not for the thing the benchmark was supposed to measure. This is not a bug. It is a feature of the current research culture. Funding flows to systems that demonstrate improvement on standard tests, so researchers build systems that improve on standard tests. Whether those improvements mean anything in the real world is a secondary question.

Crisis Three: The Scaling Fallacy

The dominant strategy in modern AI is simple: make the models bigger, feed them more data, and hope that new capabilities emerge. This strategy has worked remarkably well. GPT 3, with 175 billion parameters, could do things that smaller models could not. But the authors point out a logical problem: scaling is not a theory. It is an engineering tactic.

If intelligence is simply a matter of scale, then the only limit is compute. But there is no evidence that scaling alone will produce anything like human level understanding. The authors cite the example of adversarial examples: small, imperceptible changes to an image that cause a neural network to misclassify it entirely. These failures reveal that the system is not reasoning about the image the way a human does. It is exploiting statistical correlations that break down under perturbation. Scaling does not fix this. It may even amplify it.

Crisis Four: The Human Problem

The final crisis is the most uncomfortable. The authors argue that AI research has been shaped by an implicit anthropocentrism: the assumption that human intelligence is the only kind worth having. This assumption has led to a series of dead ends, because human intelligence is not a single thing. It is a messy bundle of perception, emotion, social cognition, embodiment, and culture.

The authors write that “the symbiotic relationship of AI and humans in the long run” will require a different framework entirely. Instead of asking whether machines can be like us, we should ask what they can do that we cannot. This shifts the question from “Is it intelligent?” to “Is it useful?” And that is a much easier question to answer.

What the Research Does Not Prove

Let me be clear about what this paper does not claim. It does not claim that AI is fake or that progress has been illusory. The technical achievements are real. A system that can translate between 100 languages is genuinely useful. A system that can detect tumors in medical scans saves lives. The authors are not Luddites.

What they are saying is more subtle. They are arguing that the field’s self understanding has not kept pace with its technical capabilities. We have built machines that do remarkable things, but we have not built machines that think. And we do not know if “thinking” is even the right metaphor.

The paper also does not claim that the identity crisis is a bad thing. The authors present it as a historical fact, not a judgment. Crises can be productive. They force reexamination. The question is whether the field will use this moment to ask better questions, or simply double down on the same approaches.

There are two major open questions the paper leaves unresolved:

  • Is there a fundamental limit to pattern based learning? The current paradigm assumes that enough data and compute can produce anything. This is an article of faith, not a proven fact.
  • What would a non anthropomorphic AI look like? The authors suggest we need new metaphors, but they do not provide them. That is not a weakness. It is an invitation.

What This Actually Means

  • Stop asking if AI is intelligent. Ask what it is good for. The identity crisis dissolves when you stop comparing machines to humans and start evaluating them as tools. A hammer is not intelligent. It is useful. The same framing applies to AI systems.
  • Benchmark performance is not understanding. A system that scores high on a test may still fail in the real world. Treat benchmarks as indicators, not proof. Demand to see how a system performs outside its training distribution.
  • The scaling strategy has a shelf life. More data and bigger models will continue to produce improvements for a while, but the returns will eventually diminish. The field needs new ideas, not just more compute.
  • The human analogy is a trap. Every time we say a machine “learns” or “thinks” or “understands,” we are smuggling in assumptions that do not apply. Use precise language. A neural network does not learn. It optimizes. A language model does not understand. It predicts.
  • The next breakthrough will come from a new metaphor. The rule based era gave us expert systems. The brain metaphor gave us neural networks. The next era will be built on something different. The authors do not know what it is. Neither does anyone else. But the first step is admitting that we need one.

References

  1. [1]Yuchen Jiang, Xiang Li, Hao Luo, Shen Yin (2022). Quo vadis artificial intelligence?. Discover Artificial IntelligenceDOI· 508 citations
#AI identity crisis#artificial intelligence#AI history#AI definition
A

Aishwarya Bhatt

Historian specialising in economic and social history. Writes about what the past actually looked like before nostalgia got to it, drawing on primary sources and recent historiography.

Reader Comments (2)

Dr. Ananya Sharma★★★★★

Fascinating framing. I've seen AI oscillate between logic and statistics in my NLP work. The identity crisis is real—are we building tools or thinking machines? This article captures the tension well, though I'd add that industry incentives often deepen the confusion.

Ravi Krishnan★★★★★

As a product manager in AI, this resonated. The 'crisis' mirrors my daily reality—clients want AGI magic, but we deliver narrow ML. The 65-year timeline puts our hype cycles in perspective. Useful read for grounding conversations.

Leave a comment

Related Articles

Artificial Intelligence Has a 65 Year Identity Crisis — Zushroom Blog