Inside the AI brain: memory vs. reasoning

New research maps the hidden divide between AI’s memory and intelligence

Inside the AI brain: memory vs. reasoning

Researchers have found clear evidence that AI language models store memory and reasoning in distinct neural pathways. The finding could lead to safer, more transparent systems that can “forget” sensitive data without losing their ability to think.

Large language models, like those from the GPT family, rely on two core capabilities:

  1. Memorization, which allows them to recall exact facts, quotes, or training data.
  2. Reasoning, which enables them to apply general principles to solve new problems.

Until now, scientists weren’t sure whether these two functions were deeply entangled or shared the same internal architecture. They decided to find out and discovered that the separation is surprisingly clean. It shows that rote memorization relies on narrow, specialized neural pathways, while logical reasoning and problem-solving use broader, shared components. Critically, the researchers demonstrated they could surgically remove the memorization circuits with minimal impact on the model's ability to think.

In experiments on the language models, millions of neural weights were ranked by a property called curvature, which measures how sensitive the model’s performance is to small changes. High curvature indicates flexible, general-purpose pathways; low curvature marks narrow, specialized ones. When the scientists removed the low-curvature components – essentially switching off the “memory circuits” – the model lost 97% of its ability to recall training data but retained nearly all of its reasoning skills.

One of the most unexpected discoveries was that arithmetic operations share the same neural routes as memorization, not reasoning. After memory-related components were pruned, mathematical performance dropped sharply, while logical problem-solving remained almost untouched.

This suggests that, for now, AI “remembers” math rather than computes it, similar to a student reciting times tables instead of performing calculations. The insight may explain why language models often struggle with even simple math without external tools.

The team of researchers visualized the model’s internal “loss landscape” – a conceptual map of how wrong or right the AI’s predictions are as its internal settings change. Using a mathematical tool called K-FAC (Kronecker-Factored Approximate Curvature), they identified which regions of the network correspond to memory versus reasoning.

Testing across multiple systems, including vision models trained on intentionally mislabeled images, confirmed the pattern: when memorization components were removed, recall dropped to as low as 3%, but reasoning tasks, such as logical deduction, common-sense inference, and science reasoning, held steady at 95-106% of baseline.

Understanding these internal divisions could have profound implications for AI safety and governance. Models that memorize text verbatim risk leaking private information, copyrighted data, or harmful content. If engineers can selectively disable or edit memory circuits, they could build systems that preserve intelligence while erasing sensitive or biased data.

While the current technique cannot guarantee permanent deletion, since “forgotten” data can sometimes reappear with retraining, the research represents a major step toward improved transparency in AI.