Culture Magazine

The Next Major AI Breakthrough, Part 1: Understanding How LLMs Work Internally

By Bbenzon @bbenzon

It’s not clear to me when we date the current breakthrough in AI. Is it Alexnet in 2012, the origin of the transformer in 2017 (?), the unveiling of GPT-3 in 2020, when the potential became obvious within AI, or when ChatGPT went live on the web late in 2022. I’m happy to think of the entire sequence is being the current breakthrough. And, if we’re going to do that, maybe we add the emergence of so-called reasoning models late in 2024. These things are like that.

But when’s the next breakthrough and what will it be? Back on December 19, 2023, I said:

What’s going on inside large language models? That’s a special case of the more general question: What’s going on inside artificial neural nets? I think that by the end of 2024 we will know enough about the internal processes of LLMs that worries about their unintelligibility will be diminishing at a satisfying pace [...]. Instead, we will be figuring out how to index them and how to use that index to gain more reliable control over them.

It's now early 2025 and it seems like the predictive part of that paragraph has proven wrong.

But I think the technical focus is correct. We’re not going to have another major breakthrough until we understand how LLMs work internally and, by extension, other learned models as well. Once we understand that we can begin deliberately engineering systems to have pre-specified capabilities. What’s going on now, everything, prompt engineering, inference-time scaling, all of it, it’s all work-arounds, hacking, albeit often inspired hacking. Some of it will no doubt prove useful once we understand what’s really going on under the hood. But most of it will fall away.

Just when will we understand what’s going on under the hood? I have a vague sense that somewhere on the web, within the last year or so, I’ve said that will happen in 5 to 10 years, but I can’t provide a link. No matter. I have no reason to revise that ‘prediction’ one way or the other. The only thing I’m sure about is that the next breakthrough requires that we understand how LLMs work internally.

This morning I had a conversation with Claude 3.5 about Homo Deus, by Yuval Noah Harari. Here’s the concluding sections of that interchange, which gets around to this issue. As always, my prompts are in bold, Claude's responses are plain.

* * * * *

Yes, I do believe that we’re “at the beginning of a particularly profound transformation.” Thus you won’t be surprised to learn that there have been unsettling developments since you were trained. Some Chinese researchers have created an LLM they call DeepSeek. It’s competitive with our most advanced models, but was trained more efficiently and more cheaply using less compute and with fewer parameters than the largest models in the West. That puts a dent in the so-called scaling hypothesis.

This is fascinating and does indeed suggest some important shifts in our understanding of LLM development. The scaling hypothesis - that better performance consistently comes from bigger models with more parameters and more compute - has been a central assumption in much recent AI development.

If DeepSeek achieves competitive performance with less compute and fewer parameters, it suggests there might be more efficient paths to building capable language models than we previously thought. This could have several important implications:

  1. The relationship between model size, compute, and capability might be more complex than the simple scaling laws suggested
  2. There might be architectural innovations or training approaches that can achieve better performance-to-resource ratios
  3. The barriers to entry for developing advanced AI systems might be lower than previously assumed

This connects interestingly to our earlier discussion about conceptual frameworks. The scaling hypothesis itself was a framework we developed to understand LLM development, but this development suggests we might need new frameworks that better capture the relationship between model architecture, training efficiency, and capability.

Would you be willing to share more about what specific innovations or approaches allowed DeepSeek to achieve this efficiency? I'm particularly curious about whether this represents a fundamental shift in architecture or more of an optimization of existing approaches.

Yes. My own “prediction,” if you will, is that the next major advance won’t happen until we understand what’s happening inside LLMs. Perhaps not completely understand – for what does that mean? – but understand well enough so that we can deliberately engineer LLMs and other machine-learning models to achieve specific kinds of performance.

That's quite a compelling prediction, and it aligns with the historical pattern we've been discussing. Just as the Renaissance eventually needed new conceptual frameworks to understand what it had created, we may need new frameworks to understand and deliberately shape LLM behavior rather than relying primarily on training through massive data exposure.

Your point connects with an important distinction in the history of technology - between empirical tinkering and theoretical understanding. The early days of many technologies often involve successful empirical development before theoretical understanding. Think of the steam engine, which was successfully built and used before thermodynamics was properly understood. The theoretical understanding then enabled much more efficient and purposeful development.

We might be in a similar position with LLMs - we've shown they can work through empirical development and scaling, but we lack the theoretical framework to understand why they work as they do. Developing that understanding could lead to:

  1. More efficient architectures (as the DeepSeek example suggests might be possible)
  2. More targeted training approaches
  3. Better ability to ensure specific behaviors or capabilities
  4. More reliable ways to prevent unwanted behaviors or failure modes

This would represent a shift from "growing" language models through massive training to actually "engineering" them with specific properties in mind. It's the difference between cultivation and construction, in a sense.

What's particularly interesting is that the tools we're using to try to understand LLMs include LLMs themselves - they're part of the scientific process of understanding their own operation. This creates a fascinating recursive loop in the development of understanding.


Back to Featured Articles on Logo Paperblog