Can We Make an AI Scientist?

Sam Rodriques, What does it take to build an AI Scientist? August 15, 2024:

What will it take to build an AI Scientist?

I run FutureHouse, a non-profit AI-for-Science lab where we are automating research in biology and other complex sciences. Several people have asked me to respond to Sakana's recent AI Scientist paper. However, judging from comments on HackerNews, Reddit and elsewhere, I think people already get it: Sakana’s AI Scientist is just ChatGPT (or Claude) writing short scripts, making plots, and grading its own work. It's a nice demo, but there's no major technical breakthrough. It's also not the first time someone has claimed to make an AI Scientist, and there will be many more such claims before we actually get there.

So, putting Sakana aside: what are the problems we have to solve to build something like a real AI scientist? Here’s some food for thought, based on what we have learned so far:

It will take fundamental improvements in our ability to navigate open-ended spaces, beyond the capabilities of current LLMs

Scientific reasoning consists of essentially three steps: coming up with hypotheses, conducting experiments, and using the results to update one’s hypotheses. Science is the ultimate open-ended problem, in that we always have an infinite space of possible hypotheses to choose from, and an infinite space of possible observations. For hypothesis generation: How do we navigate this space effectively? How do we generate diverse, relevant, and explanatory hypotheses? It is one thing to have ChatGPT generate incremental ideas. It is another thing to come up with truly novel, paradigm-shifting concepts.

I note that this is quite different from playing games like chess or Go. Those gaves have huge search spaces, much larger than we can explicitly construct. But they are well-structured spaces. The space of scientific hypotheses is not at all well-structured. I discuss this problem in various posts, including this one: Stagnation, Redux: It’s the way of the world [good ideas are not evenly distributed, no more so than diamonds] (August 13, 2024).

It will take tight integration with experiments

Once we have a hypothesis, we then need to decide which experiment to conduct. This is an iterative process. How can we identify experiments that will maximize our information gain? How do we build affordance models that tell us which experiments are possible and which are impossible? Affordance models are critical, because discovery is about doing things that have never been done before.

There's much more at the link.