
Lee Sedol met AlphaGo in 2016
Photo by AP/Ahn Young-joon/Alamy
The first time AlphaGo revealed its full power, it prompted a visceral reaction. Lee Sedol, the world’s greatest player of the ancient Chinese board game Go, had been visibly upset by the AI’s prowess. The stagnant crowd in downtown Seoul, South Korea, could barely contain their gasps. It quickly dawned on Lee, and tens of millions watching at home, that this AI was different from those that had come before.
Not only did it beat Lee, but it did so with an almost human-like ability. “AlphaGo actually has an intuition,” Google co-founder Sergey Brin said New Scientist in 2016, shortly after AlphaGo went up 3-0. “It makes beautiful movements. It makes even more beautiful movements than most of us could think of.”
The series ended with Google DeepMind’s AlphaGo system winning 4-1. Lee said he was “in shock”.
It is now a decade since this defining moment for AlphaGo and AI in general. Admiring AI is a common experience with the success of major language models like ChatGPT. AlphaGo was in many ways our first glimpse of what was to come. Ten years later, what is the legacy of AlphaGo and has the technology lived up to its potential?
“Large language models are now quite different in some ways from AlphaGo, but there’s actually an underlying technological thread that hasn’t really changed,” says Chris Maddison of the University of Toronto, who was part of the original AlphaGo team.
The underlying technology is neural networks – mathematical structures inspired by the brain and written into code. Historically, making a gaming machine would have involved a human being writing down the rules it should follow in various situations. With a neural network, the machine learns by itself.
But even with a neural network, cracking Go was a huge task. The ancient Chinese game, which sees two players move black and white counters to win territory on a 19 x 19 board, allows 10171 possible positions. By comparison, there are only 1080 atoms in the entire observable universe.
The breakthrough came from Maddison and his colleagues trying to replicate the intuition of a human player by training a neural network to predict the next strongest move based on millions of moves from real games. Of course, human players wouldn’t need to play as many games to build up their intuition, but they never could either – a clear advantage for AI.
AlphaGo was also not limited to learning from human players; it can play millions of games against itself to hone its skills. “By learning through these games, it can discover new knowledge and be able to go beyond human-level players,” says Pushmeet Kohli at Google DeepMind.
The final system that beat Lee was more complex than Maddison’s early models, but the overall message was simple: neural networks worked. “AlphaGo definitely showed that neural networks can do pattern recognition better than humans. They can essentially have intuition that surpasses humans,” says Noam Brown at OpenAI.
Other alphas
So what happened next? After AlphaGo, Google DeepMind and AI researchers began applying the fundamental lesson to real-world applications, such as in mathematics and biology. One of the most striking examples of this was AlphaFold, an artificial intelligence that could predict how proteins would look in three-dimensional space from their chemical composition far better than any human-designed program, and which won the team behind the Nobel Prize in Chemistry.
Recently, another neural network-based AI, AlphaProof, performed at the gold medal level in the International Mathematical Olympiad, a prestigious math test for students, amazing mathematicians. “Not only can you get this superhuman intelligence in a game, but you can get that experience in important scientific applications,” says Kohli.
The logic behind both the AlphaGo style of AI and that used for large language models (LLM) such as ChatGPT is similar. The first step, called pretraining, involves feeding a neural network with a large amount of human data, such as complete Go games, or the entire internet in the case of and an LLM. The second step, called post-training, then sees the network improve through a technique called reinforcement learning, which shows an AI what success looks like and lets it figure out how to achieve it.
For AlphaGo, this meant letting it play against itself millions of times until it figured out the best winning strategies. For AlphaFold, it was about telling the AI what a successfully folded protein looked like and letting it figure out the rules. For ChatGPT, it’s telling the model which answer people like better, a process called reinforcement learning from human feedback, or giving it a solution to a defined problem, say in math or coding, and letting it figure out how best to “reason” toward a solution by feeding the output back to itself, similar to how humans think out loud.
But this also has disadvantages. Neural networks are in many ways a black box. Despite attempts to figure out how they work, many of them are too large and complex to understand at a basic level.
When AlphaGo made its now famous move 37, onlookers initially thought the AI had gone mad, but it was only as the game progressed that it became clear that it was a strategic master move. However, Google DeepMind’s engineers could not ask AlphaGo why it had made that move, and it could just as well have been a mistake, for which we would be none the wiser.
“These models will come up with answers, and we won’t know if they are brilliant insights or hallucinations,” says Kohli. “We are still actively working to try to resolve those kinds of questions.”
A big part of AlphaGo’s achievement was that there was plenty of data to feed the model in the first place and a clear definition of success. It therefore makes sense that the areas in which AI is most successful today are within fields where both of these conditions are also true, says Maddison, such as mathematics and programming, where it is easy to define and verify what is right or wrong. “The similarities between these approaches tell us something, and that tells us what are the raw necessary ingredients for progress.”
Topics:






