What the history of AI tells us about its future

But what computers were bad at, traditionally, was strategy – the ability to think about the shape of a game of many, many moves in the future. This is where the humans still had the advantage.

Or so Kasparov thought, until Deep Blue’s move in Game 2 shook him. It sounded so sophisticated that Kasparov began to worry: maybe the machine was much better than he thought! Convinced he had no way to win, he quit the second game.

But he shouldn’t have. Deep Blue, it turns out, wasn’t really good. Kasparov had failed to spot a move that would have let the match end in a draw. He was getting angry: fearing that the machine was much more powerful than it really was, he had begun to see human-like reasoning where none existed.

Cut off from his rhythm, Kasparov continued to play worse and worse. He psyched himself again and again. At the start of winner-takes-all game six, he made such an ugly move that chess watchers screamed in shock. “I was in no mood to play at all,” he later said at a press conference.

IBM took advantage of its moonshot. In the press frenzy that followed Deep Blue’s success, the company’s market capitalization grew by $11.4 billion in a single week. Even more significant, however, was that IBM’s triumph felt like a thaw in AI’s long winter. If chess could be conquered, what was the next step? The public’s mind wavered.

“That’s it,” Campbell tells me, “that’s what got people’s attention.”


The truth is, it was no surprise that a computer beat Kasparov. Most people who had paid attention to AI – and chess – expected this to happen eventually.

Chess may seem like the pinnacle of human thought, but it is not. Indeed, it’s a mental task that lends itself perfectly to brute-force computation: the rules are clear, there’s no hidden information, and a computer doesn’t even need to keep track of what happened in previous movements. It simply assesses the position of the coins at the moment.

“There are very few problems where, like in chess, you have all the information you might need to make the right decision.”

Everyone knew that once computers got fast enough, they would overwhelm a human. It was just a matter of when. By the mid-1990s, “the writing was already on the wall, in a sense,” says Demis Hassabis, head of artificial intelligence company DeepMind, which is part of Alphabet.

Deep Blue’s victory was the moment that showed just how limited hand-coded systems could be. IBM had spent years and millions of dollars developing a computer for playing chess. But he couldn’t do anything else.

“It didn’t lead to the breakthroughs that allowed the [Deep Blue] AI will have a huge impact on the world,” says Campbell. They didn’t really discover principles of intelligence, because the real world is not like chess. “There are very few issues there where, like in chess, you have all the information you might need to make the right decision,” Campbell adds. “Most of the time there are unknowns. There is chance.

But even as Deep Blue was cleaning the floor with Kasparov, a handful of rambling newbies were tinkering with a radically more promising form of AI: the neural network.

With neural networks, the idea was not, as with expert systems, to patiently write rules for each decision that an AI will make. Instead, training and reinforcement strengthen internal connections in a rough emulation (according to theory) of how the human brain learns.

1997: After Garry Kasparov beat Deep Blue in 1996, IBM asked the world chess champion for a rematch, which took place in New York with an upgraded machine.

AP PHOTO / ADAM NADEL

The idea had been around since the 1950s. But training a usefully large neural network required blazing fast computers, tons of memory, and lots of data. None of this was then readily available. Even in the 90s, neural networks were considered a waste of time.

“At the time, most people in the AI ​​field thought neural networks were just garbage,” says Geoff Hinton, professor emeritus of computer science at the University of Toronto and a pioneer in the domain. “I’ve been called a ‘true believer'” – not a compliment.

But in the 2000s, the computing industry was moving to make neural networks viable. Video gamers’ thirst for ever better graphics has created a huge industry in ultra-fast graphics processing units, which have proven to be perfectly suited to the mathematics of neural networks. Meanwhile, the internet was exploding, producing a torrent of images and text that could be used to train the systems.

In the early 2010s, these technical leaps allowed Hinton and his team of True Believers to take neural networks to new heights. They could now create networks with many layers of neurons (that’s what the “deep” in “deep learning” means). In 2012, his team won hands down the annual Imagenet competition, where AIs compete to recognize elements in images. This stunned the computer world: self-learning machines were finally viable.

Ten years after the start of the deep learning revolution, neural networks and their pattern recognition capabilities have colonized every corner of everyday life. They help Gmail auto-complete your sentences, help banks detect fraud, allow photo apps to automatically recognize faces, and, in the case of OpenAI’s GPT-3 and DeepMind’s Gopher, write long-sounding essays. human and summarize texts. They are even changing the way science is done; in 2020, DeepMind launched AlphaFold2, an AI that can predict how proteins will fold, a superhuman skill that can help researchers develop new drugs and treatments.

Meanwhile, Deep Blue has disappeared, leaving no useful inventions in its wake. It turns out that playing chess was not a necessary computer skill in everyday life. “What Deep Blue finally showed was the shortcomings of trying to make everything by hand,” says Hassabis, founder of DeepMind.

IBM tried to remedy the situation with Watson, another specialized system, this one designed to tackle a more practical problem: getting a machine to answer questions. He used statistical analysis of massive amounts of text to arrive at an understanding of the language that was, for the time, state-of-the-art. It was more than just an if-then system. But Watson faced unlucky timing: He was only eclipsed a few years later by the deep learning revolution, which brought a generation of language processing models far more nuanced than the statistical techniques of Watson.

Deep learning has shrugged off old-school AI precisely because “pattern recognition is incredibly powerful,” says Daphne Koller, a former Stanford professor who founded and runs Insitro, which uses neural networks and other forms of machine learning to study new drug treatments. The flexibility of neural networks — the wide variety of ways pattern recognition can be used — is why there hasn’t been another AI winter yet. “Machine learning has actually delivered value,” she says, that “previous waves of exuberance” in AI never did.

The reversed fortunes of Deep Blue and neural networks show how bad we have been, for so long, at judging what is hard and what is valuable in AI.

For decades people assumed that mastering chess would be important because, well, chess is hard for humans to play at a high level. But chess has proven to be quite easy for computers to master, because it’s so logical.

What was much harder for computers to learn was the occasional, unconscious mental work that humans do, such as carrying on a lively conversation, driving a car through traffic, or reading a friend’s emotional state. We do these things so easily that we rarely realize how tricky they are and how much blurry, grayscale judgment they require. The great utility of deep learning comes from its ability to capture small pieces of this subtle and unrecognized human intelligence.


Yet there is no ultimate victory in artificial intelligence. Deep learning may be booming now, but it’s also racking up heavy criticism.

“For a very long time, there’s been this techno-chauvinistic enthusiasm that says okay, AI will solve all the problems!” says Meredith Broussard, a programmer turned New York University journalism professor and author of Non-artificial intelligence. But as she and other critics have pointed out, deep learning systems are often trained on biased data and absorb those biases. Computer scientists Joy Buolamwini and Timnit Gebru found that three commercially available visual AI systems were terrible at analyzing the faces of darker-skinned women. Amazon trained an AI to check resumes, only to find it was downgrading women.

Although computer scientists and many AI engineers are now aware of these bias issues, they don’t always know how to fix them. On top of that, neural networks are also “massive black boxes,” says Daniela Rus, an AI veteran who currently heads MIT’s Computer Science and Artificial Intelligence Lab. Once a neural network is trained, its mechanics are not easily understood, even by its creator. It is unclear how he arrives at his conclusions or how he will fail.

“For a very long time, there’s been this techno-chauvinistic enthusiasm that OK, AI is going to solve all the problems!”

It may not be a problem, Rus believes, to rely on a black box for a task that is not “safety critical”. But what about higher-stakes work, like self-driving? “It’s actually quite remarkable that we’ve been able to place so much trust in them,” she says.

This is where Deep Blue had an edge. The old-school style of crafting rules may have been flimsy, but it was understandable. The machine was complex, but that was no mystery.


Ironically, this old style of programming may be making a comeback as engineers and computer scientists grapple with the limits of pattern matching.

Language generators, like OpenAI’s GPT-3 or DeepMind’s Gopher, can take a few sentences you’ve written and go on, writing pages and pages of plausible-sounding prose. But despite impressive facial expressions, Gopher “still doesn’t quite understand what he’s saying,” Hassabis said. “Not in a real sense.”

Likewise, visual AI can make terrible mistakes when it encounters a borderline case. Self-driving cars crashed into fire trucks parked on highways because in the millions of hours of video they had been trained on, they had never encountered this situation. Neural networks have, in their own way, a version of the “fragility” problem.

Comments are closed.