Spinning the Battle to control AI

Hal writes:

Hassabis often cites Breakout, a videogame for the Atari console. A Breakout player controls a bat that she can move horizontally across the bottom of the screen, using it to bounce a ball against blocks that hover above it, destroying them on impact. The player wins when all blocks are obliterated. She loses if she misses the ball with the bat. Without human instruction, DeepMind’s program not only learned to play the game but also worked out how to cannon the ball into the space behind the blocks, taking advantage of rebounds to break more blocks. This, Hassabis says, demonstrates the power of reinforcement learning and  the preternatural ability of DeepMind’s computer programs.

It’s an impressive demo. But Hassabis leaves a few things out. If the virtual paddle were moved even fractionally higher, the program would fail. The skill learned by DeepMind’s program is so restricted that it cannot react even to tiny changes to the environment that a person would take in their stride – at least not without thousands more rounds of reinforcement learning. But the world has jitter like this built into it.

Hassabis leave more than a few things out, as Dr Beth Singler also points out.

Anyone who played breakout enough in the early 90s found that the thing that demonstrates ‘the preternatural ability of DeepMind’ was possible and effective at winning. It may not be immediately obvious to those who haven’t played the game, and who don’t understand that this is a natural result of playing games with scores. Whether it was new to engineers in 2015 is unclear, but it was new enough to the DeepMind PR people that they persuaded journalists it was actually novel – did anyone who said so ever play the original game?

This novelty bias also applies to the AlphaGo move that DeepMind PR pushed in 2016 – that ‘unique move’ appears a bunch of times in the KGS archive.

As Whitehall Departments think they should sprinkle AI in their spending round submissions, there is another lesson in Hal’s words.

When we talk about expensive deepmind generalists over cheap NHS specialists, this is what we man. The civil service also believes expensive generalists over their own specialists too. In both cases, the expensive generalists assume what is new to them is new to everyone. In medicine, the specialists know better, but it’s harder to write fawning articles about that.

posted: 14 Mar 2019