The VICE Channels

    This Computer Program Can Teach Itself to Beat Super Mario Bros.

    Written by

    Derek Mead


    It's fairly well known that the first part of the first level of Super Mario Bros. was designed to teach new players how to jump on bad guys, eat mushrooms, and so forth. But what about players that aren't human? Could a computer be programmed to learn how to play classic games? Work done by computer scientist Dr. Tom Murphy suggests that yes, AI can learn to game.

    The video above features Murphy explaining the results of a paper he recently published and presented at the 2013 SigBovik conference. (SigBovik is held annually on April 1, and prominently features spoof research, although Murphy clearly states that his research isn't fake. For posterity's sake, I've yet to find any sources claiming this is a prank.) In short, his goal was to develop a program that could learn from a user what it takes to beat a game, and then apply that to its own methods.

    From Murphy's paper, which is a truly enjoyable read.

    "The basic idea is to deduce an objective function from a short recording of a player's inputs to the game," he writes. "The objective function is then used to guide search over possible inputs, using an emulator. This allows the player's notion of progress to be generalized in order to produce novel gameplay."

    As you can see by the early gameplay testing (starting around the 6:00 mark in the above video), the first iteration of Murphy's program (a pair of them, actually) basically mashed buttons and got nowhere. But as its goals and methods were refined (scoring a lot of points was one static goal, aside from simply "winning") by Murphy, it slowly gets better.

    As he notes in the paper's title, Murphy's system is based on lexicographic ordering, which is essentially the mathematical technique for figuring out how a set of data should be sorted. The first of his programs he used, called LearnFun, recorded all of the output data from his gameplay–everything from the number of coins he had to how far right he scrolled–to learn what values could be manipulated and which ones changed as a result of that.

    He then fed that data into his second program, PlayFun, which essentially tried to figure out which combination of input values would result in the most desirable outputs–mainly, these goals were scoring lots of points and scrolling as far right as possible.

    Pac-Man's ballsy move. Via Murphy's paper

    For the first level, the idea that Mario needs to get as far right as possible worked, after a bit of finagling. But in level 1-2, there's a key point where Mario must move backwards after grabbing those all-important coins before he can move forward again. For Murphy's computer, that was a watershed moment, as defeating it meant coding the program to learn more than just "move right and jump when you need to." Instead, it meant teaching the program how to simulate basic critical thinking: "Okay, I'm stuck. Now what?"

    PlayFun ends up being a pretty damn skilled AI, able to defeat the problems of World 1-2 and also exploit a glitch that only a computer could regularly–as long as Mario is moving downward, however slightly, he can stomp Goombas, even if they're above him (10:50 in the video).

    But, alas, Murphy's model isn't perfect. In World 1-3, there's a long jump that Mario has to back up and to a speed jump to clear, and the program simply can't figure it out. Still, it's impressive stuff, and as Ian Steadman notes at Wired UK, should make for interesting discussion at the annual Mario AI Championship, where algorithms typically revolve around pathfinding rather than computation, like Murphy's does.

    If you make it to the end of the video or Murphy's very accessible, entertaining paper, he also discusses how he's trying to apply his programs to other games. When it plays The Karate Kid, the computer uses all of its power kicks against early characters in a tournament before getting its ass whooped by harder characters later on. The result makes sense in terms of the program using its best possible methods to get to the next round, but also shows the shortsighted nature of the program itself.

    It does work quite well, however, with Hudson's Adventure Island, another side-scroll platformer. It also displays the precision only possible of a computer in Pac-Man, where it needlessly takes the risk of diving in between ghosts as they scroll by. Finally, it fails miserably at Tetris, because the computer is designed to maximize points now, without planning for the future. Nevertheless, the work is impressive as all get-out. I just need it to get to the point that I can finally beat The Lost Levels.