A new basketball-playing robot was able to teach itself how to shoot hoops in only two hours.
It seems like with each passing day, another 'humans-only' domain is taken over by robots: they manufacture all our stuff, drive our Ubers, and fight our wars. Increasingly, robots are also beginning to challenge humans in a variety of athletics. There are engineers trying to create a team of soccer playing robots that will be ready for the 2050 world cup, a robot that can swim, and even a robot that'll kick your ass at foosball.
The latest feat of robo-athletics is coming out of Arizona State University, where a team of researchers has created a robot that was able to teach itself to shoot hoops in a matter of hours.
Unwilling to let Team Human slip into obsolescence on the court, I decided to pay a visit to the robo baller and challenge it to a shootout. If I won, the robots would have to agree to stay off our courts, but if I lost…
Given the high stakes, I paid a visit to the office of Heni Ben Amor, the head of Arizona State's Interactive Robotics Lab, to get a little insight into my opponent's strategy before the pickup game. Ben Amor and his team of graduate students have been working on their basketball playing robot known as SunDevil RX)for nearly a year now, although the underlying algorithm that allows the robot to teach itself how to ball has been in development for much longer.
"My lab researches methods that allows robots to learn skills on their own rather than have a human program them," Ben Amor said. "We changed the traditional programming paradigm into a different paradigm where instead you're the teacher."
What Ben Amor is describing is a branch of computer science called machine learning. Advanced machine learning algorithms, such as those used by Google's DeepMind, are able to make use of massive databases which train the machine by having it sort through the data over the course of millions of trials until it has learned the best way to achieve a desired result. This is known as reinforcement learning, where a machine learns how to do something through repeated trial and error without being explicitly told how to achieve that goal. While reinforcement learning works great for virtual tasks, having to run through millions of iterations is not optimal for a robot trying to execute a task IRL—like learning how to put the rock in the hole.
To get around this difficulty, Ben Amor developed a special type of reinforcement learning algorithm called sparse latent space policy search for their basketball bot. This type of algorithm provides the robot with a hierarchy of its motor functions so that it can replicate and correct dynamic movements—like shooting a basketball into a hoop. But rather than allowing the robot manipulate each individual motor as it learns, it manipulates them in groups according to a hierarchy established by Ben Amor and his team that is coded into the algorithm.
By limiting the degrees of freedom of the robot with their latent space policy search algorithm, the researchers were actually able to make it far outperform other machines which are teaching themselves how to perform different tasks. Previously, if the researchers were to readjust the hoop by moving its height or distance from the robot, it might've taken the robot weeks of learning to get the ball in the hoop again because the robot would be tweaking small, individual parameters rather than operating at a meta-level that changes parameters in pre-established groups. But with Ben Amor's algorithm, the robot is able to learn how to shoot from scratch within two hours.
As you can see in the above video, the shooting technique the robot originally settled on looks kind of like someone pushing someone else away from their chest. This technique discovered by the robot worked fine, allowing it to sink between 60 and 70 percent of its shots by Ben Amor's estimation. But when the researchers wiped its memory and it had to learn how to shoot all over again, the technique it developed was different: this time it looked more like an underhand toss.
In other words, the robot had taught itself different ways of arriving at the same goal.
Now that I had a little insight into my opponent's strategy, it was time for the shootout. The court was a dim room just off of the university's main campus, where a small team of graduate students sat coding behind computer monitors. The blinds were drawn so the light wouldn't interfere with the robot's optical system (an Xbox Kinect mounted to its head), which clearly gave the robot a home court advantage.
As the visiting team, I was given the first attempt to get the ball into the makeshift hoop. Figuring that the robot might be on to something with its underhand shooting technique, I decided to give it a try. Against all odds, I was successful, but the lead didn't last long and SunDevil RX sunk its first shot.
The suspense became unbearable as SunDevil RX calibrated its shot after I missed my second —and the robot takeover began with a swish. When I missed my third shot, the air in the lab became electric as a handful of graduate students briefly stopped coding to watch their creation dunk on me.
And sure enough, SunDevil RX coolly sank the shot like some sort of robotic Steve Nash. A robot which had taught itself to shoot hoops in two hours had beaten a human 3-1. As I hung my head in shame, SunDevil RX raised its lifeless arms in a gesture of victory.
My one bit of solace was that all the engineers responsible for designing the robot actually fared worse against it on the court than I did. Luckily for all of us, the main purpose of the algorithm underlying this robot isn't actually shooting hoops—in fact, Ben Amor isn't entirely sure what it will be used for.
The obvious solution is putting the algorithm to use in robots that are used in manufacturing. This would allow for quick tweaks to the production process without having to waste huge amounts of time recalibrating the robot to the new task. Ben Amor could also see it being put to use helping disabled persons or as an element in an amusement park (think Disneyland, but where the robots in 'It's a Small World' could actually interact with you).
The reason Ben Amor chose to use shooting hoops as a way of developing this algorithm is because the system of rewards that the robot uses to teach itself is very clear: the ball either goes into the hoop or it doesn't. But now that the technology is there, it's up to others to figure out how to put it to use in a practical way.