Quantcast
Building a Better CAPTCHA by Faking Out Robots

A pair of computer scientists seeks to avoid CAPTCHA doomsday.

Fundamentally, CAPTCHA is a stopgap solution. In the canonical sense of presenting images containing garbled and-or noisy text and asking users to verify their human-ness by deciphering said text, CAPTCHA's utility depends upon two not-guaranteed things. The first is that humans will be able to keep up with challenges of escalating difficulty, while the second is that computers won't get better at text recognition. It's more or less foretold that a collision is imminent in which computers will become better at solving CAPTCHA challenges than humans. RIP CAPTCHA.

We may have already reached that collision. Computers started beating humans in some character recognition tasks circa 2005, while Gmail's CAPTCHA was cracked in 2008. Presumably, CAPTCHAs still manage to filter out enough robots to keep them around, but this will only become less and less so. The question is then of how to make a better CAPTCHA, one that is more fundamentally resistant to machine vision.

A pair of computer scientists from Korea University, Shinil Kwon and Sungdeok Cha, has developed a new image-based CAPTCHA system that achieves this fundamental resistance by injecting temporary randomness into image sets, with the result being challenges that may have different solutions at different points in time. As a result, robots are unable to glean new information by making random guesses. This is key: Without trial and error, machines become so much less intelligent.

First off, we're not talking about the classic text-recognition CAPTCHA. We can just assume that's dead and buried. Cha and Kwon's work concerns the next iteration of CAPTCHA, which involves divining information from sets of images.

Image: Cha et al

"Although computer vision algorithms are powerful, they're still weak in answering semantic questions," the duo write in the current issue of IEEE Software. An example of such a challenge might be presenting a series of images and asking the user to select every image where Bill Gates appears. This is effective for a while, but we have to consider the scale at which bots are attempting to create accounts and gain access to systems: millions per day. Every attempt represents a chance to learn new information about a challenge, and, thus, better odds of solving it the next time.

"Should robots, through luck, pass the challenge, they can record all the relevant information for use in future attacks," Cha and Kwon write. "Furthermore, robots could use commercial search engines to retrieve image tags or similar images."

This brings us to Cha and Kwon's solution. Their challenge starts with a series of images as described above—some are to be chosen by the user while some are to be omitted, with some correct answer being maintained internally by the CAPTCHA. Classically, we'd imagine each image in the set to be labeled (internally) as either included or excluded, but the new system adds a third, neutral possibility. Now, an image can either be correctly included, correctly excluded, or irrelevant. A user or robot can pick or not pick one or more of these neutral possibilities and it will have no impact on whether or not the CAPTCHA is passed. Crucially, the set of neutral possibilities is unknown to the user and changes randomly from challenge to challenge, so a CAPTCHA may look the same to the user, but, internally, it's different.

What this means is that a robot attempting to learn a solution via random guessing won't really learn anything because it will never know why it was right or wrong. That is, if it picks five images from a pool of 10 and passes the CAPTCHA, it will have no idea the basis of its passing the test. The attempt will have been for naught, but the robot doesn't have any awareness of this irrelevance as it updates its own "pirate" database with what it believes to be correct selections that are actually neutral.

The system can be further improved by implementing a "trap" database, in which neutral images that would have been failures are associated with particular IP addresses. Because the robot based at that IP address has not learned about the failure (because it was randomly put into the neutral image pool in a prior test), the image can be redeployed as a trap; the robot has learned incorrectly that it is a correct answer because in one instance it did not cause the test to fail. In another, this same wrong answer might not be labeled as neutral and might really count.

In testing the new system, Cha and Kwon found that their robots were able to achieve a success ratio of only .023 through 2,250,000 attempts. This goes down to approximately zero when the trap database is used. "Owing to random and temporal aspects of the neutral images, the pirate databases couldn't maintain accuracy, and the robots never had opportunities to correct their misunderstanding," they write. "We discovered that 2,465 images (approximately 19.9 percent) were incorrectly labeled in the pirate databases." For comparison, human users were able to maintain success ratios of .793 without a trap database in play, and .645 with a trap database.

Cha and Kwon are currently working with Microsoft Research Asia to address issues of resiliency and scalability for future real-world deployment.