This Camera Can Read a Book Without Opening It

A new imaging technique can read what is printed on the pages of a book without ever opening it.

On Friday, a team of researchers at MIT and Georgia Tech unveiled a new machine imaging technique that allows a computer to determine what is printed on individual sheets of stacked paper without having to flip through them. It's kind of like if Superman used his X-ray vision for doing nerdy stuff like reading books, except that Superman is a letter-interpretation algorithm and his X-ray vision is terahertz radiation.

As detailed in a paper published in Nature Communications, the system uses terahertz radiation (the band of electromagnetic radiation between microwaves and infrared light) which has a number of advantages over other surface-penetrating waves like X-rays or ultrasound.

For starters, terahertz radiation is absorbed by different chemicals in different ways, which means that it can be used to distinguish paper and ink in a book. The terahertz camera used by the team can also emit the radiation in super short microbursts which measure the depth of a page in a book by timing how long it took for the radiation to be reflected from the book back to the camera.

These ultrashort bursts of radiation allows for a depth resolution which is so fine that the researchers were able to measure the distance from the source of radiation emission to individual pages in a book, which are separated by pockets of air that are only about 20 micrometers deep.

The way the ink reflects the terahertz radiation back to the camera is then analyzed by an algorithm developed by the MIT researchers to render the reflection-time data as an image. The resulting image of what is printed on a given page is often heavily distorted, so it is then processed by another algorithm developed by the Georgia Tech team which is able to interpret individual letters in the distorted image.

"It's actually kind of scary," said MIT Media Lab researcher Barmak Heshmat, referring to the letter-interpretation algorithm. "A lot of websites have these letter certifications [captchas] to make sure you're not a robot, and this algorithm can get through a lot of them."

Terahertz imaging is a technology that is still in its infancy, and although researchers at MIT realized this imaging technique could be used to see through envelopes over a decade ago, the researchers' algorithms aren't quite ready to read War and Peace just yet.

As a proof of principle, the researchers printed a single letter on individual sheets of paper and found that their technique could correctly identify the letters up to nine pages deep. Beyond that the reflected signal became too noisy to extract the information on a page, but as the radiation sensors are refined the technique should one day be able to read entire books without ever opening them.

Indeed, this was part of the impetus behind the project. Heshmat said that the Metropolitan Museum in New York was incredibly interested in their project since it would allow them to peer into books that are so old that touching them would irreparably damage them. Moreover, because terahertz imaging is able to differentiate between chemicals on an object, the technique won't just be used for reading books: it can be used to "read" anything that is organized in thin layers, such as the coating on a pill.

"So much work has gone into terahertz technology to get the sources and detectors working," said Laura Waller, an associate professor of electrical engineering and computer science at UC Berkeley. "This work is one of the first to use these new tools along with advances in computational imaging to get at pictures of things we could never see with optical technologies. Now we can judge a book through its cover."