This Tiny Picture on Twitter Contains the Complete Works of Shakespeare

“I was just testing to see how much raw data I could cram into a tweet," the computer science undergraduate behind the demonstration said.

|
Oct 31 2018, 1:46pm

Image: Creative Commons, Wikipedia.

What if you could fit entire literary collections on Twitter? Scrap the 280 character limit: this week, a researcher demonstrated how it’s possible to squeeze the complete works of Shakespeare into a single, tiny image included with one tweet.

Steganography, the hiding of data in images, is not new to Twitter. But this tweet still provides a stark example of how information can sometimes be hiding in plain sight, malicious or otherwise.

The Shakespeare data is hosted “all on Twitter,” David Buchanan, the computer science undergraduate behind the demonstration, told Motherboard in an online chat.

The trick works by leveraging how Twitter handles metadata. Buchanan explained that Twitter strips most metadata from images, but the service leaves a particular type called ICC untouched. This is where Buchanan stored his data of choice, including ZIP and RAR archives.

“So basically, I wrote a script which parses a JPG file and inserts a big blob of ICC metadata,” he said. “The metadata is carefully crafted so that all the required ZIP headers are in the right place.” This process was quite fiddly, he added, saying it took a few hours to complete, although he wrote the script itself over a span of a couple of months.

“I was just testing to see how much raw data I could cram into a tweet and then a while later I had the idea to embed a ZIP file,” Buchanan added.

Got a tip? You can contact Joseph Cox securely on Signal on +44 20 8133 5190, OTR chat on jfcox@jabber.ccc.de, or email joseph.cox@vice.com.

In a follow-up tweet, Buchanan provided instructions on how anyone can pull data from the image. Motherboard verified that the image does indeed contain a bevy of files with works by Shakespeare nestled within them.

There could be more nefarious applications for this sort of technique beyond sharing literature. Files could potentially act as parts of infrastructure for controlling malware while sitting out in the open—traffic from an infected computer calling out to Twitter is probably going to be less suspicious than phoning home to an unknown server elsewhere.

Buchanan agreed that malware distribution would be a potential use case, and added “it already has been possible via more ‘traditional’ steganography techniques, but this method allows you to pack in way more data.”

Buchanan said he reported the technique to Twitter via bug bounty platform HackerOne, but Twitter did not seem very interested. In a tweet, Buchanan said Twitter did not think the issue was a bug.

Twitter did not immediately respond to a request for comment.