The Infinite Loop Designed to Break the Data Mining Market

An endless loop of fake data.

Jul 29 2015, 11:00am

The internet is a big, dark monetization machine. People log in, share some memes, and money shoots out of a wormhole-like vortex on the other end. In the middle are data brokers, businesses that collect and buy your data in order to resell it to other companies.

It's a neat little system, and now, a group of three New York-based tech developers say they've created a program called Data Arbitrage to destroy it.

You probably don't know who the data brokers buying and selling your information are, and penetrating the industry's veil of secrecy has been the focus of many a research project and government report. Some startups like Datacoup have tried to make the process of data brokerage more transparent, however, letting customers sell their own data and receive a cut of the revenue.

But what if the very idea of willingly turning the details of your shopping or entertainment habits into cash is untenable even if you get a cut?

Enter Data Arbitrage. The program automatically sells users' social media data to brokers and then uses the resulting funds to buy fake accounts, which in turn provide more data to sell. A bot called "ArbiBot" lives on the user's computer and generates posts from the fake accounts. The goal is to eventually generate so much noisy, fake data that the whole enterprise of data brokerage is rendered completely useless.

"So far we have it running on Twitter, Instagram, and Facebook," a Data Arbitrage spokesperson, who asked to remain anonymous for legal concerns (at least until the program is released to the public on August 4th), told me in an email. "We plan to branch out to other social media sites a such as Google+ and LinkedIn. We are also interested in diluting credit card information, which we are on track to accomplish in the coming months."

"I think they are definitely tapping into a visceral feeling that a lot of people have about the data economy"

Until the launch date, it's unclear how well Data Arbitrage works; the developers would not give me a copy of the program, and making an appreciable impact will likely depend on actually building up a user base of people willing to give up their data to the scheme. But similar projects already do exist. Google Will Eat Itself, for example, seeks to buy Google out slowly by serving Google ads on a network of secret sites and using the resulting cash to buy Google shares.

The core concept in both cases is "arbitrage," a financial term that refers to taking advantage of a price imbalance in two different markets. The money earned from selling data is used to buy tons of fake accounts, which are easy to make and cheap as dirt. It's a proven concept in the world of finance, but for data, the question remains: will it work?

"I think it's safe to assume that social media bots denigrate the overall quality of mined and brokered data," said Matt Hogan, a spokesperson for Datacoup. However, Hogan said, social media profiles are just one avenue for data creation and collection by brokers; simply browsing the web is another.

"If Data Arbitrage is just focused on social data, it may not have the disruptive effect that they hope, but who knows?" Hogan said. "I think they are definitely tapping into a visceral feeling that a lot of people have about the data economy."

When I asked Data Arbitrage's anonymous spokesperson where the data generated through the program will be sold, the spokesperson refused to answer on the grounds that the group fears being shut down. This concern may be justified, since Hogan told me that Datacoup seeks out fake and duplicate accounts and removes them from their system.

Data Arbitrage appears ready for a war of attrition with data brokers, and the spokesperson told me they will keep working on making ArbiBot more advanced in order to slip past brokers. Of data brokers' reaction to the system, the spokesperson said, "We expect them to fight back."