Wikimedia and Twitter Bots Are Breaking the News

Robo-reporters are starting to get scoops.

Meghan Neal

Meghan Neal

Image: Mediagalleries/Twitter

We already knew that bots were writing news content, automating narrative stories from data-rich topics like sports scores and financial markets. Now, robo-reporters are starting to get scoops. They're not just writing stories; they're breaking them.

Thomas Steiner, a Google engineer in Germany, designed an algorithm that covers the news as it's breaking by monitoring activity on Wikipedia (old school journalists everywhere are wincing) and watching for spikes in editing activity.

The idea is that if something big is happening—especially if it’s a global event—multiple editors around the world will be updating Wikipedia and Wikidata pages at once, in different languages. That spike in activity tips off the bot to the story. According to Steiner, his news bot spotted major stories like the Boston Marathon bombing and the disappearance of Malaysia Airlines MH370.

The bare-bones site tracking real-time editing is called Wikipedia Live Monitor. It was first created last year, and now Steiner's has extended his robo-news operation to Twitter. The bot mines the social media site for a particular search term triggered by the Wikipedia activity and pulls out all relevant photos to illustrate the story.

"We have connected the world of breaking news events based on detected concurrent Wikipedia and Wikidata edits with the world of social network sites," said Steiner in a paper recently published on Cornell's arXiv preprint server and picked up by MIT Technology Review yesterday.

You can check out the visual news events on the Twitter bot account @mediagalleries. The earliest are from a case study Steiner did to test out the program during the Olympics in Sochi. More recently, there are galleries illustrating major sports events, and the latest updates to flight MH370 and the conflict in Crimea.

You can see, it's still a rudimentary process, hardly about to put the staff of the New York Times out of business. But it says a lot about the direction automating the news is heading in.

Finding stories by scanning Twitter is nothing new; journalists do it all the time, like a modern-day wire service. But spotting them through algorithms and reporting them via bots, with no human middle man anywhere in the process, is a different ballgame.

It's sensible enough: When people around the globe are connected and online, there's little reason not to know everything that happens the instant it does, and Steiner's work is just the latest small attempt at harnessing that huge reporting power. "The Olympics being an event of common interest, an even bigger majority of people share the event in a multitude of languages on global social network sites, which makes the event an ideal subject of study," he wrote in the paper.

Still, the Fourth Estate is one of the more disconcerting industries being taken over by robots, and not just because it’s my own livelihood. And it’s more common than you think; Kristian Hammond, cofounder of Narrative Science, a company that's been automating content for several years now, predicted that 90 percent of the news could be written by computers by 2030.

Last week, a robot wrote a breaking report of an earthquake in Los Angeles, published on the LA Times—that's a top story in a major publication. And we wrote about how the recently announced Buzzfeed-Whisper partnership means viral stories will now be automated, too, creating original content from crowdsourced data—just like Steiner’s Wikipedia bot, except thousands of people will be reading them.