Yes, Please: An Algorithm for Fact Checking the Internet

Researchers use graph theory to sniff out junk information.

​Despite the claims of print journalism's anxiety-stricken old guard, fact checking hasn't vaporized under the bright lights of high-BPM internet writing. If anything, it's forced editors and writers to crank up the obsessiveness accordingly, because the bullshit onslaught is nowadays just staggering and ceaseless.

So, yes, there is still the bullshit, but it's hard to really complain about its lack of fact checking because peddlers of internet bullshit tend to know they're pitching bullshit. And they know that internet consumers, or some portion of them, will latch onto bullshit because it's their kind of bullshit, serving some or another popular bullshit outlook.

​Snopes truly does the lord's work in debunking bullshit, but that's but a single ray of light in a deep, dark sea. What if the internet could bullshit-check itself? Maybe we could someday just push a little poo icon in Chrome and the bullshit would get flagged. It's a possibility, according to ​a recent paper published by a team of computer scientists based at Indiana University and Portugal's Instituto Gulbenkian de Ciencia.

The paper, "Computational fact checking from knowledge networks," outlines an approach to BS detection using a shortest-path problem in graph theory. First, a questionable statement is broken apart into three pieces: a subject, predicate, and an object, which might look like this: "Socrates," "is a," "person."

Next, we take the subject and object of that statement (Socrates, person) and assign them to nodes. Nodes in graph theory are connected by "edges," which are just lines. These are the predicates. This is how you build a knowledge graph, a collection of things (subjects and objects) connected by various sorts of predicates (relations).

This is what a knowledge graph might look like (albeit for a different purpose):

Things and relationships. If you were to take, say, Wikipedia topics and do the same thing, you would get a very, very big graph. But this is how the researchers behind the current paper approached the problem: building a knowledge graph of Wikipedia that can be used as a reference for their fact-checking system.

Basically, the scheme would take a statement to be checked, dig out its subject and object, and then find where they're located in the Wikipedia knowledge graph. If they happen to be connected by just a single edge, then, hey, gold star. Otherwise, the accuracy is determined by the distance between the two nodes across the graph.

"Given a new statement, we expect it to be true if it exists as an edge of the [knowledge graph], or if there is a short path linking its subject to its object within the KG," the paper explains. "If, however, the statement is untrue, there should be neither edges nor short paths that connect subject and object." Seems reasonable.

And it worked ... sometimes. You can see the results graphically below, where the diagonal represents the true statement and the rest are experimental results: how close to true the system was able to auto-classify true statements.

Ideally, the internet would be just be filled with a lot less bullshit, and perhaps some Chrome bullshit blocker plug-in will bring us closer to that day. One can hope.