FYI.

This story is over 5 years old.

Tech

Meet the Anti-Turing Test

Did a human produce this nonsense, or was it a computer?

Can a computer produce nonsense of sufficient nonsensical quality to convince a reader that they're actually reading James Joyce? Or, rather, that they're reading James Joyce's unreadable opus Finnegan's Wake, literature's towering force of impenetrability?

Or, rather, can a James Joyce convince a reader that they're not reading nonsense produced by a computer program and are indeed consuming high literature?

Advertisement

This is the question posed by data scientist Will Kurt on his probability blog Count Bayesie. Kurt spent the holiday weekend tinkering with some Lua code released by Andrej Karpathy's Hackers Guide to Neural Networks that implements a very strange and very powerful algorithmic scheme known as a recurrent neural network (RNN).

In the simplest of terms, an RNN is a neural network that features a sort of feedback memory. While a usual NN might be viewed as a flow of computation along some set of connections or nodes, like an information river, an RNN adds the additional feature of information whirlpools. That is, in an RNN it's possible to go not only from node to next-node, it's possible to go from a node back to the same node in a self-loop or to previous nodes.

It's a way of providing a sort of memory to the network, so the RNN doesn't even really need new input to do computations, as previous inputs can just keep bouncing around between nodes and, in a way, the network keeps learning. The nodes are always being activated.

A neural network like this is very handy for a few applications in particular: speech recognition, music composition, and learning "formal grammars." It's this last one that Kurt was after in his Finnegan's Wake experiment.

"RNNs are particularly well suited for learning a strange text like Finnegans Wake because they learn one character at a time, rather than whole words or 'tokens,'" he writes. "Most generative language models (such as Markov Chains) assume a consistent vocabulary in the language, a rule which Finnegans Wake refuses to play by."

Advertisement

"It is at least some credit to the book that the majority of Natural Language Processing techniques would be as baffled by the text as many readers," Kurt Continues.

From Finnegan's Wake:

The great fall of the offwall entailed at such short notice the pftjschute of Finnegan, erse solid man, that the humptyhillhead of humself prumptly sends an unquiring one well to the west in quest of his tumptytumtoes: and their upturnpikepointandplace is at the knock out in the park where oranges have been laid to rust upon the green since dev-linsfirst loved livvy.
What clashes here of wills gen wonts, oystrygods gaggin fishy-gods! Brékkek…

If that passage didn't send you immediately racing off to the library, here is the text that Karpathy's RNN produced based off of learning the Joyce text character by character:

Oracia not only soen for em. Praye in a bitary, like Seemeryth! My minnalnon, in his profish on. Soon from Nila! Well, Anna Mae, make, me hoven, I feel ox in Ir such of an Dona, the Oxentricies seen enough fatury bonefit!, Tooking, the worth and the cabbound marry, his fattime you boot, in jarque to the tare, that, this a man and in cottestian, which she starse to liet, a power or forey foot lips and reprobed you upon the lesh field thinaindus, place, lie and leave…

Non-English professors might be suitably convinced that the RNN-produced text is indeed the product of Joyce or at least a close nonsense-aficionado follower.

Here is Kurt's point: "The question that RNNs allow us to start answering about Finnegans Wake is in the difference between the original and the simulation. If we cannot truly distinguish between Finnegans Wake and the output of an RNN, then, despite Joyce's best efforts, his work truly must be viewed as a literary curiosity with no more meaning than a Rorschach Test. However, if studying the two leads us to see a clear difference, then that difference itself is precisely what makes Joyce's text something other than nonsense."

And that's the anti-Turing Test.

The Karpathy code, by the by, doesn't look too hard to get up and running and might even be pretty fun to experiment with. I'll have to get back to you on that.