Forensic Linguist Says 'Lodestar' Can't Tell Us Who Burned Trump in New York Times Op-Ed

Guessing the anonymous author of the Trump op-ed in The New York Times based on the use of the word ‘lodestar’ is “entirely useless,” according to a forensic linguist.

|
Sep 6 2018, 7:23pm

Image: Shutterstock

Earlier this week, an anonymous White House “senior official” declared themselves a member of “the resistance” against Donald Trump in a scathing op-ed where they called out Trump’s “amorality,” “repetitive rants,” and “impulsiveness,” which lead to “half-baked, ill-informed and occasionally reckless decisions.”

“We fully recognize what is happening. And we are trying to do what’s right even when Donald Trump won’t,” the senior official wrote in the essay published by The New York Times on Wednesday.

While the contents of the op-ed were explosive, the biggest question everyone seemed to wonder and speculate about was, well, who the hell authored it?

One popular theory is that the piece was written by Vice President Mike Pence. The theory, laid out by audio producer Dan Bloom on Twitter, relies on the op-ed author's use of the word “lodestar,” which you don't hear often in conversational English, but which Pence often uses in speeches. Other observers have pointed to other phrases used in the op-ed like “malign behavior,” apparently a favorite expression of Secretary of State Mike Pompeo, as potential telltale signs.

Got a tip? You can contact this reporter securely on Signal at +1 917 257 1382, OTR chat at lorenzofb@jabber.ccc.de, or email lorenzo@motherboard.tv

It's a compelling bit of detective work that anyone with access to the viral Times essay and Google can partake in: take seemingly uncommon phrasings from the op-ed, cross reference with what senior officials in the White House have written or said, and you'll start making interesting connections with red twine on your corkboard.

But Shlomo Argamon, a professor at the Illinois Institute of Technology who has studied forensic linguistics, told Motherboard that trying to identify the author based on a single word or two, such as “lodestar” or “malign behavior,” is just amateur forensic linguistics and “entirely useless.”

Forensic linguistics and stylometry are the sciences of analyzing texts and attributing them to authors based on their style. The FBI, for example, used to have a forensic linguist on staff.

“It’s almost the same as somebody writing a letter and signing it by somebody else's name, and then you think that that other person wrote it," Argamon told me in a phone call.

That’s because these are words that can be used “consciously” by the author as a way to throw would-be unmaskers off the right track, as Argamon explained. “If ‘lodestar’ is so obviously a signature of Mike Pence,“ Argamon said, “any anonymous person worth their salt, especially given the tenor and the content of the op-ed, would be trying to use that term.”

This is a something that White House leakers allegedly already do. A White House official who gives anonymous quotes to the press recently told Axios that “to cover my tracks, I usually pay attention to other staffers’ idioms and use that in my background quotes.” There are even automated tools that attempt to anonymize written text.

Read more: Life on the Internet Is Hard When Your Last Name is 'Butts'

So individual words are not the best indicators. Better clues, rather, include how frequently or infrequently the author uses function words like “and” and “or,” prepositions, relative clauses, conjunctions, or certain syntactic constructions. These are all elements of style that are “unconscious” and harder to control and imitate, according to Argamon.

"None of us is directly aware of how frequently somebody uses the word 'and' versus the word 'or,'” Argamon told me. “We don't even really notice those words at a conscious level."

The other thing you need is a lot of data—many words. To successfully identify the author of the piece, one would have to compare this op-ed and whatever stylistic clues it reveals to several documents written by all the potential authors. And, ideally, you’d also need similar documents like other op-eds and essays, not transcripts of speeches or other texts that have different styles like chat conversations and emails.

It’s entirely possible to analyze text and identify who wrote it. In fact, researchers have proven that this can be done even with code. But it’s unlikely that the clues contained in this 900-something-word op-ed are enough.

“All attempts at IDing the author include the guesser’s knowledge—and biases—of the people at play, their motives, their personalities, their roles and circumstances,” Ariel Robinson, a linguist and policy strategist at the consulting firm Smooth Sailing Solutions, told me in an online chat. “Unless the guesser has equal information on all potential authors and approaches the task scientifically and analytically, I don’t think they can ID the author with 100 percent certainty.”

Solve Motherboard’s weekly, internet-themed crossword puzzle: Solve the Internet .