FYI.

This story is over 5 years old.

Tech

The Mathematicians Trying to Prove that Every Language Is Inherently Positive

Language is “our great social technology,” and it’s an inherently positive one, according to new research.

Language is either a tool, a structure, or a meaningless soup of signs, depending on who you ask. But regardless of what social function it ultimately performs, all languages have an inherent positive bias, according to new research led by a team of mathematicians at the University of Vermont.

The researchers built a database of billions of individual words from 10 different languages using a variety of sources: Google Books, Twitter, movie subtitles, and song lyrics, to name just a few. They then compiled a list of the most commonly used words in the languages they studied, and crowdsourced labels that described whether they held positive or negative connotations. After analyzing the occurrence of positive or negative-leaning words across all their sources, the team found that every language they analyzed skewed positive; words that conveyed sentiments of happiness or love outweighed the negative.

Advertisement

"One could think of language as our great social technology," said Peter Dodds, the study's lead author. "We tend to, overall, veer towards the positive. Language is gluing us together, and we're glued together for positive reasons. That being said, it's still a surprise."

Semantic analysis of large word databases isn't new, and Dodds and his colleagues performed a similar experiment in 2012. That time, they only applied their analysis to the English language. Dodds told me this latest experiment, outlined in a paper published today in the Proceedings of the National Academy of the Sciences, represents a leap forward in terms of the scope of data-based approaches to language analysis.

By only looking at individual words, however, it's possible that Dodds and his colleagues are missing out on the context that the words are placed in. For example, the sentence "Happiness is an old friend, long lost," contains potentially positive words and one negative, although there's no doubt that it is a negative sentence on the whole.

"You have to start with words, because that's how languages have evolved," said Dodds. "We break it down by words, and it turns out that the tools we can use in these studies are very powerful and output sentiment scores that match up with Gallup polls."

To test the validity of their findings, the team applied their analysis to books and mapped the peaks and valleys of their emotional content. Fyodor Dostoyevsky's Crime and Punishment ends on a low note, for example, and the researchers' graph illustrates this with a dip in mood near the finale. "When we apply them to books, dips in the sentiment graphs match up with dips in the book's narrative—the proof is in the pudding," Dodds told me.

​There are philosophical questions to grapple with when it comes to how language reflects reality, as well. Is language a driver of action—rather, does it form the substrate of existence by constituting a frame through which we may interpret events?—or is it something to be wielded freely in a Derridian carnival of gleeful deconstruction? In other words, is language inherently positive, or are we just happy people in general?

​Dodds told me that these are contentious questions among linguists, and that his team's work is meant to start experts on the path to answering them empirically. "We have to get the atoms—words—sorted first, and then we can tackle the molecules," said Dodds. That is, if the internet doesn't wipe out most of the world's languages first and we're all left speaking emoji.