Buzzfeed Wants to Use Facebook for Political Polling: What Could Go Wrong?
In which I attempt to poll my Facebook friends with sentiment analysis software.
Image: Robert Scoble/Flickr
Back in November, Buzzfeed editor-in-chief Ben Smith announced that the listicle-turned-news site was partnering with ABC News and Facebook to start applying sentiment analysis to Facebook posts, which "may be the most important new source of political data in the 2016 elections."
Sentiment analysis is a type of text mining focused on extracting opinions. The field is in its infancy, and even Smith allows that it is "as famous for its pitfalls as for any successes." but he states that he's confident if anyone can pull it off, "it will be Facebook."
I decided to try using sentiment analysis on my own Facebook feed, using a Python NLTK Text Classification demo made at Cornell's natural language processing lab. Trained on tweets and movie reviews, I had high hopes.
My attempt at sentiment analysis didn't go well, however.
A post from one Jewish friend about Heidegger's anti-semitism came out as positive, as did a sarcastic post about buying Simply Orange juice right before the juice was recalled for making people sick.
I doubt this is the most sophisticated tool on the internet, and there's Facebook will surely be using something more advanced (the company did not respond to request for an interview).
These posts would've been tough for any sentiment analysis software, however. The Heidegger post was less positive or negative, and more of a reflection on thinking and morality. The Simply Orange thing was straight sarcasm.
Smith admitted that sentiment analysis has had trouble with sarcasm, but judging from my own wall, there are layers upon layers of irony going on, to the point where even I only understand it from a long-term relationship with these people.
Granted, my feed is full of irony-drenched, sardonic Millennials, but why are you looking at Facebook data unless it's to take the temperature of that very demographic?
I emailed Annie Swafford, an assistant professor of Interdisciplinary and Digital Teaching at SUNY-New Paltz. Swafford wrote a series of blog posts critiquing the use of sentiment analysis to chart the plot of novels, saying "all approaches—from the lexicon-based approaches to the more advanced Stanford parser—have difficulty with anything that doesn't sound like a tweet or product review."
While Buzzfeed is collecting something close to the latter, she still thought that the news site was going to have some trouble.
"Are political posts more sarcastic than other types of Facebook posts?" was the first thing she asked. "If so (and it definitely seems possible), that would make it harder for the sentiment analysis algorithm to reliably detect the sentiment of the post (unless the user selected 'sarcastic' from the possible list of moods Facebook provides)."
Anecdotally, I haven't seen anyone use those moods that Facebook provides in quite a while. It never really seemed popular.
Swafford also pointed to the issue of linking up what we post on Facebook with how we actually feel about something.
"How does peer pressure affect what people post on social media?" she asked. "Traditional phone polling had a version of this issue (although the conversations were one-on-one, other family members may have been in the room), but with Facebook, this is a problem on a much larger scale: people may feel less able to say what they think if they're trying to fit in with friends or maintain good relationships with family members."
"Fuck this corporate hypocrite fake ass leader" came back "neutral"
"I imagine everyone can think of examples of occasions where they decided not to post political articles on Facebook to avoid protracted arguments on their Facebook walls. Some people might likewise feel compelled to 'like' or share a link they don't wholeheartedly support to keep up appearances. It would be tricky to find an algorithm that would take this dynamic into consideration."
Out of curiosity, I decided to see how a really histrionic post was read by my amateur sentiment analysis tool. I knew there would be a reliable supply of comments beneath any article posted by Vice (yeah, we see you). I was not disappointed.
"Fuck this corporate hypocrite fake ass leader. He is nothing but a bitch ass puppet. He should do his fuckin job and leader a country for a better future so our future generations do have to worry about war and poverty. Presidents aren't elected they are selected," posted someone whose profile picture was a sloth in a sweater, below a post of Shane Smith's interview with Obama.
This came back "neutral." Not to mock this guy's, uh, passion, but it's possible that sentiment analysis had trouble with the phrase "leader a country" so that "future generations do have to worry about war and poverty." Maybe it was just the use of the word "better."
This reminded me of when Brandwatch tried using sentiment analysis software to analyze Gamergate tweets to see if they blamed game developers or journalists, it found that 90 percent of the Tweets read as "undetermined." Do you remember any tweets about Gamergate where the sentiment was unclear to you?
BuzzFeed and Facebook aren't claiming to replace polling. But if you want to believe—as Smith's post argues—that Facebook is a political arena that's worth attention, the sentiment analysis of social media seems fraught at best, and potentially politically dangerous at worst. When USA Today used hamhanded sentiment analysis to determine the "sentiment score" for each presidential candidate in 2012, a sociologist told Fast Company it was "irresponsible" and "borderline criminal."
Even under the best of circumstances, though, it was always going to be annoying.