Data Visualization Experts Say Search Trend Maps Are Mostly Bunk

Google’s new map shows Canada’s “most misspelled words.”

|
Jun 8 2017, 5:16pm

Image: Google Canada

Google Trends has put out a map showing what is says are the "most misspelled words" in Canada so far this year. According to this visualization, people in Quebec have trouble spelling "blueberry," while in the Northwest Territories, residents struggle with "facetious" (that's a tough one for sure), and Nova Scotians aren't sure how to spell "yacht."

Canadians have understandably gobbled this up—we love nothing more than comparing ourselves to others, and silly stuff like this is catnip to the media. Maps like this pop up all the time: Google did something similar in the US in May, and in Canada, they've previously tracked people's favourite donuts, province by province. Even Pornhub recently did a map of most commonly misspelled searches, state by state.

Maps like these are fun, but they leave some experts rolling their eyes.

"This combines two of my pet peeves," Robert Kosara, a research scientist with Tableau Software in Seattle, who does data visualization research, told me over the phone. "Maps being used in weird ways, and rankings."

He recently tweeted a link to an xkcd comic poking fun at this:

"Many web companies use maps like this in viral marketing, but the methodology behind them is pretty weak," a wiki dedicated to explaining the webcomic says.

Random noise in the data can create what look like significant differences between regions where there really aren't any—doesn't it seem weird that this whole map of Canada doesn't just say "accommodate" and "accommodation," which are actually very hard to spell?

And why are Labradorians flummoxed by "precious," and Yukon residents by "altar"?

It also isn't really fair to compare Nunavut (population 37,000) to Ontario (13.9 million), for example, where the number of searches will make for vastly different denominators.

Donut searches by province (2016). Image: Google Canada

"If you look at a word or a search term, it's going to be some tiny fraction of a percentage," Kosara said. That can make the top word in the ranking seem like a big deal, but what was the second-most searched? The third? If we saw a top-ten list in each region, there might be more similarities across provinces, instead of such weirdly huge differences.

I talked to Alexandra Hunnings at Google Canada, who noted that the visualization is based on searches relating to how to spell various words. "It isn't a poll," she said. "It boils down to what people are typing in the search bar."

In a follow-up email, she explained that Google Trends data is an "unbiased sample of Google search data." It's anonymized, categorized (determining a topic for a search query) and aggregated, allowing Google to measure interest in a particular topic across search.

In Canadian media, a lot was made of the fact that British Columbia's "most misspelled word" was "pneumonia," which had some people wondering if the rain had more people concerned about being sick. Nunavut's word, meanwhile, was "anxiety."

Read More: This Map Will Show If Your Web Traffic Passes Through an NSA Listening Post

To Andrew Piper, a professor and director of the .txtLAB at McGill University in Montreal, that's another potential pitfall and danger of data visualizations like these: People read a lot into it. "Just picking the top [word] isn't a good way of representing what comes across as really important semantic implications," he said. "Are people googling 'pneumonia' because it's raining in BC? We have no idea."

One final, and possibly quibbling, point. If people are googling how to spell something, Piper added, it doesn't mean they never knew how to spell it to begin with. The real problem, of course, are the spelling mistakes we don't even realize we're making.

Correction: An earlier version of this piece said New Brunswick residents were looking up how to spell "yacht." Actually it was people in Nova Scotia. The piece has been updated.

Subscribe to Science Solved It, Motherboard's new show about the greatest mysteries that were solved by science.