Researchers Studied 160 Million Memes and Found Most of Them Come From Two Websites

A study of hundreds of millions of images from Reddit, 4chan, Gab, and Twitter reveals how memes spread.

Jun 12 2018, 5:27pm

Researchers at University College London developed a new way to measure how memes are made and spread. What they found won’t surprise anyone who’s peered into the darker parts of the internet in the last few years: The most toxic, yet most effectively spread, memes are first shared on two places, the subreddit r/the_donald and 4chan’s “politically incorrect” forum, called /pol/.

The researchers said they studied multiplatform meme ecosystems, with a focus on “fringe and potentially dangerous communities.”

“Considering the increasing relevance of digital information on world events, our study provides a building block for future cultural anthropology work, as well as for building systems to protect against the dissemination of harmful ideologies,” they added.

They’re not the first to think deeply and academically about the meme ecosystem, but the patterns they found also bolster what we already knew about memes: that based on sheer size and spread of these communities, you’re probably sharing images that were made to be distributed in toxic communities.

The researchers gathered a database of more than 100 million images from online communities known to generate lots of memes, including Reddit, Twitter, 4chan, and Gab. They also downloaded more than 700,000 images from Using perceptual hashing, or pHashing, they ran an algorithm through these image databases to detect visually similar memes. pHashing extracts a unique “thumbprint” from images, making it easier to detect patterns in visual similarity over a large database, and derive patterns out of those similarities. They then clustered similar memes by community, and studied how patterns evolved over time.

From this information, they could compare clusters to entries in to determine the context of the memes, and whether they were racist, hateful, funny, or something else.

The researchers did not determine who actually made the memes, just where they were first shared on the public internet—as Motherboard has previously reported, many memes originally circulate in small Discord channels or on group chats before being posted to bigger platforms like Reddit or 4chan.

/pol/ had the highest volume of memes, while the_donald was the best at getting memes spread outside of its own community. Reddit and Twitter users shared more “fun” memes, they concluded, while /pol/ and Gab saw more racist or politically-motivated images.

But within Reddit, subreddits like the_donald were still considered fringe by the researchers. “Reddit users are more interested in politics-related memes than other type of memes,” the researchers write in the study. “That said, when looking at individual subreddits, we find that The Donald is the most active one when it comes to posting memes in general. It is also the subreddit where most racism and politics related memes are posted.”

The researchers posit that this kind of algorithm—which they’ve made openly available—could be useful for social media platforms trying to automatically detect hateful content.

[h/t MIT Technology Review]