Harvard researchers found fewer instances of Wikipedia censorship after the site started encrypting all of its traffic.
Image: Wikimedia Commons/Shutterstock - Remix by Jason Koebler
"Knowledge is power," as the old saying goes, so it's no surprise that Wikipedia—one of the largest repositories of general knowledge ever created—is a frequent target of government censorship around the world. In Turkey, the entire site has been blocked in all languages since April 29; Russia has censored articles about weed; in the UK, articles about German metal bands have been blocked; in China, the entire site has been banned on multiple occasions.
Determining how to prevent these acts of censorship has long been a priority for the non-profit Wikimedia Foundation, and thanks to new research from the Harvard Center for Internet and Society, the foundation seems to have found a solution: encryption.
In 2011, Wikipedia added support for Hyper Text Transfer Protocol Secure (HTTPS), which is the encrypted version of its predecessor HTTP. Both of these protocols are used to transfer data from a website's server to the browser on your computer, but when you try to connect to a website using HTTPS, your browser will first ask the web server to identify itself. Then the server will send its unique public key which is used by the browser to create and encrypt a session key. This session key is then sent back to the server which it decrypts with its private key. Now all data sent between the browser and server is encrypted for the remainder of the session.
"The decision to shift to HTTPS has been a good one in terms of ensuring accessibility to knowledge."
In short, HTTPS prevents governments and others from seeing the specific page users are visiting. For example, a government could tell that a user is browsing Wikipedia, but couldn't tell that the user is specifically reading the page about Tiananmen Square.
Up until 2015, Wikipedia offered its service using both HTTP and HTTPS, which meant that when countries like Pakistan or Iran blocked the certain articles on the HTTP version of Wikipedia, the full version would still be available using HTTPS. But in June 2015, Wikipedia decided to axe HTTP access and only offer access to its site with HTTPS. The thinking was that this would force the hand of restrictive governments when it came to censorship—due to how this protocol works, governments could no longer block individual Wikipedia entries. It was an all or nothing deal.
Critics of this plan argued that this move would just result in more total censorship of Wikipedia and that access to some information was better than no information at all. But Wikipedia stayed the course, at least partly because its co-founder Jimmy Wales is a strong advocate for encryption. Now, new research from Harvard shows that Wales' intuition was correct—full encryption did actually result in a decrease in censorship incidents around the world.
The Harvard researchers began by deploying an algorithm which detected unusual changes in Wikipedia's global server traffic for a year beginning in May 2015. This data was then combined with a historical analysis of the daily request histories for some 1.7 million articles in 286 different languages from 2011 to 2016 in order to determine possible censorship events. At the end of their year-long data collection, the Harvard researchers also did a client-side analysis, where they would try to access various Wikipedia articles in a variety of languages as they would be seen by a resident in a particular country.
After a painstakingly long process of manual analysis of potential censorship events, the researchers found that, globally, Wikipedia's switch to HTTPS had a positive effect on the number censorship events by comparing server traffic from before and after the switch in June of 2015.
Although countries like China, Thailand and Uzbekistan were still censoring part or all of Wikipedia by the time the researchers wrapped up their study, they remained optimistic: "this initial data suggests the decision to shift to HTTPS has been a good one in terms of ensuring accessibility to knowledge."
Correction: a previous version of this article understated the extent of Wikipedia censorship in Turkey, saying that only some pages were censored. In fact, in the year since the study was finished, Turkey has increased Wikipedia censorship. The entire site has been blocked in all languages in Turkey since April 29. Motherboard regrets the error.