The VICE Channels

    The Secret Search Engine Tearing Wikipedia Apart

    Written by Jason Koebler

    Image: Nicciolo Caranti/Flickr

    Over the weekend you may have heard news that the Wikimedia Foundation, the nonprofit that finances and founded Wikipedia, is interested in creating a search engine that appears squarely aimed at competing with Google. What you may not have heard is that this nascent project is tearing the Wikimedia Foundation and the Wikipedia community apart.

    In September, the Wikimedia Foundation won a $250,000 grant from the Knight Foundation to start building the “Wikimedia Knowledge Engine,” a “system for discovering reliable and trustworthy public information on the internet,” according to grant documents, which were released late last week. That the Knowledge Engine, now known as “Wikimedia Discovery,” even existed was news to the Wikipedia editors community, who say the project’s secretive nature and very existence are fundamentally at odds with Wikimedia’s transparent ethos. Critics say the project that showcases a disconnect and lack of understanding between a foundation that’s increasingly run by those with Silicon Valley connections and the volunteer community that keeps the foundation’s flagship product running.

    “There’s been increasing alienation of the community from the foundation,” William Beutler, a longtime Wikipedia editor, journalist, and author of The Wikipedian blog, told me. “The community is this volunteer group that is made up of people who largely buy into Wikipedia for ideological reasons. Then you have the foundation, which has increasingly fewer people from the community and a larger Silicon Valley contingent that comes from a tech background.”

    “It seems like there’s been a culture clash,” he added. “And this is the most destructive manifestation of that culture clash.”

    "I wonder if the foundation, a lot of whom have experience in Silicon Valley, is getting a little bored with just running an encyclopedia"

    How is it possible that a mere search engine could be causing this much consternation? Well, Wikimedia has always thrived on its radical transparency: Its annual strategy plans and fiscal information is public, and Wikipedia has always maintained a comprehensive history of every edit on the website.

    But the Knowledge Engine has been shrouded in secrecy from the get-go. In fact, there’s widespread disagreement about what the Knowledge Engine even is, because Wikimedia’s public statements are at odds with leaked internal documents discussing the projects and with its grant application to the Knight Foundation.

    The concern from the Wikipedia community is that a Google-like search engine could represent a shift in the organization’s focus from human-led curation and editing of articles to one driven by automated data lookup.

    An excerpt from the Knight Foundation grant. Image: Wikimedia

    Wikimedia founder Jimmy Wales has said that suggestions that Wikimedia is building a Google competitor are “trolling,” “completely and utterly false,” and “a total lie.”

    Wales and the Wikimedia Foundation have said externally that the Knowledge Engine will primarily improve search within Wikipedia and other Wikimedia projects, but leaked documents and the grant application itself paint a different picture.

    The grant application, however, notes that it will “create a model for surfacing high quality, public information on the internet.” It also sets up the stakes as suggesting that “commercial search engines dominate search-engine use of the internet” and notes that “Google, Yahoo, or another big commercial search engine could suddenly devote resources to a similar project, which could reduce the success of the project.”

    In addition, internal Wikimedia Foundation documents that were leaked this weekend describe the Knowledge Engine like this:

    “Knowledge Engine By Wikipedia will democratize the discovery of media, news and information—it will make the Internet’s most relevant information more accessible and openly curated, and it will create an open data engine that’s completely free of commercial interests. Our new site will be the Internet’s first transparent search engine, and the first one that carries the reputation of Wikipedia and the Wikimedia Foundation.”

    The specifics of the Knowledge Engine’s design don’t necessarily matter, the secrecy and changing story surrounding it does, according to Beutler and Liam Wyatt, a Wikipedia community manager in Europe who calls himself a “Wikiwatcher.”

    More important is that Wikimedia never made any of these proposals public to the Wikipedia community at large and instead moved forward with a project that Wikimedia says will cost at least $2.5 million over the first couple years, and will take, at minimum, six years to complete. The search engine project is also not mentioned in any of Wikimedia’s annual planning documents, which are available to the public.

    Mockups of the search engine.

    “A search engine is a potentially valid way of expressing our mission statement,” Wyatt told me. “The mission statement is about transparency and collaboration, and a secret search engine is not that.”

    There are indications that disagreement about the Knowledge Engine and the community’s role (or lack thereof) in designing it is causing shakeups within the Wikimedia Foundation itself. Late last year, Wikimedia Foundation Board of Trustees member James Heilman was dismissed from the board. Heilman has hinted that his internal quest to make the Knight Foundation document public led to his firing.

    “Grant applications should be published at the same time as they are submitted to potential funders,” Heilman wrote in an op-ed published on The Signpost, Wikipedia’s internal newspaper. “This would keep those in a position of management accountable. It would reduce the risk of unpleasant surprises down the road.”

    A mock up of what Wikimeda Knowledge Engine might look like, from a leaked presentation.

    On a Wikipedia talk discussion page, Wales called Heilman’s claims “utter fucking bullshit” and said that Heilman had “a pattern of behavior and actions that I viewed as violating the trust and values of the community.”

    Shortly after Heilman’s firing, Siko Bouterse, a Wikimedia's director of community resources, said she’d be leaving the organization in part because of transparency issues.

    These departures and firings represent a fundamental shift within Wikimedia, and the search engine drama is a proxy battle over that shift, Beutler and Wyatt said. At its core, it represents the Wikimedia Foundation’s slow but noticeable evolution from a group that primarily runs an online encyclopedia to one that operates like a Silicon Valley tech company, despite the fact that it’s a nonprofit organization.

    "It’s a conspiracy theory to say that they’re replacing the editing community with an algorithm, but it is the scale and trajectory"

    Beutler notes that five of the Wikimedia Foundation’s board members have ties to Google, and that Lila Tretikov, who took over as executive director in 2014, made her career as a software engineer and executive in Silicon Valley. The Wikimedia Foundation did not respond to a Motherboard request for comment.

    “I wonder if the foundation, a lot of whom have experience in Silicon Valley, is getting a little bored with just running an encyclopedia,” Beutler said.

    Meanwhile, traffic to Wikipedia coming from Google has fallen as Google has begun incorporating fast facts from Wikipedia articles onto the front page of Google (search for a celebrity, for instance, and you’ll find out how old they are without having to click through to Wikipedia). As long as that information is coming from Wikipedia, it can be looked at as a win for the encyclopedia’s overall effort to expand public knowledge, but it does represent something of an existential threat to a website that is almost fully dependent on donations from users who actually visit it.

    “To speculate, the thinking may be ‘What if Google doesn’t need us?,’” Beutler said. “If you start from Google, I suppose it would be arguable logical for us to create our own search engine.”

    Wyatt says the search engine project suggests Wikimedia is shifting to focus its efforts more on automatically generated content that lacks the human touch of the volunteer editors that have built Wikipedia.

    “It comes down to whether they see this as a potential replacement for the editorial community,” Wyatt said. “It’s a conspiracy theory to say that they’re replacing the editing community with an algorithm, but it is the scale and trajectory.”

    So, what of the people who think that a Wikipedia-run search engine that’s not beholden to advertisers or tracking could be good for internet users as a whole? Beutler points to the last time Wales attempted to build a Google competitor, called Wikia Search. It failed, quickly.

    “In a vacuum, competition is good,” Beutler said. “For the Wikimedia Foundation to think it can build a Google competitor is crazy. I hate to tell people their dreams are lost, but it’s really not going to happen on this timeframe, on this budget.”