Twitter's algorithm turns your timeline into an experiment and the test subject is you.
Jack Dorsey. Photo: Yana Paskova/Bloomberg via Getty Images
There's a vicious arms race progressing outside the walled gardens of the major internet services, and it's all about ways to get into your head. Technology companies supported primarily through advertising can only come through on their stratospheric valuations by developing increasingly advanced targeting based on sophisticated user profiling. The user profiles are themselves built on technology which is often disconnected from the core product, but eventually those priorities may bleed through. It's starting to look like the Twitter timeline may be next in line.
Since its earliest days as a simple microblogging service, Twitter has always been fundamentally rooted in chronology, even enough so to mostly kill off open data feed specifications such as RSS and Atom. Nonetheless, rumors of a new "algorithmic timeline" long in development were finally validated on February 10, with the release of a new feature which prioritizes tweets and presents them in an alternate order.
The prospect of this change was upsetting to many active users who didn't want Twitter to rewire the guts of a tool they use daily, but sequential presentation isn't inherently superior. The real problem with ranking algorithms is not that they'd be nonlinear, but rather that legal and regulatory entities will allow them to be treated as intellectual property secret sauce. By extension, then, they are not auditable to the same extent as a simple chronology. This mode of presentation allows Twitter to manipulate what its users see in order to harvest valuable data on them.
A fully algorithmic Twitter would be an experimental Twitter, where content can be constantly tweaked and manipulated in order to see how users will react
Algorithms are often smart, but if they are also opaque, then their insights and conclusions ultimately belong to the company, not the user. It's all but certain that Twitter will never fully explain to users how it determines the content rankings for tweets, nor allow any meaningful controls or inputs. In fact, product branding aside, an "algorithmic timeline" is actually a logical impossibility, because those two words have opposite meanings. The latter is a clear order according to our shared understandings of space, time, and math. The former comes out of a black box designed by Twitter. Simple sequences may be overwhelming, but at least they can be trusted.
Actually, Twitter does not yet have a magic algorithm that knows what you are interested in, and it likely never will. Rather, there's a rough approximation of that ideal tool, which will always fail in some capacity. It will need lots of additional fine tuning. What Twitter does have, however, is exhaustive tracking of your user actions on their platform, and it can use that to power the fine tuning. Nearly every action taken through a Twitter client or via the API requires authentication, which means the company always has a record of who you are and what you are doing.
Algorithms are imperfect attempts by computers to explain the world, or at least some small corner of it. They will rarely improve in a spontaneous burst of computer science; those events, when they do happen, are the exception, discussed in academia for years to come. A much more sensible way to improve an algorithm is iteratively, by adjusting the the inputs: improve the incoming data first, and then change the algorithm's analysis to leverage the new information. In this case, your user behavior is the input.
The move toward algorithmic presentation is essentially the conversion of a structured broadcasting platform into an ecosystem within which Twitter decides which way is up. The effectiveness of the content ranking algorithms will be testable using the measurable inputs you provide simply by using the service. If your usage patterns indicate that the content rankings are inaccurate, then it's time to iterate! Twitter will try to create a more accurate algorithm based on your behavioral data inputs, and will create a more sophisticated data profile of you in the process. It is a complete loop, a data sandbox in which the users are not sculptors but sand.
In 2014, Facebook performed a psychological experiment, adjusting the content of the News Feed post stream for about on about 700,000 people and attempting to see whether that affected the tone of their subsequent posts. After it was made public, the backlash was swift and thorough, but that probably had more to do with the emotional nature of the experiment, which is outside the bounds of Facebook's usual business. But the same actions performed as a way to test user behavioral reactions and improve the accuracy of the News Feed targeting is obviously well within the bounds of the usual business of the News Feed. Similar logic may now be coming to Twitter as well.
In a sense, the embedded tweet poll feature introduced in late 2015 is a small scale but logically complementary equivalent. Poll results are vastly cleaner statements of user preference than anything that could be derived from a textual analysis of tweet content. Adding them to the service cuts right to the chase: Why spend millions on complex algorithms and processes which imperfectly divine what people like when you can just ask them instead?
Twitter has stated that it won't share poll results with advertisers, but poll results can also be used for statistical inference—you're young, so you're more likely to have liberal political views!—making them rich data profiling tools. They can be used to help build marketable aggregate data products that are far more sophisticated and valuable than simply selling your answer in a Coke-or-Pepsi poll to Coke or Pepsi, and Twitter has thus far made no such guarantees about might will happen with downstream conclusions. When converted into structured data, polls are more clearly actionable than any other data asset Twitter collects. They are effectively Twitter's brilliant way of turning users into agents, creating millions of new data filtration mechanisms which can then be turned on the rest of the social circle.
As polls spread highly structured tweet content, algorithmic presentation meanwhile creates a testable structure around the more chaotic tweets. Since the implicit contract with users would no longer be based on publicly verifiable values like timestamps, a fully algorithmic Twitter would actually just be an experimental Twitter, a service in which the content can be constantly tweaked and manipulated in order to see how users will react. Under the guise of algorithmic presentation, and ostensibly in pursuit of a more accurate content ranking and better user experience, hypotheses about your preferences can be quietly tested to measure how you respond to sports, hockey, the Detroit Red Wings. Comic books, Marvel, Iron Man. Hypothesis, test, refinement. It's a lab! Welcome to the age of Big Data. Step right this way, your test tube awaits.