The 2014 Algorithms Solving 2015's Problems
To talk about algorithms, we need to talk about problems.
For as vast a number of things the word "algorithm" might apply to, it's still very often used as either a stand-in for "computer program" or for "computer program replacing human brain-tasks." What an algorithm really consists of is much simpler: steps to solve a problem, or to solve a problem within a given timeframe. Programs implement algorithms, but so do humans and other things in nature. Where there is a problem, there is an algorithmic solution.
This list focuses (mainly) on what we usually think of as algorithms, which are computing/mathematical schemes. There were many more developed last year, of course, full-on libraries of code being generated one after the other, many destined to wither and fade within the Cornell University servers supporting the arXiv open-access research database. But some stood out. Exclusion from the list shouldn't be taken as a slight because, really, tracking algorithm research completely would be like several full-time jobs just in itself.
To talk about algorithms, we need to talk about problems.
Problem: Bitcoin Is Volatile
The global currency market is already an unsettled locale—a 24-hour trading world with relatively few rules, yet governed by big banks with big sums of money. The factors that influence currency exchange rates are likewise big: geopolitics, fiscal policy, inflation.
Bitcoin is like a parody of the currency market, where rates are driven mostly by speculation rather than much of anything in the real-world. There is no stabilizing force, no commodity or government to keep it anchored. In 2011, a single bitcoin went for about $0.30, while in 2013 a bitcoin could be had for over $1,200. Now, it's hanging out around $350. According to a Boston University report, bitcoin carries a year-to-year price volatility of over 100 percent.`
What a group of MIT researchers demonstrated last October, however, is that bitcoin is actually predictable. With help from 200 million data points, they were able to train an algorithm to hunt out patterns in the bitcoin market using Bayesian analysis. The basic idea is that a possible event's history/background can be used to predict the probability of that event. Here, the event is some not-quite-defined trading pattern.
"We developed this method of latent-source modeling, which hinges on the notion that things only happen in a few different ways," said Devavrat Shah, the study's principal investigator. "Instead of making subjective assumptions about the shape of patterns, we simply take the historical data and plug it into our predictive model to see what emerges."
Every two seconds, the MIT researchers made a prediction for the following 10 seconds. Above a certain market-movement threshold, they bought a bitcoin. if the prediction was below that threshold, they sold it. Using this algorithm, Shah and his team made 2,872 individual trades, earning an investment return of 89 percent.
Problem: Robots Move Like Robots
Everyone knows what "moving like a robot" means: stilted, jerky, discontinuous. Unsmooth. Humans and other animals move in analog, with inner purpose. We move not just in reaction to stimuli or according to some instruction, but because our very physical being is movement. We wandered and ran and jumped before we were able to compute wandering and running and jumping. A robot doesn't have this deeper naturalness.
MIT's cheetah, the newest iteration of which was demonstrated last summer, comes awfully close, mimicking animal motion thanks to a new electric engine and a new algorithm. As Motherboard's Jordan Pearson reported in September, "basically, the algorithm allows the quadrupedal bot to carefully control the power of its footfalls, striking the ground with more or less force, depending on what's needed."
MIT mechanical engineering professor Sangbae Kim explained that the new movement algorithm swaps fast cyclical motion for harder, more calibrated individual strides of the sort used by world-class runners like Usain Bolt.
"They actually increase their stride length by pushing downward harder and increasing their ground force, so they can fly more while keeping the same frequency," he said. "Our robot can be silent and as efficient as animals. The only things you hear are the feet hitting the ground."
Problem: Unsolvable Problems
Problems become unsolvable, or practically unsolvable (e.g. by a computer in a finite period of time), because of complexity. It sounds like a trivial statement, but a problem can be very, very big—guiding a spacecraft to a distant comet, for example—without being too complex to solve. But something that sounds simple, like airline scheduling, quickly becomes an uncomputable mess when all of the pertinent factors are accounted for: fuel efficiency, minimizing passenger delays, managing pilot and flight crew sleep schedules, airport traffic, and so on. Exploring every combination is impossible.
The general category of problem here is optimization, and the usual solution is to approximate. But if you've flown on an airplane recently, the limits of fudging are clear enough. Cornell computer scientist David Steurer, the recipient of a $200,000 Microsoft fellowship, wants either to do better or to prove that doing better is impossible.
This challenge has a name: the Unique Games Conjecture, put forth in 2002 by the complexity theorist Subhash Khot. A crude sketch of the thing might be imagining some problem as a map of places that need to be visited, like a mail route. Getting from place to place takes some amount of time that's the same for each trip. Imagine the route is the product of some optimization problem, with the distances and number of stops increasing enormously for even the tiniest bits of added complexity.
Khot's conjecture (conjecture being a good but not-quite-there or insufficiently supported idea) says that some mail routes are too big. Some problems are not solvable.
They're too big to even approximate mail delivery. We can't even visit enough houses and mailboxes to make it look like a mail carrier passed through. Steurer (and other theorists) are interested in showing that any route for any problem can in fact be delivered to sufficiently, and he's already had some success with the "sum of squares" method.
"The UGC predicts that for a wide range of problems, either a concrete meta-algorithm can solve it or no efficient algorithm at all can solve it," Steurer explained last June. "That prediction is remarkable because there are usually lots of different methods to try to solve a problem. So the UGC theory says that there is no point in trying all of these different methods, because either the UGC method works or none at all."
The stakes involving in disproving or otherwise evading UGC is, according to Seurer, nothing less than a "unified theory of efficient algorithms for difficult computational problems." One algorithm to rule all other algorithms.
Problem: Math Is Too Precise
In modeling hyper-dynamic systems, like atmospheric conditions or other things involving interacting fluids, it's impossible to account for every influence. Which is fine, because we can usually get good-enough simplifications for our purposes.
In modeling turbulence or shock waves within fluid dynamical systems, scientists and mathematicians often turn to Burger's equations, which, as I wrote in September, describe something very messy in conspicuously unmessy terms. There is more to airflow in the atmosphere than they represent, but the equations ignore it out of necessity.
Not knowing something shouldn't mean pretending it doesn't exist, argues Brown University's George Karniadakis. His idea is to bring together Burger's equations and an emerging idea called uncertainty quantification. Basically, instead of imagining some influence doesn't exist, we might apply some reasonable randomness in its place. This is something already done within computer graphics (using noise as texture).
"The general idea in UQ," Karniadakis explained last fall, "is that when we model a system, we have to simplify it. When we simplify it, we throw out important degrees of freedom. So in UQ, we account for the fact that we committed a crime with our simplification and we try to reintroduce some of those degrees of freedom as a random forcing. It allows us to get more realism from our simulations and our predictions."
Problem: Too Much Data/Not Enough Data
An algorithm should only be as good as the data it's offered, right? What sort of Netflix recommendations might I get if I've never watched or rated anything, or if I've rated things dishonestly? They won't be very accurate anyway.
But that's not the end of it. A pair of MIT researchers, Richard Cockburn and Dan Levine, have created an algorithm that evaluates a given application/program based on whether it requires data that is either too difficult to collect or too time-consuming to process and then tells it the best possible subset of data it might focus on to get good results. They presented their work at the Uncertainty in Artificial Intelligence conference last July.
So, if a climate scientist was trying to make some prediction given some enormous pool of possible data, the algorithm would offer instead the best possible smaller, more management subset of that data. Instead of attempting to gather every possible observational scrap, they can just grab a few high-quality data nuggets.
The algorithm has to do with mathematical models known as graphs. A graph might represent a chessboard, a network of computer servers, a family tree, or anything else that might possibly be represented by an arrangement of nodes (represented in the graph by a circle) connected by lines or "edges," which represent the strengths of the relationships between different nodes. Cockburn and Levine's method looks at a given graph and determines how much information a given node might give about another node or nodes. This is property is known as "mutual information."
As Levine explains, determining mutual information between nodes can be imagined as the injection of some blue dye into one node to see how blue any of the other nodes turn. "It's typically going to fall off as we go further out in the graph," he says. "If there's a unique path between them, then we can compute it pretty easily, because we know what path the blue dye will take. But if there are loops in the graph, then it's harder for us to compute how blue other nodes are because there are many different paths."
Loops are the first problem in this sort of computation. Levine and Cockburn solve it by rearranging graphs as trees, which are just graphs without the loop-forming edges. Within a tree, the new algorithm is able to determine which nodes are junk, or junk relative to other higher-value nodes. These junk nodes are used to navigate through the tree/graph, but the algorithm prevents their short-range influence from impacting the larger calculations of mutual information. Junk nodes are used to connect, not for their data-value. It's sort of like stealing Starbucks' wifi without buying anything.
So, the graph becomes a simplified structure of high-value nodes. No loops, no junk. It's this newly trimmed structure that reveals the highest-value data, or reveals the data or potential data that isn't going to matter very much in the grand computation scheme.
Problem: Light Is Complex
Think about what light does: shadows, reflection, refraction. Light can travel through some things, but not others. Sometimes light, as it passes across some medium, becomes a different sort of light with a different character and quality. Light changes constantly; even in the most sterile, fluorescent light-filled office-space, photons will find new pathways and will be absorbed into new materials. You won't even think about it.
Artists and programmers think about it all the time, often with frustration. The math is daunting, the results never quite good enough.
Ray tracing is a method of simplifying the light-modeling problem while, at the same time, taking the problem deeper. When we look at some landscape, whether it's an IRL scene or through the glass eye of a computer screen, we don't see everything. Some light reaches our eye directly; some bounces or reflects before reaching our eye; some never makes it at all. Ray tracing offers a method of ignoring those wasted photons.
"It seems at first that it is impossible to know beforehand which rays reach the eye," explains Paul Rademacher, a former Dreamworks animator. "After all, any given ray can bounce around the room many times before reaching the eye. However, if we look at the problem backwards, we see that it has a very simple solution. Instead of tracing the rays starting at the light source, we trace them backwards, starting at the eye."
"Consider any point on the view window whose color we're trying to determine," he says. "Its color is given by the color of the light ray that passes through that point on the view window and reaches the eye. We can just as well follow the ray backwards by starting at the eye and passing through the point on its way out into the scene."
The ray tracing idea is hardly new. Rademacher's explanation is circa 1997. The computational cost of actually doing it has left the method mostly out of reach. A 2013 headlining presentation at the Siggraph conference in Anaheim was given the dubious (or at least salty) title, "Ray tracing is the future and ever will be."
Last spring saw the unveiling of a technology/product called PowerVR Ray Tracing, which came with the promise of ray tracing for the masses or at least ray tracing for the masses of video game designers. It's not quite a new algorithm, but a new architecture to support an algorithm, a graphics-processing unit dubbed "Wizard."
Wizard was unveiled at 2014's Game Developers Conference in San Francisco by the British R&D firm Imagination Technologies (maker of GPUs for smartphones, otherwise).
Wizard's centerpiece is an entirely new sort of processor just called a ray tracing unit (an RTU within a GPU). It's a nonprogrammable, single-function device that both crunches and organizes/stores the interactions of light rays within a given scene. This has the advantage of not stealing processing power from the more usual graphics processing functions. The PowerVR architecture has been in-development since 2010.
"The point then of putting ray tracing hardware in a PowerVR GPU is to offer dedicated, fixed function hardware to accelerate these processes," AnandTech's Ryan Smith wrote prior to the unit's formal unveiling. "By pulling them out of the shaders and implementing superior ray tracing based algorithms, the quality of the effect should go up and at the same time the power/resource cost of the effect may very well go down.'
So, maybe natural light and computer graphics aren't mutually exclusive.
There is evidence that starving yourself can fend off the aging process. This has been shown in animals, including some primates, at least. It isn't, however, something that should be pursued in a fountain of youth kind of way. Calorie restriction comes with its own, new set of risks: anemia, neurological deficits, heart failure, hormonal imbalances.
And yet the allure remains.
Last January, a group of researchers at Tel Aviv University's Blavatnik School of Computer Science came up with an algorithm that seeks out the specific genes that can be "turned off" to create the same anti-aging effect as calorie restriction. It is, ahem, a way of having your anti-aging cake and eating it too (and then actually eating it).
The Tel Aviv scientists call their method a "metabolic transformation algorithm." Given two metabolic states (of cells or whatever), the tool can identify the environmental or genetic changes needed to go from one of the states to the other. Using the algorithm, the team was able to make the gene expression of old yeast look like new yeast. (Yeast is a common model because much of its DNA is preserved in humans.)
"Most algorithms try to find drug targets that kill cells to treat cancer or bacterial infections," said Karen Yizhak, the principal investigator. "Our algorithm is the first in our field to look for drug targets not to kill cells, but to transform them from a diseased state into a healthy one." Yizhak's work was published in the journal Nature Communications.
At the FUN with Algorithms 2014 conference last summer, computer scientist Valerio Volpi and a small team from the University of Pisa presented this paper: "Zombie Swarms: An Investigation on the Behaviour of Your Undead Relatives." It was a reminder (one of many at the conference) that computer science and engineering isn't some airless realm of binary trees and correctness proofs (still plenty of proofs and trees, just more air) and that problem solving is a creative art, and vice versa.
The problem here is not just surviving a zombie outbreak, but doing so as efficiently as possible. First, we can look at the algorithm for being a zombie:
This is something that can be played out using numerical simulations, as the Pisa team did using a tool called Sycamore. The basic solution, as demonstrated via simulation, is for the humans to act as zombie herders, using sound (singing, moaning, whatever) to attract and gather the zombies in one central location. There are a few ways of doing this and, as the researchers note, there are also several different variants of zombies themselves: fast, slow, stupid, less stupid, infectious, noninfectious. Etc.
"We dedicate this work to the many lab assistants that were harmed in the making of this paper," Volpi and his team write in a corresponding paper. "Running experiments with Zombies can be a tricky business, and while we acknowledge that numerical simulations may never be an adequate substitute for field experiments, yet after a number of such failed experiments we came to the conclusion that we prefer the safety of tenure-track, to the risks of a zombie startupper life."