FYI.

This story is over 5 years old.

Tech

A New Object-Recognition Algorithm Could Change the Face of Machine Learning

So far the model just applies to alphabets, but it's a 'human-capable' start.
Image: Danqing Wang

The basic principle of machine learning is training. As humans, we can learn very profound things from single examples—spoiled milk tastes bad, fire is hot—but machines need more because they learn statistically. Machines depend upon data.

Or this is the current state of things, anyhow. It may prove to be less fundamental than is usually assumed, according to a study published this week in Science. The report, which comes courtesy of researchers at NYU and MIT, introduces the Bayesian program learning (BPL) framework, a new machine learning model capable of mimicking the human mind's capacity for generalizing from single examples. It's a model that "learns to learn."

Advertisement

"People learn richer representations than machines do," the paper notes, "even for simple concepts, using them for a wider range of functions, including creating new exemplars, parsing objects into parts and relations, and creating new abstract categories of objects based on existing categories. The best machine classifiers do not perform these additional functions, which are rarely studied and usually require specialized algorithms."

"A central challenge is to explain these two aspects of human-level concept learning," the authors continue. "How do people learn new concepts from just one or a few examples? And how do people learn such abstract, rich, and flexible representations?"

Machine learning models all improve with more data, not less, but humans seem to be able to break this seemingly fundamental principle. That's a real talent.

The BPL is, according to the paper, capable of making generalizations in a way that's mostly indistinguishable from people. It does this by taking broad concepts and reducing them to probabilistic programs constructed from separate, discrete procedures. These can be viewed as raw materials or "primitives" that may come together to form deep concepts just as some regular old bricks may be assembled into the most intensely detailed palace.

The learning method devised by the researchers is so-far specific to character-recognition tasks, e.g. handwritten characters from the world's alphabets. It operates by generating a programmatic representation of a given character as algorithmic instructions as to how one might reproduce it. The result is a certain sort of generalization—by following the same instructions for producing the letter, many different users may produce many different variations on it, but it will remain the same fundamental symbol.

Advertisement

As a result, the model "naturally captures the abstract 'causal' structure of the real-world processes that produce examples of a category," the paper explains. The model is able to use primitives from previously generated concept-programs to not just identify new examples of letters, but to create new concept-programs, e.g. new letters.

Here's how it might work. The algorithm is presented with a character that it's never seen before and makes five attempts at parsing it, each of which is a new program (so, again, the new programs are themselves algorithmically produced). These programs are then tasked with creating entirely new letters that vary according to a probabilistic spread. As you can see in the video, the results are about on par with what actual human people come up with, hence the study's advertised "human-level concept learning."

So, how might this apply to machine learning generally?

"Right now the algorithm only works for handwritten characters, but we identified three core ingredients that are important for model performance and may help guide progress in other domains," Brenden Lake, an NYU data scientist and study co-author, told Motherboard.

These ingredients include first "compositionality," which is an old idea stating that, as above, representations of concepts should be built from simpler primitives. The second ingredient is "causality," the idea that we can model the abstract structure of how objects in the world are generated. And, finally, there is "learning-to-learn." This last bit just means that knowledge from previous concepts can be used to help learn new concepts.

"These principles may help explain how we learn and use other types of concepts so quickly," Lake said. "We're especially interested in applications such as learning new spoken words and learning new gestures. You may only need to hear one example of someone saying the name 'Ban Ki-Moon' to basically get it, to be able to recognize another person saying the same name, and generate (speak) an approximate version of the name yourself. Same applies with a new gesture—[consider the] first time you saw a 'high five.' For these domains, concepts are compositional, and the causal process is well defined and open to study."