Tech

Princeton Uses GPT to Make a Robot That Cleans on Command

This robot is able to sort laundry into lights and darks, recycle drink cans, and put away scattered objects where they belong.
Image: Princeton​ Researchers
Image: Princeton Researchers

A group of researchers from Princeton, Stanford, and Google have created a large language model-powered robot called the TidyBot, which can perform housekeeping tasks, such as sorting laundry into lights and darks and picking up recycling off the floor when given instructions in plain English. 

Many researchers have tried to merge LLMs with physical robots to complete tasks autonomously. Google and Microsoft have already released versions of robots that combine visual and language capabilities to do things like get a chip bag from a kitchen. Researchers behind TidyBot took these capabilities a step further, by asking the LLM, specifically OpenAI’s GPT-3 Davinci model, to take in user preferences and apply them to future interactions.

Advertisement

The researchers wrote in the paper that they first asked a person to provide a few example object placements such as “yellow shirts go in the drawer, dark purple shirts go in the closet, white socks go in the drawer,” and then asked the LLM to summarize the examples to create generalized preferences for the person. 

“The underlying insight is that the summarization capabilities of LLMs are a good match for the generalization requirements of personalized robotics,” the authors wrote. “LLMs demonstrate astonishing abilities to perform generalization through summarization, drawing upon complex object properties and relationships learned from massive text datasets.” 

“Unlike classical approaches that require costly data collection and model training, we show that LLMs can be directly used off-the-shelf to achieve generalization in robotics, leveraging the powerful summarization capabilities they have learned from vast amounts of text data,” they added. 

The website for the researchers’ paper displays a robot that is able to sort laundry into lights and darks, recycle drink cans, throw away trash, put away bags and utensils, put away scattered objects where they belong, and put toys into a drawer. 

The researchers first tested a text-based benchmark dataset, where they input user preferences, and then asked the LLM to create personalized rules to determine where the objects belong. The LLM summarized the examples into general rules and then used the summary to determine where to place new objects. The benchmark scenarios were defined in four rooms, with 24 scenarios per room. Each scenario contains two to five places to put objects and equal amounts of seen and unseen objects for the model to sort. This test, they wrote, achieved a 91.2 percent accuracy on unseen objects. 

When they applied this approach to the real-world robot, TidyBot, they found that it was able to successfully put away 85 percent of objects. TidyBot was tested on eight real-world scenarios, each with its own set of ten objects, and ran the bot three times in each scenario. In addition to the LLM, the TidyBot uses an image classifier called CLIP and an object detector called OWL-ViT. 

Danfei Xu, an Assistant Professor at the School of Interactive Computing at Georgia Tech, told Motherboard that LLMs allow robots to have more problem-solving capabilities. “Previous task planning systems most rely on some forms of search or optimization algorithms, which are not very flexible and hard to construct. LLMs and multimodal LLM allows these systems to reap the benefit of the Internet-scale data and easily generalize to new problems,” he said when asked about Google’s PaLM-E.