Teaching robots how to handle, manipulate, and transport various objects is key to advancing automation technology, and one way to teach robots how to better move objects is to create a large dataset that can be used to train a robot how to respond to the way various objects react when moved. Factors like the object’s shape, weight, mass distribution, and friction all play roles in how an object moves.
Researchers at the Massachusets Institute of Technology (MIT) have been working on developing a dataset that will allow future robots to better understand how to manipulate objects. The dataset was generated by utilizing a robotic arm packed with cameras and sensors to gather the data. The “pushy robotic” setup included a precision industrial robotic arm, a 3D motion-tracking system, cameras, and custom software that pulls all the data together. The researchers used the arm to push small, modular objects around a table’s surface. Each object was fully adjustable and allowed the team to change how that object behaved with small changes to its weight, mass distribution, and shape.
A key to compiling the novel Omnipush dataset was building modular objects (pictured) that enabled the robotic system to capture a vast diversity of pushing behavior. The central pieces contain markers on their centers and points so a motion-detection system can detect their position within a millimeter. (Image Credit: MIT)
Named Omnipush, the dataset contains 250 different pushes of 250 different objects effectively creating 62,500 pushes, making this the largest dataset of its type ever compiled. Previous MIT studies produced datasets that were much less diverse, with the next largest dataset containing only ten objects. These previous studies also lacked any form of computer vision which left a lot of room for improvement.
“We need a lot of rich data to make sure our robots can learn,” says Maria Bauza, a graduate student in the Department of Mechanical Engineering (MechE) and first author of a paper describing Omnipush that’s being presented at the upcoming International Conference on Intelligent Robots and Systems. “Here, we’re collecting data from a real robotic system, [and] the objects are varied enough to capture the richness of the pushing phenomena. This is important to help robots understand how pushing works, and to translate that information to other similar objects in the real world.”
Being able to quickly identify the most efficient and safe way to move an object is critical in high-level robotic tasks. Imagine a robot that is tasked with moving a sensitive radioactive sample. Using this dataset, programmers could “teach” that robot how to determine the best way to move that radioactive sample without damaging its housing while still making the whole process as efficient as possible.
The modular objects used in this study are what really sets it apart from previous datasets. Weighing just 100-grams each, the aluminum objects are able to be configured in a number of ways by adding on different shaped components, adding extra weights to their chassis, and even shifting their weight distribution. This allowed the team of researchers to develop 250 different combinations from just a few base modules. Each module featured a unique marker on its surface that allowed the vision system to focus on it.
Each unique object was placed on a surface and the robot moved to a random position several centimeters from the object, selects a random direction, and then pushes the object for 1-second. From that position, the robot selects a second position and pushes again for 1-second. This process is repeated 250 more times before a new object is placed on the surface and the pushing cycle repeated. In total, the robot ran for 12-hours per day for a period of two weeks, accumulating more than 150 hours total.
“Imagine pushing a table with four legs, where most weight is over one of the legs. When you push the table, you see that it rotates on the heavy leg and have to readjust. Understanding that mass distribution, and its effect on the outcome of a push, is something robots can learn with this set of objects,” Rodriguez says.
To test the dataset, the researchers used 150 Omnipush objects to train a model that was asked to predict the final pose a pushed object after only being given the initial pose and description of the push. The team says that they “trained the model on 150 Omnipush objects, and tested it on a held-out portion of objects. Results showed that the Omnipush-trained model was twice as accurate as models trained on a few similar datasets.”
“The robot is asking, ‘If I do this action, where will the object be in this frame?’ Then, it selects the action that maximizes the likelihood of getting the object in the position it wants,” Bauza says. “It decides how to move objects by first imagining how the pixels in the image will change after a push.”
As datasets grow larger autonomous robotics will become more precise, efficient, and overall better at the task they are instructed to perform. This research shows exactly why generating large datasets is important. With just two weeks of testing, this team was able to more than double the accuracy that was previously achieved. Imagine what will be possible with datasets containing millions of variable combinations are created. Plus, with “pushy” datasets like this, the robotic cats of the future will be just as annoying as the real cats of today.