I’m willing to bet that many of you reading this article have at least attempted to solve a Rubik’s cube before, and I bet that many of you have succeeded, but have you tried it with only one hand, and did you do it in 20-minutes or less? A team of researchers at OpenAI, a San Francisco-based research lab, has developed a one-handed robot that can do just that. I know what you are thinking right now. Robots have been able to solve Rubik’s cubes for years, but this particular robot was not built with solving those colorful puzzles in mind, but instead was built to further research into how robots manipulate objects.
“Solving a Rubik’s Cube requires unprecedented dexterity and the ability to execute flawlessly or recover from mistakes successfully for a long period of time,” the statement said. “Even for humans, solving a Rubik’s Cube one-handed is no simple task — there are 43,252,003,274,489,856,000 ways to scramble a Rubik’s Cube.”
In a recent announcement, the team details how they utilized a pair of neural networks to train a human-like robotic hand to solve a Rubik’s cube without any human input, or special devices designed to grip the cube. The robot uses nothing but four fingers and a thumb to flip and twist the cube until each side is comprised of a single color. If you think this is an easy thing to do, I implore you to grab a Rubics Cube and just try twisting one layer using only the fingers and thumb that are attached to the palm that the cube is resting on. It’s no easy task.
“We’ve trained a pair of neural networks to solve the Rubik’s Cube with a human-like robot hand. The neural networks are trained entirely in simulation, using the same reinforcement learning code as OpenAI Five paired with a new technique called Automatic Domain Randomization (ADR),” the team wrote. “The system can handle situations it never saw during training, such as being prodded by a stuffed giraffe. This shows that reinforcement learning isn’t just a tool for virtual tasks, but can solve physical-world problems requiring unprecedented dexterity.”
The researchers first trained the neural networks to solve the Rubik’s Cube in simulation using reinforcement learning and Kociemba’s algorithm for picking the solution steps. Utilizing domain randomization, they were able to transfer those networks directly to the robot, but that was not enough. Variables like friction, elasticity, and dynamics are very hard to measure and recreate in a simulation which forced the team to create a new method called Automatic Domain Randomization. This method allows the team to endlessly generate increasingly difficult puzzles to solve in the simulation.
“ADR starts with a single, nonrandomized environment, wherein a neural network learns to solve Rubik’s Cube. As the neural network gets better at the task and reaches a performance threshold, the amount of domain randomization is increased automatically,” the team wrote. “This makes the task harder since the neural network must now learn to generalize to more randomized environments. The network keeps learning until it again exceeds the performance threshold when more randomization kicks in, and the process is repeated.”
Using ADR the team was able to train the neural networks in the simulation, then pass that training onto the real robotic hand. This works because ADR is able to expose the neural networks to an endless, ever-increasingly difficult variety of simulations. The dataset that is gathered helps the robot quickly identify and adjust to whatever pattern is currently displayed on the Rubik’s cube. The team took things a step farther, and physically interrupted the robotic hand while it was attempting to solve the puzzle. In one experiment, a blanked was tossed over the had as it manipulated the cube. Another experiment used a plush giraffe to try to push the cube out of the hand. The team also did things like strapping two fingers together to simulate tired fingers, and they even placed a rubber glove over the hand. The results were quite favorable too, with the ADR trained networks being able to instantly overcome and adapt to the perturbations. The system is not perfect just yet though with the team admitting that the hand is capable of successfully solving the cube just one-fifth of the time if the cube is scrambled to the max.
“Solving the Rubik’s Cube with a robot hand is still not easy. Our method currently solves the Rubik’s Cube 20% of the time when applying a maximally difficult scramble that requires 26 face rotations. For simpler scrambles that require 15 rotations to undo, the success rate is 60%,” the team wrote. “When the Rubik’s Cube is dropped or a timeout is reached, we consider the attempt failed. However, our network is capable of solving the Rubik’s Cube from any initial condition. So if the cube is dropped, it is possible to put it back into the hand and continue solving.”
Ultimately the ability to solve a fun puzzle with such dexterity represents a major breakthrough in AI, machine learning, and robotics. As I talked about in a previous article, the ability to gather massive data sets and use that data to manipulate the physical world through robots is becoming increasingly important. With this new ADR method of ever-increasing difficulty, there is no limit to what robots and other technology can be trained to do. Who knows, one day we might have fully autonomous robot doctors who perform highly complex brain surgery, and thanks to their precision and almost instantaneous ability to adapt and overcome, brian surgery could become an out-patient procedure. The future is bright, and I for one am excited to see what comes next!