Visual 'servoing' provides intuitive control of robots
26 September 2012
Using a novel method of integrating video technology and familiar control devices, a research team has developed a technique to simplify remote control of robotic devices.
The researchers - from Georgia Tech and the Georgia Tech Research Institute (GTRI) - aimed to enhance a human operator's ability to perform precise tasks using a multi-jointed robotic device such as an articulated mechanical arm. The new approach has been shown to be easier and faster than older methods, especially when the robot is controlled by an operator who is watching it in a video monitor.
Dubbed 'Uncalibrated Visual Servoing for Intuitive Human Guidance of Robots', the new method uses a special implementation of an existing vision-guided control method called visual 'servoing' (VS).
"Our approach exploits 3D video technology to let an operator guide a robotic device in ways that are more natural and time-saving, yet are still very precise," said Ai-Ping Hu, a GTRI senior research engineer who is leading the effort. "This capability could have numerous applications – especially in situations where directly observing the robot's operation is hazardous or not possible, including bomb disposal, handling of hazardous materials and search-and-rescue missions."
For decades articulated robots have been used by industry to perform precision tasks such as welding vehicle seams or assembling electronics, Hu explained. The user develops a software program that enables the device to cycle through the required series of motions, using feedback from sensors built into the robot.
But such programming can be complex and time-consuming. The robot must typically be maneuvered joint by joint through the numerous actions required to complete a task. Moreover, such technology works only in a structured and unchanging environment, such as a factory assembly line, where spatial relationships are constant.
The human operator
In recent years, new techniques have enabled human operators to freely guide remote robots through unstructured and unfamiliar environments, to perform such challenging tasks as bomb disposal, Hu said. Operators have controlled the device in one of two ways: by "line of sight" – direct user observation – or by means of conventional, two-dimensional camera that is mounted on the robot to send back an image of both the robot and its target.
But humans guiding robots using either method face the same complexities that challenge those who programme industrial robots, he added. Manipulating a remote robot into place is generally slow and laborious.
That's especially true when the operator must depend on the imprecise images provided by 2D video feedback. Manipulating separate controls for each of the robot's multiple joint axes, users have only limited visual information to help them and must manoeuver to the target by trial and error.
"Essentially, the user is trying to visualise and reconstruct a 3D scenario from flat 2D camera images," Hu said. "The process can become particularly confusing when operators are facing in a different direction from the robot and must mentally re-orientate themselves to try to distinguish right from left. It's somewhat similar to backing up a vehicle with an attached trailer – you have to turn the steering wheel to the left to get the trailer to move right, which is decidedly non-intuitive."
The VS advantage
VS has been studied for years as a way to use video cameras to help robots re-orient themselves within a structured environment such as an assembly line.
Traditional VS is calibrated, meaning that position information generated by a video camera can be transformed into data meaningful to the robot. Using these data, the robot can adjust itself to stay in a correct spatial relationship with target objects.
"Say a conveyor line is accidently moved a few millimeters," Hu said. "A robot with a calibrated VS capability can automatically detect the movement using the video image and a fixed reference point, and then readjust to compensate."
But VS offers additional possibilities. The research team has adapted the technology in ways that facilitate human control of remote robots. The new technique takes advantage of both calibrated and uncalibrated techniques. A calibrated 3D "time of flight" camera is mounted on the robot – typically at the end of a robotic arm, in an end-effector(an eye-in-hand system).
The camera utilises an active sensor that detects depth data, allowing it to send back 3D coordinates that pinpoint the end-effector's spatial location. At the same time, the eye-in-hand camera also supplies a standard, uncalibrated 2D greyscale video image to the operator's monitor.
The result is that the operator, without seeing the robot, now has a robot's-eye view of the target. Watching this image in a monitor, an operator can visually guide the robot using a gamepad.
In addition, VS technology now automatically actuates all the joints needed to complete whatever action the user indicates on the gamepad – rather than the user having to manipulate those joints one by one. In the background, the Georgia Tech system performs the complex computation needed to coordinate the monitor image, the 3D camera information, the robot's spatial position and the user's gamepad commands.
The research team's plans include testing a mobile platform with a VS-guided robotic arm mounted on it. Also underway is a proof-of-concept effort that incorporates VS control into a low-cost, consumer-level robot.
The team's ultimate goal is to develop a generic, uncalibrated control framework that is able to use image data to guide many different kinds of robots.
Contact Details and Archive...