Perception of a cooperative humanoid service robot when executing skilled tasks

Research plan

Sami Terho
Automation Technology Laboratory
Helsinki University of Technology

1. Background of research

Service robots of the future are expected to have ability to cope with skilled tasks. In contrast to traditional industrial robots they need to execute tasks that are not predefined series of movements but the execution is adapted to the situation at hand. Typical skilled tasks are for example manipulating non-predefined objects or moving in an unknown environment. Common factor in these tasks is perception of environment with the sensors of robot. The perception can be separated to two more or less independent areas: perception for manipulation or work execution, and perception for navigation. When manipulating objects one needs accurately classified data of objects in close range. In navigation the information of objects farther away is also needed, but the data does not need to be as accurate as in manipulation.

2. Goal of research

The research aims to find out what kind of sensors and processing of sensor data is needed for a service robot to achieve the planning of task execution, manipulating the objects and planning the movements effectively. The sensor data is processed in a way that the algorithms on a higher lever are able to utilize the information as effectively as possible. The idea is that the hardware and algorithms based on the research would be suitable for as many robotic applications as possible.

3. Areas of research

Robots often use laser scanners and cameras to perceive their environment. The most efficient way to utilize these two modalities is to combine all the information together.

3.1 Perception for manipulation

Manipulation actions traditionally require accurate information of objects pose and geometry. This information is used for planning and executing the manipulation actions. On the other hand the need for accurate information is quite a limitation in practice. Accurate sensors might be very expensive and difficult to integrate to the system. The need for accurate information can be compensated by smarter perception planning and data fusion. The perception should not be passive, but the robot should actively use its moving capability to find the best ways to perceive the environment and objects.

Based on acquired data the objects can be classified on higher level to for example larger and smaller boxes. Based on this information the robot plans its actions, that is, chooses the objects to be manipulated, and executes preparatory actions such as approaching the object. Actual manipulation of the object might require more information about the object. For example when carrying an object one needs to identify suitable spots for grabbing.

3.2 Perception for navigation

Navigation requires information of obstacles and formations and type of the terrain. This information can be used to locate the robot and to plan collision-free moving trajectories. The knowledge of terrain formations can be utilized when different moving methods are applied. When climbing a steep slope or moving in an uneven terrain the robot might need to choose different moving methods than when moving on a flat surface.

3.3 Perception planning

Often it is not feasible and not even possible to acquire all the required information with static sensors. A mobile robot can observe a target from different angles by moving itself and its degrees of freedom. This kind of information acquisition requires planning of perception activities, which in turn needs a model of the perceived environment. This model includes all the data already acquired, and a map of areas that have not yet been perceived, or whose information is obsolete.

3.4 Image processing

Image processing can be used to determine the color and an approximation of the geometry of an object. If the geometry is known the model of the object can be fitted to the image. In this way also the pose of the object can be determined. Camera is primarily used to identify objects, but with some limitations it can be also used for acquiring parameters required for manipulation.

3.5 Signs and marks

The execution of many tasks becomes easier, if the capacity of human sensing and reasoning can be used as a part of the system. In cooperative robotics this can be made for example by using signs and marks to guide the robot. For example, if a marker is attached to an object, the robot does not necessarily need to recognize it based on its features. In contrast to human-to-human communication for the robot it is easier to recognize different barcodes and similar symbols than text composed from letters. Signs can also be used on higher level to guide robot to right path by using signs like traffic-signs. Additionally, commands can be given using signs.

3.6 Processing three-dimensional data

The data acquired with three-dimensional sensors includes much information of shapes and positions of the objects. The interesting targets can be distinguished from this data with different model fitting algorithms and by separating objects on different distances. Three-dimensional data often includes significant amount of noise, because the sensors often include mechanically moving parts. The inaccuracies in the degrees of freedom are cumulated to the data. For this reason the algorithms have to be very robust.

3.7 Combining different sources of information

Greatest advantage when using two- and three-dimensional data comes when they are used to support each other. Acquisition of three dimensional data can be focused to right area when the direction of target is determined with a camera. On the other hand, recognition based on camera is easier if one can determine the position of the target from the three-dimensional data. The signs can be recognized with a camera, but three-dimensional data can be utilized in locating the signs. For example, the robot can be guided to a certain area based on this information.

4. Implementation of the research

Primary research platform is WorkPartner service robot developed at Automation Technology Laboratory of Helsinki University of Technology. The methods developed will be later applied to other robots also. Three dimensional data is also acquired with a Riegl-type laser scanner located at the laboratory. This data is used to validate the acquired and processed with other methods. The development of the methods is mostly done with Matlab software. The actual real time implementation in WorkPartner’s perception computer is implemented in C++ language.

The case examples used in the tests are picking up small unknown objects, like garbage, from the ground and moving a pile of boxes from one place to another. These tasks will utilize all the perception methods treated in this research.