Model-based motor control learning in robotics

Thumbnail Image
Publication date
Defense date
Journal Title
Journal ISSN
Volume Title
Google Scholar
Research Projects
Organizational Units
Journal Issue
Robot motor control learning is currently a very active research area in robotics. The challenge consists in designing an efficient heuristic to guide a search process which will attempt to find and appropriate motor control policy capable of accomplishing a certain task. Implicitly or explicitly, the search is an optimization process of some specific objective function. This means, in practise, that the robot will have to evaluate potential candidate solutions in the real environment. Each evaluation is a trial-and-error execution that will end up with useful information to feed the heuristic that will decide which potential candidate solution will be evaluated in the next execution. Many learning techniques have been developed during past decades. A very general classification divides these methods into two groups: global learning and local learning methods. The state of art in local learning techniques is very developed. There are hundreds of examples where complex learning tasks can be achieved using these methods. Nevertheless, they all constraint the search space to make the learning process feasible. The result is always a local optimum of the objective function. On the other hand, global learning techniques are oriented to find global optimum solutions. This is generally a much harder problem as the search space is larger. Research on global learning methods is also large. However, there is a lack of methods with direct applicability in complex robotics systems without assuming prior knowledge about the task. This is mainly due to two facts. First, they scale badly to problems with continuous and high dimensional state-action spaces. Second, too many real robot-environment interactions are needed. Whatever the performance of these methods is, it seems to be clear that global methods based on model-based learning have clear advantages with respect to model-free methods, especially those related to the minimization of the number of trial-error executions. A complete study of the most relevant work done in motor control learning is performed in Chapter 2, where the advantages and disadvantages of the different existing methods are discussed. Nevertheless, this thesis puts the focus on overcoming the limitations on current global learning methods, concretely those following a model-based learning philosophy. In our opinion, they have more advantages. This statement will be justified during the reading of this chapter. As already mentioned, every learning problem is indeed an optimization process. Learning methods based on an explicit optimization process have several advantages when the complete search space can be represented by control policy functions depending on a reduce set of parameters. This occurs always when taking advantage of the existing controllers in literature. For these cases, a new learning method consisting in an optimization process using surrogate models has been proposed in Chapter 3. Some learning problems are very complex because there is any controller which a priori can solve the objective task. In these cases is very hard to parametrize any control policy and therefore it is necessary to search directly for a sequence of low-level motor control commands. Accomplishing this task is very difficult if seeking for optimal solutions. If optimality is not crucial, it is possible to take advantage of kinodynamic planning algorithms to solve the problem. However, planners need models, and models are not known, otherwise it would not be a learning problem. Nevertheless, this research has found the way to combine kinodynamic planning with model learning to manage an efficient new solution for learning problems from scratch (without assuming prior knowledge). This finding is described in Chapter 4 and the first part of Chapter 5. Despite the effectiveness of such solution, it still lacks of a ideal behaviour for its application in real environments as the provided control policies are open-loop. Nonetheless, it is possible to combine the output of such algorithm with the potential of model predictive control. This is done in the second part of such chapter. The result is a complete architecture capable of providing robust close-loop policies capable of operating efficiently in real environments.
Robotics, Robot motor, Control learning
Bibliographic citation