LOCEN Research Focus: Robotic models of multiple skill learning driven by intrinsic motivations

Synopsis

 

Authors: Vieri Santucci, Marco Mirolli, Gianluca Baldassarre

Topic and its relevance. Developing robots able to autonomously discover, select, and solve multiple new tasks in a cumulative open-endeed fashion is an important issue for autonomous robotics. It becomes even crucial if we want to build robots capable of solving multiple problems in real environments posing challenges that are unknown at design time. There are two key `ingredients' necessary to build these kind of robots. The first are intrinsic motivations (IMs): these can drive autonomous learning of robots in an open-ended fashion in the absence of tasks assigned to the robots by the users. The second are hierarchical architectures: these are needed to store multiple skills, drive their acquistion with IMs, learn goals related to skills, and form complex skills based on simpler skills. 

Questions and goals. What are the hierarchical robot architectures that can support an open-ended acquisition of multiple skills at multiple levels of granularity? What are the IM systems that can guide the acquisition of multiple skills? In particular, what roles can novelty-based, prediction based, and competence-based IM mechanisms play in such acquisition? How can IMs support the self-generation of goals and manage the focussing of learning resources on skills depending on their rate of acquisition? 

Methods. We build robotic architectures encompassing IMs and a hierarchical organisation of the acquired skills. The models usually employ novelty-based IMs to focus learning, competence-based intrinsic motivations to decide on which skill to focus learning resources, and prediction-based IMs to self-generate goals (the latter is under exploration). In terms of learning the systems are mainly based on reinforcement larning (e.g., TD-learning, Q-learning), in particular based on modular architectures, and sometimes unsupervised learning (e.g., self-organising maps). 

Results. We are proposing increasinlgy complete architectures to support open-ended autonomous learning of multiple skills. We have compared different types of IMs learning signals with the aim to identify the best ones to decide on which skill to focus learning. We are starting to face the problem of how to use IMs to self-generate goals. We have proposed algorithms that learn to select the best data structure (`expert') to accomplish a given goal. 

Conclusions. Learning multiple skills cannot be accomplished with simple algorithms, so choosing which ones to use and how to integrate them in whole functioning architectures is not trivial. Our approach is leading to develop architectures encompassing IMs and and a hierarchical organisation that are contributing to accomplish truly open-ended learning robots, a fundamental milestone of artificial intelligence.

Key references.