LOCEN Research Focus: Ecological Active Vision and Intrinsic Motivations



Authors: Dimitri Ognibene, Valerio Sperati, Rodolfo Marraffa, Gianluca Baldassarre

Topic and its relevanceThis project (initially funded by the EU project MindRACES and later by the EU project IM-CLeVeR) is on LOCEN's approach to vision called `Ecological Active vision - EAV'. EAV is grounded on the `active vision' approach, based on an actively-moved small fovea plus a low-resolution periphery, augmented with four principles: (a) a strong coupling of bottom-up and top-down attention processes; (b) the use of reinforcement-learning to acquire top-down attention skills; (c) the use of attention and vision to support pragmatic action (e.g., reaching and grasping) rather than vision per-se, in particular a strong spatial coupling between attention and manipulation actions; (d) the use of a novel Potential Action Memory component to collect information on the best places to visit with the fovea. Lately we have linked EAV with intrinsic motivations (IMs), in particular IM related to the perception of movement in the world and agency (i.e., the agent's perception of the capacity to cause movement in the world with own actions). 

Questions and goals. What are the key principles that guide vision in primates since these activey look around and increase own survival chances with pragmatic (e.g., reaching, grasping, and eating objects) rather than just orienting (looking round) actions? What is the relation, during development, between the bottom-up `objective' peripheral vision and the top-down goal-directed/learning foveal vision? How does reward, following the atteinment of resources scattered in the environment, affect learning to move the fovea around? How are these processess affected by intrinsic motivations? 

Methods. We build various active visual system architecture endowed with a bottom-up peripheral component, that drives the fovea on regions of space with high contrast and movement, and a fovea-based reinforcement-learning top-down component that learns to move the fovea on `interesting' places in the world depending on the agent task (reward function). Lately we are building some versions of the architecture endowed with intrinsic motivation devices that can reward the agent for causing movement in the environment. The architectures are tested both in simulation and with real robots. 

Results. The architectures explain the interplay of bottom-up and top-down . The architectures also reproduce and explain some empirical results, e.g. some looking behaviours of infants. One architecture is also capable of learning multiple visusal skills in a cumulative fashion on the basis of intrinsic motivations. 

Conclusions. The results show the high innovativity of the EAV principles that lead to a number of new problems, but also opportunities, for vision.