Show simple item record

dc.contributor.supervisorHemion, Nikolas
dc.contributor.authorLoviken, Pontus
dc.contributor.otherFaculty of Science and Engineeringen_US

How can real robots with many degrees of freedom - without previous knowledge of themselves or their environment - act and use the resulting observations to efficiently develop the ability to generate a wide set of useful behaviours?

This thesis presents a novel framework that enables physical robots with many degrees of freedom to rapidly learn models for control from scratch. This can be done in previously inaccessible problem domains characterised by a lack of direct mappings from motor actions to outcomes, as well as state and action spaces too large for the full forward dynamics to be learned and used explicitly. The proposed framework is able to cope with these issues by the use of a set of local Goal Babbling models, that maps every outcome in a low dimensional task space to a specific action, together with a sparse higher level Reinforcement Learning model, that learns to navigate between the contexts from which each Goal Babbling model can be used. The two types of models can then be learned online an in parallel, using only the data a robot can collect by interacting with its environment.

To show the potential of the approach we present two possible implementations of the framework, over two separate robot platforms: a simulated planar arm with up to 1, 000 degrees of freedom, and a real humanoid robot with 25 degrees of freedom. The results show that learning is rapid and essentially unaffected by the number of degrees of freedom of the robot, allowing for the generation of complex behaviours and skills after a relatively short training time. The planar arm is able to strategically plan series of motions in order to move its end-effector between any two parts of a crowded environment, within 10, 000 iterations. The humanoid robot is able to freely transition between states such as lying on the back, belly, and sides, and occasionally also sitting up, within only 1, 000 iterations. This corresponds to 30 − 60 minutes of real-world interactions.

The main contribution of this thesis is to provide a framework for solving a control learning problem, previously largely unexplored with no obvious solutions, but with strong analogies to, for example, early learning of body orientation control in infants. This thesis examined two quite different implementations of the proposed framework, and showed success in both cases for two different control learning problem.

dc.publisherUniversity of Plymouth
dc.rightsCC0 1.0 Universal*
dc.subjectmodel learningen_US
dc.subjectReinforcement learningen_US
dc.subjectOnline learningen_US
dc.subjectGoal babblingen_US
dc.subjectinverse modelsen_US
dc.subjectMicro data learningen_US
dc.subjectDevelopmental roboticsen_US
dc.subjectreal-world robotsen_US
dc.subjectsensorimotor controlen_US
dc.titleFast Online Model Learning for Controlling Complex Real-World Robotsen_US
dc.rights.embargoperiodNo embargoen_US
rioxxterms.funderHorizon 2020en_US

Files in this item


This item appears in the following Collection(s)

Show simple item record

CC0 1.0 Universal
Except where otherwise noted, this item's license is described as CC0 1.0 Universal

All items in PEARL are protected by copyright law.
Author manuscripts deposited to comply with open access mandates are made available in accordance with publisher policies. Please cite only the published version using the details provided on the item record or document. In the absence of an open licence (e.g. Creative Commons), permissions for further reuse of content should be sought from the publisher or author.
Theme by 
@mire NV