Representations for Policy Learning of Embodied Agents
Abstract
This thesis investigates how embodied autonomous agents can obtain useful representations of their environment and further apply those representations to learn goal-directed behaviours. It explores the potential of using sensory-motor data obtained directly from the agent’s interactions with its environment, without any additional inputs, task-dependent cues and minimal amount of heuristics. The methodology focuses on using unsupervised and self-supervised learning techniques to extract meaningful patterns from sensory data. This thesis proposes a few different representation learning architectures,with final versions composed as a combination of sensory encoding, sequence modeling and predictive training objective, and tests them in a simulated mobile platform robot scenario. First, the training data is obtained via random interaction of the agent with the environment as a stream of sensory-motor information. Then, this data is used to train a representation learning model. Finally, policy learning performance in spatial learning tasks is used to estimate how informative the representations are. The results demonstrate that it is possible to obtain unsupervised, task-agnostic representations that can be used for policy learning in embodied agents. The resulting policies perform on par with or better than benchmarks in some of the test environments, improving the performance and the robustness of the learning. The findings particularly highlight the importance of incorporating memory via sequence modelling and action information into the representation learning process, as it improves task performance, compared to using only sensory information. More generally, the findings show that even models relying on minimal heuristics or task-specific cues in representation learning can yield meaningful and useful representations. The results contribute to the field of developmental roboticsby providing evidence that an embodied agent can obtain useful representations of the environment through autonomous interaction with the environment.
Awarding Institution(s)
University of Plymouth
Supervisor
Pablo Borja, Swen Gaudl, Tony Belpaeme
Keywords
Embodied agents, machine learning, autonomous agents, Policy learning, Developmental Robotics, Goal-directed control
Document Type
Thesis
Publication Date
2025
Embargo Period
2027-01-12
Deposit Date
January 2026
Additional Links
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Recommended Citation
Hagen, O. (2025) Representations for Policy Learning of Embodied Agents. Thesis. University of Plymouth. Available at: https://doi.org/10.24382/qkwt-2203
This item is under embargo until 12 January 2027
COinS
