Abstract
How do we create machines with the ability to capture, record and recall memories of past experience? How should these machines choose the most optimal action based on those stored memories? These seem like crucial questions for creating intelligent machines capable of learning from experience. The field of Artificial Intelligence (AI) is trying to reproduce such capabilities with increasing success. Currently a large portion of AI algorithms are focusing on making decisions based on big sets of learned past experience examples in the form of instantaneous input-output mapping. They operate as discrete models where time is collapsed into independent signal samples. Yet the dimension of time is the most fundamental source of perception, and the presence and absence of the signal is only visible by its changes in time. This work combines features of established neural network algorithms to create a new approach to the processing of temporal signals. Considering either agent-environment division present in reinforcement learning (RL) or controller-process division in control theory, there is always the intelligent part, agent or controller, which tries to control the passive, mostly deterministic part, environment or process. As the complexity of the problem grows, the coupling between the controller and the controlled part starts to become more physically limited. With limited perception, the controller has to resolve to building an abstract model of the controlled process or environment in order to be able to take fully informed actions. Presented in this thesis is a new artificial neural network capable of creating an unsupervised temporal model of the signal that can then be used as an abstract environment model for the controller. The network is structured as multilayer hierarchical composition of self-organising maps, augmented by short term memory in the form of wave-delay lines. Each layer performs temporal signal decomposition with a progressively larger time spectrum. The research analyses the network performance in creating abstract signal models on a range of synthetic and real world signals. It then introduces simple reinforcement learning additions, that allow the network to solve simple toy RL benchmarks.
Keywords
Unsupervised learning, Sequence encoding, Self-organizing maps, Multilayer hierarchical network
Document Type
Thesis
Publication Date
2023
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Recommended Citation
Bogdan, P. (2023) Learning and planning for autonomous systems with emergent hierarchical representations and decaying short-term memory. Thesis. University of Plymouth. Retrieved from https://pearl.plymouth.ac.uk/secam-theses/91