Abstract

Robots are an increasing part of our daily lives. As robots become more pervasive it is important that we are able to interact with them naturally. The field of Human Robot Interaction (HRI) seeks to improve interactions between human and robot. People spend many years in their childhood learning to communicate the locations of objects naturally to other people. When trying to communicate the location of objects, people generate under-specified statements and then generate further repair if necessary to guide the listener as part of an interactive dialogue. The focus in HRI up until now has been on trying to generate non-ambiguous statements to refer to objects or locations. I create here a dynamic method of generating spatial referring expressions, based on under-specified statements followed if necessary by repair, as a step towards more interactive dialogue. I present the following thesis: A robot that is able to use dynamic description methods –using vague initial language with the ability to further repair for generating spatial referring expressions as well as reducing the problem of combinatorial explosion, will be a more effective tool for collaborating with people than using static non-ambiguous descriptions. This kind of dynamic form of description is new to the field of HRI. In socio-linguistics this form of communication is thought to lead to a least collaborative effort, with both partners in a conversation contributing to a description. To ensure the validity of my work, I base my work on potential real use case scenarios for a social robot in a number of studies. I start by looking at a Robot Assisted Language Learning scenario, in which the robot attempts to encourage the use of spatial language in a quiz based game. As another use case I look at a nuclear waste disposal task. I also present the initial study in which I noticed the discrepancy between how we have been attempting to generate referring expressions, and how people communicate. I describe how I created the dynamic systems based on human-human interactions. By looking at two people solving the task we gather data on position to represent the state of the action, and what participants are saying in that state. I use this to build a classifier that determines what the robot should say at a given state of the interaction. This system allows a robot to successfully guide a person to the correct object/location. In my studies I find that this dynamic form of communication is more efficient in terms of time, and distance travelled when trying to complete a task that requires spatial referring expressions when compared to static non-ambiguous descriptions. I also find that it is possible for people to prefer this form of communication in a complex real world task.

Document Type

Thesis

Publication Date

2021-01-01

DOI

10.24382/1232

Creative Commons License

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License
This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.

Share

COinS