School of Engineering, Computing and Mathematics

Modeling speech imitation and ecological learning of auditory-motor maps

Abstract

Classical models of speech consider an antero-posterior distinction between perceptive and productive functions. However, the selective alteration of neural activity in speech motor centers, via transcranial magnetic stimulation, was shown to affect speech discrimination. On the automatic speech recognition (ASR) side, the recognition systems have classically relied solely on acoustic data, achieving rather good performance in optimal listening conditions. The main limitations of current ASR are mainly evident in the realistic use of such systems. These limitations can be partly reduced by using normalization strategies that minimize inter-speaker variability by either explicitly removing speakers' peculiarities or adapting different speakers to a reference model. In this paper we aim at modeling a motor-based imitation learning mechanism in ASR. We tested the utility of a speaker normalization strategy that uses motor representations of speech and compare it with strategies that ignore the motor domain. Specifically, we first trained a regressor through state-of-the-art machine learning techniques to build an auditory-motor mapping, in a sense mimicking a human learner that tries to reproduce utterances produced by other speakers. This auditory-motor mapping maps the speech acoustics of a speaker into the motor plans of a reference speaker. Since, during recognition, only speech acoustics are available, the mapping is necessary to "recover" motor information. Subsequently, in a phone classification task, we tested the system on either one of the speakers that was used during training or a new one. Results show that in both cases the motor-based speaker normalization strategy slightly but significantly outperforms all other strategies where only acoustics is taken into account.

DOI Link

10.3389/fpsyg.2013.00364

Publication Date

2013-01-01

Publication Title

Frontiers in Psychology

Volume

Publisher

Frontiers Media SA

ISSN

1664-1078

Embargo Period

2024-11-22

Recommended Citation

Canevari, C., Badino, L., D'Ausilio, A., Fadiga, L., & Metta, G. (2013) 'Modeling speech imitation and ecological learning of auditory-motor maps', Frontiers in Psychology, 4. Frontiers Media SA: Available at: 10.3389/fpsyg.2013.00364

UoP_Deposit_Agreement v1.1 20160217.pdf (123 kB)

Download

Additional Files

UoP_Deposit_Agreement v1.1 20160217.pdf (123 kB)

COinS

School of Engineering, Computing and Mathematics

Modeling speech imitation and ecological learning of auditory-motor maps

Abstract

DOI Link

Publication Date

Publication Title

Volume

Publisher

ISSN

Embargo Period

Recommended Citation

Additional Files

Search

Browse

About

Links

School of Engineering, Computing and Mathematics

Modeling speech imitation and ecological learning of auditory-motor maps

Authors

Abstract

DOI Link

Publication Date

Publication Title

Volume

Publisher

ISSN

Embargo Period

Recommended Citation

Additional Files

Share

Search

Browse

About

Links