Modeling speech imitation and ecological learning of auditory-motor maps.

Canevari, C; Badino, L; D'Ausilio, A; Fadiga, L; Metta, G

dc.contributor.author	Canevari, C	en
dc.contributor.author	Badino, L	en
dc.contributor.author	D'Ausilio, A	en
dc.contributor.author	Fadiga, L	en
dc.contributor.author	Metta, G	en
dc.date.accessioned	2017-06-23T15:47:49Z
dc.date.available	2017-06-23T15:47:49Z
dc.date.issued	2013	en
dc.identifier.issn	1664-1078	en
dc.identifier.uri	http://hdl.handle.net/10026.1/9542
dc.description.abstract	Classical models of speech consider an antero-posterior distinction between perceptive and productive functions. However, the selective alteration of neural activity in speech motor centers, via transcranial magnetic stimulation, was shown to affect speech discrimination. On the automatic speech recognition (ASR) side, the recognition systems have classically relied solely on acoustic data, achieving rather good performance in optimal listening conditions. The main limitations of current ASR are mainly evident in the realistic use of such systems. These limitations can be partly reduced by using normalization strategies that minimize inter-speaker variability by either explicitly removing speakers' peculiarities or adapting different speakers to a reference model. In this paper we aim at modeling a motor-based imitation learning mechanism in ASR. We tested the utility of a speaker normalization strategy that uses motor representations of speech and compare it with strategies that ignore the motor domain. Specifically, we first trained a regressor through state-of-the-art machine learning techniques to build an auditory-motor mapping, in a sense mimicking a human learner that tries to reproduce utterances produced by other speakers. This auditory-motor mapping maps the speech acoustics of a speaker into the motor plans of a reference speaker. Since, during recognition, only speech acoustics are available, the mapping is necessary to "recover" motor information. Subsequently, in a phone classification task, we tested the system on either one of the speakers that was used during training or a new one. Results show that in both cases the motor-based speaker normalization strategy slightly but significantly outperforms all other strategies where only acoustics is taken into account.	en
dc.format.extent	364 - ?	en
dc.language	eng	en
dc.language.iso	eng	en
dc.subject	acoustic-to-articulatory mapping	en
dc.subject	automatic speech classification	en
dc.subject	deep neural networks	en
dc.subject	mirror neurons	en
dc.subject	phone classification	en
dc.subject	speaker normalization	en
dc.subject	speech imitation	en
dc.title	Modeling speech imitation and ecological learning of auditory-motor maps.	en
dc.type	Journal Article
plymouth.author-url	https://www.ncbi.nlm.nih.gov/pubmed/23818883	en
plymouth.volume	4	en
plymouth.publication-status	Published online	en
plymouth.journal	Front Psychol	en
dc.identifier.doi	10.3389/fpsyg.2013.00364	en
plymouth.organisational-group	/Plymouth
plymouth.organisational-group	/Plymouth/Faculty of Science and Engineering
plymouth.organisational-group	/Plymouth/REF 2021 Researchers by UoA
plymouth.organisational-group	/Plymouth/REF 2021 Researchers by UoA/UoA11 Computer Science and Informatics
dc.publisher.place	Switzerland	en
dcterms.dateAccepted	2013-06-04	en
dc.rights.embargoperiod	Not known	en
rioxxterms.versionofrecord	10.3389/fpsyg.2013.00364	en
rioxxterms.licenseref.uri	http://www.rioxx.net/licenses/all-rights-reserved	en
rioxxterms.licenseref.startdate	2013	en
rioxxterms.type	Journal Article/Review	en

Files in this item

Name:: Modeling speech imitation and ...
Size:: 1.556Mb
Format:: PDF

View/Open

Name:: UoP_Deposit_Agreement v1.1 ...
Size:: 125.4Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

School of Engineering, Computing and Mathematics

Show simple item record