Show simple item record

dc.contributor.authorCanevari, Cen
dc.contributor.authorBadino, Len
dc.contributor.authorD'Ausilio, Aen
dc.contributor.authorFadiga, Len
dc.contributor.authorMetta, Gen
dc.date.accessioned2017-06-23T15:47:49Z
dc.date.available2017-06-23T15:47:49Z
dc.date.issued2013en
dc.identifier.issn1664-1078en
dc.identifier.urihttp://hdl.handle.net/10026.1/9542
dc.description.abstract

Classical models of speech consider an antero-posterior distinction between perceptive and productive functions. However, the selective alteration of neural activity in speech motor centers, via transcranial magnetic stimulation, was shown to affect speech discrimination. On the automatic speech recognition (ASR) side, the recognition systems have classically relied solely on acoustic data, achieving rather good performance in optimal listening conditions. The main limitations of current ASR are mainly evident in the realistic use of such systems. These limitations can be partly reduced by using normalization strategies that minimize inter-speaker variability by either explicitly removing speakers' peculiarities or adapting different speakers to a reference model. In this paper we aim at modeling a motor-based imitation learning mechanism in ASR. We tested the utility of a speaker normalization strategy that uses motor representations of speech and compare it with strategies that ignore the motor domain. Specifically, we first trained a regressor through state-of-the-art machine learning techniques to build an auditory-motor mapping, in a sense mimicking a human learner that tries to reproduce utterances produced by other speakers. This auditory-motor mapping maps the speech acoustics of a speaker into the motor plans of a reference speaker. Since, during recognition, only speech acoustics are available, the mapping is necessary to "recover" motor information. Subsequently, in a phone classification task, we tested the system on either one of the speakers that was used during training or a new one. Results show that in both cases the motor-based speaker normalization strategy slightly but significantly outperforms all other strategies where only acoustics is taken into account.

en
dc.format.extent364 - ?en
dc.languageengen
dc.language.isoengen
dc.subjectacoustic-to-articulatory mappingen
dc.subjectautomatic speech classificationen
dc.subjectdeep neural networksen
dc.subjectmirror neuronsen
dc.subjectphone classificationen
dc.subjectspeaker normalizationen
dc.subjectspeech imitationen
dc.titleModeling speech imitation and ecological learning of auditory-motor maps.en
dc.typeJournal Article
plymouth.author-urlhttps://www.ncbi.nlm.nih.gov/pubmed/23818883en
plymouth.volume4en
plymouth.publication-statusPublished onlineen
plymouth.journalFront Psycholen
dc.identifier.doi10.3389/fpsyg.2013.00364en
plymouth.organisational-group/Plymouth
plymouth.organisational-group/Plymouth/Faculty of Science and Engineering
plymouth.organisational-group/Plymouth/REF 2021 Researchers by UoA
plymouth.organisational-group/Plymouth/REF 2021 Researchers by UoA/UoA11 Computer Science and Informatics
dc.publisher.placeSwitzerlanden
dcterms.dateAccepted2013-06-04en
dc.rights.embargoperiodNot knownen
rioxxterms.versionofrecord10.3389/fpsyg.2013.00364en
rioxxterms.licenseref.urihttp://www.rioxx.net/licenses/all-rights-reserveden
rioxxterms.licenseref.startdate2013en
rioxxterms.typeJournal Article/Reviewen


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record


All items in PEARL are protected by copyright law.
Author manuscripts deposited to comply with open access mandates are made available in accordance with publisher policies. Please cite only the published version using the details provided on the item record or document. In the absence of an open licence (e.g. Creative Commons), permissions for further reuse of content should be sought from the publisher or author.
Theme by 
Atmire NV