ORCID

Abstract

Articulatory speech synthesis provides an alternative to the state of the art concatenative and formant systems, holding potential for more versatile and expressive artificial speech due to its physical modelling basis. However, a major limitation of practical articulatory synthesis is gaining adequate control of the complex underlying physical models, which stems from a lack of articulatory data. In an effort to procure more data, a Genetic Algorithm approach to Acoustic-Articulatory Parameter Inversion is taken. This paper presents the initial results from testing a number of fitness functions for the Acoustic-Articulatory Parameter Inversion of three vowels, /a/, /o/, and /e/. Three feature vector representations of the vowels were tested; Hertz, Mel-scale, and Cents, in conjunction with three distance metrics. The distance metrics defined the fitness score by calculating the similarity between a candidate and targets feature vector. A Voiced/Un-Voiced constraint was also added as a penalty function, and an indicator of loudness was implemented using a Root Mean Square based co-efficient. The results indicated that certain combinations of the above could lead to convergence towards all three vowels. However, the quality of convergence was not uniform.

DOI

10.1145/3067695.3076112

Publication Date

2017-07-15

Publication Title

Proceedings of Genetic and Evolutionary Computation Conference

Organisational Unit

School of Art, Design and Architecture

Share

COinS