This paper studies the issues surrounding the search and selection process in a general CSS system which may affect the synthesis result, namely the homosonic segments. Homosonic segments are first termed in this study, where it refers to audio files which have one or more of the same sonic properties with each other, but do not sound the same acoustically when played due to the limited audio features extracted during the analysis process. These homosonic segments create confusions within the CSS selection engine. This study proposes a robust solution to overcome this issue by introducing the concatenation cost in addition to the regular target cost. The experiment conducted in this study observes that the use of concatenation cost to help solve the problem is feasible. Further evaluation also suggests that the concatenation cost is an effective solution in solving the challenges involving homosonic segments as the sounds synthesised through concatenation cost function have a better accuracy and possess higher fluency when concatenated from one segment to the next.



Publication Date


Publication Title

2016 Third International Conference on Information Retrieval and Knowledge Management (CAMP)

Embargo Period


Organisational Unit

School of Art, Design and Architecture