An Artificial Intelligence Approach to Concatenative Sound Synthesis

Mohd Norowi, Noris

dc.contributor.supervisor	Miranda, Eduardo Reck
dc.contributor.author	Mohd Norowi, Noris
dc.contributor.other	Faculty of Arts, Humanities and Business	en_US
dc.date.accessioned	2013-08-23T09:42:24Z
dc.date.available	2013-08-23T09:42:24Z
dc.date.issued	2013
dc.identifier	10167917	en_US
dc.identifier.uri	http://hdl.handle.net/10026.1/1606
dc.description	Sound examples are included with this thesis	en_US
dc.description.abstract	Technological advancement such as the increase in processing power, hard disk capacity and network bandwidth has opened up many exciting new techniques to synthesise sounds, one of which is Concatenative Sound Synthesis (CSS). CSS uses data-driven method to synthesise new sounds from a large corpus of small sound snippets. This technique closely resembles the art of mosaicing, where small tiles are arranged together to create a larger image. A ‘target’ sound is often specified by users so that segments in the database that match those of the target sound can be identified and then concatenated together to generate the output sound. Whilst the practicality of CSS in synthesising sounds currently looks promising, there are still areas to be explored and improved, in particular the algorithm that is used to find the matching segments in the database. One of the main issues in CSS is the basis of similarity, as there are many perceptual attributes which sound similarity can be based on, for example it can be based on timbre, loudness, rhythm, and tempo and so on. An ideal CSS system needs to be able to decipher which of these perceptual attributes are anticipated by the users and then accommodate them by synthesising sounds that are similar with respect to the particular attribute. Failure to communicate the basis of sound similarity between the user and the CSS system generally results in output that mismatches the sound which has been envisioned by the user. In order to understand how humans perceive sound similarity, several elements that affected sound similarity judgment were first investigated. Of the four elements tested (timbre, melody, loudness, tempo), it was found that the basis of similarity is dependent on humans’ musical training where musicians based similarity on the timbral information, whilst non-musicians rely on melodic information. Thus, for the rest of the study, only features that represent the timbral information were included, as musicians are the target user for the findings of this study. Another issue with the current state of CSS systems is the user control flexibility, in particular during segment matching, where features can be assigned with different weights depending on their importance to the search. Typically, the weights (in some existing CSS systems that support the weight assigning mechanism) can only be assigned manually, resulting in a process that is both labour intensive and time consuming. Additionally, another problem was identified in this study, which is the lack of mechanism to handle homosonic and equidistant segments. These conditions arise when too few features are compared causing otherwise aurally different sounds to be represented by the same sonic values, or can also be a result of rounding off the values of the features extracted. This study addresses both of these problems through an extended use of Artificial Intelligence (AI). The Analysis Hierarchy Process (AHP) is employed to enable order dependent features selection, allowing weights to be assigned for each audio feature according to their relative importance. Concatenation distance is used to overcome the issues with homosonic and equidistant sound segments. The inclusion of AI results in a more intelligent system that can better handle tedious tasks and minimize human error, allowing users (composers) to worry less of the mundane tasks, and focusing more on the creative aspects of music making. In addition to the above, this study also aims to enhance user control flexibility in a CSS system and improve similarity result. The key factors that affect the synthesis results of CSS were first identified and then included as parametric options which users can control in order to communicate their intended creations to the system to synthesise. Comprehensive evaluations were carried out to validate the feasibility and effectiveness of the proposed solutions (timbral-based features set, AHP, and concatenation distance). The final part of the study investigates the relationship between perceived sound similarity and perceived sound interestingness. A new framework that integrates all these solutions, the query-based CSS framework, was then proposed. The proof-of-concept of this study, ConQuer, was developed based on this framework. This study has critically analysed the problems in existing CSS systems. Novel solutions have been proposed to overcome them and their effectiveness has been tested and discussed, and these are also the main contributions of this study.	en_US
dc.description.sponsorship	Malaysian Minsitry of Higher Education, Universiti Putra Malaysia	en_US
dc.language.iso	en	en_US
dc.publisher	University of Plymouth	en_US
dc.subject	Artificial Intelligence	en_US
dc.subject	Concatenative sound synthesis
dc.subject	Order dependent feature selection
dc.subject	Homosonic segments
dc.subject	Equidistant segments
dc.subject	Basis of sound similarity
dc.title	An Artificial Intelligence Approach to Concatenative Sound Synthesis	en_US
dc.type	Thesis
plymouth.version	Full version	en_US
dc.identifier.doi	http://dx.doi.org/10.24382/3849

Files in this item

Name:: 2013mohdnorowi10167917phd.pdf
Size:: 3.555Mb
Format:: PDF
Description:: Thesis

View/Open

Name:: 2013mohdnorowi10167917phd_soun ...
Size:: 80.20Mb
Format:: Basic Audio
Description:: Sound files from Appendix A1 to A9

View/Open

Name:: 2013mohdnorowi10167917phd_soun ...
Size:: 35.66Mb
Format:: Basic Audio
Description:: Sound files from Appendix A10 ...

View/Open

Name:: license.txt
Size:: 3.272Kb
Format:: Text file

View/Open

This item appears in the following Collection(s)

01 Research Theses Main Collection
Research Theses Main

Show simple item record