Abstract

A typical gene expression data set consists of measurements of a large number of gene expressions, on a relatively small number of subjects, classified according to two or more outcomes, for example cancer or non-cancer. The identification of associations between gene expressions and outcome is a huge multiple testing problem. Early approaches to this problem involved the application of thousands of univariate tests with corrections for multiplicity. Over the past decade, numerous studies have demonstrated that analyzing gene expression data structured into predefined gene sets can produce benefits in terms of statistical power and robustness when compared to alternative approaches. This thesis presents the results of research on gene set analysis. In particular, it examines the properties of some existing methods for the analysis of gene sets. It introduces novel Bayesian methods for gene set analysis. A distinguishing feature of these methods is that the model is specified conditionally on the expression data, whereas other methods of gene set analysis and IGA generally make inferences conditionally on the outcome. Computer simulation is used to compare three common established methods for gene set analysis. In this simulation study a new procedure for the simulation of gene expression data is introduced. The simulation studies are used to identify situations in which the established methods perform poorly. The Bayesian approaches developed in this thesis apply reversible jump Markov chain Monte Carlo (RJMCMC) techniques to model gene expression effects on phenotype. The reversible jump step in the modelling procedure allows for posterior probabilities for activeness of gene set to be produced. These mixture models reverse the generally accepted conditionality and model outcome given gene expression, which is a more intuitive assumption when modelling the pathway to phenotype. It is demonstrated that the two models proposed may be superior to the established methods studied. There is considerable scope for further development of this line of research, which is appealing in terms of the use of mixture model priors that reflect the belief that a relatively small number of genes, restricted to a small number of gene sets, are associated with the outcome.

Awarding Institution(s)

University of Plymouth

Supervisor

Rana Moyeed, William Henley

Keywords

Bayesian Statistics, Gene Set Analysis, Pathway Analysis, Epigenetics, Gene Expression

Document Type

Thesis

Publication Date

2013

Deposit Date

June 2024

Additional Links

http://dx.doi.org/10.24382/3709

Recommended Citation

Wright, A. (2013) Bayesian Pathway Analysis in Epigenetics. Thesis. University of Plymouth. Available at: http://dx.doi.org/10.24382/3709

Download

Additional Files

license.txt (3 kB)

COinS

School of Engineering, Computing and Mathematics Theses

Bayesian Pathway Analysis in Epigenetics

Abstract

Awarding Institution(s)

Supervisor

Keywords

Document Type

Publication Date

Deposit Date

Additional Links

Recommended Citation

Additional Files

Search

Browse

About

Links

School of Engineering, Computing and Mathematics Theses

Bayesian Pathway Analysis in Epigenetics

Authors

Abstract

Awarding Institution(s)

Supervisor

Keywords

Document Type

Publication Date

Deposit Date

Additional Links

Recommended Citation

Additional Files

Share

Search

Browse

About

Links