ORCID
- Shang-Ming Zhou: 0000-0002-0719-9353
Abstract
Gene selection is crucial for cancer classification using microarray data. In the interests of improvingcancer classification accuracy, in this paper, we developed a new wrapper method called ieGENES for geneselection. First we proposed a parsimonious kernel machine regularization (PKMR) model by usingridge regularization in kernel machine driven classification to tackle multi-collinearity for the sake of stableestimates in high-dimensional settings. Then the ieGENES algorithm was developed to optimally identifyrelevant genes while iteratively eliminating redundant ones based on leave-one-out cross-validation accuracy.In particular, we developed a new methodology to optimally update model parameters upon gene removal.The ieGENES algorithm was evaluated on six cancer microarray datasets and compared to existing methods.Classification accuracy and number of differentially expressed genes (DEGs) identified were assessed. Interms of gene selection accuracy, the ieGENES outperformed multiple wrapper methods on 5 out of 6datasets (Colon, Leukemia, Hepato, Glioma, and Breast Cancers), with statistically significant improvements(�� < 0.001). For the Colon dataset, ieGENES achieved 96.21% accuracy with 167 DEGs. The proposed ieGENEStechnique demonstrated superior performance in identifying DEGs for cancer diagnosis comparing withexisting techniques. It offers a promising tool for identifying biologically relevant genes in microarray dataanalysis and biomarker discovery for cancer research.
Publication Date
2025-02-28
Publication Title
Journal of Biomedical Informatics
Volume
164
ISSN
1532-0464
Keywords
Cancer, Differentially expressed genes, Gene detection, Kernel machines, Machine learning, Microarray data
First Page
104803
Last Page
104803
Recommended Citation
Xia, X., Zhou, S., Liu, Y., Lin, N., & Overton, I. (2025) 'ieGENES: A machine learning method for selecting differentially expressed genes in cancer studies', Journal of Biomedical Informatics, 164, pp. 104803-104803. Available at: 10.1016/j.jbi.2025.104803