Mining textual data from primary healthcare records: Automatic identification of patient phenotype cohorts
dc.contributor.author | Zhou, Shang-Ming | |
dc.contributor.author | Rahman, MA | |
dc.contributor.author | Atkinson, M | |
dc.contributor.author | Brophy, S | |
dc.date.accessioned | 2021-11-05T18:01:19Z | |
dc.date.available | 2021-11-05T18:01:19Z | |
dc.date.issued | 2014-07 | |
dc.identifier.isbn | 9781479914845 | |
dc.identifier.issn | 2161-4393 | |
dc.identifier.uri | http://hdl.handle.net/10026.1/18252 | |
dc.description.abstract |
Due to advances of the omics technologies, rich sources of clinical, biomedical, contextual, and environmental data about each patient have been available in medical and health sciences. However, an enormous amount of electronic health records is actually generated as textual data, such as descriptive terms/concepts. No doubt, efficiently harnessing these valuable textual data would allow doctors and nurses to identify the most appropriate treatments and the predicted outcomes for a given patient in real time. We used textual data to identify patient phenotypes from UK primary care records that were managed by Read codes (a clinical classification system). The fine granularity level of Read codes leads to a huge number of clinical terms to be handled. Unfortunately, traditional medical statistics methods have struggled to process this sort of data effectively. In this paper, we described how the problem of patient phenotype identification can be transformed into document classification task, a text mining scheme is addressed to integrate feature ranking methods and genetic algorithm to identify the most parsimonious subset of features that still holds the capacity of characterizing the distinction of patient phenotypes. The experimental results have demonstrated that compact feature sets with 2 or 3 important terms describing clinical events were effectively identified from 16852 Read codes while their classification accuracy remained high level of agreements with specialists from secondary care in classifying testing samples. | |
dc.format.extent | 3621-3627 | |
dc.language.iso | en | |
dc.publisher | IEEE | |
dc.subject | Clinical Research | |
dc.subject | Patient Safety | |
dc.subject | 8.4 Research design and methodologies (health services) | |
dc.subject | Generic health relevance | |
dc.title | Mining textual data from primary healthcare records: Automatic identification of patient phenotype cohorts | |
dc.type | conference | |
dc.type | Conference Proceeding | |
plymouth.author-url | https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000371465703106&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomerID=11bb513d99f797142bcfeffcc58ea008 | |
plymouth.date-start | 2014-07-06 | |
plymouth.date-finish | 2014-07-11 | |
plymouth.conference-name | 2014 International Joint Conference on Neural Networks (IJCNN) | |
plymouth.publication-status | Published | |
plymouth.journal | 2014 International Joint Conference on Neural Networks (IJCNN) | |
dc.identifier.doi | 10.1109/ijcnn.2014.6889494 | |
plymouth.organisational-group | /Plymouth | |
plymouth.organisational-group | /Plymouth/Faculty of Health | |
plymouth.organisational-group | /Plymouth/Faculty of Health/School of Nursing and Midwifery | |
plymouth.organisational-group | /Plymouth/REF 2021 Researchers by UoA | |
plymouth.organisational-group | /Plymouth/REF 2021 Researchers by UoA/UoA03 Allied Health Professions, Dentistry, Nursing and Pharmacy | |
plymouth.organisational-group | /Plymouth/Users by role | |
plymouth.organisational-group | /Plymouth/Users by role/Academics | |
dc.rights.embargoperiod | Not known | |
rioxxterms.versionofrecord | 10.1109/ijcnn.2014.6889494 | |
rioxxterms.licenseref.uri | http://www.rioxx.net/licenses/all-rights-reserved | |
rioxxterms.type | Conference Paper/Proceeding/Abstract |