Automatic Classification of National Health Service Feedback
dc.contributor.author | Haynes, C | |
dc.contributor.author | PALOMINO, MARCO | |
dc.contributor.author | Stuart, EJ::0000-0001-8373-8526 | |
dc.contributor.author | Viira, D | |
dc.contributor.author | Hannon, F | |
dc.contributor.author | Crossingham, G | |
dc.contributor.author | Tantam, K | |
dc.date.accessioned | 2022-03-25T16:36:44Z | |
dc.date.available | 2022-03-25T16:36:44Z | |
dc.date.issued | 2022-03-18 | |
dc.identifier.issn | 2227-7390 | |
dc.identifier.issn | 2227-7390 | |
dc.identifier.other | 983 | |
dc.identifier.uri | http://hdl.handle.net/10026.1/18975 | |
dc.description.abstract |
<jats:p>Text datasets come in an abundance of shapes, sizes and styles. However, determining what factors limit classification accuracy remains a difficult task which is still the subject of intensive research. Using a challenging UK National Health Service (NHS) dataset, which contains many characteristics known to increase the complexity of classification, we propose an innovative classification pipeline. This pipeline switches between different text pre-processing, scoring and classification techniques during execution. Using this flexible pipeline, a high level of accuracy has been achieved in the classification of a range of datasets, attaining a micro-averaged F1 score of 93.30% on the Reuters-21578 “ApteMod” corpus. An evaluation of this flexible pipeline was carried out using a variety of complex datasets compared against an unsupervised clustering approach. The paper describes how classification accuracy is impacted by an unbalanced category distribution, the rare use of generic terms and the subjective nature of manual human classification.</jats:p> | |
dc.format.extent | 983-983 | |
dc.language | en | |
dc.language.iso | en | |
dc.publisher | MDPI AG | |
dc.subject | NLP | |
dc.subject | classification | |
dc.subject | clustering | |
dc.subject | text pre-processing | |
dc.subject | machine learning | |
dc.subject | National Health Service (NHS) | |
dc.title | Automatic Classification of National Health Service Feedback | |
dc.type | journal-article | |
dc.type | Journal Article | |
plymouth.author-url | https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000774106700001&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomerID=11bb513d99f797142bcfeffcc58ea008 | |
plymouth.issue | 6 | |
plymouth.volume | 10 | |
plymouth.publication-status | Published online | |
plymouth.journal | Mathematics | |
dc.identifier.doi | 10.3390/math10060983 | |
plymouth.organisational-group | /Plymouth | |
plymouth.organisational-group | /Plymouth/Faculty of Science and Engineering | |
plymouth.organisational-group | /Plymouth/Faculty of Science and Engineering/School of Engineering, Computing and Mathematics | |
plymouth.organisational-group | /Plymouth/REF 2021 Researchers by UoA | |
plymouth.organisational-group | /Plymouth/REF 2021 Researchers by UoA/UoA11 Computer Science and Informatics | |
plymouth.organisational-group | /Plymouth/Users by role | |
plymouth.organisational-group | /Plymouth/Users by role/Academics | |
dcterms.dateAccepted | 2022-03-16 | |
dc.rights.embargodate | 2022-3-29 | |
dc.identifier.eissn | 2227-7390 | |
dc.rights.embargoperiod | Not known | |
rioxxterms.versionofrecord | 10.3390/math10060983 | |
rioxxterms.licenseref.uri | http://www.rioxx.net/licenses/all-rights-reserved | |
rioxxterms.licenseref.startdate | 2022-03-18 | |
rioxxterms.type | Journal Article/Review | |
plymouth.funder | AGE'IN (Age Independently)::Interreg 2 Seas Mers Zeeën |