Abstract

Objectives UK statistics suggest only two-thirds of patients with dementia get a diagnosis recorded in primary care. General practitioners (GPs) report barriers to formally diagnosing dementia, so some patients may be known by GPs to have dementia but may be missing a diagnosis in their patient record. We aimed to produce a method to identify these ‘known but unlabelled’ patients with dementia using data from primary care patient records. Design Retrospective case–control study using routinely collected primary care patient records from Clinical Practice Research Datalink. Setting UK general practice. Participants English patients aged >65 years, with a coded diagnosis of dementia recorded in 2000–2012 (cases), matched 1:1 with patients with no diagnosis code for dementia (controls). Interventions Eight coded and nine keyword concepts indicating symptoms, screening tests, referrals and care for dementia recorded in the 5 years before diagnosis. We trialled machine learning classifiers to discriminate between cases and controls (logistic regression, naïve Bayes, random forest). Primary and secondary outcomes The outcome variable was dementia diagnosis code; the accuracy of classifiers was assessed using area under the receiver operating characteristic curve (AUC); the order of features contributing to discrimination was examined. Results 93 426 patients were included; the median age was 83 years (64.8% women). Three classifiers achieved high discrimination and performed very similarly. AUCs were 0.87–0.90 with coded variables, rising to 0.90–0.94 with keywords added. Feature prioritisation was different for each classifier; commonly prioritised features were Alzheimer’s prescription, dementia annual review, memory loss and dementia keywords. Conclusions It is possible to detect patients with dementia who are known to GPs but unlabelled with a diagnostic code, with a high degree of accuracy in electronic primary care record data. Using keywords from clinic notes and letters improves accuracy compared with coded data alone. This approach could improve identification of dementia cases for record-keeping, service planning and delivery of good quality care.

DOI

10.1136/bmjopen-2020-039248

Publication Date

2021-01-22

Publication Title

BMJ Open

Volume

11

Issue

1

Publisher

BMJ Publishing Group

ISSN

2044-6055

Embargo Period

2024-11-19

Share

COinS