Digital speech biomarkers for assessing cognitive decline across neurodegenerative conditions

This study investigates speech impairments in individuals with mild cognitive impairment due to Alzheimer’s disease (MCI-AD), mild cognitive impairment with Lewy bodies (MCI-LB), and Parkinson’s disease with mild cognitive impairment (PD-MCI), compared to healthy controls (HC), aiming to identify linguistic and acoustic digital biomarkers that differentiate these groups. Monologue recordings were collected from 68 HC, 42 MCI-AD, 50 MCI-LB, and 47 PD-MCI participants (ON state). Participants were instructed to speak spontaneously for one and a half minutes. Speech was automatically transcribed, manually corrected, and analyzed using natural language processing to extract eight linguistic (lexical/syntactic) and four acoustic (prosodic) biomarkers. Group differences were assessed using the Mann–Whitney U test, with Spearman’s correlation used to examine associations with clinical and MRI measures (FDR-corrected). Machine learning models (XGBoost) were applied to evaluate the classificatory and predictive potential of speech features. Distinct speech patterns were observed across groups: MCI-AD participants exhibited reduced use of function words, resulting in increased content density, PD-MCI participants used shorter sentences and fewer coordinating conjunctions with longer pauses, and MCI-LB participants exhibited greater lexical repetition than MCI-AD. Altered speech features correlated with structural brain changes but not with global cognition (MoCA) or depressive symptoms (GDS). Sentence structure and pausing features showed strong interrelationships. Machine learning models showed that adding speech biomarkers improved classification performance compared to using clinical scores alone. In regression analyses, the models predicted MoCA with a normalized error of 10%, performing similarly on automatic and manually corrected transcripts. These findings suggest that speech biomarkers and traditional clinical assessments may offer complementary information about cognitive status and brain health, supporting their use in scalable, non-invasive cognitive monitoring.
This study investigates speech impairments in individuals with mild cognitive impairment due to Alzheimer’s disease (MCI-AD), mild cognitive impairment with Lewy bodies (MCI-LB), and Parkinson’s disease with mild cognitive impairment (PD-MCI), compared to healthy controls (HC), aiming to identify linguistic and acoustic digital biomarkers that differentiate these groups. Monologue recordings were collected from 68 HC, 42 MCI-AD, 50 MCI-LB, and 47 PD-MCI participants (ON state). Participants were instructed to speak spontaneously for one and a half minutes. Speech was automatically transcribed, manually corrected, and analyzed using natural language processing to extract eight linguistic (lexical/syntactic) and four acoustic (prosodic) biomarkers. Group differences were assessed using the Mann–Whitney U test, with Spearman’s correlation used to examine associations with clinical and MRI measures (FDR-corrected). Machine learning models (XGBoost) were applied to evaluate the classificatory and predictive potential of speech features. Distinct speech patterns were observed across groups: MCI-AD participants exhibited reduced use of function words, resulting in increased content density, PD-MCI participants used shorter sentences and fewer coordinating conjunctions with longer pauses, and MCI-LB participants exhibited greater lexical repetition than MCI-AD. Altered speech features correlated with structural brain changes but not with global cognition (MoCA) or depressive symptoms (GDS). Sentence structure and pausing features showed strong interrelationships. Machine learning models showed that adding speech biomarkers improved classification performance compared to using clinical scores alone. In regression analyses, the models predicted MoCA with a normalized error of 10%, performing similarly on automatic and manually corrected transcripts. These findings suggest that speech biomarkers and traditional clinical assessments may offer complementary information about cognitive status and brain health, supporting their use in scalable, non-invasive cognitive monitoring.

Keywords

Acoustic biomarkers , Linguistic biomarkers , Machine learning , Mild cognitive impairment , Parkinson’s disease , Spontaneous speech , Statistical analysis , Acoustic biomarkers , Linguistic biomarkers , Machine learning , Mild cognitive impairment , Parkinson’s disease , Spontaneous speech , Statistical analysis

Citation

Computers in Biology and Medicine. 2025, vol. 198, issue November, p. 1-12.
https://doi.org/10.1016/j.compbiomed.2025.111251

Document type

Peer-reviewed

Document version

Published version

Language of document

en

DOI

10.1016/j.compbiomed.2025.111251

URI

http://hdl.handle.net/11012/255699

Collections

Ústav telekomunikací

Creative Commons license

Except where otherwised noted, this item's license is described as Creative Commons Attribution 4.0 International

Citace PRO

Full item page

Digital speech biomarkers for assessing cognitive decline across neurodegenerative conditions

Files

Date

Authors

Advisor

Referee

Mark

Journal Title

Journal ISSN

Volume Title

Publisher

ORCID

Altmetrics

Abstract

Description

Keywords

Citation

Document type

Document version

Date of access to the full text

Language of document

Study field

Comittee

Date of acceptance

Defence

Result of defence

DOI

URI

Collections

Endorsement

Review

Supplemented By

Referenced By

Creative Commons license

Citace PRO