Digital speech biomarkers for assessing cognitive decline across neurodegenerative conditions
Loading...
Date
Advisor
Referee
Mark
Journal Title
Journal ISSN
Volume Title
Publisher
Altmetrics
Abstract
This study investigates speech impairments in individuals with mild cognitive impairment due to Alzheimer’s disease (MCI-AD), mild cognitive impairment with Lewy bodies (MCI-LB), and Parkinson’s disease with mild cognitive impairment (PD-MCI), compared to healthy controls (HC), aiming to identify linguistic and acoustic digital biomarkers that differentiate these groups. Monologue recordings were collected from 68 HC, 42 MCI-AD, 50 MCI-LB, and 47 PD-MCI participants (ON state). Participants were instructed to speak spontaneously for one and a half minutes. Speech was automatically transcribed, manually corrected, and analyzed using natural language processing to extract eight linguistic (lexical/syntactic) and four acoustic (prosodic) biomarkers. Group differences were assessed using the Mann–Whitney U test, with Spearman’s correlation used to examine associations with clinical and MRI measures (FDR-corrected). Machine learning models (XGBoost) were applied to evaluate the classificatory and predictive potential of speech features. Distinct speech patterns were observed across groups: MCI-AD participants exhibited reduced use of function words, resulting in increased content density, PD-MCI participants used shorter sentences and fewer coordinating conjunctions with longer pauses, and MCI-LB participants exhibited greater lexical repetition than MCI-AD. Altered speech features correlated with structural brain changes but not with global cognition (MoCA) or depressive symptoms (GDS). Sentence structure and pausing features showed strong interrelationships. Machine learning models showed that adding speech biomarkers improved classification performance compared to using clinical scores alone. In regression analyses, the models predicted MoCA with a normalized error of 10%, performing similarly on automatic and manually corrected transcripts. These findings suggest that speech biomarkers and traditional clinical assessments may offer complementary information about cognitive status and brain health, supporting their use in scalable, non-invasive cognitive monitoring.
This study investigates speech impairments in individuals with mild cognitive impairment due to Alzheimer’s disease (MCI-AD), mild cognitive impairment with Lewy bodies (MCI-LB), and Parkinson’s disease with mild cognitive impairment (PD-MCI), compared to healthy controls (HC), aiming to identify linguistic and acoustic digital biomarkers that differentiate these groups. Monologue recordings were collected from 68 HC, 42 MCI-AD, 50 MCI-LB, and 47 PD-MCI participants (ON state). Participants were instructed to speak spontaneously for one and a half minutes. Speech was automatically transcribed, manually corrected, and analyzed using natural language processing to extract eight linguistic (lexical/syntactic) and four acoustic (prosodic) biomarkers. Group differences were assessed using the Mann–Whitney U test, with Spearman’s correlation used to examine associations with clinical and MRI measures (FDR-corrected). Machine learning models (XGBoost) were applied to evaluate the classificatory and predictive potential of speech features. Distinct speech patterns were observed across groups: MCI-AD participants exhibited reduced use of function words, resulting in increased content density, PD-MCI participants used shorter sentences and fewer coordinating conjunctions with longer pauses, and MCI-LB participants exhibited greater lexical repetition than MCI-AD. Altered speech features correlated with structural brain changes but not with global cognition (MoCA) or depressive symptoms (GDS). Sentence structure and pausing features showed strong interrelationships. Machine learning models showed that adding speech biomarkers improved classification performance compared to using clinical scores alone. In regression analyses, the models predicted MoCA with a normalized error of 10%, performing similarly on automatic and manually corrected transcripts. These findings suggest that speech biomarkers and traditional clinical assessments may offer complementary information about cognitive status and brain health, supporting their use in scalable, non-invasive cognitive monitoring.
This study investigates speech impairments in individuals with mild cognitive impairment due to Alzheimer’s disease (MCI-AD), mild cognitive impairment with Lewy bodies (MCI-LB), and Parkinson’s disease with mild cognitive impairment (PD-MCI), compared to healthy controls (HC), aiming to identify linguistic and acoustic digital biomarkers that differentiate these groups. Monologue recordings were collected from 68 HC, 42 MCI-AD, 50 MCI-LB, and 47 PD-MCI participants (ON state). Participants were instructed to speak spontaneously for one and a half minutes. Speech was automatically transcribed, manually corrected, and analyzed using natural language processing to extract eight linguistic (lexical/syntactic) and four acoustic (prosodic) biomarkers. Group differences were assessed using the Mann–Whitney U test, with Spearman’s correlation used to examine associations with clinical and MRI measures (FDR-corrected). Machine learning models (XGBoost) were applied to evaluate the classificatory and predictive potential of speech features. Distinct speech patterns were observed across groups: MCI-AD participants exhibited reduced use of function words, resulting in increased content density, PD-MCI participants used shorter sentences and fewer coordinating conjunctions with longer pauses, and MCI-LB participants exhibited greater lexical repetition than MCI-AD. Altered speech features correlated with structural brain changes but not with global cognition (MoCA) or depressive symptoms (GDS). Sentence structure and pausing features showed strong interrelationships. Machine learning models showed that adding speech biomarkers improved classification performance compared to using clinical scores alone. In regression analyses, the models predicted MoCA with a normalized error of 10%, performing similarly on automatic and manually corrected transcripts. These findings suggest that speech biomarkers and traditional clinical assessments may offer complementary information about cognitive status and brain health, supporting their use in scalable, non-invasive cognitive monitoring.
Description
Keywords
Acoustic biomarkers , Linguistic biomarkers , Machine learning , Mild cognitive impairment , Parkinson’s disease , Spontaneous speech , Statistical analysis , Acoustic biomarkers , Linguistic biomarkers , Machine learning , Mild cognitive impairment , Parkinson’s disease , Spontaneous speech , Statistical analysis
Citation
Computers in Biology and Medicine. 2025, vol. 198, issue November, p. 1-12.
https://doi.org/10.1016/j.compbiomed.2025.111251
https://doi.org/10.1016/j.compbiomed.2025.111251
Document type
Peer-reviewed
Document version
Published version
Date of access to the full text
Language of document
en
Study field
Comittee
Date of acceptance
Defence
Result of defence
Collections
Endorsement
Review
Supplemented By
Referenced By
Creative Commons license
Except where otherwised noted, this item's license is described as Creative Commons Attribution 4.0 International

0000-0003-2701-1802 