Skip to main content

Language performance as a predictor of future Alzheimer’s disease

Language sample analysis may help predict future Alzheimer’s disease in people who are cognitively normal, suggesting that language patterns may be an early, detectable biomarker for the disease. Published in EClinicalMedicine, researchers at IBM Thomas J. Watson Research Center and Pfizer Worldwide Research and Development analyzed written language samples and were able to predict Alzheimer’s disease more than seven years before the diagnosis.

A hand holding a pen while writing in a journalFor the study, researchers used data from 270 participants in the long-running, NIH-funded Framingham Heart Study: 190 participants were in the training set and 80 participants were in the test set. The training set was used to develop the linguistic markers via a type of machine learning called automated linguistic analysis. The test set, determined by those participants whose data had been reviewed thoroughly by a panel of experts to assess their Alzheimer’s disease status, was used to assess the predictive performance of those linguistic markers in an independent sample.

In particular, for the test set, half of the 80 subjects had developed Alzheimer’s-like symptoms by age 85 (cases) and half did not (controls). Each of the 270 participants had performed a written picture analysis task when they were cognitively normal. For the training set, the researchers identified 87 language characteristics from the writing samples. They then used models to predict the future development of Alzheimer’s disease by assessing language performance.

Researchers found that language patterns such as writing short and simple phrases, repeating and misspelling words, and skipping punctuation were associated with future onset of Alzheimer’s. The language pattern analysis was about 70% accurate in predicting who developed Alzheimer’s disease. Additionally, combining language sample analysis with more traditional clinical data models, such as neuropsychological test scores, demographic and genetic information, and medical history, increased prediction accuracy from 59% to 69% when language was also included.

The researchers noted that exploring the relationships of linguistic and non-linguistic variables, along with verbal language patterns, may further the development of non-invasive tests for the early detection of Alzheimer’s.

The Framingham Heart Study Consortium data used in this research was supported in part by NIA grants R01AG016495 and R01AG008122.

These activities relate to NIH's AD+ADRD Research Implementation Milestone 9.H, “Launch research programs to develop and validate sensitive neuropsychological and behavioral assessment measures to detect and track the earliest clinical manifestations of AD and AD-related dementias.”

Reference: Eyigoz E, et al. Linguistic markers predict onset of Alzheimer’s disease. EClinicalMedicine. 2020;28:100583. doi: 10.1016/j.eclinm.2020.100583.