PHILADELPHIA – Language in Facebook posts may help identify conditions such as diabetes, anxiety, depression and psychosis in patients, according to a study from Penn Medicine and Stony Brook University researchers. It’s believed that language in posts could be indicators of disease and, with patient consent, could be monitored just like physical symptoms. This study was published in PLOS ONE.
“This work is early, but our hope is that the insights gleaned from these posts could be used to better inform patients and providers about their health,” said lead author Raina Merchant, MD, MS, the director of Penn Medicine’s Center for Digital Health and an associate professor of Emergency Medicine. “As social media posts are often about someone’s lifestyle choices and experiences or how they’re feeling, this information could provide additional information about disease management and exacerbation.”
Using an automated data collection technique, the researchers analyzed the entire Facebook post history of nearly 1,000 patients who agreed to have their electronic medical record data linked to their profiles. The researchers then built three models to analyze their predictive power for the patients: one model only analyzing the Facebook post language, another that used demographics such as age and sex, and the last that combined the two datasets.
Looking into 21 different conditions, researchers found that all 21were predictable from Facebook alone. In fact, 10 of the conditions were better predicted through the use Facebook data instead of demographic information.
Some of the Facebook data that was found to be more predictive than demographic data seemed intuitive. For example, “drink” and “bottle” were shown to be more predictive of alcohol abuse. However, others weren’t as easy. For example, the people that most often mentioned religious language like “God” or “pray” in their posts were 15 times more likely to have diabetes than those who used these terms the least. Additionally, words expressing hostility — like “dumb” and some expletives— served as indicators of drug abuse and psychoses.
“Our digital language captures powerful aspects of our lives that are likely quite different from what is captured through traditional medical data,” said the study’s senior author Andrew Schwartz, PhD, a visiting assistant professor at Penn in Computer and Information Science, and an assistant professor of Computer Science at Stony Brook University. “Many studies have now shown a link between language patterns and specific disease, such as language predictive of depression or language that gives insights into whether someone is living with cancer. However, by looking across many medical conditions, we get a view of how conditions relate to each other, which can enable new applications of AI for medicine.”
Last year, many members of this research team were able to show that analysis of Facebook posts could predict a diagnosis of depression as much as three months earlier than a diagnosis in the clinic. This work builds on that study and shows that there may be potential for developing an opt-in system for patients that could analyze their social media posts and provide extra information for clinicians to refine care delivery. Merchant said that it’s tough to predict how widespread such a system would be, but it “could be valuable” for patients who use social media frequently.
“For instance, if someone is trying to lose weight and needs help understanding their food choices and exercise regimens, having a healthcare provider review their social media record might give them more insight into their usual patterns in order to help improve them,” Merchant said.
Later this year, Merchant will conduct a large trial in which patients will be asked to directly share social media content with their health care provider. This will provide a look into whether managing this data and applying it is feasible, as well as how many patients would actually agree to their accounts being used to supplement active care.
“One challenge with this is that there is so much data and we, as providers, aren’t trained to interpret it ourselves — or make clinical decisions based on it,” Merchant explained. “To address this, we will explore how to condense and summarize social media data.”
The current study received funding from a Robert Wood Johnson Foundation Pioneer Award.
Other authors on this study include David A. Asch, Patrick Crutchley, Lyle H. Ungar, Sharath C. Guntuku, Johannes Eichstaedt, Shawndra Hill, Kevin Padrez, and Robert J. Smith.
Penn Medicine is one of the world’s leading academic medical centers, dedicated to the related missions of medical education, biomedical research, excellence in patient care, and community service. The organization consists of the University of Pennsylvania Health System and Penn’s Raymond and Ruth Perelman School of Medicine, founded in 1765 as the nation’s first medical school.
The Perelman School of Medicine is consistently among the nation's top recipients of funding from the National Institutes of Health, with $550 million awarded in the 2022 fiscal year. Home to a proud history of “firsts” in medicine, Penn Medicine teams have pioneered discoveries and innovations that have shaped modern medicine, including recent breakthroughs such as CAR T cell therapy for cancer and the mRNA technology used in COVID-19 vaccines.
The University of Pennsylvania Health System’s patient care facilities stretch from the Susquehanna River in Pennsylvania to the New Jersey shore. These include the Hospital of the University of Pennsylvania, Penn Presbyterian Medical Center, Chester County Hospital, Lancaster General Health, Penn Medicine Princeton Health, and Pennsylvania Hospital—the nation’s first hospital, founded in 1751. Additional facilities and enterprises include Good Shepherd Penn Partners, Penn Medicine at Home, Lancaster Behavioral Health Hospital, and Princeton House Behavioral Health, among others.
Penn Medicine is an $11.1 billion enterprise powered by more than 49,000 talented faculty and staff.