Social Media Data Show Language Related to Depression Didn’t Spike After Initial Pandemic Wave
Researchers who analyzed language related to depression on social media during the pandemic say the data suggest people learned to cope as the waves wore on.
U of A researcher Alona Fyshe and her collaborators at the University of Western Ontario hypothesized that depression-related language would spike during each wave of COVID-19. But their study shows that wasn’t the case.
“There was a big reaction at the beginning and then people sort of found their new normal,” says Fyshe. “It’s a message of resilience, people figuring out how to keep on keeping on in a pandemic.”
The researchers started the project early in the pandemic as a way to help, since many of their standard research tools, such as collecting brain imaging data, were inaccessible due to restrictions. It was the discussion of hobbies on social media — such as the widespread focus on baking sourdough bread — that piqued their interest and sparked the idea for the study.
“It seemed like when people took on these hobbies it helped them to cope, so that’s what we were looking for — the patterns of language usage that were helpful, that were indicative of individuals figuring out how to cope,” says Fyshe, an assistant professor of computing science and psychology.
They turned their attention to online platforms such as Reddit and Twitter.
Gauging Mental Health Through Social Media
Social media is a useful tool in assessing mental health at the population level, explains Fyshe, a fellow of the Alberta Machine Intelligence Institute and Canada CIFAR AI chair.
The researchers first identified keywords by analyzing the type of language posters were using in discussions on Reddit. The self-identification found in those subreddits and forums isn’t replicated in many other social media platforms, Fyshe explains.
“Essentially we trained a machine learning model that can differentiate between the language of people who post to /depression versus people who don’t,” says Fyshe.
Using this information and the identified keywords, they turned their attention to Twitter. They analyzed data from four cities — Sydney, Mumbai, Seattle and Toronto – with different waves of COVID-19 so they could determine which changes in language were due to global trends and which were local. They restricted the data to areas with a large percentage of English tweets so they could use the same methodology to analyze all the data.
The results were surprising, says Fyshe. In general, spikes in COVID-19 cases and the various waves throughout the pandemic weren’t reflected in the data. In fact, the only city with an increase in depression-related language after the first wave was Mumbai, which saw a significant second wave.
“After the first wave, people just seemed to figure out how to cope. It just wasn’t the same sort of language being used, which I think in some ways is a really positive thing,” says Fyshe, who is also a member of the Neuroscience and Mental Health Institute.
Fyshe says the machine learning methods used to scrape Reddit subforums to identify keywords and analyze Twitter data could be applied to a wide range of subjects. For example, when examining data in Seattle, they found strong reactions to the Black Lives Matter movement.
“It was indicative of there being a large change to the general mood — what people were talking about and how people were feeling about the world they lived in.”
Twitter recently created a new program for researchers that allows them access to a much larger quantity of tweets than they previously had.
“They opened it up so you could scrape millions of tweets a month, which absolutely changed what we were able to do. We could now get a more complete picture rather than just a subsample of what’s available,” says Fyshe.
Next, Fyshe and collaborators are examining supportive language and how it has changed over the course of the pandemic. They plan to compare tweets related to Bell Let’s Talk Day in January 2020 with the same content in January 2021 to explore the differences in language used before and during the pandemic.
The research was supported by CIFAR, with large-scale computing support from Compute Canada.