Can applied linguistics help combat the spread of anti-vaccine misinformation online? by Milo Coffey.
We’ve all heard it. Whether it’s from a family member, a co-worker, or even just somebody in the queue outside the supermarket, at some point in the past year you’re bound to have come across at least one person who maintains that vaccines are harmful: “I’m not taking any experimental jabs,” they declare with conviction, and proceed to tell you all about how they heard on Facebook that Mary’s friend’s brother’s girlfriend once got a cold after having her flu vaccine in 2012.
It’s nonsense, of course, but it’s dangerous nonsense. Mass vaccination is key to protecting us all from diseases that may otherwise lead to serious illness or death. But when misinformation begins to spread, there is a danger that fewer people will want to get vaccinated. Not only does this put individuals at risk, but it also harms the community by reducing the effects of herd immunity, whereby unprotected people are less likely to contract an infection due to the majority of people having been vaccinated.
But what does any of this have to do with applied linguistics, you might ask. Well, to stop the spread of misinformation about vaccines, you first need to know what you’re looking for. That’s where applied linguists come in. As part of my third year undergraduate module Language in the Media, I recently carried out a study which aimed to identify the linguistic features that characterise anti-vaccine and anti-science posts, especially those concerning Covid-19, on Twitter. In doing so, I hoped to provide a basis for future attempts to counter the spread of this misinformation and prevent people from being misled about vaccination.
The first step in my investigation was to gather the data. I needed to create a corpus of tweets so I could look for patterns of linguistics features across the texts. To make sure the data was reliable (in terms of being as unbiased as possible), half of the tweets in my corpus were anti-vaccine and anti-science, and the other half were pro-vaccine and pro-science. Including both anti-vaccine and pro-vaccine tweets also allowed me to make comparisons between the language used by the different groups. I sourced the tweets by searching Twitter using popular hashtags employed by users in either camp and entering the first ten tweets from each search into my corpus.
To analyse the tweets, I used a framework developed by Ken Hyland which helps researchers to identify how people use language to express their opinions and try to influence the views of others (Hyland calls this ‘interaction’). Interaction breaks down into two categories, which are called ‘stance’ and ‘engagement’. ‘Stance’ refers to the ways that writers use language to express themselves, their opinions and their commitments in a text, whereas ‘engagement’ refers to the ways writers use language to acknowledge the presence of the reader and try to guide them towards a particular point of view. As the diagram below shows, stance and engagement are carried out through the use of particular linguistic features. For example, a writer might use boosters (words like certainly, definitely) to express their stance towards what they are writing about, or use reader pronouns like we and you to try to involve the reader in the discussion and influence their opinions.
After going through the corpus and annotating for stance and engagement, I used computer software to tally up the total occurrences of all the different features. The results showed that there was no significant difference (i.e. no difference that could not be put down to chance) in how often the elements of stance occurred in the two groups of tweets in my corpus. However, there were significant differences in how often elements of engagement occurred. Directives (words that tell the reader to do something, like imagine, or consider, etc) occurred much more frequently in the pro-vaccine, pro-science corpus, whereas reader pronouns were significantly more common in the anti-vaccine, anti-science corpus. This was because the pro-vaccine tweets often included instructions for how to stay safe, like Wash your hands and Get vaccinated, and the anti-vaccine tweets often tried to engage the reader and convince them to support their point of view with sentences like Think before your get an unknown substance pumped into you.
I also found that many of the features of stance and engagement that were used in the tweets I analysed could be linked to one or more themes which previous research had identified in vaccine discourse. For example, anti-vaccinators’ suspicion that people were looking to accuse them of spreading conspiracy theories was often realised through the use of hedges (words like may, might). This allowed them to refrain from fully committing to what they were saying, in case it was later disproven. Meanwhile, the instructional hashtags of the pro-vaccine and pro-science tweets like #StopTheSpread and #WearAMask link to the group’s promotion of science and safety and criticism of anti-vaccinators.
Overall, my investigation showed that anti-vaccine, anti-science and pro-vaccine, pro-science tweets use stance and engagement in different ways to express their views and try to persuade people to agree with them, and also that these features can be linked to different themes which characterise the discourse of these two groups. The next step would be to use this information to help design systems that can identify and evaluate tweets about vaccination to help to counter the spread of misinformation and potentially save lives. So, next time you see a post online about vaccines, be wary of how the writer might be trying to use language to influence you. Applied linguistics knowledge can come in handy when you least expect it!