Considerations for Successful Counterspeech

Project Title

Evaluating Methods to Diminish Expressions of Hatred and Extremism Online

Lead / Author

Susan Benesch and Derek Ruths

Relevant Dates

Study conducted May 2014 to March 2016.

Description

This guide provides lessons from recent research into what works among types of counterspeech that aim to diminish expressions of hatred and extremism online. In particular, the resource draws on findings from the broader study “Evaluating Methods to Diminish Expressions of Hatred and Extremism Online,” which examines spontaneous activity on Twitter where there is a direct response to hateful or dangerous speech. The former covers an expression of hatred against a person or people based on their group identity, while the latter refers to speech that can inspire or accelerate intergroup violence. 

For the larger study, the authors aimed to improve both methods to identify forms of hate speech online, as well as types of counterspeech, towards better understanding what forms of counterspeech are more likely to work in which circumstances. Detailed findings are available in other publications.

According to the authors, there are two ways in which counterspeech could be successful. The first is speech, including text or visual media, which has a favourable impact on the original (hateful) Twitter user, shaping their discourse if not also their beliefs. Evidence suggesting such change can include when the original (hateful) Twitter user apologizes, recants, or deletes either their original tweet or account.

The second type of success is to positively affect what the audience views as appropriate norms of discourse. The authors acknowledge that this is difficult to assess, but that it may be indicated by extended dialogue that remains civil, and where others join in the counterspeech efforts. Given current challenges of measuring the second category, this paper focuses on the first—a favourable impact on the original (hateful) Twitter user.

Select Findings

Through their research, the authors find numerous cases of successful counterspeech, as well as patterns in what seems to work, though caution that the findings are preliminary and call for more work to build the evidence base. They note that the successful strategies are often used in combination, even in a single tweet, and that they are most likely to succeed when aimed at those less committed to hatred or extremism.

One such strategy is warning of the consequences of hateful speech, including impact on the speaker (e.g. their own relationships and employment) as well as on the target individual or group. The authors find evidence that strategy can prompt deletion of the hateful tweet, and while such impact was evident in the short term, they note that it is unknown whether this method can change a user’s behaviour over the medium to long term. The research also found successful counterspeech in examples of ‘shaming and labeling,’ where tweets are labeled as hateful, racist, bigoted, misogynist etc. Given the stigma of such words, those who do not identify with such labels can be quick to alter or delete such tweets.

Empathy and affiliation, such as creating a connection with the speaker over shared background or identity, is another strategy found to change the tone and even curtail hateful speech. The authors note that while there is little evidence of long term behaviour change, it can prevent escalation in the moment. They also found cases of humour shifting the conversation, de-escalating conflict and drawing more attention to counterspeech messages. Further, the authors emphasize the persuasive impact of images – such as memes, graphics, photographs, animations and videos – going beyond text, in part through how they can transcend cultural and linguistic boundaries, expanding the audience and allowing counterspeech to spread virally.

The authors also highlight findings about what does not work, in terms of strategies they find to be ineffective and in some cases may even be counterproductive or harmful. For example, hostility, the use of aggressive tone and/or insults showed evidence of backfire effects, such as escalation of hateful rhetoric and reluctance of potential counterspeakers to join in. Fact-checking, which is responding to hateful speech by correcting falsehoods or misperceptions, is also found not to work as a method of influencing the original speaker. The authors argue that corrections that insult or threaten an individual’s worldview can lead them to further entrench in their views. Further, they caution that some counterspeech strategies can themselves become harassment, including threats, and/or serve to silence in ways that cause harm to the original speaker and the virtual audience. Overall, the authors recommend using caution, good judgement and following the above recommended guidelines when engaging in counterspeech.

Further Information

Considerations for Successful Counterspeech

Related Initiatives

Susan Benesch et al., “Counterspeech on Twitter: A Field Study,” Public Safety Canada, 2016.

Haji Mohammad Saleem et al., “A Web of Hate: Tackling Harmful and Hateful Speech in Online Social Spaces,” TA-COS, 2016.

Jamie Bartlett and Louis Reynolds, “The state of the art 2015: a literature review of social media intelligence capabilities for counter-terrorism,” Demos, 2015.

Theme(s)

Keyword(s)

Date modified: