Lowenstein Clinic Proposes Framework to Moderate Indirect Hate Speech Online
In light of social media contributing to violence in volatile regions, the Allard K. Lowenstein International Human Rights Clinic at Yale Law School has released a report proposing how social media giant Meta can take a human rights approach to moderating a particular kind of hate speech in conflict or crisis situations.
Managing and Mitigating Indirect Hate Speech on Meta Social Media Platforms outlines a framework for how Meta — the company best known for Facebook and Instagram — and similar social media platforms can approach how they review and monitor user-generated content for indirect hate speech in places or situations that have a heightened risk of violence. The report’s focus is proxy or indirect hate speech, defined as hate speech that is likely to contribute to violence but does not explicitly name a protected characteristic, making it harder to identify than more direct forms of hate speech. Though the report looks at indirect hate speech online within a specific context, it recognizes the need to effectively address the harmful phenomenon of online hate speech as a whole.
The report follows research by others on how information shared on social media has contributed to violence in politically sensitive areas. To show how this could happen, the report cites research by Global Witness, which documented how Meta failed to detect overt hate speech in advertisements the organization submitted as a test in Myanmar, Ethiopia, and Kenya. The finding was especially concerning, according to the Lowenstein report, because hate speech has particular potential to incite violence in those countries. What’s more, the report notes, Meta’s review process did not catch hate speech in these countries even though the company has said it was putting more resources toward reducing hate speech there.
Meta has broadened its definition of hate speech in the Facebook Community Standards, an outline of what is and is not allowed on the company’s platforms. The report argues that this is a necessary starting point to properly capture indirect hate speech. Through publicly available information and interviews with human rights advocates who specialize in online hate speech prevention, the report concludes that closing gaps in the standards would cover a broader swathe of online hate speech as conceptualized by human rights law. Moreover, by broadening their overall definition of hate speech, Meta could improve monitoring of direct hate speech as well by clarifying the underlying principles and purpose of their hate speech policies writ large.
The report proposes that Meta adopt a signals framework for content moderation, both to determine whether specific content constitutes indirect hate speech and to help moderators decide which content to prioritize within large-scale enforcement. By examining case studies across several countries, most of which are experiencing emerging or active conflict, the report illustrates how this framework could be useful in moderating hate speech.
Using the holistic signals framework, content moderators would determine whether content should be flagged, removed, or otherwise sanctioned as hate speech by identifying online and offline signals. Online signals relate to the content of the post itself and how users interact with it, while offline signals relate to the real-world social and political context in which the post exists. As the report authors explain, online signals often include proxy language, account history, reach and engagement, and explicit disclaimers, while offline signals include local risk of conflict, identity of the target, and identity of the poster.
The report also concludes that Meta must dedicate additional and strategic resources — intellectual and financial — to moderate both direct and indirect hate speech in situations of emerging or present conflict where rapid, large-scale enforcement is critical.