Introducing Hatebase: the world’s largest online database of hate speech

Predicting genocide is, by definition, an almost impossible task due to the scarcity of early, actionable data. There’s no chi-squared test or Monte Carlo method for reliably distributing societies along a spectrum from homogeneous to homicidal, both because the extermination of entire populations has become a relatively rare occurrence (thanks to the ever-increasing internationalization of human rights, law, media, and trade) and because those societies which do succeed at systematized annihilation are often equally resourceful at hiding evidence of their crimes.

In the information-rich twenty-first century, good data remains the Achilles’ heel of genocide studies.

At the Sentinel Project for Genocide Prevention, we’re tackling this problem on two fronts. First, in order to improve our data intake we’ve begun to engage in direct field work through our situations of concern (SOCs). Earlier this month, staff from the Sentinel Project were in Kenya during the contested presidential elections, monitoring tensions in urban hubs such as Nairobi and Mombasa as well as in known regional conflict zones such as the Tana River District.

Our second strategy has been to improve the tools with which we parse and prioritize data, whether from the field, from mainstream media or from social networks. To this end, the Sentinel Project recently partnered with my own organization, Mobiocracy, on the development of Hatebase, an authoritative, multilingual, usage-based repository of structured hate speech which data-driven NGOs can use to better contextualize conversations from known conflict zones.

Hatebase is available to casual users through a Wikipedia-like web interface, and to developers through an authenticating API. Although the core of Hatebase is its community-edited vocabulary of multilingual hate speech, a critical concept in Hatebase is regionality: users can associate hate speech with geography, thus building a parallel dataset of “sightings” which can be monitored for frequency, localization, migration, and transformation.

For instance, an organization monitoring several simultaneous theaters of operation might integrate location-based Hatebase data into its monitoring software to assign additional real-time “weight” to specific conflict zones, providing guidance on how to best redeploy limited resources. For genocide monitoring organizations in particular, regional hate speech is a widely recognized indicator of elevated risk.

There are some weaknesses implicit in a solely vocabulary-based approach to linguistic analysis. Innocuous language, when localized, can adopt a sinister secondary meaning (e.g. “cockroaches,” meaning Tutsis in Rwanda), and threats can be communicated without the need for easily identified keywords (“their days are numbered”). Despite these limitations, Hatebase can provide a layer of relevance which complements other context-based information sources, not unlike traffic congestion layered onto a city map.

In the months ahead, we’ll be adding additional data attributes, visualizations, and end-user functionality to Hatebase, with a particular focus on strengthening the API in accordance with our commitment to partnership-based innovation. Our hope is that other individuals, groups and organizations will embrace this collaborative model by leveraging Hatebase data in their own applications.

Hatebase Logo

This entry was posted in General News, Kenya, Situation of Concern, Team Announcements. Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.


  1. Posted March 26, 2013 at 6:32 am | Permalink

    How we can work together?

  2. Dr. Don John Omale
    Posted March 26, 2013 at 1:48 pm | Permalink

    This is a very good and intelligent innovation. It is a good data base for Geographic Information on hate speeches; and could be used as Early Warning Mechanism in violence prevention and peacebuilding. Thanks alot; Iam in.

  3. naftali mutahi
    Posted March 27, 2013 at 3:29 pm | Permalink

    keep moving.

  4. Marc Altman
    Posted March 27, 2013 at 4:47 pm | Permalink

    This posting is under a sub-heading of Kenya. Is Hatebase to be a general international database or, at least in the present time, on focused on Kenya?

  5. Sentinel Project
    Posted March 28, 2013 at 4:36 pm | Permalink

    Hi Marc – Hatebase has a general international scope. We may have mis-tagged it, I will take a look. Thanks for the comment!


  6. Olek Netzer
    Posted April 17, 2013 at 6:44 am | Permalink

    I forgot to tell you:
    1. I shall be presenting a Paper at the conference of IAGS (International Association of Genocide Scholars) this coming June in Siena, Italy; hope see any of you there.
    2. I happen to be a survivor of Genocide-Holocaust myself.
    Olek Nezer, PhD

  7. Gerry
    Posted April 27, 2013 at 7:27 pm | Permalink

    There is an ongoing issue in Tibet…. hundreds of people immolating themselves
    in protest over the 50+ year human rights violations on the part of the Chinese
    Communist Party.

    I’m a little surprised to NOT see Tibet listed in your “SITUATIONS OF CONCERN”.

13 Trackbacks

Post a Comment

Your email is never published nor shared. Required fields are marked *


You may use these HTML tags and attributes <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>