Labor Conflict in the Middle East and North Africa (MENALC) Dataset

The Labor Conflict in the Middle East and North Africa (MENALC) Dataset, represents one of the largest and most comprehensive efforts to catalogue events of labor conflict and unrest in the Middle East/North Africa region. Future iterations of the project will expand the dataset to cover a wide-range of autocratic countries in Africa, Latin America, Eastern Europe and Southeast Asia.

 Designed to augment policymakers' and researchers' ability to analyze labor mobilization patterns, the MENALC dataset contains information on protests, sit-ins, strikes, and other public demonstrations perpetrated by unions and union affiliates in the MENA region. MENALC currently includes information on over 3500 labor protest events from 1980 to 2011.

Each event record contains information about the timing, location and magnitude of labor protest events as well as descriptions regarding the actors, targets, issues of contention and government responses. While other protest datasets contain information on social conflict events more generally, the purpose of this dataset is to gather information specifically related to labor movements in order to allow researchers to catalogue their protest activities and collective mobilization efforts. 

MENALC is directed by Ashley Anderson (Harvard University) and has been funded by generous grants from the National Science Foundation (NSF), Harvard University Weatherhead Center, Harvard University Institute for Quantitative Social Science (IQSS), and the Project on Middle East Political Science (POMEPS). To access these data, please send a formal inquiry to Ashley at the following address: aaanders@g.harvard.edu. 

Methodology

The MENALC dataset is intended to provide researchers with a comprehensive, methodologically rigorous resource for analyzing labor behavior and unrest in the Middle East/North Africa (MENA). Every independent, autocratic nation in the MENA region with a legal, operative labor union (a total of 13 countries) is covered beginning in 1980. Unlike other datasets which draw from a limited number of sources, the primary information from this dataset is compiled from a comprehensive list of over 200 sources in English, French and Arabic as compiled by the Lexis-Nexis Academic, Factiva, and World News Connection news sources as well as archival materials from the Harvard Library and Library of Congress.

The MENALC dataset codes all reported protest events undertaken by labor groups defined as workers, labor organizations, or unions. In generating these event data, MENALC leverages the technological advances offered by text-analysis for processing event data from journalistic reports as well as human coding to better classify specific information about participants, targets, and demands unavailable using machine coding techniques. 

No event is considered too small. Ongoing events, such as strikes, occupations, lockouts, and hunger strikes which share the same actors, targets, and issues as previous event entries are coded each day as a separate event, to allow researchers flexibility in aggregating/disaggregating data. All reports of future plans for actions are ignored as are all threats to engage in protest actions. Only events for which coders could identify a credible start date are included in the dataset. 

Source Information

The dataset sources information using text-analysis of over 200 news sources included in the Lexis-Nexis, Factiva, and World News Connection Databases. Additionally, for Tunisia and Morocco specifically, sources of information also include major local news sources (Al-Ittihad Ichtiraki, Al-Bayane, Le Temps, Le Renouveau, etc.) and union journals. By sourcing from so many news outlets, the MENALC dataset offers the best possible comprehensiveness, coverage and reliability of all extant datasets covering social conflict in the region. 

Once news reports are obtained, text analysis protocols are used to sort through the thousands of news articles collected and identify those related to labor conflict. Then, human coders are used to classify descriptive information on each protest event, including date, location, actors, targets and protest demands. When discrepancies arise in news reporting, human coders took care to use the most widely reported figures and most recent news sources to obtain information. 

Variables Included

MENALC tracks the following information about each protest event:

- Start date

- Duration

- Event Type

- Escalation

- Brief Description of the Event

- Actors and Targets

- Issues

- Number of Participants

- Number of arrests, injuries, deaths

- Repression

- Location

- Source

Intercoder reliability

To ensure continuity in the application of the coding methodology, we use one principal coder per country. To the greatest extent possible, coders are made familiar with country dynamics by reading both general historical literature on the country as well as more specific pieces regarding labor organization. Through this process coders gain an intimate knowledge of the relevant actors and issues in the country, which is invaluable in assisting coders with making the inevitable judgments that must be made when coding data. 

Additionally, to ensure validity of results across coders, the team double coded 10% of the country-years to check if two coders inputted the same information about protest events from news sources. The cross-validation check showed substantial agreement between coders in all cases. For further information please consult the codebook. 

CODEBOOK

The complete codebook is available here.