Framing of social issues in media

So far, I experimented with standardized tasks and datasets that were provided and easily accessible. In the real world though, NLP practitioners often have to solve a problem from scratch. This includes gathering and cleaning data, choosing a model, iterating on the model, and possibly going back to change the data.

Therefore, I have built my own system end-to-end for this project. A starter code was provided to me, but I freely tried many things beyond what was provided. The full process required the following steps:

Understand the task specification
Collect raw data
Annotate training and test data for development
Train and test models using this data
“Deploy” the system
Write a report/article

Task Specification

What is framing?

Framing is selecting and amplifying some aspects of a perceived reality in a communicating text. This could be a tweet, a news article, or any such media. Examples [1]:

When some news media emphasize the mental illness of gun shooters over other aspects of gun violence in covering the issue, this is framing.
When you choose to purchase a yogurt product that is advertised “90 percent less fat” rather than one saying “10 percent fat,” this is when framing effect occurs.

Why does it matter?

Quoting from one of the seminal studies of the problem:

In a polarized media environment, partisan media outlets intentionally frame news stories in a way to advance certain political agendas. Even when journalists make their best efforts to pursue objectivity, media framing often favors one side over another in political disputes, thus always resulting in some degree of bias. Hence, a news framing analysis is helpful because it not only tells us whether a news article is left- or right-leaning (or positive or negative), but also reveals how the article is structured to promote a certain side of the political spectrum.

In communication research, manual identification of media frames is a challenging task due to the large amount of media data in this news-saturated environment. More importantly, there is a high level of complexity in framing analysis that often requires a careful investigation of nuances in news coverage, which is time-consuming.
Liu et al. (2019)

Hence, an NLP solution that automates media framing identification would immensely help social scientists and other analysts.

The Task

Identify the framing of a given paragraph/sentence in several languages. Effectively, this is a straightforward text classification problem in a multilingual setting.

Input

A text file with one paragraph/sentence per line. An example of the input looks like this:

Some economists say that immigrants, legal and illegal, produce a net economic gain, while others say that they create a net loss.

Output

The output of the model will be a .tsv file with one sentence per line, a tab, and then the corresponding label.

The Annotation Standard

Social scientists have created a widely accepted list of 15 cross-cutting framing dimensions, such as economics, morality, and politics. These were originally developed by Boydstun et al. (2014) and are termed the “Policy Frames Codebook”. The paper describing the codebook is publicly available. The framing dimensions are also listed in the Table below.

Economy	costs, benefits, or other financial implications
Capacity and Resources	availability of physical, human or financial resources, and capacity of current systems
Morality	religious or ethical implications
Fairness and Equality	balance or distribution of rights, responsibilities, and resources
Legality, Constitutionality, Jurisdiction	rights, freedoms, and authority of individuals, corporations, and government
Policy Prescription and Evaluatin	discussion of specific policies aimed at addressing problems
Crime and Punishment	effectiveness and implications of laws and their enforcement
Security and Defence	threats to welfare of the individual, community, or nation
Health and Safety	health care, sanitation, public safety
Quality of Life	threats and opportunities for the individual’s wealth, happiness, and well-being
Cultural Identity	traditions, customs, or values of a social group in relation to a policy issue
Public Sentiment	attitudes and opinions of the general public, including polling and demographics
Political	considerations related to politics and politicians, including lobbying, elections, and attempts to sway voters
External Regulation and Reputation	international reputation or foreign policy of the US
Other	any coherent group of frames not covered by the above categories

Annotation labels

Domains

I applied the system to news data related to two social issues:

immigration
same-sex marriage

Languages

I had a few sentences in English already. However, considering the importance of multilingualism, the system was designed to provide labels for sentences in the following languages:

English
Mandarin (Chinese)
Hindi
Telugu
Bengali
Greek

As well as two surprise languages the system never saw before:

Russian
Turkish

References

[1] Taken from http://www.openframing.org/home.html

The Number Crunch