How might we determine the impact of the Coronavirus on the language around ‘humanitarianism’ in global media discourse?

We used Natural Language Processing (NLP) to look at a corpus of humanitarian news articles from Euro-Atlantic Countries, Gulf Donors, and New Global Media Players from December 2019 to August 2020. This analysis gave us insight into the directionality of humanitarian aid, key topics during the time, and the approach of nations towards handling the pandemic.

 

MY ROLE

Researcher

Data Analyst

Designer

Developer

TOOLS & METHODS

Python for NLP

Topic Modelling

TF-IDF

Collocation Analysis

HighCharts.js

AmCharts

Adobe XD

TEAM

Tashfeen Ahmed

Tamara Lottering

Xiaohang Xu

Minjia Zhao

Jin Mu

DURATION

3 months

 
Flow – 1.png
 

Methodology and Analyses

We started off with initial cleaning and tokenization. The pre-processing involved normalization, stemming, and stopword processing. Once the cleaning and preprocessing were completed, we analysed the collocation of terms using n-grams. trigrams yielded better results than bigrams since they revealed the context of how the terms were used.

 
 

To better understand the topics highlighted over time, we performed LSA and LDA topic modeling and took a look at the TF-IDF term frequency. The term frequency gave us a better idea about the topics that we could compare.

During the exploratory data analysis (EDA), I looked at news articles from 2010-2020 to understand the subjectivity of humanitarian news discourse. The pattern over a decade shows that the US and UK were poles apart in 2010 (the UK being more subjective). But they come to the same level by 2020. This visualisation has not been included in the ‘COVID in Pixels’ web page.

Visualisations

I used Highcharts.js and AmCharts to show visualisations. The goal was to make the viewers quickly grasp key insights from the data. The website is hosted on Github pages and features a timeline of news articles along with line and bar graphs. The data was generated in Python and exported to JSON for JavaScript-based visualisations.

This was a group project for one of the courses in the Design Informatics programme.