Beyond Gun Control: Creating a Dutch Stance Dataset for Diversity in News Recommendation

Myrthe Reuver, Kasper Welbers, Wouter van Atteveldt, Antske Fokkens, Mariken van der Velden and Felicia Locherbach

Stance detection could be used for designing a democracy-supporting, viewpoint-diverse news recommender (Reuver et al. 2021). However, the contemporary stance dataset landscape is mostly dominated by English-language datasets (e.g. the 10 benchmark datasets in Schiller et al. (2021)). Also less useful for a focus on Dutch democratic debate is that most of the above datasets are aimed at North-American socio-political debates, such as gun control and tipping in restaurants. This multidisciplinary work-in-progress is on creating a Dutch stance dataset with stances on multiple salient 2020 Dutch election issues. The dataset consists of news article sentences on political issues, and political actors’ stances towards these issues. An example can be seen in (1) below.

1. “Voormalig minister van Veiligheid en Justitie Ivo Opstelten ( VVD ) vindt het te ver gaan om studenten te verbieden ’s avonds een biertje te drinken in een studentensocieteit.”

‘Former Minister of Security and Justice Ivo Opstelten (VVD) thinks it is going too far to prohibit students from drinking a beer in a student club in the evening’

Our current pipeline consists of a pre-annotation and annotation step. We use a large corpus of news articles from diverse sources, published during the 2020 Dutch parliamentary elections. Our annotated sentences contain a political actor, which we define as a natural person or political party either in parliament or up for election in the Dutch 2020 parliamentary election, or a parliamentary or governmental institution (e.g. ‘De Tweede Kamer’). Secondly, we detect political issues on which political parties and actor have policy positions, such as “taxes” and “crime reduction”. Lastly, we detect stances of these actors on these issues. We identify a stance as an actor either supporting or actively opposing or reducing the issue under discussion, whether this is tax cuts, support for refugees, or law enforcement. Stances are annotated with CCSAnnotator, a newly designed annotation tool that allows users to swipe on their mobile phone to annotate the three classes (pro, con, and neutral).

This dataset will allow us to optimize for political actor diversity, issue diversity, and viewpoint (stance) diversity in news recommendation.

References

Reuver, Myrthe, Nicolas Mattis, Marijn Sax, Suzan Verberne, Nava Tintarev, Natali Helberger, Judith Moeller, Sanne Vrijenhoek, Antske Fokkens, and Wouter van Atteveldt (2021), Are we human, or are we users? the role of natural language processing in human-centric news recommenders that nudge users to diverse content, Proceedings of the 1st Workshop on NLP for Positive Impact, pp. 47–59.

Schiller, Benjamin, Johannes Daxenberger, and Iryna Gurevych (2021), Stance detection benchmark: How robust is your stance detection?, KI-K ̈unstliche Intelligenz 35 (3), pp. 329–341, Springer.