Michiel van der Meer and Liesje van der Linden
Analyzing public feedback on governmental decisions stemming from online platforms can provide quick insights into the support and opposition expressed. However, traditionally, computational analyses are limited to single-dimensional stance detection (e.g. Wang et al., 2020; Martin et al., 2020), whereas creating an overview of the arguments in favor or against policy could provide the authorities with more nuanced perspectives. Furthermore, existing approaches to argument extraction have focused on mining arguments, but fail to condense information in a structured manner (Ein-Dor et al., 2020).
In this work, we explore the capabilities of an automated method for extracting key arguments (KPA, Bar-Haim et al., 2020) on a corpus of Dutch opinions on the decision to stop administering the AstraZeneca vaccine to citizens under the age of 60. The opinions stem from the discussion forum Nujij, provided by the popular Dutch news website Nu.nl. The corpus has previously been annotated manually for stance, arguments, and willingness to take the vaccine (van der Linden et al., 2022).
Since the corpus is annotated using an inductive approach, we can 1) investigate the KPA method’s ability to detect all high-level arguments, irrespective of their popularity, and 2) measure the recall of the argument matching approach. Such an analysis is usually not possible due to corpora not being exhaustively annotated, both in terms of label definitions (the set of argument classes) and label assignments (comments assigned to an argument class).
We find that the black-box model is unable to generalize to infrequent arguments, contains a series of failure cases, and performance is generally inflated due to single class performance. Most argument clusters stemming from the inductive annotation procedure are not retrieved by the KPA method, even when optimized to find a similar number of key argument clusters. The best performance was observed when restricted to a limited number of candidates, leading to very general key arguments.
Our results indicate that the KPA method can reconstruct, to some degree, the main arguments present in the AstraZeneca discussion. However, failing to extract arguments beyond the most frequent ones limits its usability for directly informing policy. Instead, more attention should be made to the representation of minority opinions in the long tail, which runs a risk of getting neglected (Mustafaraj et al., 2011).
Finally, we call for the incorporation of a human-in-the-loop approach, to facilitate the argument analysis. Hybrid methods can help in avoiding some of the drawbacks stemming from automated methods but reduce workload with respect to a fully manual approach (van der Meer et al., 2022).