data and stimuli for Samo, Bonan and Si (2022) and Bonan and Samo (submitted)
Link to Samo, Bonan and Si (2022) https://ebooks.iospress.nl/doi/10.3233/SHTI220702.
Abstract: This paper explores a methodology for bias quantification in transformer-based deep neural network language models for Chinese, English, and French. When queried with health-related mythbusters on COVID-19, we observe a bias that is not of a semantic/encyclopaedical knowledge nature, but rather a syntactic one, as predicted by theoretical insights of structural complexity. Our results highlight the need for the creation of health-communication corpora as training sets for deep learning.
The code is a mutuated version from Renaud (2020, https://github.com/celine-renaud/Memoire).