Recap of our Tech Talk 3: “The Making-Of the Inclusive Language Twitter Corpus”

Bibiana Cortés, Dr. Diana Carter, and Lucas Quesada present to an in-person and virtual audience during our third Tech Talk of the 2022/23 Series.

On November 15th, 2022, the AMP Lab welcomed Dr. Diana Carter, Lucas Quesada, Yarubi Díaz Colmenares, and Bibiana Cortés who presented on their ongoing research around inclusive language usage among Spanish speakers tweeting on Twitter. Their presentation consisted of three different steps to creating their inclusive Language Twitter Corpus.

Firstly, they presented how they used Netlytic.org software to gather their tweet dataset. They created parameters by which they could collect instances of inclusive language usage focused on pronouns. In Spanish, pronouns are conjugated in relation to the noun they are attached to, which results in a gendered conjugation for all pronouns. This aspect of Spanish is exclusionary for non-binary pronoun usage. The collection and analysis of tweets aided in determining ways in which Twitter users were modifying pronouns to enact inclusive language.

Dr. Diana Carter responds to questions from the audience.

The team used Python code as a way of filtering out tweets that were not relevant to inclusive language, or, were not examples of tweets which used creative elements to make Spanish pronouns inclusive. In some instances, they manually coded and sifted through their dataset to find examples of conscious inclusive language. For instance, the research team reflected on how tweets used the ‘@’ symbol instead of an o or an a in words such as nosotros to deny the masculine or feminine pronoun usage in favour of an inclusive and non-gendered version of the pronoun. In some tweet examples, these types of usages appeared accidental, or were deemed as not consciously altering the gendered pronoun for the purpose of language inclusivity.

Finally, they brought forward challenges they faced with the collection, including coding issues and content clarity and intention of the tweets and their Tweeters. As well, the aforementioned example of the usage of the ‘@’ symbol as a shift away from gendered pronoun usage is complicated in that the symbol itself still holds the shape of the letter a, denoting a feminine pronoun usage. Some other examples of ways tweeters modified pronouns for inclusivity included emojis, hashtags, and other symbols to obfuscate the inherently gendered usage of a and o in Spanish pronouns.

Bibiana Cortés explains her coding practice using Python on this project.

This tech talk was an incredible insight into the ways that digital tools such as Python and Netlytic.org can be used for language studies, gender studies, and critical considerations around internet and social platform usage. We thank the research team for their thoughtful engagement with digital humanities through the lens of language and gender studies.

This research project is funded by SSHRC.