Kieran co-presented with Niamh on the issue of language bias. Niamh spoke about various ways in which the field of digital preservation has established English as the default language, using examples like programming languages, GitHub, but also how some digital preservation tools fail to handle non-English characters very well, either distorting their spelling or just plain failing to work. She also highlighted practices within the field that sometimes refer to diacritics, such as fadas as ‘illegal characters’ that should be ‘sanitised’.
In the second half of the talk, Kieran shared several real-world NLI examples that each had unique issues and solutions to the handling of fadas. In each instance, he demonstrated that it might be tempting to just erase diacritics and anglicise the characters, but if you dig a bit deeper, you can find ways to preserve these characters while making them work well with digital preservation tools, or even tools like Excel. Rather than leaving out the diacritics which are essential parts of a word, it’s great to hear a discussion of how these kinds of limitations in the systems can be overcome.