Machine (Mis)translation and Problem of Codifying Non-Anglophone Culture

At present, most natural language processing tasks work best in English. The over-representation of English starkly juxtaposes the minimal linguistic representation of the global majority within language models. These languages, otherwise known as 'Low resource languages' (LRLs) lack the data needed to perform NLP tasks well. Within the context of machine translation (MT) systems, languages and by extension, cultural identities get lost in translation. Considering the case of machine translation tasks for three Nigerian languages: Hausa, Ìgbò and Yorùbá, I summarize findings from a series of interviews of Nigerian language experts. The findings illustrate indigenous perceptions of MT system usability within Nigerian contexts and the ideal use-cases that they imagine. Participants also discuss the technical failures they observe, situating them within complex linguistic attributes of their native tongues. In this talk, I highlight the difficulty of holistically representing the complex social, political, and cultural contexts embedded within Nigerian languages. Therefore, expanding discourses of sociotechnical failures to the level of epistemological failure which in consequence, invite us into a line of inquiry that questions the knowledge frameworks that inform how to define a 'good machine translation' practice in relation to low-resourced indigenous languages.

 

Date
11.05.
Start
10
00
End
13
00
Format
Lecture
Contributor(s)