What do language-learning and literary research have to do with artificial intelligence? A workshop at Pittsburgh University, organised by Professor Karen Park as part of Oxford’s AHRC-funded research programme in Creative Multilingualism, aimed to find out. It brought together experts in language conservation, teaching and testing with literary scholars and representatives from Duolingo, Wikitongues, Google, Amazon, TrueNorth, and other AI innovators, for a day of interesting discussion.
AI creates some immediate practical benefits. In the past, you needed a human being to test how well somebody else could speak a language. Oral exams were cumbersome and expensive and limited to only being able to take place at a specified time and place. But now it’s possible for an online test – developed by Duolingo – to measure not only written but also spoken competence, up to a medium-to-good level of proficiency. This means a student in a developing country wanting to prove their level of English doesn’t have to make a journey to a city to do it: the test can be taken anywhere with internet access, at any time.
This technology has the potential to help with less-often learned languages too. In UK schools, lots of students have some knowledge of languages that are not commonly taught (such as community languages for example); but it’s not always so straightforward to turn that knowledge into a qualification because of the difficulty in finding examiners. Perhaps the architecture developed for the English test could help here too. Certainly the internet and social media are enabling many smaller languages to survive and grow. Communities that have been scattered can now chat and tweet in their languages. Software engineers like Craig Cornelius at Google work to make different scripts available in Unicode so that computers can recognise them (most recently, Cherokee). And the website-cum-activist-group Wikitongues archives videos of endangered languages while also campaigning for their support.
This growth in the variety of language used online creates a challenge for translation. Google Translate adds ever-more languages to its portfolio (104, the last time I counted) but the challenge isn’t simply one of scale. As I have been exploring in my own strand of the Creative Multilingualism project, ‘Prismatic Translation’, there is also a profound question as to how we frame the relationship between language (singular), languages (plural) and identity. This is one case where linguistic and literary scholars may be able to help the computer programmers.
Often, people think that a ‘good’ or ‘correct’ translation has to be done into the standard form of a language, such as ‘fluent English’ or Russian. ‘Correctness’ – in the sense of getting the meaning right, is mixed up with ‘correctness’ in terms of using an approved form of the language. But this idea can be considered dated and belongs to the era of print. Languages have always been spoken in a huge variety of ways, including registers and dialects, by different groups, at different times, for different purposes. Really, there has never been such a thing as ‘English’, only ‘Englishes’. However, when a translation was done for print, in books that were sold in a national market, it made sense for the translation to be written in the standard form of the language.
But now, the online world is revealing the amazing variety of ways in which people use language. What we might think of as being separate (standard) languages blend and merge; innovations and idiosyncrasies catch on and spread. What we are faced with, is not a countable number of separate ‘languages’, but the enormous landscape of ‘language’ as an endlessly diverse, ever-changing continuum. Standard languages are strongly regulated areas on this continuum; but all sorts of people that might be characterised by ethnicity, location, interest or age, express their identities through language in varyingly distinctive ways.
Colin Cherry, a research scientist at Google Translate, raised the question of what kind of language a translation should be into. He suggested that Google might, before long, give you a choice as to whether you wanted your translation to be fluent, or to ‘sound like a translation’; or to be formal or colloquial.
Yet the potential for variety is much wider than this. What if online translation could happen, not only into standard Greek, but Greeklish; not only into standard German, but Kiezdeutsch; not only into English but into English as I speak it, my idiom, my style? Would we even want that to happen? Technology creates these questions; but it is culture that will answer them.