Khaya African Language Translation and Speech Recognition AI Demonstrates Major Improvements

What Is The Inspiration Behind The Name Khaya AI?

Fig 1. Khaya AI is named after the Khaya African Mahogany tree. Just like the tree, it is rooted in Africa. We hope it will similarly become a nourishing, sustaining resource for Africa and Africans in the digital future. It is also a word for “home” in several Southern African languages.

What could the Previous Version of Khaya AI do?

Fig 2. Nine months ago, version 1.0.3 of the Khaya App was released — providing the world with Twi and Yoruba Automatic Speech Recognition (ASR) capabilities, as well as Ga, Ewe, Twi and Yoruba neural text translators.
Fig 3. Khaya AI Improves over time by learning from user feedback. The improvements described in this article were achieved over a period of 9 months, and usage of the translators by tens of thousands of people.

What Can The New Version of Khaya AI Do?

Dagbani Introduced as Expansion in Northern Ghana Begins

If you go through the reviews gathered by the Khaya app on the Android store (where a majority of the current app user base exists) a notable theme is the absence of Northern Ghanaian languages in version 1.0.3 of the app. We invested a significant amount of effort in Northern language research over the past year and are proud to introduce Dagbani text translation and ASR in version 1.0.4. We worked closely with the Dagbani Wikimedia Group on data and evaluation of the Dagbani technologies. This is the tip of the proverbial iceberg for our planned Northern Ghanaian language coverage, with languages like Hausa, Gurune (Frafra) and Buli slated for text translator release shortly. Models for languages such as Dagaare, Mamprusi, Gonja and Kasem are in early research phases as well.

Text Translator Improvements

Fig 4. Khaya AI can help speakers of different languages to communicate with each other, by translating African languages — such as Twi, Ewe and Ga — into English and vice versa. By learning from user feedback, it improves over time.
Table 1. Measured improvements in text translator performance using the BLEU metric (higher is better). Improvements were confirmed using human evaluators. Yoruba text translators outperform Google Translate by 1.5 BLEU points for the Yoruba to English direction and 9.8 BLEU points for the English to Yoruba direction on our benchmarks. Dagbani showing ∞ improvement indicates a previously nonexistent model. Ga improvements are ongoing, and it has been left out from this table

Automatic Speech Recognition (ASR) Improvements

Fig 5. Khaya AI can help speakers of African languages to communicate with their phones and other devices, by transcribing African languages speech— such as Twi, Ewe, Yoruba and Ga — into text. This text can then be used by Alexa to control a smartphone, for instance, or further processed for a variety of applications
Table 2. Measured improvements in Automatic Speech Recognition (ASR) performance using the WER metric (lower is better). Improvements were confirmed using human evaluators. ∞ improvement indicates a previously nonexistent model.

What Comes Next?

1. API Release

We are building an API to empower African developers to build application solutions for their communities on top of the AI and ML systems we have created. Currently, API release is scheduled to open up for free trials within a couple weeks of this writing (scheduled for early May).

2. More Languages

Improvements to the Ga text translation system are in the pipeline for release. Additions of — 1. Swahili, Hausa, Frafra (Gurune), Buli, Dagaare, Mamprusi, Shona text translators, 2. Swahili and Hausa ASR — are on the road map this year.

3. Language Scaling

When we started out, our models were “one-to-one” — meaning a separate model was trained for Twi to English, English to Twi, English to Ga, etc. As you can imagine, this approach is difficult to scale in production, as general purpose translators and speech recognition systems are relatively large and computationally expensive.

HOW TO SUPPORT OUR WORK

Fig 6. We are building a more inclusive world with cutting-edge African language tech, will you help us?
Fig 7. You can support our work by purchasing a cheap ad-free subscription to the app.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Paul Azunre

Paul Azunre

Paul Azunre holds a PhD in Computer Science from MIT and has served as a Principal Investigator on several DARPA programs. He founded Algorine & Ghana NLP