Khaya African Language Translation and Speech Recognition AI Demonstrates Major Improvements

What Is The Inspiration Behind The Name Khaya AI?

Fig 1. Khaya AI is named after the Khaya African Mahogany tree. Just like the tree, it is rooted in Africa. We hope it will similarly become a nourishing, sustaining resource for Africa and Africans in the digital future. It is also a word for “home” in several Southern African languages.

What could the Previous Version of Khaya AI do?

Fig 2. Nine months ago, version 1.0.3 of the Khaya App was released — providing the world with Twi and Yoruba Automatic Speech Recognition (ASR) capabilities, as well as Ga, Ewe, Twi and Yoruba neural text translators.
Fig 3. Khaya AI Improves over time by learning from user feedback. The improvements described in this article were achieved over a period of 9 months, and usage of the translators by tens of thousands of people.

What Can The New Version of Khaya AI Do?

Dagbani Introduced as Expansion in Northern Ghana Begins

Text Translator Improvements

Fig 4. Khaya AI can help speakers of different languages to communicate with each other, by translating African languages — such as Twi, Ewe and Ga — into English and vice versa. By learning from user feedback, it improves over time.
Table 1. Measured improvements in text translator performance using the BLEU metric (higher is better). Improvements were confirmed using human evaluators. Yoruba text translators outperform Google Translate by 1.5 BLEU points for the Yoruba to English direction and 9.8 BLEU points for the English to Yoruba direction on our benchmarks. Dagbani showing ∞ improvement indicates a previously nonexistent model. Ga improvements are ongoing, and it has been left out from this table

Automatic Speech Recognition (ASR) Improvements

Fig 5. Khaya AI can help speakers of African languages to communicate with their phones and other devices, by transcribing African languages speech— such as Twi, Ewe, Yoruba and Ga — into text. This text can then be used by Alexa to control a smartphone, for instance, or further processed for a variety of applications
Table 2. Measured improvements in Automatic Speech Recognition (ASR) performance using the WER metric (lower is better). Improvements were confirmed using human evaluators. ∞ improvement indicates a previously nonexistent model.

What Comes Next?

1. API Release

2. More Languages

3. Language Scaling

HOW TO SUPPORT OUR WORK

Fig 6. We are building a more inclusive world with cutting-edge African language tech, will you help us?
Fig 7. You can support our work by purchasing a cheap ad-free subscription to the app.

--

--

--

Paul Azunre holds a PhD in Computer Science from MIT and has served as a Principal Investigator on several DARPA programs. He founded Algorine & Ghana NLP

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

One of the Fathers of AI Is Worried About Its Future

4 Challenges Companies Should Be Ready For When Implementing AI

How to bring growth in 2021 using Conversational AI

Top 5 Conversational Marketing Platform for Travel

Why do we use Computer Vision (CV)?

Image Segmentation

Alethea AI Announces Binance Smart Chain Investment

The Tech Oligopoly — Part 1

How Will AI Change the Face of Digital Marketing in 2021?

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Paul Azunre

Paul Azunre

Paul Azunre holds a PhD in Computer Science from MIT and has served as a Principal Investigator on several DARPA programs. He founded Algorine & Ghana NLP

More from Medium

AI in journalism

Newsletter #68 — Google breakthrough with PaLM language model

The end of the era ImageNet

The Epistemology of an AGI