Transformer Language Modeling for Akuapem and Asante Twi

Image for post
Image for post
Fig. 1: We named our main model ABENA — A BERT Now in Akan

Introduction

In our previous blog post we introduced a preliminary Twi embedding model based on fastText and visualized it using the Tensorflow Embedding Projector. As a reminder, text embeddings allow you to convert text into numbers or vectors which a computer can perform arithmetic operations on to enable it reason about human language, i.e., carry out natural language processing (NLP). A screenshot of our fastText Twi embeddings from that exercise is shown in Fig. 2.


Watch the accompanying video to this post above & be sure to hit subscribe to see future content on YouTube.

Introduction

Natural language processing (NLP) is the subfield or Machine Learning and Artificial Intelligence (AI) concerned with teaching computers to read, understand and act on human language. A major component in enabling this is converting text into a meaningful set of numbers that the computer can then analyze and manipulate to extract meaning and context. For the purpose of this article, we will restrict the discussion of NLP to the analysis of text.

Image for post
Image for post
Natural Language Processing (NLP) is key for human interaction with computers [image source: thinkpalm.com]

Formally, Natural Language processing can be loosely described as encompassing the tools and methods involved in the analysis or study of languages used for everyday communications by humans, whether by speech or text, through computer manipulations. …


Image for post
Image for post
Source

Last year, I had the privilege of attending the 2019 Neural Information Processing Systems (NeurIPS) conference in Vancouver, Canada. It is widely recognized as the biggest and most influential conference by the Machine Learning (ML) community, and most of those who are interested in the future of the Artificial Intelligence (AI) field keep their eye on it. Yearly, breakthrough theoretical ideas are announced, received with great anticipation and enthusiasm by the community for debate, and some successfully transform the landscape of our understanding of how these systems work, and how they could work even better. Many great minds are undoubtedly in attendance to present their ideas, and the biggest heavyweight industrial players in the space — Google, Facebook, etc. — prowl the attendee lists for top talent to poach and hire. …

About

Paul Azunre

Paul Azunre holds a PhD in Computer Science from MIT and has served as a Principal Investigator on several DARPA programs. He founded Algorine & Ghana NLP

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store