3 Challenges Threaten Africa Rise of AI in 1000 Languages! Can They Solve It?

3 Challenges Threaten Africa Rise of AI in 1000 Languages! Can They Solve It

3 Challenges Threaten Africa Rise of AI in 1000 Languages! Can They Solve It?

Africa is building AI tools in its many languages, but faces hurdles. Data scarcity and ethical questions around data collection threaten progress. Efforts are underway in Nigeria, Kenya, and South Africa to create inclusive AI solutions.

CONTENTS:

3 Challenges Threaten Africa Rise of AI in 1000 Languages
3 Challenges Threaten Africa Rise of AI in 1000 Languages! Can They Solve It

Hausa AI project lacked data

3 Challenges Threaten Africa Rise of AI

When the Nigerian government announced in April that it would develop a multilingual artificial intelligence (AI) tool to enhance digital inclusion throughout West Africa, 28-year-old computer science student Lwasinam Lenham Dilli was excited. Dilli had faced challenges in finding datasets online to create a large language model (LLM) for his final-year university project, which aimed to power AI chatbots in his native Hausa language.

“I needed texts in English along with their Hausa translations, but I couldn’t find any clean data online,” Dilli told the Thomson Reuters Foundation.

 

African AI needs local languages

“Developing local language LLMs is crucial for ensuring that our dialects and languages are preserved and included in the AI ecosystem,” Dilli remarked.

Globally, AI tools like OpenAI’s ChatGPT, Meta’s Llama 2, and Mistral AI have captured the attention of millions due to their capability to generate human-like text. However, for many tech-savvy Africans, this excitement is dampened by a stark reality: these advanced systems often struggle with African languages such as Hausa, Amharic, or Kinyarwanda, frequently producing nonsensical outputs.

Technology experts warn that the absence of LLMs in these languages could lead to the exclusion of millions across the continent, exacerbating both digital and economic divides.

 

Nigeria builds inclusive AI tool

In April, Nigeria’s Digital Economy Minister Bosun Tijani announced a government-led initiative aimed at developing a multilingual large language model (LLM) to promote inclusivity in AI solutions. The initiative will focus on training the LLM using five low-resource languages and accented English to ensure robust language representation.

Collaboration will occur with Nigerian AI startups, and local data collection will involve volunteers proficient in five Nigerian languages: Yoruba, Hausa, Igbo, Ibibio, and Pidgin, a widely spoken West African lingua franca. The project will leverage the expertise of over 7,000 participants from Nigeria’s tech talent program, which aims to train three million individuals in coding and programming skills.

 

Complex Nigerian AI needs hacks

Silas Adekunle, co-founder of Awarri, an AI startup involved in the initiative, highlighted the complexities of creating an AI tool that comprehends Nigeria’s diverse linguistic and cultural terrain.

“With Nigeria’s multitude of accents and languages, developing this large language model (LLM) presents numerous challenges,” Adekunle stated. “However, the LLM will empower many individuals and developers to create AI-driven products tailored for the Nigerian market.”

3 Challenges Threaten Africa Rise of AI: Adekunle emphasized the project’s ambitious scope and the need for innovative approaches due to resource constraints. “We’ve had to be inventive in how we train the model, gather data, compute, and annotate what we have,” he added.

 

African AI rising in local tongues

Africa hosts over 2,000 languages across its 54 countries, as reported by UNESCO. Despite this linguistic diversity, most African languages are underrepresented online, with English dominating approximately half of all websites, followed by Spanish, German, Japanese, and French.

3 Challenges Threaten Africa Rise of AI: In addition to the Nigerian government’s initiative, a small but growing number of African startups are addressing the challenge of developing AI tools in languages like Swahili, Amharic, Zulu, and Sesotho. For example, in Kenya, Jacaranda Health, a health tech firm, has introduced the first large language model (LLM) operating in Swahili. Named UlizaLlama (AskLlama) and built on Meta’s Llama 3 system, this initiative aims to enhance maternal healthcare in East Africa.

Currently, the platform provides automated responses to queries from Swahili-speaking expectant mothers on topics such as diet, fetal movement, and exercise during pregnancy. By the end of June, UlizaLlama plans to integrate capabilities that personalize responses based on individual needs, offering more comprehensive pregnancy guidance and emergency support.

 

African AI aids mothers, translates

3 Challenges Threaten Africa Rise of AI: Jay Patel, director of technology at Jacaranda Health, emphasized UlizaLlama’s mission to provide accurate answers swiftly to expectant mothers who may not have easy access to information through conventional means like Google. Initially targeting an 85% accuracy rate, the platform aims to reduce response times from a few minutes to less than a minute in the future.

Meanwhile, in South Africa, the Masakhane initiative is leveraging open-source machine learning to facilitate the translation of African languages. Lelapa AI, a South African AI research lab, has introduced VulaVula, a commercial language processing tool capable of translating, transcribing, and analyzing languages such as English, Afrikaans, Zulu, and Sesotho.

 

African AI faces data, ethics woes

3 Challenges Threaten Africa Rise of AI: Building large language models (LLMs) in African languages presents significant challenges, primarily due to data scarcity and ethical concerns surrounding consent, compensation, and copyright issues.

Many African languages are considered low-resource languages, lacking sufficient data for effective model training compared to high-resource languages like English or French. Michael Michie, Co-Founder of Everse Technology Africa, highlighted ethical dilemmas in data collection for LLMs. In communities where oral tradition is prevalent, there may be reluctance to share language data, necessitating respect for community wishes.

Currently, African countries lack specific regulations addressing consent, privacy, and fair compensation for communities contributing data to train AI tools. Michie emphasized the need for guidelines to prevent exploitation and ensure that the development of LLMs benefits the communities they are intended to serve.

Open-source initiatives like Creative Commons, which enable sharing with specified conditions such as attribution and non-commercial use, are not a straightforward solution according to AI experts. Vukosi Marivate, associate professor at the University of Pretoria and co-founder of Lelapa AI, cautioned against a blanket application of Creative Commons for language models. He noted concerns that without proper frameworks, contributors to these models may not receive adequate recognition or compensation.

Marivate stressed the importance of safeguarding African languages amidst growing interest and investment in LLMs, ensuring that the development efforts prioritize the well-being and rights of language communities.

 

Check out TimesWordle.com  for all the latest news

Leave a Reply

Your email address will not be published. Required fields are marked *