Why AI Needs to Speak More Than English
African languages face neglect in AI safety tests. TUKABENCH aims to change that, revealing the cultural bias inherent in current models.
For all the buzz about AI's global reach, there's a glaring oversight that's hard to ignore. testing the safety of Large Language Models (LLMs), English still dominates the conversation. But what about African languages or other low-resource ones? Enter TUKABENCH, a new benchmark aiming to shake up the status quo.
The Language Bias
Let's get real. Most AI models are English-centric, leaving African languages in the dust. TUKABENCH, focusing on seven neglected African languages, is trying to change the narrative. It builds on JailbreakBench by using human translation, culturally apt prompts, and even code-switching between English and African languages. The goal? To see how these models respond in languages other than English.
Here's the kicker: prompting in African languages results in fewer refusals. Culturally adapted prompts are even more effective. The challenge here isn't just about language, it's about cultural understanding. When the AI understands cultural context, it performs better. So, why aren't more models built this way?
Model Failures and Human Oversight
But wait, there are two big hiccups. First, these models often trip over comprehension when faced with lesser-known languages. Second, the reliability of these models as 'judges' drops significantly. TUKABENCH introduces a new term called 'Deflection' to capture these failures, alongside the more typical 'Refused' and 'Jailbroken'. And when compared with human judgments, AI seems to falter more as languages get less common.
This exposes a troubling truth. If AI can't reliably understand or judge in these languages, how can we trust its decisions? The gap between the keynote and the cubicle is enormous.
Why This Matters
So, what's the takeaway? AI needs to be more inclusive and culturally aware if it's going to be truly global. The press release said AI transformation. The employee survey said otherwise. If models can't perform consistently across languages, they can't claim to be truly innovative. And let's face it, language inclusivity isn't just a 'nice-to-have'. It's essential for ethical AI deployment.
In a world where AI's influence is only growing, ignoring language diversity is a recipe for disaster. After all, how can we expect AI to serve a global audience if it doesn't even speak their language?
Get AI news in your inbox
Daily digest of what matters in AI.
Key Terms Explained
The broad field studying how to build AI systems that are safe, reliable, and beneficial.
A standardized test used to measure and compare AI model performance.
In AI, bias has two meanings.
The practice of developing AI systems that are fair, transparent, accountable, and respect human rights.