AI can help bridge Southeast Asia’s 1000 languages—but the work ‘has to be done by Southeast Asians’
The full extent of Southeast Asia’s cultural diversity risks being ignored by AI models built on English and Mandarin Chinese.
With over 1,000 languages, Southeast Asia is one of the most linguistically diverse regions in the world–and that’s a challenge for businesses trying to operate with talent and customers right next door.
“The language barrier can be a huge issue,” Kisson Lin, co-founder and chief operating officer for Singaporean AI startup Mindverse AI at Fortune’s Brainstorm AI Singapore conference on Tuesday. “We have different colleagues from different regions speaking different languages. It’s not only that you [find] it hard to collaborate, but also hard to bond with each other.”
But can AI bridge the linguistic divide, without eradicating the cultural nuances within a diverse population of 600 million?
Solving this question can unlock new markets for global businesses. Lin pointed out that Alibaba’s sales revenue spiked once it started using AI to translate product information.
AI might even help India’s prolific, multi-lingual entertainment industry “propagate to the whole world,” says Sambit Sahu, senior vice president of silicon design for Ola Kutrim, an Indian AI startup.
Yet Leslie Teo, lead of the Southeast Asian Languages in One Network (Sea-Lion) project, said that hundreds of Southeast Asian languages present a unique challenge to developers. “AI models are built on data…and the region is not well represented in the digital space.” That means the richness of the area’s food, history and culture—particularly from smaller language groups like Khmer and Lao—risks being left out.
The benchmarks for judging AI’s performance are also largely driven by English and Mandarin Chinese, which can miss the nuances of even widely-spoken tongues like Cantonese, says Caroline Yap, managing director of global AI business and applied engineering at Google Cloud. That’s why it’s important to “keep humans in the loop,” she said.
It’s important to share models widely and allow universities, developers and enterprises to test and find problems, Sahu suggested.
But Teo argued the only way for AI to accurately represent Southeast Asia’s distinctive character and complexity, without the communities that provide the data, is to put locals in charge of the process.
“It has to be done by Southeast Asians,” he said.