Alibaba Launches AI Language Model for Southeast Asian Languages

Alibaba Launches AI Language Model for Southeast Asian Languages

Alibaba’s Research Unit, Damo Academy, has made a groundbreaking announcement: the launch of an artificial intelligence (AI) large language model tailored specifically for Southeast Asian languages. The Southeast Asia LLM (SeaLLM) is an AI model that has been pre-trained on languages such as Vietnamese, Indonesian, Thai, Malay, Khmer, Lao, Tagalog, and Burmese. This model has shown exceptional performance in linguistic and safety tasks, outperforming other open-source models in the same field.

For Alibaba, this marks a significant step in their ambition to expand their markets in the wider Southeast Asian region. The company recognizes Southeast Asia as a crucial growth market. As an example, Alibaba’s e-commerce platform Lazada, which operates in Southeast Asia, has set a target of achieving a turnover of US$100 billion by 2030 and aims to serve 300 million consumers in the region.

What sets SeaLLM apart from other models is its regional specificity. It is the first LLM developed by Alibaba that focuses on Southeast Asian languages. The launch of this model underscores the continuous efforts of Chinese companies in embracing and capitalizing on the generative AI wave that was initiated by OpenAI’s ChatGPT last year. Chinese companies and research institutes have collectively released a staggering 130 LLMs as of July this year, sparking what has been dubbed as the “tussle of a hundred large models” in the country.

Alibaba’s previous LLM, Tongyi Qianwen, which was launched in April, ranks fourth among all the models tracked by the open-source AI platform Hugging Face. The platform assesses models based on various criteria, including scientific knowledge, multitasking accuracy, and common sense reasoning. It is a testament to the outstanding capabilities of Alibaba’s LLMs in the global AI landscape.

According to Damo, SeaLLM demonstrates superior performance in non-Latin language tasks, boasting the ability to interpret and process text up to nine times longer than other models. It also excels in translating between English and low-resource languages like Lao and Khmer, which have limited training data available for conversational AI systems. This opens up avenues for better communication and engagement between businesses using the LLM and the Southeast Asian markets.

Bing Lidong, director of the language technology lab at Damo, emphasized that SeaLLM embraces the cultural richness of Southeast Asia. Moreover, he believes that this innovation has the potential to empower communities that have historically been under-represented in the digital realm.

Despite the rapid development and deployment of numerous LLMs, analysts have pointed out that the Chinese AI market still faces challenges. These challenges include US chip restrictions and the need for killer apps that can attract more users in a dynamic and ever-changing market. However, with the launch of SeaLLM and the continuous efforts of companies like Alibaba, it is evident that the Chinese AI market is well-positioned to overcome these obstacles and make significant strides in the field of AI language models.

In conclusion, the launch of Alibaba’s SeaLLM marks a significant milestone in the field of AI and language models. As Alibaba aims to expand its reach in Southeast Asia, this tailored LLM will play a pivotal role in breaking language barriers and fostering better communication within the region. By embracing and empowering historically under-represented communities, Alibaba is not only making progress in the AI chess game but also championing inclusivity and diversity in the digital realm.


Written By

Jiri Bílek

In the vast realm of AI and U.N. directives, Jiri crafts tales that bridge tech divides. With every word, he champions a world where machines serve all, harmoniously.