This article introduces Typhoon, a series of Thai large language models (LLMs) developed specifically for the Thai language. Given the challenge of low-resource languages, the authors apply continual training to transfer existing world knowledge from a strong LLM. They evaluate the Thai knowledge encapsulated in each model from the pretraining stage using ThaiExam, a benchmark based on examinations for high-school students and investment professionals in Thailand. Additionally, they fine-tune Typhoon to follow Thai instructions, and evaluate instruction-tuned models on Thai instruction datasets as well as translation, summarization, and question-answering tasks. It is found that Typhoon outperforms all open-source Thai language models, and its performance is on par with GPT-3.5 in Thai while having only 7 billion parameters and being 2.62 times more efficient in tokenizing Thai text.

 

Publication date: 21 Dec 2023
Project Page: https://huggingface.co/scb10x/typhoon-7b
Paper: https://arxiv.org/pdf/2312.13951