TinyLlama, a compact 1.1B language model pretrained on around 1 trillion tokens for roughly 3 epochs, shows remarkable performance in various downstream tasks. Despite its small size, it outperforms other open-source language models of similar sizes. Leveraging advances from the open-source community like FlashAttention, it achieves better computational efficiency. The model and its code are publicly available. The study highlights the potential of training smaller models with larger datasets, a somewhat under-explored area.

 

Publication date: 4 Jan 2024
Project Page: https://github.com/jzhang38/TinyLlama
Paper: https://arxiv.org/pdf/2401.02385