TinyLlama: An Open-Source Small Language Model

TinyLlama, a compact 1.1B language model pretrained on around 1 trillion tokens for roughly 3 epochs, shows remarkable performance in various downstream tasks. Despite its small size, it outperforms other open-source language models of similar sizes. Leveraging advances from the open-source community like FlashAttention, it achieves better computational efficiency. The model and its code are publicly available. The study highlights the potential of training smaller models with larger datasets, a somewhat under-explored area.

Publication date: 4 Jan 2024
Project Page: https://github.com/jzhang38/TinyLlama
Paper: https://arxiv.org/pdf/2401.02385

Post Views: 289

Press ESC to close

Share Article:

root

LLaMA Pro: Progressive LLaMA with Block Expansion

SPEER: Sentence-Level Planning of Long Clinical Summaries via Embedded Entity Retrieval

Please allow ads on our site