LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models

The article presents LongLoRA, a new efficient fine-tuning method that extends the context sizes of pre-trained large language models (LLMs) without significant computational costs. Traditional methods for training LLMs with long context sizes require extensive computational resources. LongLoRA addresses this by using sparse local attention and parameter-efficient fine-tuning for context expansion. It has shown strong results on various tasks and is compatible with most existing techniques. The authors have also released a dataset, LongQA, for supervised fine-tuning.

Publication date: 21 Sep 2023
Project Page: github.com/dvlab-research/LongLoRA
Paper: https://arxiv.org/pdf/2309.12307

Post Views: 295