The study introduces a new approach, Parameter-Efficient Sparsity Crafting (PESC), to improve the performance of Large Language Models (LLMs) in instruction tuning tasks. Instruction tuning enhances the ability of LLMs to follow natural language instructions. However, performance limitations often occur due to constrained model capacity. PESC addresses this by transitioning dense models to sparse models using a Mixture-of-Experts (MoE) architecture. This method reduces computational costs and GPU memory requirements, enabling model capacity expansion. The effectiveness of PESC is demonstrated through empirical evaluation, with PESC-tuned models outperforming others in general capabilities.

 

Publication date: 8 Jan 2024
Project Page: Not provided
Paper: https://arxiv.org/pdf/2401.02731