How to Prune Your Language Model: Recovering Accuracy on the Sparsity May Cry'' Benchmark

This study revisits the issue of pruning large language models (LLMs), particularly from the BERT family, in response to the Sparsity May Cry (SMC) benchmark. The benchmark highlights the complexity of pruning, as many existing methods seem to fail. The authors propose a set of strategies to achieve successful pruning, which includes a cost-benefit analysis of pruning model components, a method for scaling training and learning rate schedules, and the importance of proper parameterization for Knowledge Distillation in LLMs. The insights lead to state-of-the-art results on both classic BERT-pruning benchmarks and the SMC benchmark.

Publication date: 21 Dec 2023
Project Page: https://arxiv.org/abs/2312.13547v1
Paper: https://arxiv.org/pdf/2312.13547

Post Views: 271

How to Prune Your Language Model: Recovering Accuracy on the Sparsity May Cry” Benchmark

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

Speech Translation with Large Language Models: An Industrial Practice

Developing Interactive Tourism Planning: A Dialogue Robot System Powered by a Large Language Mode

Leave a Reply Cancel reply

Please allow ads on our site