L4Q: Parameter Efficient Quantization-Aware Training on Large Language Models via LoRA-wise LSQ
This article introduces L4Q, a novel algorithm for parameter-efficient quantization-aware training on Large Language Models (LLMs). L4Q aims to improve the generality of these models using a low-rank adaptation (LoRA)…
Continue reading