The paper discusses the use of Microscaling (MX) data formats in deep learning applications. MX data formats combine a per-block scaling factor with narrow floating-point and integer types for individual elements. The study demonstrates the practicality of MX data formats as a replacement for baseline FP32 for AI inference and training. The paper also highlights the first instance of training generative language models with sub-8-bit weights, activations, and gradients with minimal accuracy loss. The researchers conclude that MX formats effectively balance hardware efficiency, model accuracy, and user friction.

 

Publication date: 16 Oct 2023
Project Page: https://arxiv.org/abs/2310.10537v1
Paper: https://arxiv.org/pdf/2310.10537