The article introduces a new approach to enhance the arithmetic abilities of large language models (LLM). The authors propose training these models to generate a postfix expression related to the arithmetic problem and combining it with small pretrained models. These smaller models convert the token embeddings into real dense numbers and use native functions of a deep learning platform to get the correct answer. The final result is generated by adding the result outputs by the small model to LLM, a method referred to as ‘prompt injection’. The authors believe this method provides a different perspective on training and using language models.
Publication date: 30 Nov 2023
Project Page: https://github.com/eteced/arithmetic_finetuning_v1
Paper: https://arxiv.org/pdf/2311.18609