ArthModel: Enhance Arithmetic Skills to Large Language Model

The article introduces a new approach to enhance the arithmetic abilities of large language models (LLM). The authors propose training these models to generate a postfix expression related to the arithmetic problem and combining it with small pretrained models. These smaller models convert the token embeddings into real dense numbers and use native functions of a deep learning platform to get the correct answer. The final result is generated by adding the result outputs by the small model to LLM, a method referred to as ‘prompt injection’. The authors believe this method provides a different perspective on training and using language models.

Publication date: 30 Nov 2023
Project Page: https://github.com/eteced/arithmetic_finetuning_v1
Paper: https://arxiv.org/pdf/2311.18609

Post Views: 293

ArthModel: Enhance Arithmetic Skills to Large Language Model

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

ArcMMLU: A Library and Information Science Benchmark for Large Language Models

FFT: Towards Harmlessness Evaluation and Analysis for LLMs with Factuality, Fairness, Toxicity

Leave a Reply Cancel reply

Please allow ads on our site