This paper presents an innovative method to improve the mathematical accuracy of Large Language Models (LLMs) like GPT-4, which are integral in developing Intelligent Tutoring Systems (ITS). The strategy involves generating synthetic student-teacher dialogues using GPT-4. An internal monologue, termed ‘code soliloquy’, is triggered with each student response, assessing if calculations are required. If so, the model scripts the required Python code to construct its response. This approach notably enhances the quality of synthetic conversation datasets, especially for calculation-intensive subjects. Findings show that the model proficiently utilizes Python for computations, enhancing not just the accuracy but also the computational reliability of responses.
Publication date: 21 Sep 2023
Project Page: https://arxiv.org/abs/2309.12161
Paper: https://arxiv.org/pdf/2309.12161