Learning Planning-based Reasoning by Trajectories Collection and Process Reward Synthesizing
The study focuses on Large Language Models (LLMs) and their ability to handle complex reasoning tasks. It discusses the issues of reliability and faithfulness in the generated rationales of these…
Continue reading