A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains
The article presents REVEAL, a new dataset for benchmarking automatic verifiers of complex reasoning chains in language models. This is useful in complex reasoning tasks that require step-by-step answers, known…
Continue reading