The study discusses the problem of inconsistent output quality in natural language generation from logical forms using large language models. To address this, a generate-and-rerank approach is proposed. Initially, a set of candidate outputs is generated by prompting a large language model, these are then reranked using a task-specific reranker model. The effectiveness of the reranker model is evaluated using a manually collected dataset. The metrics used for ranking are aligned with human judgements. Extensive experiments show that the reranker model significantly improves the semantic consistency and fluency of the generated outputs compared to baseline methods.
Publication date: 22 Sep 2023
Project Page: Not provided
Paper: https://arxiv.org/pdf/2309.12294