Amortizing intractable inference in large language models

This research paper discusses the limitations of autoregressive large language models (LLMs) in compressing knowledge from training data. The authors propose a solution by using amortized Bayesian inference to sample from intractable posterior distributions. The process is achieved by fine-tuning LLMs using diversity-seeking reinforcement learning algorithms. The authors demonstrate that this approach enables efficient adaptation of LLMs to tasks that require multi-step rationalization and tool use.

Publication date: 6 Oct 2023
Project Page: https://github.com/GFNOrg/gfn-lm-tuning
Paper: https://arxiv.org/pdf/2310.04363

Post Views: 368

root

Exit mobile version

Please allow ads on our site

Looks like you're using an ad blocker. Please support us by disabling these ad blocker.

Press ESC to close

Share Article:

root

Confronting Reward Model Overoptimization with Constrained RLHF

Pre-trained Spatial Priors on Multichannel NMF for Music Source Separation

Please allow ads on our site