conformal policy learning

Q-Probe: A Lightweight Approach to Reward Maximization for Language Models

root February 24, 2024 0

The article introduces Q-probing, a method that adapts a pre-trained language model to maximize a task-specific reward function. This approach sits between heavier methods like finetuning and lighter ones like…

Artificial Intelligence Robotics

Conformal Policy Learning for Sensorimotor Control Under Distribution Shifts

root November 5, 2023 0

The paper addresses the challenge of identifying and responding to changes in the distribution of a sensorimotor controller’s observables. The authors propose conformal policy learning, a method that allows robots…

Page 1 of 1

Press ESC to close

conformal policy learning

Please allow ads on our site