The article introduces a novel method, DORAEMON, to address sim-to-real transfer in reinforcement learning. DORAEMON is a constrained optimization problem that maximizes the entropy of the training distribution while retaining generalization capabilities. It gradually increases the diversity of sampled dynamics parameters as long as the current policy’s success probability is high. The method has been tested successfully in a robotic manipulation setup with unknown real-world parameters.

 

Publication date: 3 Nov 2023
Project Page: https://arxiv.org/abs/2311.01885
Paper: https://arxiv.org/pdf/2311.01885