On the Curses of Future and History in Future-dependent Value Functions for Off-policy Evaluation
The article delves into the study of off-policy evaluation (OPE) in environments with complex observations, aiming to develop estimators that can avoid exponential dependence on the horizon. The authors explore…
Continue reading