Optimal cross-learning for contextual bandits with unknown context distributions
The paper by Jon Schneider and Julian Zimmert from Google Research addresses the problem of designing contextual bandit algorithms in cross-learning settings, where the learner observes the loss for the…
Continue reading