The paper presents a new approach to predicting with missing data by viewing it as a two-stage adaptive optimization problem. Traditionally, missing data points are imputed before predictions are made. However, this study proposes adaptive linear regression models, where the regression coefficients adapt depending on the observed features. This approach is shown to be equivalent to learning an imputation rule and a downstream linear regression model simultaneously, rather than sequentially. The authors also extend their framework to non-linear models. In cases where data is strongly not missing at random, their methods achieved a 2-10% improvement in out-of-sample accuracy.

 

Publication date: 2 Feb 2024
Project Page: https://arxiv.org/abs/2402.01543v1
Paper: https://arxiv.org/pdf/2402.01543