The paper introduces ORTexME, an occlusion-robust method for 3D human shape and pose estimation from a monocular video. Existing models struggle with videos featuring occlusion, a common occurrence in wild videos. ORTexME utilizes temporal information from the input video to better regularize the occluded body parts. The method is based on NeRF and uses a novel average texture learning approach to determine reliable regions for the NeRF ray sampling. The method also uses the human body mesh to guide the opacity-field updates in NeRF to suppress blur and noise. Tests show significant improvement on the challenging multi-person 3DPW dataset.

 

Publication date: 21 Sep 2023
Project Page: https://arxiv.org/abs/2309.12183v1
Paper: https://arxiv.org/pdf/2309.12183