The article presents a model built on latent diffusion advances that translates between touch and images, thus improving the process of visuo-tactile synthesis. The model outperforms previous work on tactile-driven stylization, which manipulates an image to match a touch signal. It’s the first to generate images from touch without needing additional scene information. It also introduces two new synthesis tasks: creating images that don’t include the touch sensor or hand, and estimating an image’s shading from its reflectance and touch. This work contributes significantly to multimodal learning, specifically in the area of touch sensing.

 

Publication date: 26 Sep 2023
Project Page: https://fredfyyang.github.io/vision-from-touch/
Paper: https://arxiv.org/pdf/2309.15117