The article presents a method for users to design custom gestures using a monocular camera, employing transformers and meta-learning techniques to address the challenges of few-shot learning. The method supports any combination of one-handed, two-handed, static, and dynamic gestures from various viewpoints. This approach was evaluated through a user study with 20 gestures collected from 21 participants, showing up to 97% average recognition accuracy from a single demonstration. The authors suggest this work provides a promising path for future advancements in vision-based gesture customization.

 

Publication date: 13 Feb 2024
Project Page: https://arxiv.org/abs/2402.08420v1
Paper: https://arxiv.org/pdf/2402.08420