The research explores the use of lightweight deep neural network architectures to enable Pepper, a humanoid robot, to understand and interpret American Sign Language (ASL). The goal is to facilitate non-verbal human-robot interaction. The model is optimized for embedded systems, ensuring rapid sign recognition while conserving computational resources. Large language models (LLMs) are used for intelligent robot interactions, with a focus on generating natural Co-Speech Gesture responses. The research presents an integrated software pipeline for socially aware AI interaction model. The results highlight the potential for enhancing human-robot interaction through non-verbal interactions and making technology more accessible.

 

Publication date: 2 Oct 2023
Project Page: Not provided
Paper: https://arxiv.org/pdf/2309.16898