MMToM-QA: Multimodal Theory of Mind Question Answering
The paper introduces a multimodal Theory of Mind (ToM) question answering benchmark, MMToM-QA, and a new method, BIP-ALM (Bayesian Inverse Planning Accelerated by Language Models), for engineering multimodal ToM capacity….
Continue reading