The article introduces SceMQA, a new benchmark for scientific multimodal question answering at the college entrance level. It covers core science subjects including Mathematics, Physics, Chemistry, and Biology. The benchmark provides specific knowledge points for each problem and detailed explanations for each answer. Additionally, it presents problems with identical contexts but varied questions for a more thorough assessment of reasoning abilities. The results from the evaluation of state-of-the-art Multimodal Large Language Models (MLLMs) indicate a need for further research and development, as the strongest models only achieved 50% to 60% accuracy.
Publication date: 6 Feb 2024
Project Page: https://scemqa.github.io/
Paper: https://arxiv.org/pdf/2402.05138