OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation

The article presents OPERA, a novel decoding method for multi-modal large language models (MLLMs) to reduce hallucinations. The approach is based on an Over-trust Penalty and a Retrospection-Allocation strategy. This method does not require additional data, knowledge, or training. The research found that hallucinations are closely tied to the knowledge aggregation patterns in the self-attention matrix. OPERA introduces a penalty term during the beam-search decoding to mitigate the over-trust issue, along with a rollback strategy. It has shown significant hallucination-mitigating performance on different MLLMs and metrics.

Publication date: 29 Nov 2023
Project Page: This link
Paper: https://arxiv.org/pdf/2311.17911

Post Views: 299

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving

HUGS: Human Gaussian Splats

Leave a Reply Cancel reply

Please allow ads on our site