Item: MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models
Rating: 78.81
Author: GitHub Roast

← Back to the board

MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models

Chaoyou Fu, Peixian Chen, Yunhang Shen, Yulei Qin et al.

78.81/100

📘 Readable

Decent, has merit

Content 68.0 · Citation bonus +10.8 · 1596 citations

💡 This paper presents MME, the first comprehensive evaluation benchmark for Multimodal Large Language Models (MLLMs), covering 14 perception and cognition subtasks. It avoids data leakage via manually d

#MLLM考卷#多模态摸底考#防泄漏评测#感知认知双测#大模型标尺#MLLM Exam#Multimodal Benchmark#Leak-Proof Evaluation#Perception-Cognition Tes#LLM Ruler

Roast another paper →

Score breakdown

Novelty8.0 / 10

Rigor8.0 / 10

Significance9.0 / 10

Clarity9.0 / 10

Reproducibility9.0 / 10

🌶️ Roast

🌶️ Roast 🌸 Praise

This tone hasn't been generated yet — roast it again to create it.