🔥 GitHub Roast
← Back to the board
MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models
Chaoyou Fu, Peixian Chen, Yunhang Shen, Yulei Qin et al.
78.81/100
📘 Readable
Decent, has merit
Content 68.0 · Citation bonus +10.8 · 1596 citations

💡 This paper presents MME, the first comprehensive evaluation benchmark for Multimodal Large Language Models (MLLMs), covering 14 perception and cognition subtasks. It avoids data leakage via manually d

#MLLM考卷#多模态摸底考#防泄漏评测#感知认知双测#大模型标尺#MLLM Exam#Multimodal Benchmark#Leak-Proof Evaluation#Perception-Cognition Tes#LLM Ruler

Score breakdown

Novelty8.0 / 10
Rigor8.0 / 10
Significance9.0 / 10
Clarity9.0 / 10
Reproducibility9.0 / 10

This tone hasn't been generated yet — roast it again to create it.