MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly
Zhaowei Wang, Wenhao Yu, Xiyu Ren, Jipeng Zhang et al.
62.40/100
🫥 Mediocre
Incremental, thin
Content 62.4 · Citation bonus +0.0 · no citation data
💡 MMLongBench is the first systematic benchmark for long-context vision-language models, covering 5 task categories, 13331 samples, and 5 standardized input lengths from 8K to 128K tokens, whose evaluat
#长上下文多模态评测#领域基准#多模态大模型#视觉语言模型#能力诊断#long-context multimodal #VLM evaluation#multimodal LLM#capability diagnosis#vision-language model
Score breakdown
Novelty7.0 / 10
Rigor8.0 / 10
Significance9.0 / 10
Clarity8.0 / 10
Reproducibility7.0 / 10
This tone hasn't been generated yet — roast it again to create it.