Item: Training Long-Context Vision-Language Models Effectively with Generalization Beyond 128K Context
Rating: 62
Author: GitHub Roast

← Back to the board

Training Long-Context Vision-Language Models Effectively with Generalization Beyond 128K Context

Zhaowei Wang, Lishu Luo, Haodong Duan, Weiwei Liu et al.

62.00/100

🫥 Mediocre

Incremental, thin

Content 62.0 · Citation bonus +0.0 · 0 citations

💡 This paper presents a systematic study on long-context continued pre-training for vision-language models, deriving key training recipes (e.g., long-doc VQA data, balanced length distribution, retrieva

#长上下文视觉语言模型#训练范式消融#数据配比研究#上下文外推#多任务泛化#long-context LVLM#training recipe ablation#data mixture design#context extrapolation#multi-task generalizatio

Roast another paper →

Score breakdown

Novelty7.0 / 10

Rigor8.0 / 10

Significance8.0 / 10

Clarity9.0 / 10

Reproducibility7.0 / 10

🌶️ Roast

🌶️ Roast 🌸 Praise

This tone hasn't been generated yet — roast it again to create it.