Training Long-Context Vision-Language Models Effectively with Generalization Beyond 128K Context
Zhaowei Wang, Lishu Luo, Haodong Duan, Weiwei Liu et al.
62.00/100
🫥 Mediocre
Incremental, thin
Content 62.0 · Citation bonus +0.0 · 0 citations
💡 This paper presents a systematic study on long-context continued pre-training for vision-language models, deriving key training recipes (e.g., long-doc VQA data, balanced length distribution, retrieva
#长上下文视觉语言模型#训练范式消融#数据配比研究#上下文外推#多任务泛化#long-context LVLM#training recipe ablation#data mixture design#context extrapolation#multi-task generalizatio
Score breakdown
Novelty7.0 / 10
Rigor8.0 / 10
Significance8.0 / 10
Clarity9.0 / 10
Reproducibility7.0 / 10
This tone hasn't been generated yet — roast it again to create it.