🔥 GitHub Roast
← Back to the board
NL2Repo-Bench: Towards Long-Horizon Repository Generation Evaluation of Coding Agents
Jingzhe Ding, Shengda Long, Changxin Pu, Huan Zhou et al.
63.60/100
🫥 Mediocre
Incremental, thin
Content 63.6 · Citation bonus +0.0 · no citation data

💡 This paper proposes NL2Repo-Bench, a benchmark dedicated to evaluating the long-horizon repository generation capability of coding agents, which requires models to autonomously complete architecture d

#coding agent#长程代码生成#软件工程基准#LLM能力短板#真实落地评估#coding agent truth serum#long-horizon code genera#software engineering ben#LLM capability gap#real-world deployment ev

Score breakdown

Novelty7.0 / 10
Rigor8.0 / 10
Significance9.0 / 10
Clarity9.0 / 10
Reproducibility7.0 / 10

This tone hasn't been generated yet — roast it again to create it.