Item: NL2Repo-Bench: Towards Long-Horizon Repository Generation Evaluation of Coding Agents
Rating: 63.6
Author: GitHub Roast

← Back to the board

NL2Repo-Bench: Towards Long-Horizon Repository Generation Evaluation of Coding Agents

Jingzhe Ding, Shengda Long, Changxin Pu, Huan Zhou et al.

63.60/100

🫥 Mediocre

Incremental, thin

Content 63.6 · Citation bonus +0.0 · no citation data

💡 This paper proposes NL2Repo-Bench, a benchmark dedicated to evaluating the long-horizon repository generation capability of coding agents, which requires models to autonomously complete architecture d

#coding agent#长程代码生成#软件工程基准#LLM能力短板#真实落地评估#coding agent truth serum#long-horizon code genera#software engineering ben#LLM capability gap#real-world deployment ev

Roast another paper →

Score breakdown

Novelty7.0 / 10

Rigor8.0 / 10

Significance9.0 / 10

Clarity9.0 / 10

Reproducibility7.0 / 10

🌸 Praise

🌶️ Roast 🌸 Praise

This tone hasn't been generated yet — roast it again to create it.