🏆 神作榜
- 1🥇 80.00Attention Is All You NeedAshish Vaswani, Noam Shazeer, Niki Parmar · #Transformer开
- 2🥇 80.00Deep Residual Learning for Image RecognitionKaiming He, Xiangyu Zhang, Shaoqing Ren · #残差跳线
- 3📘 78.81MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language ModelsChaoyou Fu, Peixian Chen, Yunhang Shen · #MLLM考卷
- 4📘 75.60Denoising Diffusion Probabilistic ModelsJonathan Ho, Ajay Jain, Pieter Abbeel · #扩散模型开山之作
- 5📘 73.95Absolute Zero: Reinforced Self-play Reasoning with Zero DataAndrew Zhao, Yiran Wu, Yang Yue · #零数据RL
- 6📘 71.43Mean Flows for One-step Generative ModelingZhengyang Geng, Mingyang Deng, Xingjian Bai · #一步出图
- 7📘 71.20Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScoreJunchao Wu, Runzhe Zhan, Derek F. Wong · #LLM生成文本检测
- 8📘 70.00OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer EnvironmentsTianbao Xie, Danyang Zhang, Jixuan Chen · #GUI智能体基准
- 9📘 69.87DetectRL: Benchmarking LLM-Generated Text Detection in Real-World ScenariosJunchao Wu, Runzhe Zhan, Derek F. Wong · #LLM检测基准
- 10📘 68.83WildClawBench: A Benchmark for Real-World, Long-Horizon Agent EvaluationShuangrui Ding, Xuanlang Dai, Long Xing · #沙盒打假人
- 11📘 68.64StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video UnderstandingJunming Lin, Zheng Fang, Chi Chen · #流式视频理解
- 12📘 65.97d$^2$Cache: Accelerating Diffusion-Based LLMs via Dual Adaptive CachingYuchu Jiang, Yue Cai, Xiangzhong Luo · #扩散LLM推理加速
- 13📘 65.60Tranception: protein fitness prediction with autoregressive transformers and inference-time retrievalPascal Notin, Mafalda Dias, Jonathan Frazer · #蛋白适应度预测
- 14🫥 64.80SkillOpt: Executive Strategy for Self-Evolving Agent SkillsYifan Yang, Ziyang Gong, Weiquan Huang · #智能体技能炼丹
- 15🫥 63.60NL2Repo-Bench: Towards Long-Horizon Repository Generation Evaluation of Coding AgentsJingzhe Ding, Shengda Long, Changxin Pu · #coding agent
- 16🫥 63.20VideoRoPE: What Makes for Good Video Rotary Position Embedding?Xilin Wei, Xiaoran Liu, Yuhang Zang · #视频位置编码
- 17🫥 63.20Balanced Aggregation: Understanding and Fixing Aggregation Bias in GRPOZhiyuan Zeng, Jiameng Huang, Zhangyue Yin · #GRPO聚合玄学破解
- 18🫥 62.40Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual EditingXiangyu Zhao, Peiyuan Zhang, Kexian Tang · #视觉编辑新基准
- 19🫥 62.40MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and ThoroughlyZhaowei Wang, Wenhao Yu, Xiyu Ren · #长上下文多模态评测
- 20🫥 62.00Beyond the Current Observation: Evaluating Multimodal Large Language Models in Controllable Non-Markov GamesShengyuan Ding, Xilin Wei, Xinyu Fang · #多模态大模型评估
- 21🫥 62.00Training Long-Context Vision-Language Models Effectively with Generalization Beyond 128K ContextZhaowei Wang, Lishu Luo, Haodong Duan · #长上下文视觉语言模型
- 22🫥 60.80SetCon: Towards Open-Ended Referring Segmentation via Set-Level Concept PredictionZhixiong Zhang, Yizhuo Li, Shuangrui Ding · #LVLM终于知道多目标是
- 23🫥 60.40GameCraft-Bench: Can Agents Build Playable Games End-to-End in a Real Game Engine?Tongxu Luo, Rongsheng Wang, Jiaxi Bi · #游戏生成基准
- 24🫥 59.60DetectRL-X: Towards Reliable Multilingual and Real-World LLM-Generated Text DetectionJunchao Wu, Yefeng Liu, Chenyu Zhu · #多语言文本检测
- 25🫥 58.00MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMsJiarui Zhang, Mahyar Khayatkhoei, Prateek Chhikara · #MLLM眼脑分离
- 26🫥 57.20Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-TrainingWenyu Du, Tongxu Luo, Zihan Qiu · #模型生长实用指南
- 27🫥 56.40Point2RBox-v2: Rethinking Point-supervised Oriented Object Detection with Spatial Layout Among InstancesYi Yu, Botao Ren, Peiyuan Zhang · #点监督有向检测
- 28🫥 55.60SeHDR: Single-Exposure HDR Novel View Synthesis via 3D Gaussian BracketingYiyu Li, Haoyuan Wang, Ke Xu · #单曝光HDR合成
- 29🫥 55.60Agentifying Patient Dynamics within LLMs through Interacting with Clinical World ModelMinghao Wu, Yuting Yan, Zhenyang Cai · #脓毒症智能决策
- 30🫥 54.40ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual ReasoningShengyuan Ding, Xinyu Fang, Ziyu Liu · #多模态奖励模型
- 31🫥 54.40Knowledge Index of Noah's ArkSheng Jin, Minghao Liu, Yunze Xiao · #LLM知识评估
- 32🫥 54.40GenExam: A Multidisciplinary Text-to-Image ExamZhaokai Wang, Penghao Yin, Xiangyu Zhao · #文生图评估
- 33🫥 53.60OneRec Technical ReportGuorui Zhou, Jiaxin Deng, Jinghao Zhang · #工业级推荐系统
- 34🫥 53.20The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought ReasoningQiguang Chen, Yantao Du, Ziniu Li · #长链思维分析
- 35🫥 52.40Why Multi-Step Tool-Use Reinforcement Learning Collapses and How Supervisory Signals Fix ItYupu Hao, Zhuoran Jin, Huanxuan Liao · #工具调用RL
- 36🫥 52.40SS-MAE: Spatial-Spectral Masked Auto-Encoder for Multi-Source Remote Sensing Image ClassificationJunyan Lin, Feng Gao, Xiaocheng Shi · #遥感分类
- 37🫥 52.40Fast Large Language Model Collaborative Decoding via SpeculationJiale Fu, Yuchu Jiang, Junkai Chen · #LLM加速
- 38🫥 52.40Trust Your Critic: Robust Reward Modeling and Reinforcement Learning for Faithful Image Editing and GenerationXiangyu Zhao, Peiyuan Zhang, Junming Lin · #奖励模型去幻觉
- 39🫥 52.40DebCSE: Rethinking Unsupervised Contrastive Sentence Embedding Learning in the Debiasing PerspectivePu Miao, Zeyao Du, Junlin Zhang · #句子嵌入去偏
- 40🫥 51.20RepreGuard: Detecting LLM-Generated Text by Revealing Hidden Representation PatternsXin Chen, Junchao Wu, Shu Yang · #AI生成文本检测
- 41🫥 50.40Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement LearningZhaoyang Wang, Canwen Xu, Boyi Liu · #agent环境救星
- 42🫥 50.40Learning from Peers in Reasoning ModelsTongxu Luo, Wenyu Du, Jiaxi Bi · #前缀陷阱观察
- 43🫥 50.40Multi-Layer Visual Feature Fusion in Multimodal LLMs: Methods, Analysis, and Best PracticesJunyan Lin, Haoran Chen, Yue Fan · #多模态大模型
- 44🫥 50.00Qwen-AgentWorld: Language World Models for General AgentsYuxin Zuo, Zikai Xiao, Li Sheng · #世界模型炼丹
- 45🫥 50.00Kwai Keye-VL-2.0 Technical ReportKwai Keye Team, Bin Wen, Changyi Liu · #长视频理解
- 46🫥 50.00Generative Modeling via DriftingMingyang Deng, He Li, Tianhong Li · #一步生成
- 47🫥 49.60DynamicFace: High-Quality and Consistent Face Swapping for Image and Video using Composable 3D Facial PriorsRunqi Wang, Yang Chen, Sijie Xu · #面部替换
- 48🫥 49.60MM-IFEngine: Towards Multimodal Instruction FollowingShengyuan Ding, Shenxi Wu, Xiangyu Zhao · #多模态大模型
- 49🫥 49.44MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy OptimizationXiangyu Zhao, Junming Lin, Tianhao Liang · #多模态大模型
- 50🫥 48.00Hard to Read, Easy to Jailbreak: How Visual Degradation Bypasses MLLM Safety AlignmentZhixue Song, Boyan Han, Yiwei Wang · #多模态安全