A$^2$FM: An Adaptive Agent Foundation Model for Tool-Aware Hybrid Reasoning Paper • 2510.12838 • Published 16 days ago • 22
ACADREASON: Exploring the Limits of Reasoning Models with Academic Research Problems Paper • 2510.11652 • Published 16 days ago • 28
Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL Paper • 2508.13167 • Published Aug 6 • 127
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published Feb 20 • 104
CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark Paper • 2401.11944 • Published Jan 22, 2024 • 27
SciMMIR: Benchmarking Scientific Multi-modal Information Retrieval Paper • 2401.13478 • Published Jan 24, 2024 • 3
PIN: A Knowledge-Intensive Dataset for Paired and Interleaved Multimodal Documents Paper • 2406.13923 • Published Jun 20, 2024 • 24
MMRA: A Benchmark for Multi-granularity Multi-image Relational Association Paper • 2407.17379 • Published Jul 24, 2024 • 3
PopAlign: Diversifying Contrasting Patterns for a More Comprehensive Alignment Paper • 2410.13785 • Published Oct 17, 2024 • 19
OmniBench: Towards The Future of Universal Omni-Language Models Paper • 2409.15272 • Published Sep 23, 2024 • 30