DeepResearchEval: An Automated Framework for Deep Research Task Construction and Agentic Evaluation Paper • 2601.09688 • Published 7 days ago • 119
Robust-R1: Degradation-Aware Reasoning for Robust Visual Understanding Paper • 2512.17532 • Published Dec 19, 2025 • 65
MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling Paper • 2511.11793 • Published Nov 14, 2025 • 185
100 Days After DeepSeek-R1: A Survey on Replication Studies and More Directions for Reasoning Language Models Paper • 2505.00551 • Published May 1, 2025 • 36
Finding the Sweet Spot: Preference Data Construction for Scaling Preference Optimization Paper • 2502.16825 • Published Feb 24, 2025 • 7
Test-time Computing: from System-1 Thinking to System-2 Thinking Paper • 2501.02497 • Published Jan 5, 2025 • 45