Stabilizing MoE Reinforcement Learning by Aligning Training and Inference Routers Paper β’ 2510.11370 β’ Published 16 days ago β’ 2
Glyph: Scaling Context Windows via Visual-Text Compression Paper β’ 2510.17800 β’ Published 9 days ago β’ 61
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level Paper β’ 2411.03562 β’ Published Nov 5, 2024 β’ 68
Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model Paper β’ 2510.18855 β’ Published 8 days ago β’ 60
Towards Greater Leverage: Scaling Laws for Efficient Mixture-of-Experts Language Models Paper β’ 2507.17702 β’ Published Jul 23 β’ 6
Bee: A High-Quality Corpus and Full-Stack Suite to Unlock Advanced Fully Open MLLMs Paper β’ 2510.13795 β’ Published 14 days ago β’ 50
First Try Matters: Revisiting the Role of Reflection in Reasoning Models Paper β’ 2510.08308 β’ Published 20 days ago β’ 24
SR-Scientist: Scientific Equation Discovery With Agentic AI Paper β’ 2510.11661 β’ Published 16 days ago β’ 3
Cache-to-Cache: Direct Semantic Communication Between Large Language Models Paper β’ 2510.03215 β’ Published 26 days ago β’ 93
Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer Paper β’ 2510.06590 β’ Published 21 days ago β’ 70
βοΈ BigCodeArena Collection Unveiling More Reliable Human Preferences in Code Generation via Execution β’ 8 items β’ Updated 15 days ago β’ 4