Unlocking Implicit Experience: Synthesizing Tool-Use Trajectories from Text Paper • 2601.10355 • Published 5 days ago • 33
LaSeR: Reinforcement Learning with Last-Token Self-Rewarding Paper • 2510.14943 • Published Oct 16, 2025 • 39
Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners Paper • 2509.26226 • Published Sep 30, 2025 • 33
Progressive Multimodal Reasoning via Active Retrieval Paper • 2412.14835 • Published Dec 19, 2024 • 73
Running on CPU Upgrade Featured 998 Model Memory Utility 🚀 998 Calculate vRAM needed for model training and inference