Which Reasoning Trajectories Teach Students to Reason Better? A Simple Metric of Informative Alignment Paper • 2601.14249 • Published 2 days ago • 6
OpenNovelty: An LLM-powered Agentic System for Verifiable Scholarly Novelty Assessment Paper • 2601.01576 • Published 19 days ago • 17
Analyzing the Effects of Supervised Fine-Tuning on Model Knowledge from Token and Parameter Levels Paper • 2509.16596 • Published Sep 20, 2025 • 14
Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination Paper • 2507.10532 • Published Jul 14, 2025 • 90
Pre-Trained Policy Discriminators are General Reward Models Paper • 2507.05197 • Published Jul 7, 2025 • 39
Measuring Data Diversity for Instruction Tuning: A Systematic Analysis and A Reliable Metric Paper • 2502.17184 • Published Feb 24, 2025 • 1