Reward Inside the Model: A Lightweight Hidden-State Reward Model for LLM's Best-of-N sampling Paper • 2505.12225 • Published May 18, 2025 • 9
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models Paper • 2512.24618 • Published 27 days ago • 141