Reward Inside the Model: A Lightweight Hidden-State Reward Model for LLM's Best-of-N sampling Paper • 2505.12225 • Published May 18, 2025 • 9