MOA: Multi-Objective Alignment for Role-Playing Agents Paper • 2512.09756 • Published 23 days ago • 3
Entropy Regularizing Activation: Boosting Continuous Control, Large Language Models, and Image Classification with Activation as Entropy Constraints Paper • 2510.08549 • Published Oct 9, 2025 • 6
Thought-Augmented Policy Optimization: Bridging External Guidance and Internal Capabilities Paper • 2505.15692 • Published May 21, 2025 • 14
Kimi k1.5: Scaling Reinforcement Learning with LLMs Paper • 2501.12599 • Published Jan 22, 2025 • 126