Controllable Memory Usage: Balancing Anchoring and Innovation in Long-Term Human-Agent Interaction Paper • 2601.05107 • Published 7 days ago • 21
DeepCritic: Deliberate Critique with Large Language Models Paper • 2505.00662 • Published May 1, 2025 • 54
Critique-RL: Training Language Models for Critiquing through Two-Stage Reinforcement Learning Paper • 2510.24320 • Published Oct 28, 2025 • 20