Beyond Markovian: Reflective Exploration via Bayes-Adaptive RL for LLM Reasoning Paper • 2505.20561 • Published May 26, 2025 • 7