ACoT-VLA: Action Chain-of-Thought for Vision-Language-Action Models Paper • 2601.11404 • Published 5 days ago • 21
GRPO-MA: Multi-Answer Generation in GRPO for Stable and Efficient Chain-of-Thought Training Paper • 2509.24494 • Published Sep 29, 2025 • 10