Rethinking Large Language Model Distillation: A Constrained Markov Decision Process Perspective Paper • 2509.22921 • Published Sep 26 • 11
Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs Paper • 2402.14740 • Published Feb 22, 2024 • 15
Experience is the Best Teacher: Grounding VLMs for Robotics through Self-Generated Memory Paper • 2507.16713 • Published Jul 22 • 21
Bourbaki: Self-Generated and Goal-Conditioned MDPs for Theorem Proving Paper • 2507.02726 • Published Jul 3 • 14
Ark: An Open-source Python-based Framework for Robot Learning Paper • 2506.21628 • Published Jun 24 • 16
Almost Surely Safe Alignment of Large Language Models at Inference-Time Paper • 2502.01208 • Published Feb 3 • 11
view article Article Accelerating Language Model Inference with Mixture of Attentions By hba123 and 1 other • Jan 7 • 24
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level Paper • 2411.03562 • Published Nov 5, 2024 • 68
HumanRankEval: Automatic Evaluation of LMs as Conversational Assistants Paper • 2405.09186 • Published May 15, 2024 • 22
Human-like Episodic Memory for Infinite Context LLMs Paper • 2407.09450 • Published Jul 12, 2024 • 62
ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning Paper • 2406.19741 • Published Jun 28, 2024 • 62