Aviral Kumar's picture

1

Aviral Kumar

aviralkumar

·

AI & ML interests

None yet

Organizations

None yet

authored a paper 8 months ago

Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning

Paper • 2503.07572 • Published Mar 10 • 47

authored 2 papers about 1 year ago

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 140

Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

Paper • 2408.03314 • Published Aug 6, 2024 • 63

authored 2 papers over 1 year ago

DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning

Paper • 2406.11896 • Published Jun 14, 2024 • 20

Stop Regressing: Training Value Functions via Classification for Scalable Deep RL

Paper • 2403.03950 • Published Mar 6, 2024 • 16

authored 2 papers about 2 years ago

Robotic Offline RL from Internet Videos via Value-Function Pre-Training

Paper • 2309.13041 • Published Sep 22, 2023 • 9

Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions

Paper • 2309.10150 • Published Sep 18, 2023 • 25