Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Aviral Kumar's picture
1

Aviral Kumar

aviralkumar
iamamanpandey's profile picture nitishpandey04's profile picture 21world's profile picture
·

AI & ML interests

None yet

Organizations

None yet

authored a paper 8 months ago

Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning

Paper • 2503.07572 • Published Mar 10 • 47
authored 2 papers about 1 year ago

Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 140

Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

Paper • 2408.03314 • Published Aug 6, 2024 • 63
authored 2 papers over 1 year ago

DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning

Paper • 2406.11896 • Published Jun 14, 2024 • 20

Stop Regressing: Training Value Functions via Classification for Scalable Deep RL

Paper • 2403.03950 • Published Mar 6, 2024 • 16
authored 2 papers about 2 years ago

Robotic Offline RL from Internet Videos via Value-Function Pre-Training

Paper • 2309.13041 • Published Sep 22, 2023 • 9

Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions

Paper • 2309.10150 • Published Sep 18, 2023 • 25
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs