NVIDIA Isaac GR00T in LeRobot

Community Article Published October 28, 2025

Key_visual

Introduction

NVIDIA Isaac GR00T N open reasoning Vision Language Action models are now integrated into Hugging Face’s LeRobot, with the LeRobot 0.4.0 release.

This is an exciting collaborative project between the LeRobot and NVIDIA teams. You can now post-train and evaluate GR00T N1.5 directly via LeRobot. With this, we’re aiming to make it easier for the open-source robotics community to customize and deploy Isaac GR00T robot foundation models. Read more about the release in the LeRobot v0.4.0 blog! Thanks to Steven Palma and team for this collaboration.

We’re also soon releasing an Isaac Lab environment with the SO-101 arm for simulated teleoperation and model evaluation. By providing a virtual interface to collect demonstrations, this will enable developers to create larger and more diverse datasets.

Key highlights of the integration

  • Performance: Benchmarks on LIBERO and hardware confirm that the LeRobot implementation performs on par with the original GR00T repo.
  • Community impact: Expands GR00T N1.5 reach across research and development communities, ensuring better support and maintainability.
  • Customization: You can post-train GR00T on your own robot data, thanks to LeRobot’s dataset and pipelines.

Why is this integration exciting?

  • Before this work, development meant dealing with separate codebases, different dataset formats, and custom hardware interfaces.
  • GR00T’s integration in LeRobot v0.4.0 enables developers to use and post-train a high-capacity, multimodal, generalist model without building infrastructure from scratch.
  • Streamlines installation with PyPI-ready builds, Hugging Face integration, and consistent dependency management through the LeRobot v0.4.0 release structure.
  • LeRobot offers a unified, modular policy and dataset API, allowing seamless comparison between models (like GR00T N1.5, pi0, pi0.5, SmolVLA) within one framework.
  • Shared evaluation tools between models - same dataset loaders, same metrics, same visualization dashboards.
  • LeRobot’s plugin and processor pipelines make it straightforward to:
    • Use GR00T as a drop-in policy.
    • Run inference on both simulated and real robots.
    • Combine it with other foundation models.
    • Experiment faster and with less boilerplate.
  • Clearer tutorials, notebooks, and API references suited for developers.
  • Access to new robots in the LeRobot framework.
  • Compatible with the standard LeRobot pipelines (datasets, simulators, robot drivers).
  • LeRobot v0.4.0 introduced a plugin architecture for hardware and teleoperation. Now GR00T can directly control real robots (like Reachy 2 and SO-101) through LeRobot’s drivers.
  • With GR00T integrated into LeRobot’s standardized pipelines, it’s easy to run end-to-end workflows like:
    • Collect teleoperation data → fine-tune a GR00T policy → deploy on your robot → evaluate in simulation or real.

Installation

To train and evaluate GR00T, you will need a machine with an NVIDIA GPU. We’ve tested on a Linux system with an NVIDIA H100 and an NVIDIA RTX A6000. If you don't have access to one, you can easily use a remote instance on NVIDIA Brev.

Use the commands below to create a conda environment, clone the LeRobot Repository, and install dependencies. Note: this will install LeRobot from source.

conda create -y -n lerobot python=3.10
conda activate lerobot
conda install ffmpeg=7.1.1 -c conda-forge

git clone https://github.com/huggingface/lerobot.git
cd lerobot

# Check https://pytorch.org/get-started/locally/ for your system
pip install "torch>=2.2.1,<2.8.0" "torchvision>=0.21.0,<0.23.0" # --index-url https://download.pytorch.org/whl/cu1XX
pip install ninja "packaging>=24.2,<26.0" # flash attention dependencies
pip install "flash-attn>=2.5.9,<3.0.0" --no-build-isolation

pip install -e ".[libero,groot,dev,test]"

Training and Evaluation

Follow the steps below to train and evaluate on physical hardware and in simulation.

Hardware

  1. We use the Bimanual Cube Handover dataset for training. This dataset consists of 25 episodes capturing a cube handover task between two SO-100 arms. Each episode includes both video and corresponding action data. Below is the command to train on a single GPU:
lerobot-train \
 --policy.type=groot \
 --policy.push_to_hub=false \
 --dataset.repo_id=pepijn223/bimanual-so100-handover-cube \
 --batch_size=32 \
 --steps=20000 \
 --save_checkpoint=true \
 --wandb.enable=false \
 --save_freq=10 \
 --log_freq=2 \
 --policy.tune_diffusion_model=false \
 --output_dir=./outputs/
  1. Use the following command to train with multiple GPUs:
accelerate launch \
 --multi_gpu \
 --num_processes=$(nvidia-smi -L | wc -l) \
 $(which lerobot-train) \
 --policy.type=groot \
 --policy.push_to_hub=false \
 --dataset.repo_id=pepijn223/bimanual-so100-handover-cube \
 --batch_size=32 \
 --steps=20000 \
 --save_checkpoint=true \
 --wandb.enable=false \
 --save_freq=10 \
 --log_freq=2 \
 --policy.tune_diffusion_model=false \
 --output_dir=./outputs/
  1. Run evaluation on the physical SO-100 arms with this command:
lerobot-record \
 --robot.type=bi_so100_follower \
 --robot.left_arm_port=/dev/ttyACM1 \
 --robot.right_arm_port=/dev/ttyACM0 \
 --robot.id=bimanual_follower \
 --robot.cameras='{ right: {"type": "opencv", "index_or_path": 0, "width": 640, "height": 480, "fps": 30},
   left: {"type": "opencv", "index_or_path": 2, "width": 640, "height": 480, "fps": 30},
   top: {"type": "opencv", "index_or_path": 4, "width": 640, "height": 480, "fps": 30},
 }' \
 --display_data=true \
 --dataset.repo_id=${HF_USER}/eval_groot-bimanual  \
 --dataset.num_episodes=10 \
 --dataset.single_task="Grab and handover the red cube to the other arm"
 --policy.path=${HF_USER}/groot-bimanual # your trained model
 --dataset.episode_time_s=30
 --dataset.reset_time_s=10

Simulation

For simulation, we use LIBERO - a benchmark designed to study lifelong robot learning. LIBERO provides standardized tasks that focus on knowledge transfer across different scenarios. It includes task suites designed to test learning under distribution shifts in objects, goals, and layouts, serving as a comprehensive testbed for algorithms in lifelong decision-making and generalist robot training.

  1. The command below launches training across multiple GPUs. The dataset used in this command is the LIBERO-Long task suite (libero_10), which has 10 long-horizon tasks from the LIBERO-100 collection.
accelerate launch \
  --multi_gpu \
  --num_processes=$(nvidia-smi -L | wc -l) \
  $(which lerobot-train) \
  --output_dir=./outputs/ \
  --save_checkpoint=true \
  --batch_size=64 \
  --steps=40000 \
  --eval_freq=0 \
  --save_freq=5000 \
  --log_freq=10 \
  --policy.push_to_hub=true \
  --policy.type=groot \
  --policy.repo_id=${HF_USER}/groot_libero_10_64_40000 \
  --policy.tune_diffusion_model=false \
  --dataset.repo_id=HuggingFaceVLA/libero \
  --env.type=libero \
  --env.task=libero_10 \
  --wandb.enable=true \
  --wandb.disable_artifact=true \
  --job_name=my-groot-libero-10-finetune
  1. Use the following command for training on a single GPU:
lerobot-train \
  --output_dir=./outputs/ \
  --save_checkpoint=true \
  --batch_size=64 \
  --steps=40000 \
  --eval_freq=0 \
  --save_freq=5000 \
  --log_freq=10 \
  --policy.push_to_hub=true \
  --policy.type=groot \
  --policy.repo_id=${HF_USER}/groot_libero_10_64_40000 \
  --policy.tune_diffusion_model=false \
  --dataset.repo_id=HuggingFaceVLA/libero \
  --env.type=libero \
  --env.task=libero_10 \
  --wandb.enable=true \
  --wandb.disable_artifact=true \
  --job_name=my-groot-libero-10-finetune
  1. Evaluate the LIBERO checkpoint using the command below:
lerobot-eval \
  --policy.path=${HF_USER}/groot_libero_10 \
  --env.type=libero \
  --env.task=libero_10 \
  --eval.batch_size=1 \
  --eval.n_episodes=10 \
  --policy.n_action_steps=50 \
  --env.max_parallel_tasks=1 \
  --output_dir=./evals/${HF_USER}/groot_libero_10

Performance Results

LIBERO Benchmark Results

To evaluate the LeRobot implementation of GR00T, we post-trained the GR00T N1.5 model on the LIBERO dataset for 20-40k steps. The results were then compared against the GR00T reference results, demonstrating strong performance on the LIBERO benchmark suite.

Benchmark GR00T LeRobot Original GR00T Training parameters used Checkpoint
LIBERO-Spatial 82.0% 92.0% Batch size 128; 20,000 steps Checkpoint
LIBERO-Object 99.0% 92.0% Batch size 64; 40,000 steps Checkpoint
LIBERO-Long 82.0% 76.0% Batch size 64; 40,000 steps Checkpoint
Average 87.0% 76.0%

Hardware Results

Evaluation was also done successfully in the real world on a bimanual task using two SO-100 arms.

GR00T N1.5 was also post-trained for a pick and place task, and deployed successfully on the SO-100 arm:

Get Started Today

Here are some resources to help you explore this exciting project:

Community

Sign up or log in to comment