File size: 14,255 Bytes
9ebdc51
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Repository Overview

This is a Hugging Face Space that provides a GPU-accelerated JupyterLab environment for training and simulating robots using the MuJoCo physics engine. The space covers a wide range of robotics applications including locomotion, manipulation, motion tracking, and general physics simulation. It is designed to run in a Docker container with NVIDIA GPU support for hardware-accelerated physics rendering.

## What This Environment Supports

This is a general-purpose MuJoCo training environment with sample notebooks covering:

1. **General MuJoCo Physics** (`tutorial.ipynb`) - Comprehensive introduction to MuJoCo fundamentals including basic rendering, simulation loops, contacts, friction, tendons, actuators, sensors, and advanced rendering techniques

2. **Locomotion** (`locomotion.ipynb`) - Training quadrupedal and bipedal robots for walking, running, and acrobatic behaviors. Includes environments for Unitree Go1/G1, Boston Dynamics Spot, Google Barkour, Berkeley Humanoid, Unitree H1, and more

3. **Manipulation** (`manipulation.ipynb`) - Robot arm and dexterous hand control. Includes Franka Emika Panda pick-and-place tasks and Leap Hand dexterous manipulation with asymmetric actor-critic training

4. **Motion Tracking** (`opentrack.ipynb`) - Humanoid motion tracking and retargeting using the OpenTrack system with motion capture data

## Architecture

### Container Environment
- **Base Image**: nvidia/cuda:12.8.1-devel-ubuntu22.04
- **Python**: 3.13 (Miniconda)
- **GPU Rendering**: Uses EGL (OpenGL for headless rendering) with NVIDIA drivers
- **Web Server**: JupyterLab on port 7860

### Key Components

1. **GPU Initialization** (`init_gpu.py`): Validates GPU setup before starting JupyterLab
   - Checks NVIDIA driver accessibility via `nvidia-smi`
   - Verifies EGL library availability (libEGL.so.1, libGL.so.1, libEGL_nvidia.so.0)
   - Tests EGL device initialization with multiple fallback methods (platform device, default display, surfaceless)
   - Validates MuJoCo rendering at multiple resolutions (64x64, 240x320, 480x640)
   - Critical environment variables: `MUJOCO_GL=egl`, `PYOPENGL_PLATFORM=egl`, `EGL_PLATFORM=surfaceless`

2. **MuJoCo Playground Setup** (`init_mujoco.py`): Downloads MuJoCo model assets
   - Imports `mujoco_playground` which automatically clones the mujoco_menagerie repository
   - This repository contains robot models (quadrupeds, bipeds, arms, hands, etc.)

3. **Server Startup** (`start_server.sh`): Container entrypoint
   - Sets up NVIDIA EGL library symlinks at runtime (searches /usr/local/nvidia/lib64, /usr/local/cuda/lib64, /usr/lib/nvidia)
   - Runs GPU validation (`python init_gpu.py`)
   - Downloads MuJoCo assets (`python init_mujoco.py`)
   - Disables JupyterLab announcements
   - Launches JupyterLab with iframe embedding support for Hugging Face Spaces

### Sample Notebooks

Sample notebooks are organized in individual folders within `samples/` and are automatically copied to `/data/workspaces/` at container startup:

- **`samples/tutorial/`** - Complete MuJoCo introduction (2258 lines) covering physics fundamentals, rendering, contacts, actuators, sensors, tendons, and camera control
- **`samples/locomotion/`** - Quadrupedal and bipedal locomotion training (1762 lines) with PPO, domain randomization, curriculum learning, and policy fine-tuning
- **`samples/manipulation/`** - Robot manipulation (649 lines) including pick-and-place (Panda arm) and dexterous manipulation (Leap Hand) with asymmetric actor-critic
- **`samples/opentrack/`** - Humanoid motion tracking/retargeting (603 lines) including dataset download, training, checkpoint conversion, and video generation

Each sample is copied to its own workspace directory (`/data/workspaces/<sample_name>/`) at runtime. Notebooks are only copied if they don't already exist, preserving any user modifications.

## Development Commands

### Running Locally with Docker

```bash
# Build the container
docker build -t mujoco-training .

# Run with GPU support
docker run --gpus all -p 7860:7860 mujoco-training
```

### Testing GPU Setup

```bash
# Validate GPU rendering capabilities (run inside container)
python init_gpu.py

# Check NVIDIA driver
nvidia-smi

# Test EGL libraries
ldconfig -p | grep EGL
```

### JupyterLab Access

- Default port: 7860
- Default token: "huggingface" (set via `JUPYTER_TOKEN` environment variable)
- Default landing page: `/lab/tree/workspaces/locomotion/locomotion.ipynb`
- Notebook working directory: `/data` (when deployed as Hugging Face Space)

### Persistent Storage and Workspaces

When deployed on Hugging Face Spaces, the `/data` directory is backed by persistent storage. At container startup, `start_server.sh` automatically:

1. Creates `/data/workspaces/` if it doesn't exist
2. For each sample in `samples/`, creates `/data/workspaces/<sample_name>/` if it doesn't exist
3. Copies the `.ipynb` file only if it doesn't already exist in the workspace (preserving user modifications)
4. Copies any additional files from the sample directory (datasets, scripts, etc.)

This ensures:
- User modifications to notebooks are preserved across container restarts
- Each sample has its own isolated workspace for generated data, models, and outputs
- Sample notebooks can include supporting files that are copied to the workspace
- Users can create additional workspaces in `/data/workspaces/` for their own projects

## Critical EGL Configuration

The container requires specific EGL configuration for headless GPU rendering:

1. **NVIDIA EGL Vendor Config**: Created at `/usr/share/glvnd/egl_vendor.d/10_nvidia.json` pointing to `libEGL_nvidia.so.0`
2. **Library Path**: `LD_LIBRARY_PATH` includes `/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/lib/x86_64-linux-gnu:/usr/local/cuda/lib64`
3. **Runtime Symlinks**: `start_server.sh` creates symlinks to `libEGL_nvidia.so.0` from mounted NVIDIA directories
4. **Environment Variables**: `__EGL_VENDOR_LIBRARY_DIRS=/usr/share/glvnd/egl_vendor.d`

### Troubleshooting EGL Issues

If MuJoCo rendering fails:
1. Verify NVIDIA drivers: `nvidia-smi` should show GPU info
2. Check EGL vendor config: `cat /usr/share/glvnd/egl_vendor.d/10_nvidia.json`
3. Verify library loading: `ldconfig -p | grep EGL`
4. Run comprehensive diagnostic: `python init_gpu.py`
5. Check that `MUJOCO_GL=egl` is set: `echo $MUJOCO_GL`

## Training Workflows

### General MuJoCo Simulation (tutorial.ipynb)

Basic simulation loop:
```python
import mujoco
model = mujoco.MjModel.from_xml_string(xml)
data = mujoco.MjData(model)

# Simulation loop
mujoco.mj_resetData(model, data)
while data.time < duration:
    mujoco.mj_step(model, data)
    # Read sensors, apply controls, etc.
```

Rendering:
```python
with mujoco.Renderer(model, height, width) as renderer:
    mujoco.mj_forward(model, data)
    renderer.update_scene(data, camera="camera_name")
    pixels = renderer.render()
```

### Locomotion Training (locomotion.ipynb)

Typical workflow using Brax + MuJoCo Playground:

1. **Load environment**: `env = registry.load(env_name)`
2. **Get config**: `env_cfg = registry.get_default_config(env_name)`
3. **Configure PPO**: `ppo_params = locomotion_params.brax_ppo_config(env_name)`
4. **Apply domain randomization**: `randomizer = registry.get_domain_randomizer(env_name)`
5. **Train**: Use `brax.training.agents.ppo.train` with the environment and randomization function
6. **Save checkpoints**: Policies saved to `checkpoints/{env_name}/{step}/`
7. **Fine-tune**: Restore from checkpoint and continue training with modified config

Available environments:
- **Quadrupedal**: Go1JoystickFlatTerrain, Go1JoystickRoughTerrain, Go1Getup, Go1Handstand, Go1Footstand, SpotFlatTerrainJoystick, SpotGetup, SpotJoystickGaitTracking, BarkourJoystick
- **Bipedal**: BerkeleyHumanoidJoystickFlatTerrain, BerkeleyHumanoidJoystickRoughTerrain, G1JoystickFlatTerrain, G1JoystickRoughTerrain, H1InplaceGaitTracking, H1JoystickGaitTracking, Op3Joystick, T1JoystickFlatTerrain, T1JoystickRoughTerrain

Full list: `registry.locomotion.ALL_ENVS`

Key training techniques:
- **Domain Randomization**: Randomizes friction, armature, center of mass, link masses for sim-to-real transfer
- **Energy Penalties**: `energy_termination_threshold`, `reward_config.energy`, `reward_config.dof_acc` to control power consumption and smoothness
- **Curriculum Learning**: Fine-tune from checkpoints with progressively modified reward configs
- **Asymmetric Actor-Critic**: Actor receives proprioception, critic receives privileged simulation state

### Manipulation Training (manipulation.ipynb)

Similar to locomotion but focuses on:
- **Pick-and-place tasks**: PandaPickCubeOrientation (trains in ~3 minutes on RTX 4090)
- **Dexterous manipulation**: LeapCubeReorient (trains in ~33 minutes on RTX 4090)
- **Asymmetric observations**: Use `policy_obs_key` and `value_obs_key` in PPO params to train actor on sensor-like data while critic gets privileged state

Available environments: `registry.manipulation.ALL_ENVS`

### Motion Tracking (opentrack.ipynb)

OpenTrack workflow for humanoid motion tracking:
1. **Clone repository**: `git clone https://github.com/GalaxyGeneralRobotics/OpenTrack.git`
2. **Download mocap data**: From `huggingface.co/datasets/robfiras/loco-mujoco-datasets` (Lafan1/UnitreeG1)
3. **Train policy**: `python train_policy.py --exp_name debug --terrain_type flat_terrain`
4. **Convert checkpoint**: `python brax2torch.py --exp_name <exp_name>` (Brax β†’ PyTorch)
5. **Generate videos**: `python play_policy.py --exp_name <exp_name> --use_renderer`

## Python Dependencies

Core stack (see `requirements.txt`):
- **JupyterLab**: 4.4.3 (with tornado 6.2 for compatibility)
- **JAX**: CUDA 12 support via `jax[cuda12]`
- **MuJoCo**: 3.3+ with MuJoCo MJX (JAX-based physics)
- **Brax**: JAX-based RL framework for massively parallel training
- **MuJoCo Playground**: Collection of robot environments and training utilities
- **Supporting libraries**: mediapy (video rendering), ipywidgets, nvidia-cusparse-cu12

## File Structure

```
/
β”œβ”€β”€ Dockerfile                      # Container with CUDA 12.8 + EGL setup
β”œβ”€β”€ start_server.sh                 # Container entrypoint
β”œβ”€β”€ init_gpu.py                     # GPU validation script (comprehensive EGL tests)
β”œβ”€β”€ init_mujoco.py                  # MuJoCo Playground asset downloader
β”œβ”€β”€ requirements.txt                # Python dependencies
β”œβ”€β”€ packages.txt                    # System packages (currently empty)
β”œβ”€β”€ on_startup.sh                   # Custom startup commands (placeholder)
β”œβ”€β”€ login.html                      # Custom JupyterLab login page
└── samples/                        # Example notebooks (organized by topic)
    β”œβ”€β”€ tutorial/
    β”‚   └── tutorial.ipynb          # MuJoCo fundamentals (2258 lines)
    β”œβ”€β”€ locomotion/
    β”‚   └── locomotion.ipynb        # Robot locomotion (1762 lines)
    β”œβ”€β”€ manipulation/
    β”‚   └── manipulation.ipynb      # Robot manipulation (649 lines)
    └── opentrack/
        └── opentrack.ipynb         # Motion tracking (603 lines)
```

When deployed as a Hugging Face Space with persistent storage:
```
/data/                              # Persistent storage volume (mounted at runtime)
└── workspaces/                     # Sample workspaces (created by start_server.sh)
    β”œβ”€β”€ tutorial/
    β”‚   β”œβ”€β”€ tutorial.ipynb          # Copied from samples/, preserves user edits
    β”‚   └── ...                     # User-generated data, models, outputs
    β”œβ”€β”€ locomotion/
    β”‚   β”œβ”€β”€ locomotion.ipynb
    β”‚   β”œβ”€β”€ checkpoints/            # Training checkpoints
    β”‚   └── ...
    β”œβ”€β”€ manipulation/
    β”‚   β”œβ”€β”€ manipulation.ipynb
    β”‚   └── ...
    └── opentrack/
        β”œβ”€β”€ opentrack.ipynb
        β”œβ”€β”€ datasets/               # Downloaded mocap data
        β”œβ”€β”€ models/                 # Trained models
        └── videos/                 # Generated videos
```

## Performance Notes

- **Physics simulation**: Can achieve 50,000+ Hz on single GPU with JAX/MJX (much faster than rendering)
- **Rendering**: Typically 30-60 Hz, much slower than physics
- **Training times** (on RTX 4090 / L40S):
  - Simple manipulation: 3 minutes
  - Quadrupedal joystick: 7 minutes
  - Bipedal locomotion: 17 minutes
  - Dexterous manipulation: 33 minutes
- **Brax parallelization**: Uses thousands of parallel environments for fast training
- **Checkpointing**: Critical for curriculum learning and fine-tuning

## Common Patterns

### Visualization Options

```python
scene_option = mujoco.MjvOption()
scene_option.flags[mujoco.mjtVisFlag.mjVIS_JOINT] = True           # Show joints
scene_option.flags[mujoco.mjtVisFlag.mjVIS_CONTACTPOINT] = True   # Show contacts
scene_option.flags[mujoco.mjtVisFlag.mjVIS_CONTACTFORCE] = True   # Show forces
scene_option.flags[mujoco.mjtVisFlag.mjVIS_TRANSPARENT] = True    # Transparency
scene_option.flags[mujoco.mjtVisFlag.mjVIS_PERTFORCE] = True      # Show perturbations
```

### Named Access Pattern

```python
# Instead of using indices
model.geom_rgba[geom_id, :]

# Use named access
model.geom('green_sphere').rgba
data.geom('box').xpos
data.joint('swing').qpos
data.sensor('accelerometer').data
```

### Rendering Modes

- **RGB rendering**: `renderer.render()` - returns pixels
- **Depth rendering**: `renderer.enable_depth_rendering()` then `renderer.render()`
- **Segmentation**: `renderer.enable_segmentation_rendering()` - returns object IDs and types

## Important Notes

- This is designed for Hugging Face Spaces with GPU instances (NVIDIA L40S or similar)
- All training uses JAX/Brax for massive parallelization across thousands of environments
- Policies are typically saved using Orbax checkpointing for fine-tuning
- Domain randomization is critical for sim-to-real transfer
- The environment supports multiple RL algorithms (PPO, SAC) through Brax
- Asymmetric actor-critic (different observations for policy and value function) is commonly used