A newer version of the Gradio SDK is available:
5.49.1
metadata
title: RTMO Checkpoint Tester
emoji: π
colorFrom: pink
colorTo: green
sdk: gradio
sdk_version: 5.27.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: RTMO PyTorch Checkpoint Tester
RTMO PyTorch Checkpoint Tester
This HuggingFace Space provides a real-time 2D multi-person pose estimation demo using the RTMO model from OpenMMLab, accelerated with ZeroGPU. It supports both image and video inputs.
Features
- Remote Checkpoint Selection: Choose from multiple pre-trained variants (COCO, BODY7, CrowdPose, retrainable RTMO-s) via a dropdown.
- Custom Checkpoint Upload: Upload your own
.pthfile; the application auto-detects RTMO-t/s/m/l variants. - Image Input: Upload images for single-frame pose estimation.
- Video Input: Upload video files (e.g.,
.mp4,.mov,.avi,.mkv,.webm) to perform pose estimation on video sequences and view annotated outputs. - Threshold Adjustment: Fine-tune Bounding Box Threshold and NMS Threshold sliders to refine detections.
- Example Images: Three license-free images with people are included for quick testing via the Examples panel.
- ZeroGPU Acceleration: Utilizes the
@spaces.GPU()decorator for GPU inference on HuggingFace Spaces.
Usage
- Upload Image: Drag-and-drop or select an image in the Upload Image component (or choose from Examples).
- Upload Video: Drag-and-drop or select a video file in the Upload Video component.
- Select Remote Checkpoint: Pick a preloaded variant from the dropdown menu.
- (Optional) Upload Your Own Checkpoint: Provide a
.pthfile to override the remote selection; the model variant is detected automatically. - Adjust Thresholds: Set Bounding Box Threshold (
bbox_thr) and NMS Threshold (nms_thr) to control confidence and suppression behavior. - Run Inference: Click Run Inference.
- View Results:
- For images, the annotated image will appear in the Annotated Image panel.
- For videos, the annotated video will appear in the Annotated Video panel. The active checkpoint name will appear below.
Remote Checkpoints
The following variants are available out of the box:
rtmo-s_8xb32-600e_cocortmo-m_16xb16-600e_cocortmo-l_16xb16-600e_cocortmo-t_8xb32-600e_body7rtmo-s_8xb32-600e_body7rtmo-m_16xb16-600e_body7rtmo-l_16xb16-600e_body7rtmo-s_8xb32-700e_crowdposertmo-m_16xb16-700e_crowdposertmo-l_16xb16-700e_crowdposertmo-s_coco_retrainable(from Hugging Face)
Implementation Details
- GPU Decorator:
@spaces.GPU()marks thepredictfunction for GPU execution under ZeroGPU. - Inference API: Leverages
MMPoseInferencerfrom MMPose withpose2d,pose2d_weights, and category[0]for person detection. - Monkey-Patch: Applies a regex patch to bypass
mmdetβs MMCV version assertion for compatibility. - Variant Detection: Inspects
backbone.stem.conv.conv.weightchannels in the checkpoint to select the correct RTMO variant. - Checkpoint Management: Remote files are downloaded to
/tmp/{key}.pthon demand; uploads use the provided local path. - Image & Video Support: The
predictfunction automatically handles both image and video inputs, saving annotated frames or video to/tmp/visand displaying them in the UI. - Output: Saves visualization images or videos to
/tmp/visand displays them in the UI panels.
Files
- app.py: Main Gradio application script.
- requirements.txt: Python dependencies, including MMCV and MMPose.
- README.md: This documentation file.