Feature Extraction
Transformers
Safetensors
English
GAR
custom_code
HaochenWang commited on
Commit
914d9cb
·
verified ·
1 Parent(s): ef0333a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -11,7 +11,7 @@ datasets:
11
 
12
  # GAR-8B
13
 
14
- This repository contains the **GAR-8B** model, as presented in the paper [Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs](https://github.com/Haochen-Wang409/Grasp-Any-Region).
15
 
16
  **TL; DR:** Our Grasp Any Region (GAR) supports both (1) describing a single region of an image or a video in the form of points/boxes/scribbles/masks in detail and (2) understanding multiple regions such as modeling interactions and performing complex reasoning. We also release a new benchmark, GARBench, to evaluate models on advanced region-level understanding tasks.
17
 
 
11
 
12
  # GAR-8B
13
 
14
+ This repository contains the **GAR-8B** model, as presented in the paper [Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs](https://huggingface.co/papers/2510.18876).
15
 
16
  **TL; DR:** Our Grasp Any Region (GAR) supports both (1) describing a single region of an image or a video in the form of points/boxes/scribbles/masks in detail and (2) understanding multiple regions such as modeling interactions and performing complex reasoning. We also release a new benchmark, GARBench, to evaluate models on advanced region-level understanding tasks.
17