Update README.md
Browse files
README.md
CHANGED
|
@@ -11,7 +11,7 @@ datasets:
|
|
| 11 |
|
| 12 |
# GAR-8B
|
| 13 |
|
| 14 |
-
This repository contains the **GAR-8B** model, as presented in the paper [Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs](https://
|
| 15 |
|
| 16 |
**TL; DR:** Our Grasp Any Region (GAR) supports both (1) describing a single region of an image or a video in the form of points/boxes/scribbles/masks in detail and (2) understanding multiple regions such as modeling interactions and performing complex reasoning. We also release a new benchmark, GARBench, to evaluate models on advanced region-level understanding tasks.
|
| 17 |
|
|
|
|
| 11 |
|
| 12 |
# GAR-8B
|
| 13 |
|
| 14 |
+
This repository contains the **GAR-8B** model, as presented in the paper [Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs](https://huggingface.co/papers/2510.18876).
|
| 15 |
|
| 16 |
**TL; DR:** Our Grasp Any Region (GAR) supports both (1) describing a single region of an image or a video in the form of points/boxes/scribbles/masks in detail and (2) understanding multiple regions such as modeling interactions and performing complex reasoning. We also release a new benchmark, GARBench, to evaluate models on advanced region-level understanding tasks.
|
| 17 |
|