shuai bai
commited on
Update README.md
Browse files
README.md
CHANGED
|
@@ -428,7 +428,7 @@ The model supports a wide range of resolution inputs. By default, it uses the na
|
|
| 428 |
min_pixels = 256 * 28 * 28
|
| 429 |
max_pixels = 1280 * 28 * 28
|
| 430 |
processor = AutoProcessor.from_pretrained(
|
| 431 |
-
"Qwen/Qwen2.5-VL-
|
| 432 |
)
|
| 433 |
```
|
| 434 |
|
|
@@ -478,6 +478,7 @@ To handle extensive inputs exceeding 32,768 tokens, we utilize [YaRN](https://ar
|
|
| 478 |
|
| 479 |
For supported frameworks, you could add the following to `config.json` to enable YaRN:
|
| 480 |
|
|
|
|
| 481 |
{
|
| 482 |
...,
|
| 483 |
"type": "yarn",
|
|
@@ -489,6 +490,7 @@ For supported frameworks, you could add the following to `config.json` to enable
|
|
| 489 |
"factor": 4,
|
| 490 |
"original_max_position_embeddings": 32768
|
| 491 |
}
|
|
|
|
| 492 |
|
| 493 |
However, it should be noted that this method has a significant impact on the performance of temporal and spatial localization tasks, and is therefore not recommended for use.
|
| 494 |
|
|
|
|
| 428 |
min_pixels = 256 * 28 * 28
|
| 429 |
max_pixels = 1280 * 28 * 28
|
| 430 |
processor = AutoProcessor.from_pretrained(
|
| 431 |
+
"Qwen/Qwen2.5-VL-3B-Instruct", min_pixels=min_pixels, max_pixels=max_pixels
|
| 432 |
)
|
| 433 |
```
|
| 434 |
|
|
|
|
| 478 |
|
| 479 |
For supported frameworks, you could add the following to `config.json` to enable YaRN:
|
| 480 |
|
| 481 |
+
```
|
| 482 |
{
|
| 483 |
...,
|
| 484 |
"type": "yarn",
|
|
|
|
| 490 |
"factor": 4,
|
| 491 |
"original_max_position_embeddings": 32768
|
| 492 |
}
|
| 493 |
+
```
|
| 494 |
|
| 495 |
However, it should be noted that this method has a significant impact on the performance of temporal and spatial localization tasks, and is therefore not recommended for use.
|
| 496 |
|