Update README.md
Browse files
README.md
CHANGED
|
@@ -8,12 +8,16 @@ tags:
|
|
| 8 |
library_name: transformers
|
| 9 |
---
|
| 10 |
|
| 11 |
-
# Qwen2-VL-2B
|
| 12 |
|
| 13 |
## Introduction
|
| 14 |
|
| 15 |
We're excited to unveil **Qwen2-VL**, the latest iteration of our Qwen-VL model, representing nearly a year of innovation.
|
| 16 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 17 |
### What’s New in Qwen2-VL?
|
| 18 |
|
| 19 |
#### Key Enhancements:
|
|
@@ -53,19 +57,6 @@ KeyError: 'qwen2_vl'
|
|
| 53 |
```
|
| 54 |
|
| 55 |
|
| 56 |
-
## Limitations
|
| 57 |
-
|
| 58 |
-
While Qwen2-VL are applicable to a wide range of visual tasks, it is equally important to understand its limitations. Here are some known restrictions:
|
| 59 |
-
|
| 60 |
-
1. Lack of Audio Support: The current model does **not comprehend audio information** within videos.
|
| 61 |
-
2. Data timeliness: Our image dataset is **updated until June 2023**, and information subsequent to this date may not be covered.
|
| 62 |
-
3. Constraints in Individuals and Intellectual Property (IP): The model's capacity to recognize specific individuals or IPs is limited, potentially failing to comprehensively cover all well-known personalities or brands.
|
| 63 |
-
4. Limited Capacity for Complex Instruction: When faced with intricate multi-step instructions, the model's understanding and execution capabilities require enhancement.
|
| 64 |
-
5. Insufficient Counting Accuracy: Particularly in complex scenes, the accuracy of object counting is not high, necessitating further improvements.
|
| 65 |
-
6. Weak Spatial Reasoning Skills: Especially in 3D spaces, the model's inference of object positional relationships is inadequate, making it difficult to precisely judge the relative positions of objects.
|
| 66 |
-
|
| 67 |
-
These limitations serve as ongoing directions for model optimization and improvement, and we are committed to continually enhancing the model's performance and scope of application.
|
| 68 |
-
|
| 69 |
## Citation
|
| 70 |
|
| 71 |
If you find our work helpful, feel free to give us a cite.
|
|
|
|
| 8 |
library_name: transformers
|
| 9 |
---
|
| 10 |
|
| 11 |
+
# Qwen2-VL-2B
|
| 12 |
|
| 13 |
## Introduction
|
| 14 |
|
| 15 |
We're excited to unveil **Qwen2-VL**, the latest iteration of our Qwen-VL model, representing nearly a year of innovation.
|
| 16 |
|
| 17 |
+
> [!Important]
|
| 18 |
+
> This is the base pretrained model of Qwen2-VL-2B without instruction tuning.
|
| 19 |
+
|
| 20 |
+
|
| 21 |
### What’s New in Qwen2-VL?
|
| 22 |
|
| 23 |
#### Key Enhancements:
|
|
|
|
| 57 |
```
|
| 58 |
|
| 59 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 60 |
## Citation
|
| 61 |
|
| 62 |
If you find our work helpful, feel free to give us a cite.
|