Qwen
/

Qwen2-VL-2B

Image-Text-to-Text

text-generation-inference

Model card Files Files and versions

jklj077 commited on Dec 6, 2024

Commit

d3a53f2

·

verified ·

1 Parent(s): 258ddfb

Update README.md

Files changed (1) hide show

README.md +5 -14

README.md CHANGED Viewed

@@ -8,12 +8,16 @@ tags:
 library_name: transformers
 ---
-# Qwen2-VL-2B-Base
 ## Introduction
 We're excited to unveil **Qwen2-VL**, the latest iteration of our Qwen-VL model, representing nearly a year of innovation.
 ### What’s New in Qwen2-VL?
 #### Key Enhancements:
@@ -53,19 +57,6 @@ KeyError: 'qwen2_vl'
 ```
-## Limitations
-While Qwen2-VL are applicable to a wide range of visual tasks, it is equally important to understand its limitations. Here are some known restrictions:
-1. Lack of Audio Support: The current model does **not comprehend audio information** within videos.
-2. Data timeliness: Our image dataset is **updated until June 2023**, and information subsequent to this date may not be covered.
-3. Constraints in Individuals and Intellectual Property (IP): The model's capacity to recognize specific individuals or IPs is limited, potentially failing to comprehensively cover all well-known personalities or brands.
-4. Limited Capacity for Complex Instruction: When faced with intricate multi-step instructions, the model's understanding and execution capabilities require enhancement.
-5. Insufficient Counting Accuracy: Particularly in complex scenes, the accuracy of object counting is not high, necessitating further improvements.
-6. Weak Spatial Reasoning Skills: Especially in 3D spaces, the model's inference of object positional relationships is inadequate, making it difficult to precisely judge the relative positions of objects.
-These limitations serve as ongoing directions for model optimization and improvement, and we are committed to continually enhancing the model's performance and scope of application.
 ## Citation
 If you find our work helpful, feel free to give us a cite.

 library_name: transformers
 ---
+# Qwen2-VL-2B
 ## Introduction
 We're excited to unveil **Qwen2-VL**, the latest iteration of our Qwen-VL model, representing nearly a year of innovation.
+> [!Important]
+> This is the base pretrained model of Qwen2-VL-2B without instruction tuning.
 ### What’s New in Qwen2-VL?
 #### Key Enhancements:
 ```
 ## Citation
 If you find our work helpful, feel free to give us a cite.