lmms-lab
/

LLaVA-OneVision-1.5-8B-Instruct

@@ -44,35 +44,35 @@ Complete end-to-end training framework designed for maximum efficiency:
 ## Evaluation Results
 All evaluations were conducted using [lmms_eval](https://github.com/EvolvingLMMs-Lab/lmms-eval).
-|                                  | **LLaVA-OV-1.5-8B** | **Qwen2.5 VL 7B** | **LLaVA-OV-1.5-4B** | **Qwen2.5 VL 3B** |
-|:----------------------------------|:---------------:|:-------------:|:---------------:|:-------------:|
-| MMMU (Validation)                 |    **55.44**    |     51.33     |    **51.44**    |     46.44     |
-| MMMU-Pro (Standard)               |    **37.40**    |     36.30     |    **33.24**    |     31.10     |
-| MMMU-Pro (Vision)                 |      25.15      |   **32.83**   |    **23.53**    |     21.27     |
-| MMBench (English; Test)           |    **84.14**    |     83.40     |    **82.29**    |     77.97     |
-| MMBench (Chinese; Test)           |      81.00      |   **81.61**   |    **76.73**    |     74.55     |
-| MME-RealWorld (English)           |    **62.31**    |     57.33     |    **57.16**    |     51.60     |
-| MME-RealWorld (Chinese)           |    **56.11**    |     51.50     |      21.38      |   **45.38**   |
-| AI2D (With Mask)                  |    **84.16**    |     82.58     |    **84.62**    |     78.56     |
-| AI2D (Without Mask)               |    **94.11**    |     93.36     |    **92.84**    |     90.74     |
-| CV-Bench                          |    **80.82**    |     79.95     |    **74.00**    |     71.53     |
-| VL-RewardBench                    |      45.90      |   **49.65**   |    **45.90**    |     42.06     |
-| V*                                |    **78.01**    |     76.96     |      66.49      |   **69.63**   |
-| PixmoCount                        |      62.19      |   **63.33**   |    **59.17**    |     50.85     |
-| CountBench                        |    **88.19**    |     86.35     |    **77.80**    |     72.51     |
-| ChartQA                           |    **86.48**    |     84.08     |    **85.11**    |     83.36     |
-| CharXiv (Direct Questions)        |    **74.10**    |     69.80     |    **70.70**    |     58.20     |
-| DocVQA (Test)                     |    **95.00**    |     94.93     |    **93.48**    |     92.67     |
-| InfoVQA (Test)                    |      78.42      |   **81.67**   |    **75.27**    |     75.63     |
-| WeMath                            |    **33.62**    |     33.33     |    **28.00**    |     18.38     |
-| MathVista (Mini)                  |    **69.57**    |     68.60     |    **67.36**    |     60.23     |
-| MathVision                        |    **25.56**    |     22.37     |    **22.76**    |     21.25     |
-| MMStar                            |    **67.72**    |     62.54     |    **64.22**    |     55.86     |
-| SEED-Bench (Image)                |      77.32      |   **77.53**   |    **76.74**    |     74.81     |
-| ScienceQA                         |    **94.98**    |     88.75     |    **92.05**    |     83.33     |
-| SEED-Bench 2-Plus                 |      69.21      |   **70.93**   |    **68.42**    |     68.64     |
-| OCRBench                          |      82.90      |   **84.20**   |      77.80      |   **79.20**   |
-| RealWorldQA                       |      68.10      |   **68.50**   |    **64.05**    |     60.00     |
 ### Using 🤗  Transformers to Chat
 Here we show a code snippet to show you how to use the chat model with `transformers` and `qwen_vl_utils`:

 ## Evaluation Results
 All evaluations were conducted using [lmms_eval](https://github.com/EvolvingLMMs-Lab/lmms-eval).
+|                                  | **LLaVA-OV-1.5-8B** | **Qwen2.5 VL 7B** |
+|:----------------------------------|:---------------:|:-------------:|
+| MMMU (Validation)                 |    **55.44**    |     51.33     |
+| MMMU-Pro (Standard)               |    **37.40**    |     36.30     |
+| MMMU-Pro (Vision)                 |      25.15      |   **32.83**   |
+| MMBench (English; Test)           |    **84.14**    |     83.40     |
+| MMBench (Chinese; Test)           |      81.00      |   **81.61**   |
+| MME-RealWorld (English)           |    **62.31**    |     57.33     |
+| MME-RealWorld (Chinese)           |    **56.11**    |     51.50     |
+| AI2D (With Mask)                  |    **84.16**    |     82.58     |
+| AI2D (Without Mask)               |    **94.11**    |     93.36     |
+| CV-Bench                          |    **80.82**    |     79.95     |
+| VL-RewardBench                    |      45.90      |   **49.65**   |
+| V*                                |    **78.01**    |     76.96     |
+| PixmoCount                        |      62.19      |   **63.33**   |
+| CountBench                        |    **88.19**    |     86.35     |
+| ChartQA                           |    **86.48**    |     84.08     |
+| CharXiv (Direct Questions)        |    **74.10**    |     69.80     |
+| DocVQA (Test)                     |    **95.00**    |     94.93     |
+| InfoVQA (Test)                    |      78.42      |   **81.67**   |
+| WeMath                            |    **33.62**    |     33.33     |
+| MathVista (Mini)                  |    **69.57**    |     68.60     |
+| MathVision                        |    **25.56**    |     22.37     |
+| MMStar                            |    **67.72**    |     62.54     |
+| SEED-Bench (Image)                |      77.32      |   **77.53**   |
+| ScienceQA                         |    **94.98**    |     88.75     |
+| SEED-Bench 2-Plus                 |      69.21      |   **70.93**   |
+| OCRBench                          |      82.90      |   **84.20**   |
+| RealWorldQA                       |      68.10      |   **68.50**   |
 ### Using 🤗  Transformers to Chat
 Here we show a code snippet to show you how to use the chat model with `transformers` and `qwen_vl_utils`: