Improve model card: Add pipeline tag, library name, code link, and update image paths

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +14 -9
README.md CHANGED
@@ -1,8 +1,13 @@
1
  ---
2
  license: apache-2.0
 
 
3
  ---
4
- πŸ“–<a href="https://arxiv.org/abs/2509.22647">Paper</a> |πŸ€—<a href="https://huggingface.co/internlm/CapRL-3B">CapRL-3B Model</a> |
5
- πŸ€—<a href="https://huggingface.co/datasets/internlm/CapRL-2M">CapRL-2M Dataset</a> |πŸ€—<a href="https://huggingface.co/collections/long-xing1/caprl-68d64ac32ded31596c36e189">CapRL Collection</a> | πŸ€—<a href="https://huggingface.co/papers/2509.22647">Daily Paper</a>
 
 
 
6
 
7
 
8
  **CapRL-Eval-3B** is the model used for answering questions based on captions, and it is a finetuned version of Qwen2.5-VL-3B. When dealing with tasks such as ChartQA (not multiple-choice questions), it provides more stable output formatting.
@@ -25,10 +30,10 @@ By employing CapRL training framework, initializing with the Qwen2.5-VL-3B model
25
  filtered 75K QA dataset as the training set, we obtained a highly capable captioner, CapRL-3B.
26
 
27
  <p align="center">
28
- <img src="./assets/teaser.png" alt="Main Results on GPT2" width="750"/>
29
  </p>
30
  <p align="center">
31
- <img src="./assets/performance.png" alt="Main Results on GPT2" width="750"/>
32
  </p>
33
 
34
  ## Key Features
@@ -105,16 +110,16 @@ print("Chat response:", chat_response)
105
 
106
  ## Cases
107
  <p align="center">
108
- <img src="./assets/comparison.png" alt="Main Results on GPT2" width="750"/>
109
  </p>
110
 
111
  <p align="center">
112
- <img src="./assets/info_caprl.png" alt="Main Results on GPT2" width="750"/>
113
  </p>
114
 
115
  <p align="center">
116
- <img src="./assets/info_caprl2.png" alt="Main Results on GPT2" width="750"/>
117
  </p>
118
  <p align="center">
119
- <img src="./assets/natural_caprl.png" alt="Main Results on GPT2" width="750"/>
120
- </p>
 
1
  ---
2
  license: apache-2.0
3
+ pipeline_tag: image-text-to-text
4
+ library_name: transformers
5
  ---
6
+
7
+ # CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning
8
+
9
+ πŸ“–<a href="https://huggingface.co/papers/2509.22647">Paper</a> | πŸ’»<a href="https://github.com/InternLM/CapRL">Code</a> | πŸ€—<a href="https://huggingface.co/internlm/CapRL-3B">CapRL-3B Model</a> |
10
+ πŸ€—<a href="https://huggingface.co/datasets/internlm/CapRL-2M">CapRL-2M Dataset</a> |πŸ€—<a href="https://huggingface.co/collections/long-xing1/caprl-68d64ac32ded31596c36e189">CapRL Collection</a>
11
 
12
 
13
  **CapRL-Eval-3B** is the model used for answering questions based on captions, and it is a finetuned version of Qwen2.5-VL-3B. When dealing with tasks such as ChartQA (not multiple-choice questions), it provides more stable output formatting.
 
30
  filtered 75K QA dataset as the training set, we obtained a highly capable captioner, CapRL-3B.
31
 
32
  <p align="center">
33
+ <img src="https://huggingface.co/internlm/CapRL-Eval-3B/resolve/main/assets/teaser.png" alt="Main Results on GPT2" width="750"/>
34
  </p>
35
  <p align="center">
36
+ <img src="https://huggingface.co/internlm/CapRL-Eval-3B/resolve/main/assets/performance.png" alt="Main Results on GPT2" width="750"/>
37
  </p>
38
 
39
  ## Key Features
 
110
 
111
  ## Cases
112
  <p align="center">
113
+ <img src="https://huggingface.co/internlm/CapRL-Eval-3B/resolve/main/assets/comparison.png" alt="Main Results on GPT2" width="750"/>
114
  </p>
115
 
116
  <p align="center">
117
+ <img src="https://huggingface.co/internlm/CapRL-Eval-3B/resolve/main/assets/info_caprl.png" alt="Main Results on GPT2" width="750"/>
118
  </p>
119
 
120
  <p align="center">
121
+ <img src="https://huggingface.co/internlm/CapRL-Eval-3B/resolve/main/assets/info_caprl2.png" alt="Main Results on GPT2" width="750"/>
122
  </p>
123
  <p align="center">
124
+ <img src="https://huggingface.co/internlm/CapRL-Eval-3B/resolve/main/assets/natural_caprl.png" alt="Main Results on GPT2" width="750"/>
125
+ </p>