lmms-lab
/

LLaVA-OneVision-1.5-8B-Instruct

Image-Text-to-Text

feature-extraction

Model card Files Files and versions

Metrics Training metrics Community

xiangan commited on 10 days ago

Commit

9072dbf

·

verified ·

1 Parent(s): ae3e394

Update README.md

Files changed (1) hide show

README.md +18 -1

README.md CHANGED Viewed

@@ -199,4 +199,21 @@ If you find *LLaVA-OneVision-1.5* useful in your research, please consider to ci
   journal={Transactions on Machine Learning Research}
   year={2024}
 }
-```

   journal={Transactions on Machine Learning Research}
   year={2024}
 }
+```
+## Acknowledgement
+We extend our sincere gratitude to **AIAK team of the** [**Baige AI computing platform**](https://cloud.baidu.com/product/aihc.html) **from Baidu AI Cloud** for providing the exceptional training framework. The outstanding capabilities of AIAK-Training-LLM and AIAK-Megatron have significantly accelerated our training process with remarkable efficiency. These cutting-edge frameworks have been instrumental in achieving our research goals. `To get full AIAK support, you can contact Baidu Cloud.`
+We acknowledge the support of [Synvo AI](https://synvo.ai/) for contributing to the partial data annotation in this work, and also thank the maintainers and contributors of the following open-source projects, whose work greatly inspired and supported our research:
+- LLaVA: Large Language-and-Vision Assistant — [LLaVA](https://github.com/haotian-liu/LLaVA)
+- LLaVA-NeXT: Next-generation multi-modal assistant — [LLaVA-NeXT](https://github.com/LLaVA-VL/LLaVA-NeXT)
+- lmms-eval: A standardized evaluation framework for Large Multimodal Models — [lmms-eval](https://github.com/EvolvingLMMs-Lab/lmms-eval)
+- Megatron-LM: Efficient, scalable training for large language models — [Megatron-LM](https://github.com/NVIDIA/Megatron-LM)
+- Qwen2.5-VL: Strong vision-language foundation model — [Qwen2.5-VL](https://github.com/QwenLM/Qwen2.5-VL)
+- InternVL: Open-source large-scale vision-language foundation model — [InternVL](https://github.com/OpenGVLab/InternVL)
+- Qwen3: Next-generation Qwen LLM — [Qwen](https://github.com/QwenLM/Qwen)
+- MetaCLIP: Scalable contrastive pretraining — [MetaCLIP](https://github.com/facebookresearch/MetaCLIP)
+- FineVision: Open Data Is All You Need — [FineVision](https://huggingface.co/spaces/HuggingFaceM4/FineVision)