YOLO-World: Real-Time Open-Vocabulary Object Detection Paper โข 2401.17270 โข Published Jan 30, 2024 โข 42
view article Article Introducing ColQwen-Omni: Retrieve in every modality By manu and 4 others โข Jul 17 โข 75
FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language Paper โข 2506.20920 โข Published Jun 26 โข 73
V-JEPA 2 Collection A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of https://ai.meta.com/blog/v-jepa-yann โข 8 items โข Updated Jun 13 โข 167
SmolVLM: Redefining small and efficient multimodal models Paper โข 2504.05299 โข Published Apr 7 โข 200
view article Article Open-Source Handwritten Signature Detection Model By samuellimabraz โข Mar 14 โข 119
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM Mar 12 โข 468
PaliGemma 2: A Family of Versatile VLMs for Transfer Paper โข 2412.03555 โข Published Dec 4, 2024 โข 133
PaliGemma 2 Release Collection Vision-Language Models available in multiple 3B, 10B and 28B variants. โข 32 items โข Updated Jul 10 โข 150
LayerSkip Collection Models continually pretrained using LayerSkip - https://arxiv.org/abs/2404.16710 โข 8 items โข Updated Nov 21, 2024 โข 48
Llama 3.2 Collection Meta's new Llama 3.2 vision and text models including 1B, 3B, 11B and 90B. Includes GGUF, 4-bit bnb and original versions. โข 27 items โข Updated 3 days ago โข 68
view article Article Llama can now see and run on your device - welcome Llama 3.2 Sep 25, 2024 โข 191
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 โข 15 items โข Updated Dec 6, 2024 โข 639
HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models Paper โข 2409.16191 โข Published Sep 24, 2024 โข 42
VGGHeads: A Large-Scale Synthetic Dataset for 3D Human Heads Paper โข 2407.18245 โข Published Jul 25, 2024 โข 12