view article Article Asynchronous Robot Inference: Decoupling Action Prediction and Execution Jul 10 • 44
view article Article Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints May 1, 2024 • 80
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders Jul 9 • 701
💬Urdu ASR Models Collection Collection of fine-tuned Urdu speech recognition models. • 9 items • Updated Jul 14 • 2
V-JEPA 2 Collection A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of https://ai.meta.com/blog/v-jepa-yann • 8 items • Updated Jun 13 • 167
view article Article LeMaterial: an open source initiative to accelerate materials discovery and research Dec 10, 2024 • 54
D-FINE Collection State-of-the-art real-time object detection model with Apache 2.0 licence • 15 items • Updated May 5 • 55
Llama 4 Collection Meta's new Llama 4 multimodal models, Scout & Maverick. Includes Dynamic GGUFs, 16-bit & Dynamic 4-bit uploads. Run & fine-tune them with Unsloth! • 15 items • Updated 4 days ago • 50
MoshiVis v0.1 Collection MoshiVis is a Vision Speech Model built as a perceptually-augmented version of Moshi v0.1 for conversing about image inputs • 8 items • Updated Mar 21 • 23
Qwen2.5-Omni Collection End-to-End Omni (text, audio, image, video, and natural speech interaction) model based Qwen2.5 • 7 items • Updated Jul 21 • 159
view article Article DeepSearch Using Visual RAG in Agentic Frameworks 🔎 By paultltc and 1 other • Mar 21 • 37
view article Article Open-Source Handwritten Signature Detection Model By samuellimabraz • Mar 14 • 119
Gemma 3 Collection All versions of Google's new multimodal models including QAT in 1B, 4B, 12B, and 27B sizes. In GGUF, dynamic 4-bit and 16-bit formats. • 55 items • Updated 4 days ago • 90