NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 6 items • Updated 5 days ago • 112
PaCoRe Collection Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning • 3 items • Updated 27 days ago • 8
view article Article Tensor Parallelism (TP) in Transformers: 5 Minutes to Understand Dec 4, 2025 • 63
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length Paper • 2512.04677 • Published Dec 4, 2025 • 167
DR Tulu Collection Models and data associated with DR Tulu, http://allenai-web/papers/drtulu • 5 items • Updated Nov 25, 2025 • 31
ColModernVBERT Collection Resources for ColModernVBERT – the document retrieval–optimized variant of ModernVBERT • 5 items • Updated Oct 3, 2025 • 7
Qianfan-VL Collection Qianfan-vl model series. The models are mainly domain enhanced vision language model, targeting enterprise level multi modal understanding scenarios. • 4 items • Updated Sep 24, 2025 • 19
MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer Paper • 2509.16197 • Published Sep 19, 2025 • 56
Tiny Language Model Datasets Collection Collection of Synthetic Datasets that can be used in pretraining of any the Tiny Language Model • 14 items • Updated Sep 21, 2025 • 29