MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer Paper • 2509.16197 • Published Sep 19 • 52
MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid Vision Tokenizer Paper • 2509.16197 • Published Sep 19 • 52
Granite Code Models Collection A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 23 items • Updated 22 days ago • 199
view article Article Saving Memory Using Padding-Free Transformer Layers during Finetuning By mayank-mishra • Jun 11, 2024 • 20
view article Article DS-MoE: Making MoE Models More Efficient and Less Memory-Intensive By bpan • Apr 9, 2024 • 30
view article Article DS-MoE: Making MoE Models More Efficient and Less Memory-Intensive By bpan • Apr 9, 2024 • 30
LangNav: Language as a Perceptual Representation for Navigation Paper • 2310.07889 • Published Oct 11, 2023 • 6