MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head Paper • 2601.07832 • Published 5 days ago • 43
DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs Paper • 2601.03559 • Published 11 days ago • 12
nvidia/nemotron-speech-streaming-en-0.6b Automatic Speech Recognition • Updated 12 days ago • 5.78k • 393
💧 LFM2 Collection LFM2 is a new generation of hybrid models, designed for on-device deployment. • 27 items • Updated 5 days ago • 136