From Pixels to Words -- Towards Native Vision-Language Primitives at Scale Paper • 2510.14979 • Published 12 days ago • 65
view article Article Model statistics of the 50 most downloaded entities on Hugging Face By lbourdois • 15 days ago • 26
Phoenix-VAD: Streaming Semantic Endpoint Detection for Full-Duplex Speech Interaction Paper • 2509.20410 • Published Sep 24 • 1
Self-Forcing++: Towards Minute-Scale High-Quality Video Generation Paper • 2510.02283 • Published 26 days ago • 91
EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing Paper • 2509.26346 • Published 28 days ago • 17
view article Article There is no such thing as a tokenizer-free lunch By catherinearnett • Sep 25 • 84
Kimi-K2 Collection Moonshot's MoE LLMs with 1 trillion parameters, exceptional on agentic intellegence • 3 items • Updated 1 day ago • 128
view article Article Make your ZeroGPU Spaces go brrr with PyTorch ahead-of-time compilation Sep 2 • 68
MoDA: Multi-modal Diffusion Architecture for Talking Head Generation Paper • 2507.03256 • Published Jul 4 • 2
FantasyTalking2: Timestep-Layer Adaptive Preference Optimization for Audio-Driven Portrait Animation Paper • 2508.11255 • Published Aug 15 • 11
view article Article Why We Built the OpenMDW License: A Comprehensive License for ML Models By linuxfoundation • Jul 2 • 23
view article Article Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face Jul 29 • 190
GLM-4.5 Collection GLM-4.5: An open-source large language model designed for intelligent agents by Z.ai • 11 items • Updated Aug 11 • 245