✨1T total / 50B active params per token ✨20T+ reasoning-dense tokens (Evo-CoT) ✨128K context via YaRN ✨FP8 training: 15%+ faster, same precision as BF16 ✨Hybrid Syntax-Function-Aesthetics reward for front-end & visual generation
Qwen 3 Coder is a personal attack to k2, and I love it. It achieves near SOTA on LCB while not having reasoning. Finally people are understanding that reasoning isnt necessary for high benches...