DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model Paper β’ 2405.04434 β’ Published May 7, 2024 β’ 22
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper β’ 2402.03300 β’ Published Feb 5, 2024 β’ 130
view article Article From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels Aug 18 β’ 84
view article Article Enhance Your Models in 5 Minutes with the Hugging Face Kernel Hub Jun 12 β’ 148
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper β’ 2501.12948 β’ Published Jan 22 β’ 421
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures Paper β’ 2505.09343 β’ Published May 14 β’ 72
view article Article On the Shifting Global Compute Landscape By huggingface and 1 other β’ 4 days ago β’ 23
view article Article AI Policy @π€: Response to the 2025 National AI R&D Strategic Plan By evijit and 2 others β’ Jun 2 β’ 14
view article Article 5 Things You Need to Know About Moonshot AI and Kimi K2, the New #1 model on the Hub By fdaudens and 1 other β’ Jul 15 β’ 24
The Gradient of Generative AI Release: Methods and Considerations Paper β’ 2302.04844 β’ Published Feb 5, 2023 β’ 8
view article Article What Open-Source Developers Need to Know about the EU AI Act's Rules for GPAI Models By yjernite and 5 others β’ Aug 4 β’ 28
view article Article Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face Jul 29 β’ 191
view article Article What is the Hugging Face Community Building? By evijit and 2 others β’ Jul 15 β’ 13