DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model Paper • 2405.04434 • Published May 7, 2024 • 24
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5, 2024 • 137
view article Article From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels Aug 18 • 88
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 430
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures Paper • 2505.09343 • Published May 14 • 73
view article Article 5 Things You Need to Know About Moonshot AI and Kimi K2, the New #1 model on the Hub Jul 15 • 24
The Gradient of Generative AI Release: Methods and Considerations Paper • 2302.04844 • Published Feb 5, 2023 • 8
view article Article What Open-Source Developers Need to Know about the EU AI Act's Rules for GPAI Models Aug 4 • 28
view article Article Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face +3 Jul 29 • 202