view article Article BioClinical ModernBERT: an example of continued pre-training of ModernBERT Sep 10 • 6
Running 3.6k The Ultra-Scale Playbook 🌌 3.6k The ultimate guide to training LLM on large GPU Clusters
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 431
Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation Paper • 2412.03304 • Published Dec 4, 2024 • 19