The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain Paper • 2509.26507 • Published Sep 30 • 537
Running 3.6k The Ultra-Scale Playbook 🌌 3.6k The ultimate guide to training LLM on large GPU Clusters
Bielik-11B-v2.3 Collection A collection of models based on Bielik-11B-v2.3 (merge of Bielik models) - instruct and quantized versions. • 9 items • Updated Jun 6 • 24
Bielik 7B v0.1: A Polish Language Model -- Development, Insights, and Evaluation Paper • 2410.18565 • Published Oct 24, 2024 • 47