Does your data spark joy? Performance gains from domain upsampling at the end of training Paper • 2406.03476 • Published Jun 5, 2024 • 4
Running 131 TxT360: Trillion Extracted Text 📖 131 Explore and analyze the TxT360 dataset for LLM pre-training
MAGA: MAssive Genre-Audience Reformulation to Pretraining Corpus Expansion Paper • 2502.04235 • Published Feb 6 • 23
Running 3.61k The Ultra-Scale Playbook 🌌 3.61k The ultimate guide to training LLM on large GPU Clusters