Learning Unmasking Policies for Diffusion Language Models Paper • 2512.09106 • Published 20 days ago • 8
DocReward: A Document Reward Model for Structuring and Stylizing Paper • 2510.11391 • Published Oct 13 • 27
UniGenX: Unified Generation of Sequence and Structure with Autoregressive Diffusion Paper • 2503.06687 • Published Mar 9 • 3
Running Featured 1.24k FineWeb: decanting the web for the finest text data at scale 🍷 1.24k Generate high-quality text data for LLMs using FineWeb