Improving Recursive Transformers with Mixture of LoRAs Paper • 2512.12880 • Published 16 days ago • 5
DCAgent2/claude-4-5-sonnet-thinking-stackexchange-overflow-32ep-32k-traces Viewer • Updated 23 days ago • 3.77k • 81 • 1
SPICE: Self-Play In Corpus Environments Improves Reasoning Paper • 2510.24684 • Published Oct 28 • 17
SonicMoE: Accelerating MoE with IO and Tile-aware Optimizations Paper • 2512.14080 • Published 15 days ago • 5
Trainable Log-linear Sparse Attention for Efficient Diffusion Transformers Paper • 2512.16615 • Published 12 days ago • 4