SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space Paper • 2511.20102 • Published Nov 25 • 26
CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation Paper • 2502.21074 • Published Feb 28 • 4