T-pro 2.0: An Efficient Russian Hybrid-Reasoning Model and Playground Paper • 2512.10430 • Published 28 days ago • 113
When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA Paper • 2510.04849 • Published Oct 6, 2025 • 114
Accelerating Nash Learning from Human Feedback via Mirror Prox Paper • 2505.19731 • Published May 26, 2025 • 6
How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM? Paper • 2502.14502 • Published Feb 20, 2025 • 91
SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators Paper • 2502.06394 • Published Feb 10, 2025 • 89