Implicit Reward as the Bridge: A Unified View of SFT and DPO Connections Paper • 2507.00018 • Published Jun 15
XY-Tokenizer: Mitigating the Semantic-Acoustic Conflict in Low-Bitrate Speech Codecs Paper • 2506.23325 • Published Jun 29
VStyle: A Benchmark for Voice Style Adaptation with Spoken Instructions Paper • 2509.09716 • Published Sep 9 • 10
MOSS-Speech: Towards True Speech-to-Speech Models Without Text Guidance Paper • 2510.00499 • Published Oct 1 • 18
Evaluating Hallucinations in Chinese Large Language Models Paper • 2310.03368 • Published Oct 5, 2023
LLM can Achieve Self-Regulation via Hyperparameter Aware Generation Paper • 2402.11251 • Published Feb 17, 2024 • 1
Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models Paper • 2405.12939 • Published May 21, 2024 • 1
Case2Code: Learning Inductive Reasoning with Synthetic Data Paper • 2407.12504 • Published Jul 17, 2024 • 8
Improving Contrastive Learning of Sentence Embeddings from AI Feedback Paper • 2305.01918 • Published May 3, 2023
Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective Paper • 2412.14135 • Published Dec 18, 2024
Revisiting the Test-Time Scaling of o1-like Models: Do they Truly Possess Test-Time Scaling Capabilities? Paper • 2502.12215 • Published Feb 17 • 16
World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning Paper • 2503.10480 • Published Mar 13 • 55
VisuoThink: Empowering LVLM Reasoning with Multimodal Tree Search Paper • 2504.09130 • Published Apr 12 • 12
Dictionary Learning Improves Patch-Free Circuit Discovery in Mechanistic Interpretability: A Case Study on Othello-GPT Paper • 2402.12201 • Published Feb 19, 2024
InstructTTSEval: Benchmarking Complex Natural-Language Instruction Following in Text-to-Speech Systems Paper • 2506.16381 • Published Jun 19 • 2