-
Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models
Paper • 2508.10751 • Published • 28 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 263 -
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers
Paper • 2508.14704 • Published • 43 -
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 160
Yenson
yenson-lau
·
AI & ML interests
None yet
Organizations
Papers
-
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models
Paper • 2506.06395 • Published • 133 -
Magistral
Paper • 2506.10910 • Published • 66 -
Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs
Paper • 2506.07240 • Published • 7 -
Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation
Paper • 2506.09991 • Published • 55
Starred
-
Pass@k Training for Adaptively Balancing Exploration and Exploitation of Large Reasoning Models
Paper • 2508.10751 • Published • 28 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 263 -
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers
Paper • 2508.14704 • Published • 43 -
AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs
Paper • 2508.16153 • Published • 160
Papers
-
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models
Paper • 2506.06395 • Published • 133 -
Magistral
Paper • 2506.10910 • Published • 66 -
Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs
Paper • 2506.07240 • Published • 7 -
Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation
Paper • 2506.09991 • Published • 55
datasets
0
None public yet