8 17

Jason Weston

spermwhale

AI & ML interests

None yet

Recent Activity

upvoted a paper 22 days ago

AI & Human Co-Improvement for Safer Co-Superintelligence

commented on a paper 22 days ago

AI & Human Co-Improvement for Safer Co-Superintelligence

upvoted a paper about 2 months ago

Scaling Agent Learning via Experience Synthesis

View all activity

Organizations

None yet

upvoted a paper 22 days ago

AI & Human Co-Improvement for Safer Co-Superintelligence

Paper • 2512.05356 • Published 25 days ago • 8

commented a paper 22 days ago

AI & Human Co-Improvement for Safer Co-Superintelligence

Paper • 2512.05356 • Published 25 days ago • 8 •

upvoted a paper about 2 months ago

Scaling Agent Learning via Experience Synthesis

Paper • 2511.03773 • Published Nov 5 • 81

commented 2 papers about 2 months ago

The End of Manual Decoding: Towards Truly End-to-End Language Models

Paper • 2510.26697 • Published Oct 30 • 116 •

The End of Manual Decoding: Towards Truly End-to-End Language Models

Paper • 2510.26697 • Published Oct 30 • 116 •

upvoted a paper 2 months ago

SPICE: Self-Play In Corpus Environments Improves Reasoning

Paper • 2510.24684 • Published Oct 28 • 17

commented a paper 2 months ago

SPICE: Self-Play In Corpus Environments Improves Reasoning

Paper • 2510.24684 • Published Oct 28 • 17 •

upvoted a paper 2 months ago

The Alignment Waltz: Jointly Training Agents to Collaborate for Safety

Paper • 2510.08240 • Published Oct 9 • 41

upvoted a paper 3 months ago

The Era of Real-World Human Interaction: RL from User Conversations

Paper • 2509.25137 • Published Sep 29 • 18

upvoted a paper 4 months ago

The Majority is not always right: RL training for solution aggregation

Paper • 2509.06870 • Published Sep 8 • 16

commented a paper 4 months ago

The Majority is not always right: RL training for solution aggregation

Paper • 2509.06870 • Published Sep 8 • 16 •

upvoted a paper 4 months ago

Jointly Reinforcing Diversity and Quality in Language Model Generations

Paper • 2509.02534 • Published Sep 2 • 24

authored a paper 4 months ago

Jointly Reinforcing Diversity and Quality in Language Model Generations

Paper • 2509.02534 • Published Sep 2 • 24

upvoted a paper 4 months ago

StepWiser: Stepwise Generative Judges for Wiser Reasoning

Paper • 2508.19229 • Published Aug 26 • 20

upvoted a paper 7 months ago

Self-Challenging Language Model Agents

Paper • 2506.01716 • Published Jun 2 • 10

commented a paper 7 months ago

Self-Challenging Language Model Agents

Paper • 2506.01716 • Published Jun 2 • 10 •

authored a paper 8 months ago

J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning

Paper • 2505.10320 • Published May 15 • 24

authored 2 papers 9 months ago

Multi-Token Attention

Paper • 2504.00927 • Published Apr 1 • 55

SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks

Paper • 2503.15478 • Published Mar 19 • 13

authored a paper 11 months ago

Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary Feedback

Paper • 2501.10799 • Published Jan 18 • 15

Jason Weston

AI & ML interests

Recent Activity

Organizations

spermwhale's activity