Joakim Lee's picture

263

Joakim Lee

Reinforcement4All

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 19 hours ago

Multi-Agent Evolve: LLM Self-Improve through Co-evolution

upvoted a paper about 19 hours ago

MergeMix: A Unified Augmentation Paradigm for Visual and Multi-Modal Understanding

upvoted a paper about 19 hours ago

LightBagel: A Light-weighted, Double Fusion Framework for Unified Multimodal Understanding and Generation

View all activity

Organizations

None yet

upvoted 4 papers about 19 hours ago

Multi-Agent Evolve: LLM Self-Improve through Co-evolution

Paper • 2510.23595 • Published 4 days ago • 8

MergeMix: A Unified Augmentation Paradigm for Visual and Multi-Modal Understanding

Paper • 2510.23479 • Published 4 days ago • 14

LightBagel: A Light-weighted, Double Fusion Framework for Unified Multimodal Understanding and Generation

Paper • 2510.22946 • Published 4 days ago • 16

The Best of N Worlds: Aligning Reinforcement Learning with Best-of-N Sampling via max@k Optimisation

Paper • 2510.23393 • Published 4 days ago • 20

upvoted 8 papers about 20 hours ago

Knocking-Heads Attention

Paper • 2510.23052 • Published 4 days ago • 28

Omni-Reward: Towards Generalist Omni-Modal Reward Modeling with Free-Form Preferences

Paper • 2510.23451 • Published 4 days ago • 26

Ming-Flash-Omni: A Sparse, Unified Architecture for Multimodal Perception and Generation

Paper • 2510.24821 • Published 3 days ago • 27

Parallel Loop Transformer for Efficient Test-Time Computation Scaling

Paper • 2510.24824 • Published 3 days ago • 12

PairUni: Pairwise Training for Unified Multimodal Language Models

Paper • 2510.25682 • Published 1 day ago • 12

Scaling Latent Reasoning via Looped Language Models

Paper • 2510.25741 • Published 1 day ago • 60

Reasoning-Aware GRPO using Process Mining

Paper • 2510.25065 • Published 2 days ago • 39

Video-Thinker: Sparking "Thinking with Videos" via Reinforcement Learning

Paper • 2510.23473 • Published 4 days ago • 76

upvoted 8 papers 1 day ago

UltraHR-100K: Enhancing UHR Image Synthesis with A Large-Scale High-Quality Dataset

Paper • 2510.20661 • Published 8 days ago • 13

Routing Matters in MoE: Scaling Diffusion Transformers with Explicit Routing Guidance

Paper • 2510.24711 • Published 3 days ago • 18

Beyond Reasoning Gains: Mitigating General Capabilities Forgetting in Large Reasoning Models

Paper • 2510.21978 • Published 6 days ago • 14

OSWorld-MCP: Benchmarking MCP Tool Invocation In Computer-Use Agents

Paper • 2510.24563 • Published 3 days ago • 22

ATLAS: Adaptive Transfer Scaling Laws for Multilingual Pretraining, Finetuning, and Decoding the Curse of Multilinguality

Paper • 2510.22037 • Published 6 days ago • 14

AgentFrontier: Expanding the Capability Frontier of LLM Agents with ZPD-Guided Data Synthesis

Paper • 2510.24695 • Published 3 days ago • 21

WebLeaper: Empowering Efficiency and Efficacy in WebAgent via Enabling Info-Rich Seeking

Paper • 2510.24697 • Published 3 days ago • 20

ParallelMuse: Agentic Parallel Thinking for Deep Information Seeking

Paper • 2510.24698 • Published 3 days ago • 20