39 175 43

KABI

dongguanting

https://dongguanting.github.io/

AI & ML interests

Reasoning and Alignment for Large Language Models

Recent Activity

authored a paper 1 day ago

DeepAgent: A General Reasoning Agent with Scalable Toolsets

upvoted a paper 1 day ago

ImagerySearch: Adaptive Test-Time Search for Video Generation Beyond Semantic Dependency Constraints

upvoted a paper 1 day ago

DeepAnalyze: Agentic Large Language Models for Autonomous Data Science

View all activity

Organizations

upvoted 4 papers 1 day ago

ImagerySearch: Adaptive Test-Time Search for Video Generation Beyond Semantic Dependency Constraints

Paper • 2510.14847 • Published 12 days ago • 55

upvoted 2 papers 6 days ago

BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping

Paper • 2510.18927 • Published 7 days ago • 79

WithAnyone: Towards Controllable and ID Consistent Image Generation

Paper • 2510.14975 • Published 12 days ago • 79

upvoted 2 papers 7 days ago

Chem-R: Learning to Reason as a Chemist

Paper • 2510.16880 • Published 9 days ago • 51

MathCanvas: Intrinsic Visual Chain-of-Thought for Multimodal Mathematical Reasoning

Paper • 2510.14958 • Published 12 days ago • 22

upvoted 3 papers 8 days ago

Information Gain-based Policy Optimization: A Simple and Effective Approach for Multi-Turn LLM Agents

Paper • 2510.14967 • Published 12 days ago • 32

BitNet Distillation

Paper • 2510.13998 • Published 13 days ago • 51

Towards Mixed-Modal Retrieval for Universal Retrieval-Augmented Generation

Paper • 2510.17354 • Published 8 days ago • 32

upvoted 2 papers 9 days ago

Explore to Evolve: Scaling Evolved Aggregation Logic via Proactive Online Exploration for Deep Research Agents

Paper • 2510.14438 • Published 12 days ago • 12

Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset

Paper • 2510.15742 • Published 11 days ago • 49

upvoted 2 papers 12 days ago

Qwen3Guard Technical Report

Paper • 2510.14276 • Published 13 days ago • 13

Agentic Entropy-Balanced Policy Optimization

Paper • 2510.14545 • Published 12 days ago • 98

upvoted a collection 12 days ago

Qwen3-VL

Collection

25 items • Updated 7 days ago • 335

upvoted a collection 13 days ago

AEPO

Collection

The official datasets and model checkpoints of AEPO • 4 items • Updated 8 days ago • 3

upvoted 2 papers 27 days ago

DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

Paper • 2509.25454 • Published 29 days ago • 134

Toward Effective Tool-Integrated Reasoning via Self-Evolved Preference Learning

Paper • 2509.23285 • Published Sep 27 • 13

upvoted a paper about 1 month ago

WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for Open-Ended Deep Research

Paper • 2509.13312 • Published Sep 16 • 104

KABI

AI & ML interests

Recent Activity

Organizations

dongguanting's activity