Shaobai Jiang's picture

4 630

Shaobai Jiang

shaobaij

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 15 hours ago

VISTA: A Test-Time Self-Improving Video Generation Agent

upvoted a paper about 15 hours ago

Robust Layerwise Scaling Rules by Proper Weight Decay Tuning

upvoted a paper 1 day ago

Agentic Entropy-Balanced Policy Optimization

View all activity

Organizations

None yet

upvoted 2 papers about 15 hours ago

VISTA: A Test-Time Self-Improving Video Generation Agent

Paper • 2510.15831 • Published 14 days ago • 19

Robust Layerwise Scaling Rules by Proper Weight Decay Tuning

Paper • 2510.15262 • Published 14 days ago • 5

upvoted 2 papers 1 day ago

Agentic Entropy-Balanced Policy Optimization

Paper • 2510.14545 • Published 15 days ago • 101

The Geometry of Reasoning: Flowing Logics in Representation Space

Paper • 2510.09782 • Published 21 days ago • 6

upvoted a paper 3 days ago

olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language Models

Paper • 2502.18443 • Published Feb 25 • 6

upvoted 5 papers 4 days ago

Attention Is All You Need for KV Cache in Diffusion LLMs

Paper • 2510.14973 • Published 15 days ago • 37

Reasoning with Sampling: Your Base Model is Smarter Than You Think

Paper • 2510.14901 • Published 15 days ago • 43

Not All Bits Are Equal: Scale-Dependent Memory Optimization Strategies for Reasoning Models

Paper • 2510.10964 • Published 18 days ago • 2

LLMs Can Get "Brain Rot"!

Paper • 2510.13928 • Published 16 days ago • 21

Tensor Logic: The Language of AI

Paper • 2510.12269 • Published 17 days ago • 7

upvoted 10 papers 5 days ago

BitNet Distillation

Paper • 2510.13998 • Published 16 days ago • 51

Learning to Grasp Anything by Playing with Random Toys

Paper • 2510.12866 • Published 17 days ago • 5

VLA-0: Building State-of-the-Art VLAs with Zero Modification

Paper • 2510.13054 • Published 16 days ago • 9

Ctrl-World: A Controllable Generative World Model for Robot Manipulation

Paper • 2510.10125 • Published 20 days ago • 1

The Alignment Waltz: Jointly Training Agents to Collaborate for Safety

Paper • 2510.08240 • Published 22 days ago • 40

Demystifying Reinforcement Learning in Agentic Reasoning

Paper • 2510.11701 • Published 18 days ago • 31

Dr.LLM: Dynamic Layer Routing in LLMs

Paper • 2510.12773 • Published 17 days ago • 31

Which Heads Matter for Reasoning? RL-Guided KV Cache Compression

Paper • 2510.08525 • Published 22 days ago • 22

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

Paper • 2510.11696 • Published 18 days ago • 168

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

Paper • 2510.08697 • Published 22 days ago • 32