Reasoning with Sampling: Your Base Model is Smarter Than You Think Paper • 2510.14901 • Published 13 days ago • 41
Sample By Step, Optimize By Chunk: Chunk-Level GRPO For Text-to-Image Generation Paper • 2510.21583 • Published 5 days ago • 30
Video-As-Prompt: Unified Semantic Control for Video Generation Paper • 2510.20888 • Published 6 days ago • 41
DeepAgent: A General Reasoning Agent with Scalable Toolsets Paper • 2510.21618 • Published 5 days ago • 80
Generative Universal Verifier as Multimodal Meta-Reasoner Paper • 2510.13804 • Published 14 days ago • 24
SpaceVista: All-Scale Visual Spatial Reasoning from mm to km Paper • 2510.09606 • Published 19 days ago • 17
Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels Paper • 2510.06499 • Published 21 days ago • 31
StreamingVLM: Real-Time Understanding for Infinite Video Streams Paper • 2510.09608 • Published 19 days ago • 49
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation Paper • 2510.08673 • Published 20 days ago • 120
UniVideo: Unified Understanding, Generation, and Editing for Videos Paper • 2510.08377 • Published 20 days ago • 67
SHANKS: Simultaneous Hearing and Thinking for Spoken Language Models Paper • 2510.06917 • Published 21 days ago • 34
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding Paper • 2510.06308 • Published 22 days ago • 52
VChain: Chain-of-Visual-Thought for Reasoning in Video Generation Paper • 2510.05094 • Published 23 days ago • 36
Paper2Video: Automatic Video Generation from Scientific Papers Paper • 2510.05096 • Published 23 days ago • 107