StreamingVLM: Real-Time Understanding for Infinite Video Streams Paper • 2510.09608 • Published 18 days ago • 49
Paper2Video: Automatic Video Generation from Scientific Papers Paper • 2510.05096 • Published 22 days ago • 107
Code2Video: A Code-centric Paradigm for Educational Video Generation Paper • 2510.01174 • Published 27 days ago • 33
Running 175 175 Qwen3 Omni Demo ⚡ Interact with a multimodal chatbot using text, audio, images, or video
Robix: A Unified Model for Robot Interaction, Reasoning and Planning Paper • 2509.01106 • Published Sep 1 • 48
Draw-In-Mind: Learning Precise Image Editing via Chain-of-Thought Imagination Paper • 2509.01986 • Published Sep 2 • 3
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning Paper • 2509.02544 • Published Sep 2 • 122