Kaleido: Open-Sourced Multi-Subject Reference Video Generation Model Paper • 2510.18573 • Published 10 days ago • 1
From Denoising to Refining: A Corrective Framework for Vision-Language Diffusion Model Paper • 2510.19871 • Published 9 days ago • 28
HoloCine: Holistic Generation of Cinematic Multi-Shot Long Video Narratives Paper • 2510.20822 • Published 8 days ago • 37
DyPE: Dynamic Position Extrapolation for Ultra High Resolution Diffusion Paper • 2510.20766 • Published 8 days ago • 31
VideoCanvas: Unified Video Completion from Arbitrary Spatiotemporal Patches via In-Context Conditioning Paper • 2510.08555 • Published 22 days ago • 62
Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing Paper • 2510.19808 • Published 9 days ago • 27
Learning an Image Editing Model without Image Editing Pairs Paper • 2510.14978 • Published 15 days ago • 7
UniVideo: Unified Understanding, Generation, and Editing for Videos Paper • 2510.08377 • Published 22 days ago • 67
Self-Forcing++: Towards Minute-Scale High-Quality Video Generation Paper • 2510.02283 • Published 29 days ago • 91
UniVid: Unifying Vision Tasks with Pre-trained Video Generation Models Paper • 2509.21760 • Published Sep 26 • 14
RecA Collection Unlocking the Massive Zero-shot Potential in Unified Multimodal Models through Self-supervised Learning! • 8 items • Updated Sep 22 • 12
X-UniMotion: Animating Human Images with Expressive, Unified and Identity-Agnostic Motion Latents Paper • 2508.09383 • Published Aug 12 • 1
Lynx: Towards High-Fidelity Personalized Video Generation Paper • 2509.15496 • Published Sep 19 • 12
WorldForge: Unlocking Emergent 3D/4D Generation in Video Diffusion Model via Training-Free Guidance Paper • 2509.15130 • Published Sep 18 • 30
Voyager: Long-Range and World-Consistent Video Diffusion for Explorable 3D Scene Generation Paper • 2506.04225 • Published Jun 4 • 28
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency Paper • 2508.18265 • Published Aug 25 • 202