Diffusion Transformers with Representation Autoencoders Paper • 2510.11690 • Published 17 days ago • 160
OmniInsert: Mask-Free Video Insertion of Any Reference via Diffusion Transformer Models Paper • 2509.17627 • Published Sep 22 • 65
HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning Paper • 2509.08519 • Published Sep 10 • 126
UMO: Scaling Multi-Identity Consistency for Image Customization via Matching Reward Paper • 2509.06818 • Published Sep 8 • 29
USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning Paper • 2508.18966 • Published Aug 26 • 56
Phantom-Data : Towards a General Subject-Consistent Video Generation Dataset Paper • 2506.18851 • Published Jun 23 • 30
Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation Paper • 2506.09350 • Published Jun 11 • 48
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model Paper • 2504.08685 • Published Apr 11 • 130
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation Paper • 2504.02160 • Published Apr 2 • 37
Phantom: Subject-consistent video generation via cross-modal alignment Paper • 2502.11079 • Published Feb 16 • 59
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation Paper • 2306.07954 • Published Jun 13, 2023 • 111