Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence Paper • 2510.20579 • Published 6 days ago • 50
DyPE: Dynamic Position Extrapolation for Ultra High Resolution Diffusion Paper • 2510.20766 • Published 5 days ago • 30
Video-As-Prompt: Unified Semantic Control for Video Generation Paper • 2510.20888 • Published 5 days ago • 41
Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing Paper • 2510.19808 • Published 6 days ago • 24
GigaBrain-0: A World Model-Powered Vision-Language-Action Model Paper • 2510.19430 • Published 7 days ago • 42
MUG-V 10B: High-efficiency Training Pipeline for Large Video Generation Models Paper • 2510.17519 • Published 9 days ago • 9
UltraGen: High-Resolution Video Generation with Hierarchical Attention Paper • 2510.18775 • Published 7 days ago • 15
Constantly Improving Image Models Need Constantly Improving Benchmarks Paper • 2510.15021 • Published 12 days ago • 5
ConsistEdit: Highly Consistent and Precise Training-free Visual Editing Paper • 2510.17803 • Published 8 days ago • 12
Visual Autoregressive Models Beat Diffusion Models on Inference Time Scaling Paper • 2510.16751 • Published 10 days ago • 19
PICABench: How Far Are We from Physically Realistic Image Editing? Paper • 2510.17681 • Published 8 days ago • 60
Imaginarium: Vision-guided High-Quality 3D Scene Layout Generation Paper • 2510.15564 • Published 12 days ago • 9
BLIP3o-NEXT: Next Frontier of Native Image Generation Paper • 2510.15857 • Published 11 days ago • 21
Latent Diffusion Model without Variational Autoencoder Paper • 2510.15301 • Published 12 days ago • 47
Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset Paper • 2510.15742 • Published 11 days ago • 49
LBM: Latent Bridge Matching for Fast Image-to-Image Translation Paper • 2503.07535 • Published Mar 10 • 4
Assembler: Scalable 3D Part Assembly via Anchor Point Diffusion Paper • 2506.17074 • Published Jun 20 • 2
Universal Image Restoration Pre-training via Masked Degradation Classification Paper • 2510.13282 • Published 14 days ago • 10
Trace Anything: Representing Any Video in 4D via Trajectory Fields Paper • 2510.13802 • Published 13 days ago • 30