One Small Step in Latent, One Giant Leap for Pixels: Fast Latent Upscale Adapter for Your Diffusion Models Paper • 2511.10629 • Published Nov 13 • 122
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated Jul 21 • 550
Latent Space Super-Resolution for Higher-Resolution Image Generation with Diffusion Models Paper • 2503.18446 • Published Mar 24 • 12
view article Article Run ComfyUI workflows for free with Gradio on Hugging Face Spaces Jan 14, 2024 • 95
DrawingSpinUp: 3D Animation from Single Character Drawings Paper • 2409.08615 • Published Sep 13, 2024 • 19
An Object is Worth 64x64 Pixels: Generating 3D Object via Image Diffusion Paper • 2408.03178 • Published Aug 6, 2024 • 40
SF3D: Stable Fast 3D Mesh Reconstruction with UV-unwrapping and Illumination Disentanglement Paper • 2408.00653 • Published Aug 1, 2024 • 31
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models Paper • 2402.17177 • Published Feb 27, 2024 • 88
ScreenAI: A Vision-Language Model for UI and Infographics Understanding Paper • 2402.04615 • Published Feb 7, 2024 • 44
BlockFusion: Expandable 3D Scene Generation using Latent Tri-plane Extrapolation Paper • 2401.17053 • Published Jan 30, 2024 • 33
Multi-Track Timeline Control for Text-Driven 3D Human Motion Generation Paper • 2401.08559 • Published Jan 16, 2024 • 9
Seamless Communication Collection A significant step towards removing language barriers through expressive, fast and high-quality AI translation. • 16 items • Updated Jan 16, 2024 • 158