Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization Paper • 2408.14547 • Published Aug 26, 2024
With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning Paper • 2308.12383 • Published Aug 23, 2023
Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing Paper • 2304.02051 • Published Apr 4, 2023 • 4
The (R)Evolution of Multimodal Large Language Models: A Survey Paper • 2402.12451 • Published Feb 19, 2024
LaDI-VTON: Latent Diffusion Textual-Inversion Enhanced Virtual Try-On Paper • 2305.13501 • Published May 22, 2023 • 1
Multimodal-Conditioned Latent Diffusion Models for Fashion Image Editing Paper • 2403.14828 • Published Mar 21, 2024