SAO-Instruct: Free-form Audio Editing using Natural Language Instructions Paper • 2510.22795 • Published Oct 26 • 5
Seeing Voices: Generating A-Roll Video from Audio with Mirage Paper • 2506.08279 • Published Jun 9 • 27
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published Feb 20 • 156
LatentSwap: An Efficient Latent Code Mapping Framework for Face Swapping Paper • 2402.18351 • Published Feb 28, 2024 • 2