MOSPA: Human Motion Generation Driven by Spatial Audio Paper • 2507.11949 • Published Jul 16, 2025 • 24
SpeakerVid-5M: A Large-Scale High-Quality Dataset for Audio-Visual Dyadic Interactive Human Generation Paper • 2507.09862 • Published Jul 14, 2025 • 50
SkillMimic-V2: Learning Robust and Generalizable Interaction Skills from Sparse and Noisy Demonstrations Paper • 2505.02094 • Published May 4, 2025 • 18
TokenHSI: Unified Synthesis of Physical Human-Scene Interactions through Task Tokenization Paper • 2503.19901 • Published Mar 25, 2025 • 41 • 3
MotionStreamer: Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space Paper • 2503.15451 • Published Mar 19, 2025 • 17
TokenHSI: Unified Synthesis of Physical Human-Scene Interactions through Task Tokenization Paper • 2503.19901 • Published Mar 25, 2025 • 41
MotionStreamer: Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space Paper • 2503.15451 • Published Mar 19, 2025 • 17
TokenHSI: Unified Synthesis of Physical Human-Scene Interactions through Task Tokenization Paper • 2503.19901 • Published Mar 25, 2025 • 41 • 3
TokenHSI: Unified Synthesis of Physical Human-Scene Interactions through Task Tokenization Paper • 2503.19901 • Published Mar 25, 2025 • 41