MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark Paper • 2410.19168 • Published Oct 24, 2024 • 23
Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks Paper • 2411.05361 • Published Nov 8, 2024 • 3
DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment Paper • 2507.02768 • Published Jul 3 • 18
Investigating Safety Vulnerabilities of Large Audio-Language Models Under Speaker Emotional Variations Paper • 2510.16893 • Published 11 days ago • 17
SAKE: Towards Editing Auditory Attribute Knowledge of Large Audio-Language Models Paper • 2510.16917 • Published 11 days ago • 19
Awesome papers from 臺大李宏毅 (Hung-yi Lee) Collection Recent papers authored by Hung-yi Lee. Sorted by ID • 8 items • Updated 6 days ago • 17
SHANKS: Simultaneous Hearing and Thinking for Spoken Language Models Paper • 2510.06917 • Published 22 days ago • 34
IMPACT: Iterative Mask-based Parallel Decoding for Text-to-Audio Generation with Diffusion Modeling Paper • 2506.00736 • Published May 31 • 10
Game-Time: Evaluating Temporal Dynamics in Spoken Language Models Paper • 2509.26388 • Published 30 days ago • 26