Wav2Gloss: Generating Interlinear Glossed Text from Speech Paper • 2403.13169 • Published Mar 19, 2024
On the Effects of Heterogeneous Data Sources on Speech-to-Text Foundation Models Paper • 2406.09282 • Published Jun 13, 2024
ESPnet-EZ: Python-only ESPnet for Easy Fine-tuning and Integration Paper • 2409.09506 • Published Sep 14, 2024 • 4
Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks Paper • 2411.05361 • Published Nov 8, 2024 • 3
OpenBEATs: A Fully Open-Source General-Purpose Audio Encoder Paper • 2507.14129 • Published Jul 18, 2025 • 10
POWSM: A Phonetic Open Whisper-Style Speech Foundation Model Paper • 2510.24992 • Published Oct 28, 2025 • 4
juice500/s3m-pr-ft-wavlm-ft-superblibri-phone-noblank-v1 Feature Extraction • 0.3B • Updated Apr 15, 2025
juice500/s3m-pr-ft-wavlm-ft-superblibri-phone-noblank-v1 Feature Extraction • 0.3B • Updated Apr 15, 2025
juice500/s3m-pr-ft-wavlm-superblibri-phone-noblank-v1 Feature Extraction • 0.3B • Updated Apr 8, 2025 • 1
juice500/s3m-pr-ft-wavlm-superblibri-phone-noblank-v1 Feature Extraction • 0.3B • Updated Apr 8, 2025 • 1