--- language: - zh - en pipeline_tag: automatic-speech-recognition --- # SongPrep

Demo  |  Paper  |  Code  |  Dataset

This repository is the official weight repository for SongPrep: A Preprocessing Framework and End-to-end Model for Full-song Structure Parsing and Lyrics Transcription. In this repository, we provide the SongPrep-7B model that has been trained on the Million Song Dataset. ## Model Versions | Model | #Params | HuggingFace | | :----------------------: | :----------------------: | :---------------------------------------------------: | | SongPrep | 7B |you are here | ## Citation ``` @misc{tan2025songpreppreprocessingframeworkendtoend, title={SongPrep: A Preprocessing Framework and End-to-end Model for Full-song Structure Parsing and Lyrics Transcription}, author={Wei Tan and Shun Lei and Huaicheng Zhang and Guangzheng Li and Yixuan Zhang and Hangting Chen and Jianwei Yu and Rongzhi Gu and Dong Yu}, year={2025}, eprint={2509.17404}, archivePrefix={arXiv}, primaryClass={eess.AS}, url={https://arxiv.org/abs/2509.17404}, } ``` ## License The code and weights in this repository is released in the [LICENSE](LICENSE) file.