Spaces:
Runtime error
Runtime error
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,5 +1,5 @@
|
|
| 1 |
---
|
| 2 |
-
title:
|
| 3 |
emoji: 🎵
|
| 4 |
colorFrom: purple
|
| 5 |
colorTo: gray
|
|
@@ -8,56 +8,17 @@ app_port: 7860
|
|
| 8 |
---
|
| 9 |
|
| 10 |
|
| 11 |
-
|
|
|
|
|
|
|
| 12 |
|
| 13 |
-
This repository is the official
|
| 14 |
|
| 15 |
-
|
| 16 |
|
| 17 |
-
|
| 18 |
|
| 19 |
-
|
| 20 |
-
You can install the necessary dependencies using the `requirements.txt` file with Python 3.8.12:
|
| 21 |
-
|
| 22 |
-
```bash
|
| 23 |
-
pip install -r requirements.txt
|
| 24 |
-
```
|
| 25 |
-
|
| 26 |
-
then install flash attention from wget
|
| 27 |
-
|
| 28 |
-
```bash
|
| 29 |
-
wget https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.4.post1/flash_attn-2.7.4.post1+cu12torch2.2cxx11abiFALSE-cp310-cp310-linux_x86_64.whl -P /home/
|
| 30 |
-
pip install /home/flash_attn-2.7.4.post1+cu12torch2.2cxx11abiFALSE-cp310-cp310-linux_x86_64.whl
|
| 31 |
-
```
|
| 32 |
-
|
| 33 |
-
## Start with docker
|
| 34 |
-
```bash
|
| 35 |
-
docker pull juhayna/song-generation-levo:v0.1
|
| 36 |
-
docker run -it --gpus all --network=host juhayna/song-generation-levo:v0.1 /bin/bash
|
| 37 |
-
```
|
| 38 |
-
|
| 39 |
-
## Inference
|
| 40 |
-
|
| 41 |
-
Please note that all the two folder below must be downloaded completely for the model to load correctly, which is sourced from [here](https://huggingface.co/waytan22/SongGeneration)
|
| 42 |
-
|
| 43 |
-
- Save `ckpt` to the root directory
|
| 44 |
-
- Save `third_party` to the root directory
|
| 45 |
-
|
| 46 |
-
Then run inference, use the following command:
|
| 47 |
-
|
| 48 |
-
```bash
|
| 49 |
-
sh generate.sh sample/lyric.jsonl sample/generate
|
| 50 |
-
```
|
| 51 |
-
- Input keys in the `sample/lyric.jsonl`
|
| 52 |
-
- `idx`: name of the generate song file
|
| 53 |
-
- `descriptions`: text description, can be None or specified gender, timbre, genre, mood, instrument and BPM
|
| 54 |
-
- `prompt_audio_path`: reference audio path, can be None or 10s song audio path
|
| 55 |
-
- `gt_lyric`: lyrics, it needs to follow the format of '\[Structure\] Text', supported structures can be found in `conf/vocab.yaml`
|
| 56 |
-
|
| 57 |
-
- Outputs of the loader `sample/generate`:
|
| 58 |
-
- `audio`: generated audio files
|
| 59 |
-
- `jsonl`: output jsonls
|
| 60 |
-
- `token`: Token corresponding to the generated audio files
|
| 61 |
|
| 62 |
## Note
|
| 63 |
|
|
|
|
| 1 |
---
|
| 2 |
+
title: Song Generation
|
| 3 |
emoji: 🎵
|
| 4 |
colorFrom: purple
|
| 5 |
colorTo: gray
|
|
|
|
| 8 |
---
|
| 9 |
|
| 10 |
|
| 11 |
+
<p align="center">
|
| 12 |
+
<a href="https://levo-demo.github.io/">Demo</a> | <a href="https://arxiv.org/abs/2506.07520">Paper</a> | <a href="https://github.com/tencent-ailab/songgeneration">Code</a>
|
| 13 |
+
</p>
|
| 14 |
|
| 15 |
+
This repository is the official weight repository for LeVo: High-Quality Song Generation with Multi-Preference Alignment. In this repository, we provide the SongGeneration model, inference scripts, and the checkpoint that has been trained on the Million Song Dataset.
|
| 16 |
|
| 17 |
+
## Overview
|
| 18 |
|
| 19 |
+
We develop the SongGeneration model. It is an LM-based framework consisting of **LeLM** and a **music codec**. LeLM is capable of parallelly modeling two types of tokens: mixed tokens, which represent the combined audio of vocals and accompaniment to achieve vocal-instrument harmony, and dual-track tokens, which separately encode vocals and accompaniment for high-quality song generation. The music codec reconstructs the dual-track tokens into highfidelity music audio. SongGeneration significantly improves over the open-source music generation models and performs competitively with current state-of-the-art industry systems. For more details, please refer to our [paper](https://arxiv.org/abs/2506.07520).
|
| 20 |
|
| 21 |
+
<img src="https://github.com/tencent-ailab/songgeneration/blob/main/img/over.jpg?raw=true" alt="img" style="zoom:100%;" />
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 22 |
|
| 23 |
## Note
|
| 24 |
|