Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available:
5.49.1
CLI
WIP
Dataset
Dataset.bat webui (python webui_dataset.py) consists of slice audio and transcribe wavs.
Slice audio
python slice.py -i <input_dir> -o <output_dir> -m <min_sec> -M <max_sec>
Required:
input_dir: Path to the directory containing the audio files to slice.output_dir: Path to the directory where the sliced audio files will be saved.
Optional:
min_sec: Minimum duration of the sliced audio files in seconds (default 2).max_sec: Maximum duration of the sliced audio files in seconds (default 12).
Transcribe wavs
python transcribe.py -i <input_dir> -o <output_file> --speaker_name <speaker_name>
Required:
input_dir: Path to the directory containing the audio files to transcribe.output_file: Path to the file where the transcriptions will be saved.speaker_name: Name of the speaker.
Optional
--initial_prompt: Initial prompt to use for the transcription (default value is specific to Japanese).--device:cudaorcpu(default:cuda).--language:jp,en, oren(default:jp).--model: Whisper model, default:large-v3--compute_type: default:bfloat16
Train
Train.bat webui (python webui_train.py) consists of the following.
Preprocess audio
python resample.py -i <input_dir> -o <output_dir> [--normalize] [--trim]
Required:
input_dir: Path to the directory containing the audio files to preprocess.output_dir: Path to the directory where the preprocessed audio files will be saved.
TO BE WRITTEN (WIP)
これいる?