Spaces:

Atotti
/

miipher-2-HuBERT-HiFi-GAN-v0.1

Running

App Files Files Community

miipher-2-HuBERT-HiFi-GAN-v0.1 / README_spaces.md

Ayuto

init

909e414 4 months ago

preview code

raw

history blame contribute delete

1.54 kB

A newer version of the Gradio SDK is available: 5.49.1

Upgrade

metadata

title: Miipher-2 Speech Enhancement Demo
emoji: 🎵
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.0.0
app_file: app.py
pinned: false
license: apache-2.0

Miipher-2 Speech Enhancement Demo

Miipher-2 is a speech enhancement system that uses Parallel Adapters inserted into mHuBERT layers to improve audio quality.

Features

Real-time speech enhancement from noisy or degraded audio
Parallel Adapter architecture for efficient fine-tuning
Lightning SSL-Vocoder for high-quality audio synthesis
Easy-to-use Gradio interface

Model Architecture

SSL Feature Extractor: mHuBERT-147 (Layer 6)
Parallel Adapter: Lightweight feedforward network
Lightning SSL-Vocoder: HiFi-GAN based vocoder

Usage

Upload an audio file or record using your microphone
Click "音声を修復" (Enhance Audio)
Listen to the enhanced audio output

Models

The demo automatically downloads the unified model from:

Complete Model: Atotti/miipher-2-HuBERT-HiFi-GAN-v0.1 (includes both Adapter and Vocoder)

Technical Details

Input: Audio files (WAV, MP3, FLAC)
Output: Enhanced audio at 22050Hz
Supported Languages: Primarily trained on Japanese but works with other languages
Processing: Real-time inference on CPU/GPU

License

Apache-2.0

Citation

If you use Miipher-2 in your research, please cite:

@article{miipher2,
  title={Miipher-2: Speech Enhancement with Parallel Adapters},
  author={Your Name},
  year={2024}
}