AIvry commited on
Commit
e2b4ce9
·
verified ·
1 Parent(s): 025216d

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +14 -33
app.py CHANGED
@@ -176,7 +176,7 @@ def create_interface():
176
  ```
177
 
178
  ### Audio Requirements
179
- - Format: WAV files
180
  - Sample rate: Any (automatically resampled to 16kHz)
181
  - Channels: Mono or stereo (converted to mono)
182
  - Number of files: Equal number of references and outputs
@@ -184,9 +184,9 @@ def create_interface():
184
  ## Output Format
185
 
186
  The tool generates a ZIP file containing:
187
- - `ps_scores_{model}.csv`: PS scores for each speaker/source
188
- - `pm_scores_{model}.csv`: PM scores for each speaker/source
189
- - `params.json`: Experiment parameters used
190
  - `manifest_canonical.json`: File mapping and processing details
191
 
192
  ## Available Models
@@ -194,14 +194,14 @@ def create_interface():
194
  | Model | Description | Default Layer | Use Case |
195
  |-------|-------------|---------------|----------|
196
  | `raw` | Raw waveform features | N/A | Baseline comparison |
197
- | `wavlm` | WavLM Large | 24 | Best overall performance |
198
- | `wav2vec2` | Wav2Vec2 Large | 24 | Strong performance |
199
- | `hubert` | HuBERT Large | 24 | Good for speech |
200
- | `wavlm_base` | WavLM Base | 12 | Faster, good quality |
201
- | `wav2vec2_base` | Wav2Vec2 Base | 12 | Faster processing |
202
- | `hubert_base` | HuBERT Base | 12 | Faster for speech |
203
  | `wav2vec2_xlsr` | Wav2Vec2 XLSR-53 | 24 | Multilingual |
204
- | `ast` | Audio Spectrogram Transformer | 12 | General audio |
205
 
206
  ## Parameters
207
 
@@ -213,7 +213,7 @@ def create_interface():
213
 
214
  ## Citation
215
 
216
- If you use MAPSS in your research, please cite:
217
 
218
  ```bibtex
219
  @article{Ivry2025MAPSS,
@@ -317,27 +317,8 @@ def create_interface():
317
  max_lines=10
318
  )
319
 
320
- gr.Markdown("""
321
- ## Output format:
322
- The results ZIP will contain:
323
- - `ps_scores_{model}.csv`: Perceptual Similarity scores for each speaker/source
324
- - `pm_scores_{model}.csv`: Perceptual Matching scores for each speaker/source
325
- - `params.json`: Experiment parameters
326
- - `manifest_canonical.json`: Processed file manifest
327
-
328
- ## Score interpretation:
329
- - **PS (Perceptual Similarity)**: 0-1 score, higher is better. Measures how well the separated output matches the reference compared to other sources.
330
- - **PM (Perceptual Matching)**: 0-1 score, higher is better. Measures robustness to audio distortions.
331
-
332
- ## Notes:
333
- - Processing may take several minutes depending on the audio length and model
334
- - Audio files are automatically resampled to 16kHz
335
- - The tool automatically matches outputs to references based on correlation
336
- - For best results, ensure equal number of reference and output files
337
-
338
- ## Citation:
339
- If you use this tool in your research, please cite our paper (details coming soon).
340
- """)
341
 
342
  # Set up the processing
343
  process_btn.click(
 
176
  ```
177
 
178
  ### Audio Requirements
179
+ - Format: .wav files
180
  - Sample rate: Any (automatically resampled to 16kHz)
181
  - Channels: Mono or stereo (converted to mono)
182
  - Number of files: Equal number of references and outputs
 
184
  ## Output Format
185
 
186
  The tool generates a ZIP file containing:
187
+ - `ps_scores_{model}.csv`: PS scores for each source
188
+ - `pm_scores_{model}.csv`: PM scores for each source
189
+ - `params.json`: Parameters used
190
  - `manifest_canonical.json`: File mapping and processing details
191
 
192
  ## Available Models
 
194
  | Model | Description | Default Layer | Use Case |
195
  |-------|-------------|---------------|----------|
196
  | `raw` | Raw waveform features | N/A | Baseline comparison |
197
+ | `wavlm` | WavLM Large | 24 | Strong performance |
198
+ | `wav2vec2` | Wav2Vec2 Large | 24 | Best overall performance |
199
+ | `hubert` | HuBERT Large | 24 | |
200
+ | `wavlm_base` | WavLM Base | 12 | |
201
+ | `wav2vec2_base` | Wav2Vec2 Base | 12 | Faster, good quality |
202
+ | `hubert_base` | HuBERT Base | 12 | |
203
  | `wav2vec2_xlsr` | Wav2Vec2 XLSR-53 | 24 | Multilingual |
204
+ | `ast` | Audio Spectrogram Transformer | 12 | Music |
205
 
206
  ## Parameters
207
 
 
213
 
214
  ## Citation
215
 
216
+ If you use MAPSS, please cite:
217
 
218
  ```bibtex
219
  @article{Ivry2025MAPSS,
 
317
  max_lines=10
318
  )
319
 
320
+ # gr.Markdown("""
321
+ # """)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
322
 
323
  # Set up the processing
324
  process_btn.click(