File size: 1,430 Bytes
ef1f936
167211e
 
 
 
ef1f936
1f2295d
ef1f936
 
 
 
 
167211e
ed7741a
167211e
ed7741a
 
 
167211e
 
 
 
ed7741a
167211e
3dc83fd
167211e
 
 
3dc83fd
167211e
ed7741a
167211e
ed7741a
167211e
ed7741a
167211e
 
 
 
ed7741a
167211e
ed7741a
167211e
 
1f2295d
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
---
title: Music Classification with MIT AST
emoji: 🎵
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.9.1
app_file: app.py
pinned: false
license: mit
---

# Music Classification with MIT's AST Model 🎵

This Hugging Face Space demonstrates audio classification using MIT's Audio Spectrogram Transformer (AST) model. The model can identify various types of music, instruments, and sounds in audio files.

## Features

- Simple, user-friendly interface
- Support for multiple audio formats (WAV, MP3, OGG, FLAC)
- Top-5 predictions with confidence scores
- Real-time processing

## How to Use

1. Click the "Upload Music File" button or drag and drop an audio file
2. Wait a few seconds for the model to process the audio
3. View the classification results with confidence scores

## Model Details

This app uses the `MIT/ast-finetuned-audioset-10-10-0.4593` model, which is trained on AudioSet and can recognize a wide variety of sounds and music styles. The model converts audio into spectrograms and uses a transformer architecture to classify the audio content.

## Technical Notes

- The model processes audio at 16kHz
- Results show top 5 predictions with confidence scores
- Processing is done on Hugging Face's infrastructure
- No local installation required

## Credits

- Model: [MIT AST](https://huggingface.co/MIT/ast-finetuned-audioset-10-10-0.4593)
- Interface: Gradio
- Deployment: Hugging Face Spaces