Update README.md
Browse files
README.md
CHANGED
|
@@ -25,27 +25,20 @@ Notable differences from other available models include:
|
|
| 25 |
1. Performance: CED with 10M parameters outperforms the majority of previous approaches (~80M).
|
| 26 |
|
| 27 |
### Model Sources
|
| 28 |
-
- **
|
| 29 |
-
- **Repository:** https://github.com/jimbozhang/hf_transformers_custom_model_ced
|
| 30 |
- **Paper:** [CED: Consistent ensemble distillation for audio tagging](https://arxiv.org/abs/2308.11957)
|
| 31 |
- **Demo:** https://huggingface.co/spaces/mispeech/ced-base
|
| 32 |
|
| 33 |
-
## Install
|
| 34 |
-
```bash
|
| 35 |
-
pip install git+https://github.com/jimbozhang/hf_transformers_custom_model_ced.git
|
| 36 |
-
```
|
| 37 |
-
|
| 38 |
## Inference
|
| 39 |
```python
|
| 40 |
-
>>> from
|
| 41 |
-
>>> from ced_model.modeling_ced import CedForAudioClassification
|
| 42 |
|
| 43 |
>>> model_name = "mispeech/ced-tiny"
|
| 44 |
-
>>> feature_extractor =
|
| 45 |
-
>>> model =
|
| 46 |
|
| 47 |
>>> import torchaudio
|
| 48 |
-
>>> audio, sampling_rate = torchaudio.load("
|
| 49 |
>>> assert sampling_rate == 16000
|
| 50 |
>>> inputs = feature_extractor(audio, sampling_rate=sampling_rate, return_tensors="pt")
|
| 51 |
|
|
|
|
| 25 |
1. Performance: CED with 10M parameters outperforms the majority of previous approaches (~80M).
|
| 26 |
|
| 27 |
### Model Sources
|
| 28 |
+
- **Repository:** https://github.com/RicherMans/CED
|
|
|
|
| 29 |
- **Paper:** [CED: Consistent ensemble distillation for audio tagging](https://arxiv.org/abs/2308.11957)
|
| 30 |
- **Demo:** https://huggingface.co/spaces/mispeech/ced-base
|
| 31 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 |
## Inference
|
| 33 |
```python
|
| 34 |
+
>>> from transformers import AutoModelForAudioClassification, AutoFeatureExtractor
|
|
|
|
| 35 |
|
| 36 |
>>> model_name = "mispeech/ced-tiny"
|
| 37 |
+
>>> feature_extractor = AutoFeatureExtractor.from_pretrained(model_name, trust_remote_code=True)
|
| 38 |
+
>>> model = AutoModelForAudioClassification.from_pretrained(model_name, trust_remote_code=True)
|
| 39 |
|
| 40 |
>>> import torchaudio
|
| 41 |
+
>>> audio, sampling_rate = torchaudio.load("/path-to/JeD5V5aaaoI_931_932.wav")
|
| 42 |
>>> assert sampling_rate == 16000
|
| 43 |
>>> inputs = feature_extractor(audio, sampling_rate=sampling_rate, return_tensors="pt")
|
| 44 |
|