Arm
/

stt_en_conformer_executorch_small

audioprocessing

Model card Files Files and versions

gekkov commited on Oct 22

Commit

e54e045

·

verified ·

1 Parent(s): 76a041f

Update README.md

Files changed (1) hide show

README.md +0 -1

README.md CHANGED Viewed

@@ -118,7 +118,6 @@ We used an AWS g5.24xlarge instance to train the NN.
 #### Preprocessing [optional]
-[More Information Needed]
 We first train a tokenizer on the Librispeech dataset. The tokenizer converts labels into tokens. For example, in English, it is very common to have 's at the end of words, the tokenizer will identify that patten and have a dedicated token for the 's combination.
 The code to obtain the tokenizer is available in https://github.com/Arm-Examples/ML-examples/blob/main/pytorch-conformer-train-quantize/training/build_sp_128_librispeech.py . The trained tokenizer is also available in the Hugging Face repository.

 #### Preprocessing [optional]
 We first train a tokenizer on the Librispeech dataset. The tokenizer converts labels into tokens. For example, in English, it is very common to have 's at the end of words, the tokenizer will identify that patten and have a dedicated token for the 's combination.
 The code to obtain the tokenizer is available in https://github.com/Arm-Examples/ML-examples/blob/main/pytorch-conformer-train-quantize/training/build_sp_128_librispeech.py . The trained tokenizer is also available in the Hugging Face repository.