kingabzpro
/

wav2vec2-large-xls-r-1b-Indonesian

Automatic Speech Recognition

hf-asr-leaderboard

robust-speech-event

Model card Files Files and versions

wav2vec2-large-xls-r-1b-Indonesian

This model is a fine-tuned version of facebook/wav2vec2-xls-r-1b on the common_voice dataset. It achieves the following results on the evaluation set:

Loss: 0.9550
Wer: 0.4551
Cer: 0.1643

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 64
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 400
num_epochs: 50
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
3.663	7.69	200	0.7898	0.6039	0.1848
0.7424	15.38	400	1.0215	0.5615	0.1924
0.4494	23.08	600	1.0901	0.5249	0.1932
0.5075	30.77	800	1.1013	0.5079	0.1935
0.4671	38.46	1000	1.1034	0.4916	0.1827
0.1928	46.15	1200	0.9550	0.4551	0.1643

Framework versions

Transformers 4.17.0.dev0
Pytorch 1.10.2+cu102
Datasets 1.18.2.dev0
Tokenizers 0.11.0

Downloads last month: 142

Safetensors

Model size

1.0B params

Tensor type

F32

·

Model tree for kingabzpro/wav2vec2-large-xls-r-1b-Indonesian

Base model

facebook/wav2vec2-xls-r-1b

Finetuned

(111)

this model

Collection including kingabzpro/wav2vec2-large-xls-r-1b-Indonesian

Robust Speech Recognition Event

The event ran from January 24 to February 7, 2022. Participants used the wav2vec2 model series to develop cutting-edge speech recognition models. • 14 items • Updated Jul 2 • 1

Evaluation results

Test WER on Common Voice id
self-reported

45.510
Test CER on Common Voice id
self-reported

16.430
Test WER on Robust Speech Event - Dev Data
self-reported

72.730
Test WER on Robust Speech Event - Test Data
self-reported

79.290

View on Papers With Code