finetuned_t5_amh_en

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.6256
  • Bleu: 0.8364
  • Gen Len: 220.2784

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 1.0 194 9.7772 0.1784 252.2577
No log 2.0 388 6.1238 0.0291 294.6082
13.1298 3.0 582 4.8867 0.0344 258.2577
13.1298 4.0 776 4.3878 0.1993 278.1134
13.1298 5.0 970 4.1742 0.1093 367.701
5.9006 6.0 1164 4.0494 0.1292 347.8144
5.9006 7.0 1358 3.9637 0.2261 346.5567
5.0328 8.0 1552 3.9252 0.2499 290.6598
5.0328 9.0 1746 3.8828 0.2003 383.7938
5.0328 10.0 1940 3.8550 0.2356 334.6392
4.7488 11.0 2134 3.8296 0.2338 322.701
4.7488 12.0 2328 3.8144 0.263 333.8866
4.5514 13.0 2522 3.7928 0.3135 290.0
4.5514 14.0 2716 3.7756 0.288 304.4639
4.5514 15.0 2910 3.7622 0.3206 275.3402
4.421 16.0 3104 3.7508 0.3354 281.4433
4.421 17.0 3298 3.7357 0.3479 248.2474
4.421 18.0 3492 3.7273 0.3914 266.8247
4.3274 19.0 3686 3.7161 0.5372 206.7629
4.3274 20.0 3880 3.7049 0.4423 245.4021
4.2533 21.0 4074 3.6951 0.481 227.567
4.2533 22.0 4268 3.6912 0.5658 200.1134
4.2533 23.0 4462 3.6871 0.5392 239.3711
4.1802 24.0 4656 3.6760 0.6617 218.5258
4.1802 25.0 4850 3.6725 0.585 236.9072
4.1304 26.0 5044 3.6683 0.7355 230.8351
4.1304 27.0 5238 3.6642 0.5375 242.9588
4.1304 28.0 5432 3.6592 0.7076 215.3711
4.0963 29.0 5626 3.6572 0.5566 230.3608
4.0963 30.0 5820 3.6524 0.5795 221.9175
4.047 31.0 6014 3.6498 0.6899 208.6804
4.047 32.0 6208 3.6471 0.587 230.7938
4.047 33.0 6402 3.6453 0.7257 236.1443
4.0166 34.0 6596 3.6432 0.7644 219.3918
4.0166 35.0 6790 3.6397 0.6841 247.6598
4.0166 36.0 6984 3.6377 0.7156 242.6289
3.9956 37.0 7178 3.6362 0.7676 221.6804
3.9956 38.0 7372 3.6346 0.7589 218.732
3.9641 39.0 7566 3.6337 0.7666 216.0309
3.9641 40.0 7760 3.6319 0.877 199.8866
3.9641 41.0 7954 3.6305 0.8444 202.4845
3.95 42.0 8148 3.6298 0.8761 202.5361
3.95 43.0 8342 3.6291 0.8358 209.2165
3.9399 44.0 8536 3.6279 0.7962 219.6289
3.9399 45.0 8730 3.6276 0.8881 212.0
3.9399 46.0 8924 3.6265 0.8627 207.1959
3.9261 47.0 9118 3.6261 0.865 204.9588
3.9261 48.0 9312 3.6256 0.8207 213.701
3.9145 49.0 9506 3.6256 0.863 207.9485
3.9145 50.0 9700 3.6256 0.8364 220.2784

Framework versions

  • Transformers 4.52.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
30
Safetensors
Model size
0.3B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for EdwardoSunny/finetuned_t5_amh_en

Base model

google/mt5-small
Finetuned
(613)
this model

Evaluation results