finetuned_t5_amh_en
This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:
- Loss: 3.6256
- Bleu: 0.8364
- Gen Len: 220.2784
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
|---|---|---|---|---|---|
| No log | 1.0 | 194 | 9.7772 | 0.1784 | 252.2577 |
| No log | 2.0 | 388 | 6.1238 | 0.0291 | 294.6082 |
| 13.1298 | 3.0 | 582 | 4.8867 | 0.0344 | 258.2577 |
| 13.1298 | 4.0 | 776 | 4.3878 | 0.1993 | 278.1134 |
| 13.1298 | 5.0 | 970 | 4.1742 | 0.1093 | 367.701 |
| 5.9006 | 6.0 | 1164 | 4.0494 | 0.1292 | 347.8144 |
| 5.9006 | 7.0 | 1358 | 3.9637 | 0.2261 | 346.5567 |
| 5.0328 | 8.0 | 1552 | 3.9252 | 0.2499 | 290.6598 |
| 5.0328 | 9.0 | 1746 | 3.8828 | 0.2003 | 383.7938 |
| 5.0328 | 10.0 | 1940 | 3.8550 | 0.2356 | 334.6392 |
| 4.7488 | 11.0 | 2134 | 3.8296 | 0.2338 | 322.701 |
| 4.7488 | 12.0 | 2328 | 3.8144 | 0.263 | 333.8866 |
| 4.5514 | 13.0 | 2522 | 3.7928 | 0.3135 | 290.0 |
| 4.5514 | 14.0 | 2716 | 3.7756 | 0.288 | 304.4639 |
| 4.5514 | 15.0 | 2910 | 3.7622 | 0.3206 | 275.3402 |
| 4.421 | 16.0 | 3104 | 3.7508 | 0.3354 | 281.4433 |
| 4.421 | 17.0 | 3298 | 3.7357 | 0.3479 | 248.2474 |
| 4.421 | 18.0 | 3492 | 3.7273 | 0.3914 | 266.8247 |
| 4.3274 | 19.0 | 3686 | 3.7161 | 0.5372 | 206.7629 |
| 4.3274 | 20.0 | 3880 | 3.7049 | 0.4423 | 245.4021 |
| 4.2533 | 21.0 | 4074 | 3.6951 | 0.481 | 227.567 |
| 4.2533 | 22.0 | 4268 | 3.6912 | 0.5658 | 200.1134 |
| 4.2533 | 23.0 | 4462 | 3.6871 | 0.5392 | 239.3711 |
| 4.1802 | 24.0 | 4656 | 3.6760 | 0.6617 | 218.5258 |
| 4.1802 | 25.0 | 4850 | 3.6725 | 0.585 | 236.9072 |
| 4.1304 | 26.0 | 5044 | 3.6683 | 0.7355 | 230.8351 |
| 4.1304 | 27.0 | 5238 | 3.6642 | 0.5375 | 242.9588 |
| 4.1304 | 28.0 | 5432 | 3.6592 | 0.7076 | 215.3711 |
| 4.0963 | 29.0 | 5626 | 3.6572 | 0.5566 | 230.3608 |
| 4.0963 | 30.0 | 5820 | 3.6524 | 0.5795 | 221.9175 |
| 4.047 | 31.0 | 6014 | 3.6498 | 0.6899 | 208.6804 |
| 4.047 | 32.0 | 6208 | 3.6471 | 0.587 | 230.7938 |
| 4.047 | 33.0 | 6402 | 3.6453 | 0.7257 | 236.1443 |
| 4.0166 | 34.0 | 6596 | 3.6432 | 0.7644 | 219.3918 |
| 4.0166 | 35.0 | 6790 | 3.6397 | 0.6841 | 247.6598 |
| 4.0166 | 36.0 | 6984 | 3.6377 | 0.7156 | 242.6289 |
| 3.9956 | 37.0 | 7178 | 3.6362 | 0.7676 | 221.6804 |
| 3.9956 | 38.0 | 7372 | 3.6346 | 0.7589 | 218.732 |
| 3.9641 | 39.0 | 7566 | 3.6337 | 0.7666 | 216.0309 |
| 3.9641 | 40.0 | 7760 | 3.6319 | 0.877 | 199.8866 |
| 3.9641 | 41.0 | 7954 | 3.6305 | 0.8444 | 202.4845 |
| 3.95 | 42.0 | 8148 | 3.6298 | 0.8761 | 202.5361 |
| 3.95 | 43.0 | 8342 | 3.6291 | 0.8358 | 209.2165 |
| 3.9399 | 44.0 | 8536 | 3.6279 | 0.7962 | 219.6289 |
| 3.9399 | 45.0 | 8730 | 3.6276 | 0.8881 | 212.0 |
| 3.9399 | 46.0 | 8924 | 3.6265 | 0.8627 | 207.1959 |
| 3.9261 | 47.0 | 9118 | 3.6261 | 0.865 | 204.9588 |
| 3.9261 | 48.0 | 9312 | 3.6256 | 0.8207 | 213.701 |
| 3.9145 | 49.0 | 9506 | 3.6256 | 0.863 | 207.9485 |
| 3.9145 | 50.0 | 9700 | 3.6256 | 0.8364 | 220.2784 |
Framework versions
- Transformers 4.52.3
- Pytorch 2.6.0+cu124
- Datasets 3.6.0
- Tokenizers 0.21.1
- Downloads last month
- 30
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for EdwardoSunny/finetuned_t5_amh_en
Base model
google/mt5-small