finetuned_t5_amh_en

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

Loss: 3.6256
Bleu: 0.8364
Gen Len: 220.2784

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Gen Len
No log	1.0	194	9.7772	0.1784	252.2577
No log	2.0	388	6.1238	0.0291	294.6082
13.1298	3.0	582	4.8867	0.0344	258.2577
13.1298	4.0	776	4.3878	0.1993	278.1134
13.1298	5.0	970	4.1742	0.1093	367.701
5.9006	6.0	1164	4.0494	0.1292	347.8144
5.9006	7.0	1358	3.9637	0.2261	346.5567
5.0328	8.0	1552	3.9252	0.2499	290.6598
5.0328	9.0	1746	3.8828	0.2003	383.7938
5.0328	10.0	1940	3.8550	0.2356	334.6392
4.7488	11.0	2134	3.8296	0.2338	322.701
4.7488	12.0	2328	3.8144	0.263	333.8866
4.5514	13.0	2522	3.7928	0.3135	290.0
4.5514	14.0	2716	3.7756	0.288	304.4639
4.5514	15.0	2910	3.7622	0.3206	275.3402
4.421	16.0	3104	3.7508	0.3354	281.4433
4.421	17.0	3298	3.7357	0.3479	248.2474
4.421	18.0	3492	3.7273	0.3914	266.8247
4.3274	19.0	3686	3.7161	0.5372	206.7629
4.3274	20.0	3880	3.7049	0.4423	245.4021
4.2533	21.0	4074	3.6951	0.481	227.567
4.2533	22.0	4268	3.6912	0.5658	200.1134
4.2533	23.0	4462	3.6871	0.5392	239.3711
4.1802	24.0	4656	3.6760	0.6617	218.5258
4.1802	25.0	4850	3.6725	0.585	236.9072
4.1304	26.0	5044	3.6683	0.7355	230.8351
4.1304	27.0	5238	3.6642	0.5375	242.9588
4.1304	28.0	5432	3.6592	0.7076	215.3711
4.0963	29.0	5626	3.6572	0.5566	230.3608
4.0963	30.0	5820	3.6524	0.5795	221.9175
4.047	31.0	6014	3.6498	0.6899	208.6804
4.047	32.0	6208	3.6471	0.587	230.7938
4.047	33.0	6402	3.6453	0.7257	236.1443
4.0166	34.0	6596	3.6432	0.7644	219.3918
4.0166	35.0	6790	3.6397	0.6841	247.6598
4.0166	36.0	6984	3.6377	0.7156	242.6289
3.9956	37.0	7178	3.6362	0.7676	221.6804
3.9956	38.0	7372	3.6346	0.7589	218.732
3.9641	39.0	7566	3.6337	0.7666	216.0309
3.9641	40.0	7760	3.6319	0.877	199.8866
3.9641	41.0	7954	3.6305	0.8444	202.4845
3.95	42.0	8148	3.6298	0.8761	202.5361
3.95	43.0	8342	3.6291	0.8358	209.2165
3.9399	44.0	8536	3.6279	0.7962	219.6289
3.9399	45.0	8730	3.6276	0.8881	212.0
3.9399	46.0	8924	3.6265	0.8627	207.1959
3.9261	47.0	9118	3.6261	0.865	204.9588
3.9261	48.0	9312	3.6256	0.8207	213.701
3.9145	49.0	9506	3.6256	0.863	207.9485
3.9145	50.0	9700	3.6256	0.8364	220.2784

Framework versions

Transformers 4.52.3
Pytorch 2.6.0+cu124
Datasets 3.6.0
Tokenizers 0.21.1

Downloads last month: 30

Safetensors

Model size

0.3B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for EdwardoSunny/finetuned_t5_amh_en

Base model

google/mt5-small

Finetuned

(613)

this model

EdwardoSunny
/

finetuned_t5_amh_en

finetuned_t5_amh_en

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for EdwardoSunny/finetuned_t5_amh_en

Evaluation results