metadata
			license: mit
language:
  - en
  - de
  - fr
  - fi
  - sv
  - nl
 
	Preliminary Historic Multilingual and Monolingual ByT5 Models. Following languages are currently covered:
We evaluated the hmByT5 model that was pretrained on English AjMC corpus for 200k steps:
It turns out, that the results are not on-par with current SOTA on the English AjMC corpus, see a comparison
here. Thus, we continue experiments with the Hugging Face
Transformers JAX/FLAX implementation to pretrain ByT5 models on TPU.