YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

🧠 English to Spanish Translation AI Model

This repository contains a Transformer-based AI model fine-tuned for English to Spanish text translation. The model has been trained, quantized (FP16), and tested for quality and scoring. It delivers high-accuracy translations and is suitable for real-world use cases such as educational tools, real-time communication, and travel assistants.


πŸš€ Features

  • πŸ” Language Pair: English β†’ Spanish
  • πŸ”§ Model: Helsinki-NLP/opus-mt-en-es
  • πŸ§ͺ Quantized: FP16 for efficient inference
  • 🎯 High Accuracy: Scored well on validation sets
  • ⚑ CUDA Enabled: Fast training and inference

πŸ“Š Dataset Used

Hugging Face Dataset: OscarNav/spa-eng

  • Source: OscarNav
  • Language Pair: en-es
  • Dataset Size: ~107K sentence pairs
from datasets import load_dataset

dataset = load_dataset("OscarNav/spa-eng", lang1="en", lang2="es")

πŸ› οΈ Model Training & Fine-Tuning

  • Pretrained Base Model: Helsinki-NLP/opus-mt-en-es

  • Tokenizer: AutoTokenizer from Hugging Face Transformers

  • Training Environment: Kaggle Notebook with CUDA GPU

  • Batch Size: 16

  • Epochs: 3–5 (based on early stopping)

  • Optimizer: AdamW

  • Loss Function: CrossEntropyLoss

πŸ§ͺ Quantization (FP16)

Quantized the model for reduced memory usage and faster inference without compromising translation quality.

model = model.half() 
model.save_pretrained("quantized_model_fp16")

βœ… Scoring

BLEU Score: ~34+

  • Evaluation Metric: sacrebleu on validation set

  • Inference Accuracy: Verified using real-world sample sentences

Downloads last month
-
Safetensors
Model size
77.5M params
Tensor type
F16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support