Sanskrit-qwen-7B-Translate-v2

Sanskrit AI Poster

Sanskrit Model License

A specialized Sanskrit language model for translation and transliteration tasks

๐ŸŒŸ Model Description

This is a fine-tuned version of Qwen/Qwen2.5-7B-Instruct specifically optimized for Sanskrit language processing. The model has been trained using LoRA (Low-Rank Adaptation) on a comprehensive Sanskrit dataset to excel in three key areas:

  1. Sanskrit to IAST Transliteration - Converting Devanagari script to IAST format
  2. Sanskrit to English Translation - Translating Sanskrit text to English
  3. English to Sanskrit Translation - Translating English text to Sanskrit

๐Ÿš€ Key Features

โœจ Multi-Modal Sanskrit Processing

  • IAST Transliteration: Accurate conversion from Devanagari to IAST
  • Bidirectional Translation: Sanskrit โ†” English translation
  • Context-Aware: Preserves meaning and cultural context
  • Chat-Optimized: Uses conversation format for natural interactions

๐Ÿ”ง Technical Improvements Over Previous Model

  • Enhanced Base Model: Upgraded from Qwen2.5-7B-Instruct-1M to Qwen2.5-7B-Instruct
  • Specialized Dataset: Trained on Sanskrit-transliteration-chat-dataset (vs. previous Sanskrit-llama)
  • Chat Template Format: Uses structured conversation format for better performance
  • Optimized LoRA: Improved LoRA configuration with better target modules
  • Memory Efficient: Enhanced with flash attention and gradient checkpointing

๐Ÿ“Š Model Specifications

Parameter Value
Base Model Qwen/Qwen2.5-7B-Instruct
Fine-tuning Method LoRA (Low-Rank Adaptation)
LoRA Rank 16
LoRA Alpha 32
Sequence Length 512 tokens
Training Epochs 3
Learning Rate 2e-05
Batch Size 2 (micro) ร— 4 (gradient accumulation)
Optimizer AdamW 8-bit
Precision bfloat16

๐ŸŽฏ Intended Uses

โœ… Recommended Use Cases

  • Academic Research: Sanskrit text analysis and translation
  • Educational Tools: Learning Sanskrit through translation
  • Cultural Preservation: Digitizing Sanskrit manuscripts
  • Linguistic Studies: Comparative language analysis
  • Content Creation: Sanskrit-English bilingual content

โš ๏ธ Limitations

  • Experimental Model: Still in development, results may vary
  • Context Sensitivity: Performance depends on text complexity
  • Domain Specific: Optimized for classical Sanskrit texts
  • Verification Required: Important translations should be cross-checked

๐Ÿ› ๏ธ Usage Examples

1. Sanskrit to IAST Transliteration

from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "diabolic6045/Sanskrit-qwen-7B-Translate-v2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Prepare the conversation
messages = [
    {
        "role": "system", 
        "content": "You are a Sanskrit transliteration expert. Convert the given Sanskrit text from Devanagari script to IAST (International Alphabet of Sanskrit Transliteration) format."
    },
    {
        "role": "user", 
        "content": "Transliterate this Sanskrit text to IAST: เคฌเฅเคฆเฅเคงเคฟเคถเฅเคšเคพเคฐเฅเคฅเคพเคคเฅเคชเคฐเฅ‹ เคฒเฅ‹เคญเคƒ เคธเคจเฅเคคเฅ‹เคทเคƒ เคชเคฐเคฎเค‚ เคธเฅเค–เคฎเฅ เฅค"
    }
]

# Apply chat template and generate
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.7)
response = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True)

print(response)
# Output: buddhiล›cฤrthฤtparo lobhaแธฅ santoแนฃaแธฅ paramaแนƒ sukham |

2. Sanskrit to English Translation

messages = [
    {
        "role": "system", 
        "content": "You are a Sanskrit to English translation expert. Translate the given Sanskrit text accurately while preserving the meaning and context."
    },
    {
        "role": "user", 
        "content": "Translate this Sanskrit text to English: เคฏเคฆเฅ’เค—เฅเคจเฅŒ เคธเฅ‚เคฐเฅเคฏเฅ‡เฅ‘ เคตเคฟเฅ’เคทเค‚ เคชเฅƒเฅ‘เคฅเคฟเฅ’เคตเฅเคฏเคพเคฎเฅ‹เคทเฅ‘เคงเฅ€เคทเฅเฅ’ เคฏเคคเฅ เฅค"
    }
]

# Generate translation
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=150, temperature=0.7)
response = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True)

print(response)
# Output: The poison that is in the sun, in the earth and in the herbs...

3. English to Sanskrit Translation

messages = [
    {
        "role": "system", 
        "content": "You are an English to Sanskrit translation expert. Translate the given English text accurately into Sanskrit while preserving the meaning and context."
    },
    {
        "role": "user", 
        "content": "Translate this English text to Sanskrit: May the divine powers protect us and grant us wisdom."
    }
]

# Generate Sanskrit translation
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.7)
response = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True)

print(response)
# Output: เคฆเฅ‡เคตเคพเคƒ เค…เคธเฅเคฎเคพเคจเฅ เคฐเค•เฅเคทเคจเฅเคคเฅ เคฌเฅเคฆเฅเคงเคฟเค‚ เคš เคชเฅเคฐเคฏเคšเฅเค›เคจเฅเคคเฅ เฅค

๐ŸŽฎ Interactive Demo

Try the model with our Gradio interface:

Run the interactive demo

The demo provides:

  • Mode Selection: Choose between transliteration and translation modes
  • Real-time Processing: Instant results with adjustable parameters
  • Example Library: Pre-loaded examples for each mode
  • Parameter Tuning: Adjust temperature and max length

๐Ÿ“ˆ Training Details

Dataset Information

  • Source: diabolic6045/Sanskrit-transliteration-chat-dataset
  • Format: Chat template with structured conversations
  • Size: Comprehensive Sanskrit corpus with multiple translation pairs
  • Validation Split: 10% for evaluation

Training Configuration

# Key training parameters
base_model: Qwen/Qwen2.5-7B-Instruct
adapter: lora
lora_r: 16
lora_alpha: 32
sequence_len: 512
num_epochs: 3
learning_rate: 0.00002
optimizer: adamw_8bit
lr_scheduler: cosine
bf16: auto
flash_attention: true
gradient_checkpointing: true

Hardware Requirements

  • Training: Multi-GPU setup with 24GB+ VRAM per GPU
  • Inference: 8GB+ VRAM for optimal performance
  • CPU: Compatible with CPU inference (slower)

๐Ÿ”„ Comparison with Previous Model

Feature Previous Model Current Model
Base Model Qwen2.5-7B-Instruct-1M Qwen2.5-7B-Instruct
Dataset Sanskrit-llama (Alpaca) Sanskrit-transliteration-chat-dataset
Format Alpaca format Chat template format
Capabilities Basic translation Multi-modal (transliteration + translation)
LoRA Rank 32 16 (optimized)
Sequence Length 1024 512 (focused)
Training Epochs 1 3 (more thorough)
Specialization General Sanskrit Specialized for transliteration

๐Ÿ›ก๏ธ Ethical Considerations

  • Cultural Sensitivity: Respect for Sanskrit's cultural and religious significance
  • Accuracy Disclaimer: Model outputs should be verified for important translations
  • Educational Use: Primarily intended for educational and research purposes
  • Bias Awareness: May reflect biases present in training data

๐Ÿ“š Citation

If you use this model in your research, please cite:

@misc{sanskrit-qwen-chat-lora,
  title={Sanskrit-qwen-7B-Translate-v2: A Specialized Sanskrit Translation and Transliteration Model},
  author={Divax Shah (diabolic6045)},
  year={2024},
  url={https://huggingface.co/diabolic6045/Sanskrit-qwen-7B-Translate-v2}
}

๐Ÿค Contributing

We welcome contributions to improve this model:

  1. Dataset Contributions: High-quality Sanskrit translation pairs
  2. Evaluation: Benchmarking and performance analysis
  3. Bug Reports: Issues and improvement suggestions
  4. Documentation: Usage examples and tutorials

๐Ÿ“„ License

This model is released under the Apache 2.0 License. See the LICENSE file for details.

๐Ÿ™ Acknowledgments

  • Qwen Team: For the excellent base model
  • Axolotl Framework: For the training infrastructure
  • Sanskrit Community: For linguistic guidance and feedback
  • Open Source Community: For tools and resources

Built with โค๏ธ for Sanskrit language preservation and education

Built with Axolotl

Downloads last month
13
Safetensors
Model size
8B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for diabolic6045/Sanskrit-qwen-7B-Translate-v2

Base model

Qwen/Qwen2.5-7B
Finetuned
(2800)
this model
Quantizations
2 models

Dataset used to train diabolic6045/Sanskrit-qwen-7B-Translate-v2

Space using diabolic6045/Sanskrit-qwen-7B-Translate-v2 1

Collection including diabolic6045/Sanskrit-qwen-7B-Translate-v2