Model Details

WARNINNGS: This Model IS Pre-Trained, in the future will be finetuned.

Model Description

The First Persian SLM By YSNRFD (YASIN ARYANFARD) and AMIRHOSSEIN MEHRDOOST, This Model support Only Persian text Inputs, In The Future I Want Add Englih Language Support.

Training Data

ysnrfd Sample Persian Text LINK: https://huggingface.co/datasets/ysn-rfd/fibonacci_alpaca_to_sharegpt_gpt_format_convert_new_dataset_release

Training Hyperparameters

  • Training regime: fp32 mixed precision

Evaluation

Not Yet

Testing Data, Factors & Metrics

ysnrfd Sample Persian Text

Testing Data

Not Yet

Summary

The Fisrt Persian SLM Trained From Scratch

  • Hardware Type: Nvidia Tesla T4 (1)
  • Hours used: 1H
  • Cloud Provider: Google Colab

Model Architecture and Objective

YSNRFD Architecture

Hardware

Nvidia Tesla T4

Software

Python Code, From Scratch, Pytorch

Script For Run Model

import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer
import re


model_path = "./Path_To_Model"
print(f"{model_path}...")

tokenizer = GPT2Tokenizer.from_pretrained(model_path)
model = GPT2LMHeadModel.from_pretrained(model_path)


device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"{device}")
model.to(device)
model.eval()  


try:
    end_token_id = tokenizer.convert_tokens_to_ids("### End")
    if end_token_id == tokenizer.unk_token_id:
        end_token_id = None
except:
    end_token_id = None

print("\n------+------+------+------ model is ready for testing +------+------++------\n")
print("type exit for exit")
print("\nYSNRFD")
print("------+------+------+----------------------+------+------++------\n")


while True:
   
    user_input = input("\nYou: ").strip()
    
    
    if user_input.lower() in ["exit"]:
        print("\n good bye (ysnrfd)")
        break
    
    
    prompt = f"### Human: {user_input}\n### Assistant:"
    
   
    inputs = tokenizer(
        prompt,
        return_tensors="pt",
        truncation=True,
        max_length=512
    ).to(device)
    
    
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_new_tokens=400,
            temperature=0.7,
            top_p=0.9,
            do_sample=True,
            pad_token_id=tokenizer.eos_token_id,
            eos_token_id=end_token_id or tokenizer.eos_token_id,
            repetition_penalty=1.05
        )
    
    
    full_response = tokenizer.decode(outputs[0], skip_special_tokens=False)
    
   
    assistant_response = full_response[len(prompt):].strip()
    
    
    if "### End" in assistant_response:
        assistant_response = assistant_response.split("### End")[0].strip()
    
    
    assistant_response = re.sub(r'^###\s*Assistant:\s*', '', assistant_response)
    
    
    if assistant_response:
        print(f"\nBot: {assistant_response}")
    else:
        print("\nBot: please again say")
Downloads last month
13
Safetensors
Model size
19.4M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train ysn-rfd/First_Persian_SLM_Big_Update_Version3_ysnrfd

Collection including ysn-rfd/First_Persian_SLM_Big_Update_Version3_ysnrfd