Spaces:

kingabzpro
/

Urdu-STT-with-GPT-OSS

Running

File size: 1,358 Bytes

f4b6d22
980c187
e4e6a48
 
 
f4b6d22
 
 
 
 
980c187
f4b6d22
 
980c187

---
title: Urdu STT with GPT-OSS
emoji: 🏎️
colorFrom: red
colorTo: yellow
sdk: gradio
sdk_version: 5.35.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: High-quality Urdu STT with Faster-Whisper and LLM.
---

# 🏎️ Faster Urdu ASR

This Space provides **state-of-the-art Urdu Automatic Speech Recognition (ASR)** built on [Faster-Whisper](https://github.com/guillaumekln/faster-whisper), fine-tuned for Urdu.  
In addition to transcription, it offers **optional polishing with Groq’s `openai/gpt-oss-120b` LLM** to improve Urdu grammar, punctuation, and fluency.

## ✨ Features
- 🎤 **Audio input** via upload or direct microphone recording  
- 📜 Multiple output formats: plain text, `.srt`, `.vtt`, `.json`  
- ⚡ Built on **Faster-Whisper (CT2)** for efficient GPU/CPU inference  
- 🤖 **Optional LLM polishing** with Groq API for natural, improved Urdu text  
- 🔑 Works with environment variable `GROQ_API_KEY` or via UI input  

## 🚀 Usage
1. Upload or record an Urdu audio file.  
2. Choose output format (`text`, `srt`, `vtt`, `json`).  
3. (Optional) Enable **LLM Polishing** to improve transcription quality.  
   - Provide a valid **`GROQ_API_KEY`** if not set in your environment.  
   - Adjust temperature and system prompt as needed.  
4. Click **Transcribe** and view/download your results.