File size: 1,358 Bytes
f4b6d22
980c187
e4e6a48
 
 
f4b6d22
 
 
 
 
980c187
f4b6d22
 
980c187
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
---
title: Urdu STT with GPT-OSS
emoji: 🏎️
colorFrom: red
colorTo: yellow
sdk: gradio
sdk_version: 5.35.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: High-quality Urdu STT with Faster-Whisper and LLM.
---

# 🏎️ Faster Urdu ASR

This Space provides **state-of-the-art Urdu Automatic Speech Recognition (ASR)** built on [Faster-Whisper](https://github.com/guillaumekln/faster-whisper), fine-tuned for Urdu.  
In addition to transcription, it offers **optional polishing with Groq’s `openai/gpt-oss-120b` LLM** to improve Urdu grammar, punctuation, and fluency.

## ✨ Features
- 🎀 **Audio input** via upload or direct microphone recording  
- πŸ“œ Multiple output formats: plain text, `.srt`, `.vtt`, `.json`  
- ⚑ Built on **Faster-Whisper (CT2)** for efficient GPU/CPU inference  
- πŸ€– **Optional LLM polishing** with Groq API for natural, improved Urdu text  
- πŸ”‘ Works with environment variable `GROQ_API_KEY` or via UI input  

## πŸš€ Usage
1. Upload or record an Urdu audio file.  
2. Choose output format (`text`, `srt`, `vtt`, `json`).  
3. (Optional) Enable **LLM Polishing** to improve transcription quality.  
   - Provide a valid **`GROQ_API_KEY`** if not set in your environment.  
   - Adjust temperature and system prompt as needed.  
4. Click **Transcribe** and view/download your results.