Spaces:

NitinBot001
/

ttsfm

Runtime error

App Files Files Community

NitinBot001 commited on Aug 26

Commit

3ca5f72

verified ·

1 Parent(s): 9e9b02f

Upload 38 files

Browse files

Files changed (38) hide show

.env.example +31 -0
.github/ISSUE_TEMPLATE/bug_report.md +38 -38
.github/ISSUE_TEMPLATE/feature_request.md +20 -20
.github/workflows/docker-build.yml +77 -77
.github/workflows/release.yml +95 -90
.gitignore +159 -156
CHANGELOG.md +266 -191
Dockerfile +36 -34
LICENSE +21 -21
README.zh.md +792 -0
docs/websocket-streaming.md +244 -0
pyproject.toml +169 -161
requirements.txt +3 -3
ttsfm-web/app.py +988 -574
ttsfm-web/i18n.py +238 -0
ttsfm-web/requirements.txt +16 -9
ttsfm-web/run.py +15 -0
ttsfm-web/static/css/style.css +1399 -1390
ttsfm-web/static/js/i18n.js +221 -0
ttsfm-web/static/js/playground-enhanced-fixed.js +712 -0
ttsfm-web/static/js/playground.js +861 -745
ttsfm-web/static/js/websocket-tts.js +366 -0
ttsfm-web/templates/base.html +363 -356
ttsfm-web/templates/docs.html +734 -369
ttsfm-web/templates/index.html +156 -146
ttsfm-web/templates/playground.html +317 -295
ttsfm-web/templates/websocket_demo.html +390 -0
ttsfm-web/translations/en.json +224 -0
ttsfm-web/translations/zh.json +224 -0
ttsfm-web/websocket_handler.py +231 -0
ttsfm/__init__.py +193 -183
ttsfm/async_client.py +504 -464
ttsfm/cli.py +363 -362
ttsfm/client.py +530 -481
ttsfm/exceptions.py +243 -243
ttsfm/models.py +283 -283
ttsfm/utils.py +466 -421
uv.lock +0 -0

.env.example ADDED Viewed

	@@ -0,0 +1,31 @@

+# TTSFM Environment Configuration
+# Server Configuration
+HOST=0.0.0.0
+PORT=7000
+# SSL Configuration
+VERIFY_SSL=true
+# Flask Configuration
+FLASK_ENV=production
+FLASK_APP=app.py
+DEBUG=false
+# API Key Protection (Optional)
+# Set REQUIRE_API_KEY=true to enable API key authentication
+REQUIRE_API_KEY=false
+# Set your API key here when protection is enabled
+# This key will be required for all TTS generation requests
+TTSFM_API_KEY=your-secret-api-key-here
+# Example usage:
+# 1. Set REQUIRE_API_KEY=true
+# 2. Set TTSFM_API_KEY to your desired secret key
+# 3. Restart the application
+# 4. All TTS requests will now require the API key in:
+#    - Authorization header (Bearer token) - OpenAI compatible
+#    - X-API-Key header
+#    - api_key query parameter
+#    - api_key in JSON body

.github/ISSUE_TEMPLATE/bug_report.md CHANGED Viewed

@@ -1,38 +1,38 @@
----
-name: Bug report
-about: Create a report to help us improve
-title: ''
-labels: ''
-assignees: ''
----
-**Describe the bug**
-A clear and concise description of what the bug is.
-**To Reproduce**
-Steps to reproduce the behavior:
-1. Go to '...'
-2. Click on '....'
-3. Scroll down to '....'
-4. See error
-**Expected behavior**
-A clear and concise description of what you expected to happen.
-**Screenshots**
-If applicable, add screenshots to help explain your problem.
-**Desktop (please complete the following information):**
- - OS: [e.g. iOS]
- - Browser [e.g. chrome, safari]
- - Version [e.g. 22]
-**Smartphone (please complete the following information):**
- - Device: [e.g. iPhone6]
- - OS: [e.g. iOS8.1]
- - Browser [e.g. stock browser, safari]
- - Version [e.g. 22]
-**Additional context**
-Add any other context about the problem here.

+---
+name: Bug report
+about: Create a report to help us improve
+title: ''
+labels: ''
+assignees: ''
+---
+**Describe the bug**
+A clear and concise description of what the bug is.
+**To Reproduce**
+Steps to reproduce the behavior:
+1. Go to '...'
+2. Click on '....'
+3. Scroll down to '....'
+4. See error
+**Expected behavior**
+A clear and concise description of what you expected to happen.
+**Screenshots**
+If applicable, add screenshots to help explain your problem.
+**Desktop (please complete the following information):**
+ - OS: [e.g. iOS]
+ - Browser [e.g. chrome, safari]
+ - Version [e.g. 22]
+**Smartphone (please complete the following information):**
+ - Device: [e.g. iPhone6]
+ - OS: [e.g. iOS8.1]
+ - Browser [e.g. stock browser, safari]
+ - Version [e.g. 22]
+**Additional context**
+Add any other context about the problem here.

.github/ISSUE_TEMPLATE/feature_request.md CHANGED Viewed

@@ -1,20 +1,20 @@
----
-name: Feature request
-about: Suggest an idea for this project
-title: ''
-labels: ''
-assignees: ''
----
-**Is your feature request related to a problem? Please describe.**
-A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
-**Describe the solution you'd like**
-A clear and concise description of what you want to happen.
-**Describe alternatives you've considered**
-A clear and concise description of any alternative solutions or features you've considered.
-**Additional context**
-Add any other context or screenshots about the feature request here.

+---
+name: Feature request
+about: Suggest an idea for this project
+title: ''
+labels: ''
+assignees: ''
+---
+**Is your feature request related to a problem? Please describe.**
+A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
+**Describe the solution you'd like**
+A clear and concise description of what you want to happen.
+**Describe alternatives you've considered**
+A clear and concise description of any alternative solutions or features you've considered.
+**Additional context**
+Add any other context or screenshots about the feature request here.

.github/workflows/docker-build.yml CHANGED Viewed

@@ -1,78 +1,78 @@
-name: Docker Build and Push
-on:
-  release:
-    types: [published]
-env:
-  REGISTRY_DOCKERHUB: docker.io
-  REGISTRY_GHCR: ghcr.io
-  IMAGE_NAME: ${{ github.repository }}
-jobs:
-  build-and-push:
-    runs-on: ubuntu-latest
-    permissions:
-      contents: read
-      packages: write
-    steps:
-      - name: Checkout repository
-        uses: actions/checkout@v4
-      - name: Set up QEMU
-        uses: docker/setup-qemu-action@v3
-      - name: Set up Docker Buildx
-        uses: docker/setup-buildx-action@v3
-        with:
-          driver: docker-container
-      - name: Login to Docker Hub
-        uses: docker/login-action@v3
-        with:
-          username: ${{ secrets.DOCKERHUB_USERNAME }}
-          password: ${{ secrets.DOCKERHUB_TOKEN }}
-      - name: Login to GitHub Container Registry
-        uses: docker/login-action@v3
-        with:
-          registry: ${{ env.REGISTRY_GHCR }}
-          username: ${{ github.actor }}
-          password: ${{ secrets.GITHUB_TOKEN }}
-      - name: Extract metadata
-        id: meta
-        uses: docker/metadata-action@v5
-        with:
-          images: |
-            ${{ secrets.DOCKERHUB_USERNAME }}/ttsfm
-            ${{ env.REGISTRY_GHCR }}/${{ env.IMAGE_NAME }}
-          tags: |
-            type=ref,event=tag
-            type=semver,pattern={{version}}
-            type=semver,pattern={{major}}.{{minor}}
-            type=semver,pattern={{major}}
-            type=raw,value=latest
-          labels: |
-            org.opencontainers.image.source=${{ github.repositoryUrl }}
-            org.opencontainers.image.description=Free TTS API server compatible with OpenAI's TTS API format using openai.fm
-            org.opencontainers.image.licenses=MIT
-            org.opencontainers.image.title=TTSFM - Free TTS API Server
-            org.opencontainers.image.vendor=dbcccc
-      - name: Build and push
-        id: build-and-push
-        uses: docker/build-push-action@v5
-        with:
-          context: .
-          platforms: linux/amd64,linux/arm64
-          push: true
-          tags: ${{ steps.meta.outputs.tags }}
-          labels: ${{ steps.meta.outputs.labels }}
-          cache-from: type=gha
-          cache-to: type=gha,mode=max
-      - name: Show image info
-        run: |
-          echo "Pushed tags: ${{ steps.meta.outputs.tags }}"
           echo "Image digest: ${{ steps.build-and-push.outputs.digest }}"

+name: Docker Build and Push
+on:
+  release:
+    types: [published]
+env:
+  REGISTRY_DOCKERHUB: docker.io
+  REGISTRY_GHCR: ghcr.io
+  IMAGE_NAME: ${{ github.repository }}
+jobs:
+  build-and-push:
+    runs-on: ubuntu-latest
+    permissions:
+      contents: read
+      packages: write
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v4
+      - name: Set up QEMU
+        uses: docker/setup-qemu-action@v3
+      - name: Set up Docker Buildx
+        uses: docker/setup-buildx-action@v3
+        with:
+          driver: docker-container
+      - name: Login to Docker Hub
+        uses: docker/login-action@v3
+        with:
+          username: ${{ secrets.DOCKERHUB_USERNAME }}
+          password: ${{ secrets.DOCKERHUB_TOKEN }}
+      - name: Login to GitHub Container Registry
+        uses: docker/login-action@v3
+        with:
+          registry: ${{ env.REGISTRY_GHCR }}
+          username: ${{ github.actor }}
+          password: ${{ secrets.GITHUB_TOKEN }}
+      - name: Extract metadata
+        id: meta
+        uses: docker/metadata-action@v5
+        with:
+          images: |
+            ${{ secrets.DOCKERHUB_USERNAME }}/ttsfm
+            ${{ env.REGISTRY_GHCR }}/${{ env.IMAGE_NAME }}
+          tags: |
+            type=ref,event=tag
+            type=semver,pattern={{version}}
+            type=semver,pattern={{major}}.{{minor}}
+            type=semver,pattern={{major}}
+            type=raw,value=latest
+          labels: |
+            org.opencontainers.image.source=${{ github.repositoryUrl }}
+            org.opencontainers.image.description=Free TTS API server compatible with OpenAI's TTS API format using openai.fm
+            org.opencontainers.image.licenses=MIT
+            org.opencontainers.image.title=TTSFM - Free TTS API Server
+            org.opencontainers.image.vendor=dbcccc
+      - name: Build and push
+        id: build-and-push
+        uses: docker/build-push-action@v5
+        with:
+          context: .
+          platforms: linux/amd64,linux/arm64
+          push: true
+          tags: ${{ steps.meta.outputs.tags }}
+          labels: ${{ steps.meta.outputs.labels }}
+          cache-from: type=gha
+          cache-to: type=gha,mode=max
+      - name: Show image info
+        run: |
+          echo "Pushed tags: ${{ steps.meta.outputs.tags }}"
           echo "Image digest: ${{ steps.build-and-push.outputs.digest }}"

.github/workflows/release.yml CHANGED Viewed

@@ -1,90 +1,95 @@
-name: Release and Publish
-on:
-  push:
-    tags:
-      - 'v*'  # Triggers on version tags like v1.0.0, v3.0.1, etc.
-permissions:
-  contents: write
-  id-token: write
-jobs:
-  release-and-publish:
-    runs-on: ubuntu-latest
-    steps:
-    - uses: actions/checkout@v4
-    - name: Set up Python
-      uses: actions/setup-python@v4
-      with:
-        python-version: '3.11'
-    - name: Install dependencies
-      run: |
-        python -m pip install --upgrade pip
-        pip install build twine
-    - name: Test package import
-      run: |
-        pip install -e .
-        python -c "import ttsfm; print(f'✅ TTSFM imported successfully')"
-        python -c "from ttsfm import TTSClient; print('✅ TTSClient imported successfully')"
-    - name: Build package
-      run: |
-        python -m build
-        echo "📦 Package built successfully"
-        ls -la dist/
-    - name: Check package
-      run: |
-        twine check dist/*
-        echo "✅ Package validation passed"
-    - name: Publish to PyPI
-      uses: pypa/gh-action-pypi-publish@release/v1
-      with:
-        password: ${{ secrets.PYPI_API_TOKEN }}
-    - name: Create GitHub Release
-      uses: softprops/action-gh-release@v1
-      with:
-        body: |
-          ## 🎉 TTSFM ${{ github.ref_name }}
-          New release of TTSFM - Free Text-to-Speech API with OpenAI compatibility.
-          ### 📦 Installation
-          ```bash
-          pip install ttsfm==${{ github.ref_name }}
-          ```
-          ### 🚀 Quick Start
-          ```python
-          from ttsfm import TTSClient
-          client = TTSClient()
-          response = client.generate_speech("Hello from TTSFM!")
-          response.save_to_file("hello")
-          ```
-          ### 🐳 Docker
-          ```bash
-          docker run -p 8000:8000 dbcccc/ttsfm:latest
-          ```
-          ### ✨ Features
-          - 🆓 Completely free (uses openai.fm service)
-          - 🎯 OpenAI-compatible API
-          - 🗣️ 11 voices available
-          - 🎵 6 audio formats (MP3, WAV, OPUS, AAC, FLAC, PCM)
-          - ⚡ Async and sync clients
-          - 🌐 Web interface included
-          - 🔧 CLI tool available
-          ### 📚 Documentation
-          See [README](https://github.com/dbccccccc/ttsfm#readme) for full documentation.
-        draft: false
-        prerelease: false

+name: Release and Publish
+on:
+  push:
+    tags:
+      - 'v*'  # Triggers on version tags like v1.0.0, v3.0.1, etc.
+permissions:
+  contents: write
+  id-token: write
+jobs:
+  release-and-publish:
+    runs-on: ubuntu-latest
+    steps:
+    - uses: actions/checkout@v4
+    - name: Set up Python
+      uses: actions/setup-python@v4
+      with:
+        python-version: '3.11'
+    - name: Install dependencies
+      run: |
+        python -m pip install --upgrade pip
+        pip install build twine
+    - name: Test package import
+      run: |
+        pip install -e .
+        python -c "import ttsfm; print(f'✅ TTSFM imported successfully')"
+        python -c "from ttsfm import TTSClient; print('✅ TTSClient imported successfully')"
+    - name: Build package
+      run: |
+        python -m build
+        echo "📦 Package built successfully"
+        ls -la dist/
+    - name: Check package
+      run: |
+        twine check dist/*
+        echo "✅ Package validation passed"
+    - name: Publish to PyPI
+      uses: pypa/gh-action-pypi-publish@release/v1
+      with:
+        attestations: true
+        skip-existing: true
+    - name: Extract version (strip leading v)
+      id: ver
+      run: echo "version=${GITHUB_REF_NAME#v}" >> "$GITHUB_OUTPUT"
+    - name: Create GitHub Release
+      uses: softprops/action-gh-release@v1
+      with:
+        body: |
+          ## 🎉 TTSFM ${{ github.ref_name }}
+          New release of TTSFM - Free Text-to-Speech API with OpenAI compatibility.
+          ### 📦 Installation
+          ```bash
+          pip install ttsfm==${{ steps.ver.outputs.version }}
+          ```
+          ### 🚀 Quick Start
+          ```python
+          from ttsfm import TTSClient
+          client = TTSClient()
+          response = client.generate_speech("Hello from TTSFM!")
+          response.save_to_file("hello")
+          ```
+          ### 🐳 Docker
+          ```bash
+          docker run -p 8000:8000 dbcccc/ttsfm:latest
+          ```
+          ### ✨ Features
+          - 🆓 Completely free (uses openai.fm service)
+          - 🎯 OpenAI-compatible API
+          - 🗣️ 11 voices available
+          - 🎵 6 audio formats (MP3, WAV, OPUS, AAC, FLAC, PCM)
+          - ⚡ Async and sync clients
+          - 🌐 Web interface included
+          - 🔧 CLI tool available
+          ### 📚 Documentation
+          See [README](https://github.com/dbccccccc/ttsfm#readme) for full documentation.
+        draft: false
+        prerelease: false

.gitignore CHANGED Viewed

@@ -1,156 +1,159 @@
-# Python
-__pycache__/
-*.py[cod]
-*$py.class
-*.so
-.Python
-build/
-develop-eggs/
-dist/
-downloads/
-eggs/
-.eggs/
-lib/
-lib64/
-parts/
-sdist/
-var/
-wheels/
-*.egg-info/
-.installed.cfg
-*.egg
-MANIFEST
-# Virtual Environment
-venv/
-env/
-ENV/
-.venv/
-# Environment variables
-.env
-.env.local
-.env.production
-# IDE
-.idea/
-.vscode/
-*.swp
-*.swo
-.spyderproject
-.spyproject
-# OS
-.DS_Store
-.DS_Store?
-._*
-.Spotlight-V100
-.Trashes
-ehthumbs.db
-Thumbs.db
-# Generated audio files (for testing)
-*.mp3
-*.wav
-*.opus
-*.aac
-*.flac
-*.pcm
-test_output.*
-output.*
-hello.*
-speech.*
-# Logs
-*.log
-logs/
-.pytest_cache/
-# Temporary files
-tmp/
-temp/
-.tmp/
-# Coverage reports
-htmlcov/
-.coverage
-.coverage.*
-coverage.xml
-*.cover
-.hypothesis/
-# Documentation builds
-docs/_build/
-site/
-# Package builds
-*.tar.gz
-*.whl
-dist/
-build/
-# MyPy
-.mypy_cache/
-.dmypy.json
-dmypy.json
-# Jupyter Notebook
-.ipynb_checkpoints
-# pyenv
-.python-version
-# pipenv
-Pipfile.lock
-# PEP 582
-__pypackages__/
-# Celery
-celerybeat-schedule
-celerybeat.pid
-# SageMath parsed files
-*.sage.py
-# Rope project settings
-.ropeproject
-# mkdocs documentation
-/site
-# Pyre type checker
-.pyre/
-# Additional exclusions for GitHub
-# API Keys and Secrets
-config.json
-secrets.json
-.secrets
-api_keys.txt
-# Database files
-*.db
-*.sqlite
-*.sqlite3
-# Backup files
-*.bak
-*.backup
-*~
-# Node.js (if using any JS tools)
-node_modules/
-npm-debug.log*
-yarn-debug.log*
-yarn-error.log*
-# Docker
-.dockerignore
-Dockerfile.dev
-docker-compose.override.yml
-# Local configuration
-local_settings.py
-local_config.py

+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# Virtual Environment
+venv/
+env/
+ENV/
+.venv/
+# Environment variables
+.env
+.env.local
+.env.production
+# IDE
+.idea/
+.vscode/
+*.swp
+*.swo
+.spyderproject
+.spyproject
+# OS
+.DS_Store
+.DS_Store?
+._*
+.Spotlight-V100
+.Trashes
+ehthumbs.db
+Thumbs.db
+# Generated audio files (for testing)
+*.mp3
+*.wav
+*.opus
+*.aac
+*.flac
+*.pcm
+test_output.*
+output.*
+hello.*
+speech.*
+# Logs
+*.log
+logs/
+.pytest_cache/
+# Temporary files
+tmp/
+temp/
+.tmp/
+# Coverage reports
+htmlcov/
+.coverage
+.coverage.*
+coverage.xml
+*.cover
+.hypothesis/
+# Documentation builds
+docs/_build/
+site/
+# Package builds
+*.tar.gz
+*.whl
+dist/
+build/
+# MyPy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+# Jupyter Notebook
+.ipynb_checkpoints
+# pyenv
+.python-version
+# pipenv
+Pipfile.lock
+# PEP 582
+__pypackages__/
+# Celery
+celerybeat-schedule
+celerybeat.pid
+# SageMath parsed files
+*.sage.py
+# Rope project settings
+.ropeproject
+# mkdocs documentation
+/site
+# Pyre type checker
+.pyre/
+# Additional exclusions for GitHub
+# API Keys and Secrets
+config.json
+secrets.json
+.secrets
+api_keys.txt
+# Database files
+*.db
+*.sqlite
+*.sqlite3
+# Backup files
+*.bak
+*.backup
+*~
+# Node.js (if using any JS tools)
+node_modules/
+npm-debug.log*
+yarn-debug.log*
+yarn-error.log*
+# Docker
+.dockerignore
+Dockerfile.dev
+docker-compose.override.yml
+# Local configuration
+local_settings.py
+local_config.py
+# Claude
+.claude/

CHANGELOG.md CHANGED Viewed

@@ -1,191 +1,266 @@
-# Changelog
-All notable changes to this project will be documented in this file.
-The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
-and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
-## [3.1.0] - 2024-12-19
-### 🔧 Format Support Improvements
-This release focuses on fixing audio format handling and improving format delivery optimization.
-### ✨ Added
-- **Smart Header Selection**: Intelligent HTTP header selection to optimize format delivery from openai.fm service
-- **Format Mapping Functions**: Helper functions for better format handling and optimization
-- **Enhanced Web Interface**: Improved format selection with detailed descriptions for each format
-- **Comprehensive Format Documentation**: Updated README and documentation with complete format information
-### 🔄 Changed
-- **File Naming Logic**: Files are now saved with extensions based on the actual returned format, not the requested format
-- **Enhanced Logging**: Added format-specific log messages for better debugging
-- **Web API Enhancement**: `/api/formats` endpoint now provides detailed information about all supported formats
-- **Documentation Updates**: README and package documentation now include comprehensive format guides
-### 🐛 Fixed
-- **MAJOR FIX**: Resolved file naming issue where files were saved with incorrect double extensions (e.g., `test.wav.mp3`, `test.opus.wav`)
-- **Correct File Extensions**: Files now save with proper single extensions based on actual audio format (e.g., `test.mp3`, `test.wav`)
-- **Format Optimization**: Improved format delivery through smart request optimization
-- **Format Handling**: Better handling of all supported audio formats
-### 📝 Technical Details
-- **Format Optimization**: Smart request optimization to deliver the best quality for each format
-- **Backward Compatibility**: Existing code continues to work unchanged
-- **Enhanced Format Support**: Improved support for all 6 audio formats (MP3, WAV, OPUS, AAC, FLAC, PCM)
-## [3.0.0] - 2025-06-06
-### 🎉 First Python Package Release
-This is the first release of TTSFM as an installable Python package. Previous versions (v1.x and v2.x) were service-only releases that provided the API server but not a pip-installable package.
-### ✨ Added
-- **Complete Package Restructure**: Modern Python package structure with proper typing
-- **Async Support**: Full asynchronous client implementation with `asyncio`
-- **OpenAI API Compatibility**: Drop-in replacement for OpenAI TTS API
-- **Type Hints**: Complete type annotation support throughout the codebase
-- **CLI Interface**: Command-line tool for easy TTS generation
-- **Web Application**: Optional Flask-based web interface
-- **Docker Support**: Multi-architecture Docker images (linux/amd64, linux/arm64)
-- **Comprehensive Error Handling**: Detailed exception hierarchy
-- **Multiple Audio Formats**: Support for MP3, WAV, FLAC, and more
-- **Voice Options**: Multiple voice models (alloy, ash, ballad, coral, echo, fable, nova, onyx, sage, shimmer)
-- **Text Processing**: Automatic text length validation and splitting
-- **Rate Limiting**: Built-in rate limiting and retry mechanisms
-- **Configuration**: Environment variable and configuration file support
-### 🔧 Technical Improvements
-- **Modern Build System**: Using `pyproject.toml` with setuptools
-- **GitHub Actions**: Automated Docker builds and PyPI publishing
-- **Development Tools**: Pre-commit hooks, linting, testing setup
-- **Documentation**: Comprehensive README and inline documentation
-- **Package Management**: Proper dependency management with optional extras
-### 🌐 API Changes
-- **Breaking**: Complete API redesign for better usability
-- **OpenAI Compatible**: `/v1/audio/speech` endpoint compatibility
-- **RESTful Design**: Clean REST API design
-- **Health Checks**: Built-in health check endpoints
-- **CORS Support**: Cross-origin resource sharing enabled
-### 📦 Installation Options
-```bash
-# Basic installation
-pip install ttsfm
-# With web application support
-pip install ttsfm[web]
-# With development tools
-pip install ttsfm[dev]
-# Docker
-docker run -p 8000:8000 ghcr.io/dbccccccc/ttsfm:latest
-```
-### 🚀 Quick Start
-```python
-from ttsfm import TTSClient, Voice
-client = TTSClient()
-response = client.generate_speech(
-    text="Hello! This is TTSFM v3.0.0",
-    voice=Voice.CORAL
-)
-with open("speech.mp3", "wb") as f:
-    f.write(response.audio_data)
-```
-### 📦 Package vs Service History
-**Important Note**: This v3.0.0 is the first release of TTSFM as a Python package available on PyPI. Previous versions (v1.x and v2.x) were service/API server releases only and were not available as installable packages.
-- **v1.x - v2.x**: Service releases (API server only, not pip-installable)
-- **v3.0.0+**: Full Python package releases (pip-installable with service capabilities)
-### 🐛 Bug Fixes
-- Fixed Docker build issues with dependency resolution
-- Improved error handling and user feedback
-- Better handling of long text inputs
-- Enhanced stability and performance
-### 📚 Documentation
-- Complete API documentation
-- Usage examples and tutorials
-- Docker deployment guide
-- Development setup instructions
----
-## Previous Service Releases (Not Available as Python Packages)
-The following versions were service/API server releases only and were not available as pip-installable packages:
-### [2.0.0-alpha9] - 2025-04-09
-- Service improvements (alpha release)
-### [2.0.0-alpha8] - 2025-04-09
-- Service improvements (alpha release)
-### [2.0.0-alpha7] - 2025-04-07
-- Service improvements (alpha release)
-### [2.0.0-alpha6] - 2025-04-07
-- Service improvements (alpha release)
-### [2.0.0-alpha5] - 2025-04-07
-- Service improvements (alpha release)
-### [2.0.0-alpha4] - 2025-04-07
-- Service improvements (alpha release)
-### [2.0.0-alpha3] - 2025-04-07
-- Service improvements (alpha release)
-### [2.0.0-alpha2] - 2025-04-07
-- Service improvements (alpha release)
-### [2.0.0-alpha1] - 2025-04-07
-- Alpha release (DO NOT USE)
-### [1.3.0] - 2025-03-28
-- Support for additional audio file formats in the API
-- Alignment with formats supported by the official API
-### [1.2.2] - 2025-03-28
-- Fixed Docker support
-### [1.2.1] - 2025-03-28
-- Color change for indicator for status
-- Voice preview on webpage for each voice
-### [1.2.0] - 2025-03-26
-- Enhanced stability and availability by implementing advanced request handling mechanisms
-- Removed the proxy pool
-### [1.1.2] - 2025-03-26
-- Version display on webpage
-- Last version of 1.1.x
-### [1.1.1] - 2025-03-26
-- Build fixes
-### [1.1.0] - 2025-03-26
-- Project restructuring for better future development experiences
-- Added .env settings
-### [1.0.0] - 2025-03-26
-- First service release

+# Changelog
+All notable changes to this project will be documented in this file.
+The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
+and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [3.2.3] - 2025-06-27
+### 🔄 Enhanced OpenAI API Compatibility
+This release consolidates the OpenAI-compatible API endpoints and introduces intelligent auto-combine functionality.
+### ✨ Added
+- **Auto-Combine Parameter**: New optional `auto_combine` parameter in `/v1/audio/speech` endpoint (default: `true`)
+- **Intelligent Text Handling**: Automatically detects long text and combines audio chunks when `auto_combine=true`
+- **Enhanced Error Messages**: Better error handling for long text when auto-combine is disabled
+- **Response Headers**: Added `X-Auto-Combine` and `X-Chunks-Combined` headers for transparency
+### 🔄 Changed
+- **Unified Endpoint**: Combined `/v1/audio/speech` and `/v1/audio/speech-combined` into single endpoint
+- **Backward Compatibility**: Maintains full OpenAI API compatibility while adding TTSFM-specific features
+- **Default Behavior**: Long text is now automatically split and combined by default (can be disabled)
+### 🗑️ Removed
+- **Deprecated Endpoint**: Removed `/v1/audio/speech-combined` endpoint (functionality moved to main endpoint)
+- **Legacy Web Options**: Removed confusing batch processing options from web interface for cleaner UX
+- **Complex UI Elements**: Simplified playground interface to focus on auto-combine
+### 🧹 Streamlined Web Experience
+- **User-Focused Design**: Web interface now emphasizes auto-combine as the primary approach
+- **Developer Features Preserved**: All advanced functionality remains in Python package
+- **Clear Separation**: Web for users, Python package for developers
+### 📋 Migration Guide
+- **No Breaking Changes**: Existing API calls continue to work unchanged
+- **Long Text**: Now automatically handled by default - no need to use separate endpoint
+- **Disable Auto-Combine**: Add `"auto_combine": false` to request body to get original behavior
+## [3.2.2] - 2025-06-26
+### 🎵 Combined Audio Functionality
+This release introduces the revolutionary combined audio feature that allows generating single, seamless audio files from long text content.
+### ✨ Added
+- **Combined Audio Endpoints**: New `/api/generate-combined` and `/v1/audio/speech-combined` endpoints
+- **Intelligent Text Splitting**: Smart algorithm that splits text at sentence boundaries, then word boundaries, preserving natural speech flow
+- **Seamless Audio Combination**: Professional audio processing to merge chunks into single continuous files
+- **OpenAI Compatibility**: Full OpenAI TTS API compatibility for combined audio generation
+- **Advanced Fallback System**: Multiple fallback mechanisms for audio combination (PyDub → WAV concatenation → raw concatenation)
+- **Rich Metadata**: Response headers with chunk count, file size, and processing information
+- **Comprehensive Testing**: Full test suite with unit tests, integration tests, and performance benchmarks
+### 🔄 Changed
+- **Extended Character Limits**: No longer limited to 4096 characters per request
+- **Enhanced Web Interface**: Updated documentation with combined audio endpoint information
+- **Improved Error Handling**: Better validation and error messages for long text processing
+### 🛠️ Technical Features
+- **Concurrent Processing**: Parallel chunk processing for faster generation
+- **Memory Optimization**: Efficient memory usage for large text processing
+- **Format Support**: Works with all supported audio formats (MP3, WAV, OPUS, AAC, FLAC, PCM)
+- **Performance Monitoring**: Built-in performance tracking and optimization
+- **Unicode Support**: Full Unicode text handling for international content
+### 📋 Use Cases
+- **Long Articles**: Convert blog posts and articles to single audio files
+- **Audiobooks**: Generate chapters as continuous audio
+- **Educational Content**: Transform learning materials to audio format
+- **Accessibility**: Enhanced support for visually impaired users
+- **Podcast Creation**: Convert scripts to professional audio content
+## [3.1.0] - 2024-12-19
+### 🔧 Format Support Improvements
+This release focuses on fixing audio format handling and improving format delivery optimization.
+### ✨ Added
+- **Smart Header Selection**: Intelligent HTTP header selection to optimize format delivery from openai.fm service
+- **Format Mapping Functions**: Helper functions for better format handling and optimization
+- **Enhanced Web Interface**: Improved format selection with detailed descriptions for each format
+- **Comprehensive Format Documentation**: Updated README and documentation with complete format information
+### 🔄 Changed
+- **File Naming Logic**: Files are now saved with extensions based on the actual returned format, not the requested format
+- **Enhanced Logging**: Added format-specific log messages for better debugging
+- **Web API Enhancement**: `/api/formats` endpoint now provides detailed information about all supported formats
+- **Documentation Updates**: README and package documentation now include comprehensive format guides
+### 🐛 Fixed
+- **MAJOR FIX**: Resolved file naming issue where files were saved with incorrect double extensions (e.g., `test.wav.mp3`, `test.opus.wav`)
+- **Correct File Extensions**: Files now save with proper single extensions based on actual audio format (e.g., `test.mp3`, `test.wav`)
+- **Format Optimization**: Improved format delivery through smart request optimization
+- **Format Handling**: Better handling of all supported audio formats
+### 📝 Technical Details
+- **Format Optimization**: Smart request optimization to deliver the best quality for each format
+- **Backward Compatibility**: Existing code continues to work unchanged
+- **Enhanced Format Support**: Improved support for all 6 audio formats (MP3, WAV, OPUS, AAC, FLAC, PCM)
+## [3.0.0] - 2025-06-06
+### 🎉 First Python Package Release
+This is the first release of TTSFM as an installable Python package. Previous versions (v1.x and v2.x) were service-only releases that provided the API server but not a pip-installable package.
+### ✨ Added
+- **Complete Package Restructure**: Modern Python package structure with proper typing
+- **Async Support**: Full asynchronous client implementation with `asyncio`
+- **OpenAI API Compatibility**: Drop-in replacement for OpenAI TTS API
+- **Type Hints**: Complete type annotation support throughout the codebase
+- **CLI Interface**: Command-line tool for easy TTS generation
+- **Web Application**: Optional Flask-based web interface
+- **Docker Support**: Multi-architecture Docker images (linux/amd64, linux/arm64)
+- **Comprehensive Error Handling**: Detailed exception hierarchy
+- **Multiple Audio Formats**: Support for MP3, WAV, FLAC, and more
+- **Voice Options**: Multiple voice models (alloy, ash, ballad, coral, echo, fable, nova, onyx, sage, shimmer)
+- **Text Processing**: Automatic text length validation and splitting
+- **Rate Limiting**: Built-in rate limiting and retry mechanisms
+- **Configuration**: Environment variable and configuration file support
+### 🔧 Technical Improvements
+- **Modern Build System**: Using `pyproject.toml` with setuptools
+- **GitHub Actions**: Automated Docker builds and PyPI publishing
+- **Development Tools**: Pre-commit hooks, linting, testing setup
+- **Documentation**: Comprehensive README and inline documentation
+- **Package Management**: Proper dependency management with optional extras
+### 🌐 API Changes
+- **Breaking**: Complete API redesign for better usability
+- **OpenAI Compatible**: `/v1/audio/speech` endpoint compatibility
+- **RESTful Design**: Clean REST API design
+- **Health Checks**: Built-in health check endpoints
+- **CORS Support**: Cross-origin resource sharing enabled
+### 📦 Installation Options
+```bash
+# Basic installation
+pip install ttsfm
+# With web application support
+pip install ttsfm[web]
+# With development tools
+pip install ttsfm[dev]
+# Docker
+docker run -p 8000:8000 ghcr.io/dbccccccc/ttsfm:latest
+```
+### 🚀 Quick Start
+```python
+from ttsfm import TTSClient, Voice
+client = TTSClient()
+response = client.generate_speech(
+    text="Hello! This is TTSFM v3.0.0",
+    voice=Voice.CORAL
+)
+with open("speech.mp3", "wb") as f:
+    f.write(response.audio_data)
+```
+### 📦 Package vs Service History
+**Important Note**: This v3.0.0 is the first release of TTSFM as a Python package available on PyPI. Previous versions (v1.x and v2.x) were service/API server releases only and were not available as installable packages.
+- **v1.x - v2.x**: Service releases (API server only, not pip-installable)
+- **v3.0.0+**: Full Python package releases (pip-installable with service capabilities)
+### 🐛 Bug Fixes
+- Fixed Docker build issues with dependency resolution
+- Improved error handling and user feedback
+- Better handling of long text inputs
+- Enhanced stability and performance
+### 📚 Documentation
+- Complete API documentation
+- Usage examples and tutorials
+- Docker deployment guide
+- Development setup instructions
+---
+## Previous Service Releases (Not Available as Python Packages)
+The following versions were service/API server releases only and were not available as pip-installable packages:
+### [2.0.0-alpha9] - 2025-04-09
+- Service improvements (alpha release)
+### [2.0.0-alpha8] - 2025-04-09
+- Service improvements (alpha release)
+### [2.0.0-alpha7] - 2025-04-07
+- Service improvements (alpha release)
+### [2.0.0-alpha6] - 2025-04-07
+- Service improvements (alpha release)
+### [2.0.0-alpha5] - 2025-04-07
+- Service improvements (alpha release)
+### [2.0.0-alpha4] - 2025-04-07
+- Service improvements (alpha release)
+### [2.0.0-alpha3] - 2025-04-07
+- Service improvements (alpha release)
+### [2.0.0-alpha2] - 2025-04-07
+- Service improvements (alpha release)
+### [2.0.0-alpha1] - 2025-04-07
+- Alpha release (DO NOT USE)
+### [1.3.0] - 2025-03-28
+- Support for additional audio file formats in the API
+- Alignment with formats supported by the official API
+### [1.2.2] - 2025-03-28
+- Fixed Docker support
+### [1.2.1] - 2025-03-28
+- Color change for indicator for status
+- Voice preview on webpage for each voice
+### [1.2.0] - 2025-03-26
+- Enhanced stability and availability by implementing advanced request handling mechanisms
+- Removed the proxy pool
+### [1.1.2] - 2025-03-26
+- Version display on webpage
+- Last version of 1.1.x
+### [1.1.1] - 2025-03-26
+- Build fixes
+### [1.1.0] - 2025-03-26
+- Project restructuring for better future development experiences
+- Added .env settings
+### [1.0.0] - 2025-03-26
+- First service release

Dockerfile CHANGED Viewed

@@ -1,34 +1,36 @@
-FROM python:3.11-slim
-WORKDIR /app
-ENV PYTHONDONTWRITEBYTECODE=1 \
-    PYTHONUNBUFFERED=1 \
-    PORT=8000
-# Install dependencies
-RUN apt-get update && apt-get install -y gcc curl && rm -rf /var/lib/apt/lists/*
-# Copy source code first
-COPY ttsfm/ ./ttsfm/
-COPY ttsfm-web/ ./ttsfm-web/
-COPY pyproject.toml ./
-COPY requirements.txt ./
-# Install the TTSFM package with web dependencies
-RUN pip install --no-cache-dir -e .[web]
-# Install additional web dependencies
-RUN pip install --no-cache-dir python-dotenv>=1.0.0
-# Create non-root user
-RUN useradd --create-home ttsfm && chown -R ttsfm:ttsfm /app
-USER ttsfm
-EXPOSE 7860
-HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
-    CMD curl -f http://localhost:8000/api/health || exit 1
-WORKDIR /app/ttsfm-web
-CMD ["python", "-m", "waitress", "--host=0.0.0.0", "--port=7860", "app:app"]

+FROM python:3.11-slim
+WORKDIR /app
+ENV PYTHONDONTWRITEBYTECODE=1 \
+    PYTHONUNBUFFERED=1 \
+    PORT=8000
+# Install dependencies
+RUN apt-get update && apt-get install -y gcc curl git && rm -rf /var/lib/apt/lists/*
+# Copy source code first
+COPY ttsfm/ ./ttsfm/
+COPY ttsfm-web/ ./ttsfm-web/
+COPY pyproject.toml ./
+COPY requirements.txt ./
+COPY .git/ ./.git/
+# Install the TTSFM package with web dependencies
+RUN pip install --no-cache-dir -e .[web]
+# Install additional web dependencies
+RUN pip install --no-cache-dir python-dotenv>=1.0.0 flask-socketio>=5.3.0 python-socketio>=5.10.0 eventlet>=0.33.3
+# Create non-root user
+RUN useradd --create-home ttsfm && chown -R ttsfm:ttsfm /app
+USER ttsfm
+EXPOSE 8000
+HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
+    CMD curl -f http://localhost:8000/api/health || exit 1
+WORKDIR /app/ttsfm-web
+# Use run.py for proper eventlet initialization
+CMD ["python", "run.py"]

LICENSE CHANGED Viewed

@@ -1,21 +1,21 @@
-MIT License
-Copyright (c) 2025 dbcccc
-Permission is hereby granted, free of charge, to any person obtaining a copy
-of this software and associated documentation files (the "Software"), to deal
-in the Software without restriction, including without limitation the rights
-to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
-copies of the Software, and to permit persons to whom the Software is
-furnished to do so, subject to the following conditions:
-The above copyright notice and this permission notice shall be included in all
-copies or substantial portions of the Software.
-THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
-IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
-FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
-AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
-LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
-OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
-SOFTWARE.

+MIT License
+Copyright (c) 2025 dbcccc
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

README.zh.md ADDED Viewed

	@@ -0,0 +1,792 @@

+# TTSFM - 文本转语音API客户端
+> **Language / 语言**: [English](README.md) | [中文](README.zh.md)
+[![Docker Pulls](https://img.shields.io/docker/pulls/dbcccc/ttsfm?style=flat-square&logo=docker)](https://hub.docker.com/r/dbcccc/ttsfm)
+[![GitHub Stars](https://img.shields.io/github/stars/dbccccccc/ttsfm?style=social)](https://github.com/dbccccccc/ttsfm)
+[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg?style=flat-square)](https://opensource.org/licenses/MIT)
+## Star历史
+[![Star History Chart](https://api.star-history.com/svg?repos=dbccccccc/ttsfm&type=Date)](https://www.star-history.com/#dbccccccc/ttsfm&Date)
+🎤 **现代化、免费的文本转语音API客户端，兼容OpenAI**
+TTSFM为文本转语音生成提供同步和异步Python客户端，使用逆向工程的openai.fm服务。无需API密钥 - 完全免费使用！
+## ✨ 主要特性
+- 🆓 **完全免费** - 使用逆向工程的openai.fm服务（无需API密钥）
+- 🎯 **OpenAI兼容** - OpenAI TTS API的直接替代品（`/v1/audio/speech`）
+- ⚡ **异步和同步** - 提供`asyncio`和同步客户端
+- 🗣️ **11种声音** - 所有OpenAI兼容的声音（alloy、echo、fable、onyx、nova、shimmer等）
+- 🎵 **6种音频格式** - 支持MP3、WAV、OPUS、AAC、FLAC、PCM
+- 🐳 **Docker就绪** - 一键部署，包含Web界面
+- 🌐 **Web界面** - 用于测试声音和格式的交互式试用平台
+- 🔧 **CLI工具** - 用于快速TTS生成的命令行界面
+- 📦 **类型提示** - 完整的类型注解支持，提供更好的IDE体验
+- 🛡️ **错误处理** - 全面的异常层次结构和重试逻辑
+- ✨ **自动合并** - 自动处理长文本，无缝音频合并
+- 📊 **文本验证** - 自动文本长度验证和分割
+- 🔐 **API密钥保护** - 可选的OpenAI兼容身份验证，用于安全部署
+## 📦 安装
+### 快速安装
+```bash
+pip install ttsfm
+```
+### 安装选项
+```bash
+# 基础安装（仅同步客户端）
+pip install ttsfm
+# 包含Web应用支持
+pip install ttsfm[web]
+# 包含开发工具
+pip install ttsfm[dev]
+# 包含文档工具
+pip install ttsfm[docs]
+# 安装所有可选依赖
+pip install ttsfm[web,dev,docs]
+```
+### 系统要求
+- **Python**: 3.8+（在3.8、3.9、3.10、3.11、3.12上测试）
+- **操作系统**: Windows、macOS、Linux
+- **依赖**: `requests`、`aiohttp`、`fake-useragent`
+## 🚀 快速开始
+### 🐳 Docker（推荐）
+运行带有Web界面和OpenAI兼容API的TTSFM：
+```bash
+# 使用GitHub Container Registry
+docker run -p 8000:8000 ghcr.io/dbccccccc/ttsfm:latest
+# 使用Docker Hub
+docker run -p 8000:8000 dbcccc/ttsfm:latest
+```
+**可用端点：**
+- 🌐 **Web界面**: http://localhost:8000
+- 🔗 **OpenAI API**: http://localhost:8000/v1/audio/speech
+- 📊 **健康检查**: http://localhost:8000/api/health
+**测试API：**
+```bash
+curl -X POST http://localhost:8000/v1/audio/speech \
+  -H "Content-Type: application/json" \
+  -d '{"model":"gpt-4o-mini-tts","input":"你好世界！","voice":"alloy"}' \
+  --output speech.mp3
+```
+### 📦 Python包
+#### 同步客户端
+```python
+from ttsfm import TTSClient, Voice, AudioFormat
+# 创建客户端（使用免费的openai.fm服务）
+client = TTSClient()
+# 生成语音
+response = client.generate_speech(
+    text="你好！这是TTSFM - 一个免费的TTS服务。",
+    voice=Voice.CORAL,
+    response_format=AudioFormat.MP3
+)
+# 保存音频文件
+response.save_to_file("output")  # 保存为output.mp3
+# 或获取原始音频数据
+audio_bytes = response.audio_data
+print(f"生成了 {len(audio_bytes)} 字节的音频")
+```
+#### 异步客户端
+```python
+import asyncio
+from ttsfm import AsyncTTSClient, Voice
+async def generate_speech():
+    async with AsyncTTSClient() as client:
+        response = await client.generate_speech(
+            text="异步TTS生成！",
+            voice=Voice.NOVA
+        )
+        response.save_to_file("async_output")
+# 运行异步函数
+asyncio.run(generate_speech())
+```
+#### 长文本处理（Python包）
+对于需要精细控制文本分割的开发者：
+```python
+from ttsfm import TTSClient, Voice, AudioFormat
+# 创建客户端
+client = TTSClient()
+# 从长文本生成语音（为每个片段创建单独的文件）
+responses = client.generate_speech_long_text(
+    text="超过4096字符的很长文本...",
+    voice=Voice.ALLOY,
+    response_format=AudioFormat.MP3,
+    max_length=2000,
+    preserve_words=True
+)
+# 将每个片段保存为单独的文件
+for i, response in enumerate(responses, 1):
+    response.save_to_file(f"part_{i:03d}")  # 保存为part_001.mp3、part_002.mp3等
+print(f"从长文本生成了 {len(responses)} 个音频文件")
+```
+#### OpenAI Python客户端兼容性
+```python
+from openai import OpenAI
+# 指向TTSFM Docker容器（默认不需要API密钥）
+client = OpenAI(
+    api_key="not-needed",  # TTSFM默认免费
+    base_url="http://localhost:8000/v1"
+)
+# 启用API密钥保护时
+client_with_auth = OpenAI(
+    api_key="your-secret-api-key",  # 您的TTSFM API密钥
+    base_url="http://localhost:8000/v1"
+)
+# 生成语音（与OpenAI完全相同）
+response = client.audio.speech.create(
+    model="gpt-4o-mini-tts",
+    voice="alloy",
+    input="来自TTSFM的问候！"
+)
+response.stream_to_file("output.mp3")
+```
+#### 长文本自动合并功能
+TTSFM通过新的自动合并功能自动处理长文本（>4096字符）：
+```python
+from openai import OpenAI
+client = OpenAI(
+    api_key="not-needed",
+    base_url="http://localhost:8000/v1"
+)
+# 长文本自动分割并合并为单个音频文件
+long_article = """
+您的很长的文章或文档内容在这里...
+这可以是数千字符长，TTSFM将
+自动将其分割成片段，为每个片段生成音频，
+并将它们合并成一个无缝的音频文件。
+""" * 100  # 使其真的很长
+# 这可以无缝工作 - 无需手动分割！
+response = client.audio.speech.create(
+    model="gpt-4o-mini-tts",
+    voice="nova",
+    input=long_article,
+    # auto_combine=True 是默认值
+)
+response.stream_to_file("long_article.mp3")  # 单个合并文件！
+# 禁用自动合并以严格兼容OpenAI
+response = client.audio.speech.create(
+    model="gpt-4o-mini-tts",
+    voice="nova",
+    input="仅短文本",
+    auto_combine=False  # 如果文本>4096字符将出错
+)
+```
+### 🖥️ 命令行界面
+```bash
+# 基本用法
+ttsfm "你好，世界！" --output hello.mp3
+# 指定声音和格式
+ttsfm "你好，世界！" --voice nova --format wav --output hello.wav
+# 从文件读取
+ttsfm --text-file input.txt --output speech.mp3
+# 自定义服务URL
+ttsfm "你好，世界！" --url http://localhost:7000 --output hello.mp3
+# 列出可用声音
+ttsfm --list-voices
+# 获取帮助
+ttsfm --help
+```
+## ⚙️ 配置
+TTSFM自动使用免费的openai.fm服务 - **默认情况下无需配置或API密钥！**
+### 环境变量
+| 变量 | 默认值 | 描述 |
+|----------|---------|-------------|
+| `REQUIRE_API_KEY` | `false` | 启用API密钥保护 |
+| `TTSFM_API_KEY` | `None` | 您的秘密API密钥 |
+| `HOST` | `localhost` | 服务器主机 |
+| `PORT` | `8000` | 服务器端口 |
+| `DEBUG` | `false` | 调试模式 |
+### Python客户端配置
+```python
+from ttsfm import TTSClient
+# 默认客户端（使用openai.fm，无需API密钥）
+client = TTSClient()
+# 自定义配置
+client = TTSClient(
+    base_url="https://www.openai.fm",  # 默认
+    timeout=30.0,                     # 请求超时
+    max_retries=3,                    # 重试次数
+    verify_ssl=True                   # SSL验证
+)
+# 用于带有API密钥保护的TTSFM服务器
+protected_client = TTSClient(
+    base_url="http://localhost:8000",
+    api_key="your-ttsfm-api-key"
+)
+# 用于其他自定义TTS服务
+custom_client = TTSClient(
+    base_url="http://your-tts-service.com",
+    api_key="your-api-key-if-needed"
+)
+```
+## 🗣️ 可用声音
+TTSFM支持所有**11种OpenAI兼容声音**：
+| 声音 | 描述 | 最适合 |
+|-------|-------------|----------|
+| `alloy` | 平衡且多功能 | 通用目的，中性语调 |
+| `ash` | 清晰且清楚 | 专业，商务内容 |
+| `ballad` | 流畅且优美 | 讲故事，有声读物 |
+| `coral` | 温暖且友好 | 客户服务，教程 |
+| `echo` | 共鸣且清晰 | 公告，演示 |
+| `fable` | 富有表现力且动态 | 创意内容，娱乐 |
+| `nova` | 明亮且充满活力 | 营销，积极内容 |
+| `onyx` | 深沉且权威 | 新闻，严肃内容 |
+| `sage` | 智慧且稳重 | 教育，信息性 |
+| `shimmer` | 轻盈且飘逸 | 休闲，对话式 |
+| `verse` | 有节奏且流畅 | 诗歌，艺术内容 |
+```python
+from ttsfm import Voice
+# 使用枚举值
+response = client.generate_speech("你好！", voice=Voice.CORAL)
+# 或使用字符串值
+response = client.generate_speech("你好！", voice="coral")
+# 测试不同声音
+for voice in Voice:
+    response = client.generate_speech(f"这是{voice.value}声音", voice=voice)
+    response.save_to_file(f"test_{voice.value}")
+```
+## 🎵 音频格式
+TTSFM支持**6种音频格式**，具有不同的质量和压缩选项：
+| 格式 | 扩展名 | 质量 | 文件大小 | 使用场景 |
+|--------|-----------|---------|-----------|----------|
+| `mp3` | `.mp3` | 良好 | 小 | Web、移动应用、通用使用 |
+| `opus` | `.opus` | 优秀 | 小 | Web流媒体、VoIP |
+| `aac` | `.aac` | 良好 | 中等 | Apple设备、流媒体 |
+| `flac` | `.flac` | 无损 | 大 | 高质量存档 |
+| `wav` | `.wav` | 无损 | 大 | 专业���频 |
+| `pcm` | `.pcm` | 原始 | 大 | 音频处理 |
+### **使用示例**
+```python
+from ttsfm import TTSClient, AudioFormat
+client = TTSClient()
+# 生成不同格式
+formats = [
+    AudioFormat.MP3,   # 最常见
+    AudioFormat.OPUS,  # 最佳压缩
+    AudioFormat.AAC,   # Apple兼容
+    AudioFormat.FLAC,  # 无损
+    AudioFormat.WAV,   # 未压缩
+    AudioFormat.PCM    # 原始音频
+]
+for fmt in formats:
+    response = client.generate_speech(
+        text="测试音频格式",
+        response_format=fmt
+    )
+    response.save_to_file(f"test.{fmt.value}")
+```
+### **格式选择指南**
+- **选择MP3**用于：
+  - Web应用
+  - 移动应用
+  - 较小的文件大小
+  - 通用音频
+- **选择OPUS**用于：
+  - Web流媒体
+  - VoIP应用
+  - 最佳压缩比
+  - 实时音频
+- **选择AAC**用于：
+  - Apple设备
+  - 流媒体服务
+  - 良好的质量/大小平衡
+- **选择FLAC**用于：
+  - 存档目的
+  - 无损压缩
+  - 专业工作流程
+- **选择WAV**用于：
+  - 专业音频制作
+  - 最大兼容性
+  - 当文件大小不是问题时
+- **选择PCM**用于：
+  - 音频处理
+  - 原始音频数据
+  - 自定义应用
+> **注意**：库会自动优化请求，为您选择的格式提供最佳质量。文件总是根据音频格式以正确的扩展名保存。
+## 🌐 Web界面
+TTSFM包含一个**美观的Web界面**用于测试和实验：
+![Web Interface](https://img.shields.io/badge/Web%20Interface-Available-brightgreen?style=flat-square)
+**功能：**
+- 🎮 **交互式试用平台** - 实时测试声音和格式
+- 📝 **文本验证** - 字符计数和长度验证
+- 🎛️ **高级选项** - 声音指令，自动分割长文本
+- 📊 **音频播放器** - 内置播放器，显示时长和文件大小信息
+- 📥 **下载支持** - 下载单个或批量音频文件
+- 🎲 **随机文本** - 生成随机示例文本进行测试
+- 📱 **响应式设计** - 在桌面、平板和移动设备上工作
+访问地址：http://localhost:8000（运行Docker容器时）
+## 🔗 API端点
+运行Docker容器时，这些端点可用：
+| 端点 | 方法 | 描述 |
+|----------|--------|-------------|
+| `/` | GET | Web界面 |
+| `/playground` | GET | 交互式TTS试用平台 |
+| `/v1/audio/speech` | POST | OpenAI兼容的TTS API |
+| `/v1/models` | GET | 列出可用模型 |
+| `/api/health` | GET | 健康检查端点 |
+| `/api/voices` | GET | 列出可用声音 |
+| `/api/formats` | GET | 列出支持的音频格式 |
+| `/api/validate-text` | POST | 验证文本长度 |
+### OpenAI兼容API
+```bash
+# 生成语音（短文本） - 默认不需要API密钥
+curl -X POST http://localhost:8000/v1/audio/speech \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "gpt-4o-mini-tts",
+    "input": "你好，这是一个测试！",
+    "voice": "alloy",
+    "response_format": "mp3"
+  }' \
+  --output speech.mp3
+# 使用API密钥生成语音（启用保护时）
+curl -X POST http://localhost:8000/v1/audio/speech \
+  -H "Content-Type: application/json" \
+  -H "Authorization: Bearer your-secret-api-key" \
+  -d '{
+    "model": "gpt-4o-mini-tts",
+    "input": "你好，这是一个测试！",
+    "voice": "alloy",
+    "response_format": "mp3"
+  }' \
+  --output speech.mp3
+# 使用自动合并从长文本生成语音（默认行为）
+curl -X POST http://localhost:8000/v1/audio/speech \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "gpt-4o-mini-tts",
+    "input": "这是一个超过4096字符限制的很长文本...",
+    "voice": "alloy",
+    "response_format": "mp3",
+    "auto_combine": true
+  }' \
+  --output long_speech.mp3
+# 不使用自动合并从长文本生成语音（如果文本>4096字符将返回错误）
+curl -X POST http://localhost:8000/v1/audio/speech \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "gpt-4o-mini-tts",
+    "input": "您的文本在这里...",
+    "voice": "alloy",
+    "response_format": "mp3",
+    "auto_combine": false
+  }' \
+  --output speech.mp3
+# 列出模型
+curl http://localhost:8000/v1/models
+# 健康检查
+curl http://localhost:8000/api/health
+```
+#### **新参数：`auto_combine`**
+TTSFM通过可选的`auto_combine`参数扩展了OpenAI API：
+- **`auto_combine`**（布尔值，可选，默认：`true`）
+  - 当为`true`时：自动将长文本（>4096字符）分割成片段，为每个片段生成音频，并将它们合并成一个无缝的音频文件
+  - 当为`false`时：如果文本超过4096字符限制则返回错误（标准OpenAI行为）
+  - **好处**：无需手动管理长内容的文本分割或音频文件合并
+## 🐳 Docker部署
+### 快速开始
+```bash
+# 使用默认设置运行（无需API密钥）
+docker run -p 8000:8000 ghcr.io/dbccccccc/ttsfm:latest
+# 启用API密钥保护运行
+docker run -p 8000:8000 \
+  -e REQUIRE_API_KEY=true \
+  -e TTSFM_API_KEY=your-secret-api-key \
+  ghcr.io/dbccccccc/ttsfm:latest
+# 使用自定义端口运行
+docker run -p 3000:8000 ghcr.io/dbccccccc/ttsfm:latest
+# 后台运行
+docker run -d -p 8000:8000 --name ttsfm ghcr.io/dbccccccc/ttsfm:latest
+```
+### Docker Compose
+```yaml
+version: '3.8'
+services:
+  ttsfm:
+    image: ghcr.io/dbccccccc/ttsfm:latest
+    ports:
+      - "8000:8000"
+    environment:
+      - PORT=8000
+      # 可选：启用API密钥保护
+      - REQUIRE_API_KEY=false
+      - TTSFM_API_KEY=your-secret-api-key-here
+    restart: unless-stopped
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:8000/api/health"]
+      interval: 30s
+      timeout: 10s
+      retries: 3
+```
+### 可用镜像
+| 注册表 | 镜像 | 描述 |
+|----------|-------|-------------|
+| GitHub Container Registry | `ghcr.io/dbccccccc/ttsfm:latest` | 最新稳定版本 |
+| Docker Hub | `dbcccc/ttsfm:latest` | Docker Hub镜像 |
+| GitHub Container Registry | `ghcr.io/dbccccccc/ttsfm:v3.2.2` | 特定版本 |
+## 🛠️ 高级用法
+### 错误处理
+```python
+from ttsfm import TTSClient, TTSException, APIException, NetworkException
+client = TTSClient()
+try:
+    response = client.generate_speech("你好，世界！")
+    response.save_to_file("output")
+except NetworkException as e:
+    print(f"网络错误：{e}")
+except APIException as e:
+    print(f"API错误：{e}")
+except TTSException as e:
+    print(f"TTS错误：{e}")
+```
+### 文本验证和分割
+```python
+from ttsfm.utils import validate_text_length, split_text_by_length
+# 验证文本长度
+text = "您的长文本在这里..."
+is_valid, length = validate_text_length(text, max_length=4096)
+if not is_valid:
+    # 将长文本分割成片段
+    chunks = split_text_by_length(text, max_length=4000)
+    # 为每个片段生成语音
+    for i, chunk in enumerate(chunks):
+        response = client.generate_speech(chunk)
+        response.save_to_file(f"output_part_{i}")
+```
+### 自定义请求头和用户代理
+```python
+from ttsfm import TTSClient
+# 客户端自动使用真实的请求头
+client = TTSClient()
+# 请求头包括：
+# - 真实的User-Agent字符串
+# - 音频内容的Accept头
+# - 连接保持活跃
+# - 压缩的Accept-Encoding
+```
+## 🔧 开发
+### 本地开发
+```bash
+# 克隆仓库
+git clone https://github.com/dbccccccc/ttsfm.git
+cd ttsfm
+# 以开发模式安装
+pip install -e .[dev]
+# 运行测试
+pytest
+# 运行Web应用
+cd ttsfm-web
+python app.py
+```
+### 构建Docker镜像
+```bash
+# 构建镜像
+docker build -t ttsfm:local .
+# 运行本地镜像
+docker run -p 8000:8000 ttsfm:local
+```
+### 贡献
+1. Fork仓库
+2. 创建功能分支（`git checkout -b feature/amazing-feature`）
+3. 提交更改（`git commit -m 'Add amazing feature'`）
+4. 推送到分支（`git push origin feature/amazing-feature`）
+5. 打开Pull Request
+## 📊 性能
+### 基准测试
+- **延迟**：典型文本约1-3秒（取决于openai.fm服务）
+- **吞吐量**：异步客户端支持并发请求
+- **文本限制**：使用自动合并无限制！自动处理任何长度的文本
+- **音频质量**：与OpenAI相当的高质量合成
+### 优化技巧
+```python
+# 使用异步客户端获得更好的性能
+async with AsyncTTSClient() as client:
+    # 并发处理多个请求
+    tasks = [
+        client.generate_speech(f"文本 {i}")
+        for i in range(10)
+    ]
+    responses = await asyncio.gather(*tasks)
+# 重用客户端实例
+client = TTSClient()
+for text in texts:
+    response = client.generate_speech(text)  # 重用连接
+```
+## 🔐 API密钥保护（可选）
+TTSFM支持**OpenAI兼容的API密钥身份验证**用于安全部署：
+### 快速设置
+```bash
+# 启用API密钥保护
+export REQUIRE_API_KEY=true
+export TTSFM_API_KEY=your-secret-api-key
+# 启用保护运行
+docker run -p 8000:8000 \
+  -e REQUIRE_API_KEY=true \
+  -e TTSFM_API_KEY=your-secret-api-key \
+  ghcr.io/dbccccccc/ttsfm:latest
+```
+### 身份验证方法
+API密钥以**OpenAI兼容格式**接受：
+```python
+from openai import OpenAI
+# 标准OpenAI格式
+client = OpenAI(
+    api_key="your-secret-api-key",
+    base_url="http://localhost:8000/v1"
+)
+# 或使用curl
+curl -X POST http://localhost:8000/v1/audio/speech \
+  -H "Authorization: Bearer your-secret-api-key" \
+  -H "Content-Type: application/json" \
+  -d '{"model":"gpt-4o-mini-tts","input":"你好！","voice":"alloy"}'
+```
+### 功能
+- 🔑 **OpenAI兼容**：使用标准`Authorization: Bearer`头
+- 🛡️ **多种认证方法**：头部、查询参数或JSON正文
+- 🎛️ **可配置**：通过环境变量轻松启用/禁用
+- 📊 **安全日志**：跟踪无效访问尝试
+- 🌐 **Web界面**：自动API密钥字段检测
+### 受保护的端点
+启用时，这些端点需要身份验证：
+- `POST /v1/audio/speech` - 语音生成
+- `POST /api/generate` - 传统语音生成
+- `POST /api/generate-combined` - 合并语音生成
+### 公共端点
+这些端点无需身份验证即可访问：
+- `GET /` - Web界面
+- `GET /playground` - 交互式试用平台
+- `GET /api/health` - 健康检查
+- `GET /api/voices` - 可用声音
+- `GET /api/formats` - 支持的格式
+## 🔒 安全和隐私
+- **可选API密钥**：默认免费，需要时安全
+- **无数据存储**：音频按需生成，不存储
+- **HTTPS支持**：到TTS服务的安全连接
+- **无跟踪**：TTSFM不收集或存储用户数据
+- **开源**：完整源代码可供审计
+## 📋 更新日志
+查看[CHANGELOG.md](CHANGELOG.md)了解详细版本历史。
+### 最新更改（v3.2.3）
+- ✨ **默认自动合并**：长文本现在自动分割并合并为单个音频文件
+- 🔄 **统一API端点**：单个`/v1/audio/speech`端点智能处理短文本和长文本
+- 🎛️ **可配置行为**：新的`auto_combine`参数（默认：`true`）提供完全控制
+- 🤖 **增强OpenAI兼容性**：具有智能长文本处理的直接替代品
+- 📊 **丰富响应头**：`X-Auto-Combine`、`X-Chunks-Combined`和处理元数据
+- 🧹 **简化Web界面**：移除传统批处理，提供更清洁的用户体验
+- 📖 **简化文档**：Web文档强调现代自动合并方法
+- 🎮 **增强试用平台**：专注于自动合并功能的清洁界面
+- 🔐 **API密钥保护**：用于安全部署的可选OpenAI兼容身份验证
+- 🛡️ **安全功能**：具有详细日志的全面访问控制
+## 🤝 支持和社区
+- 🐛 **错误报告**：[GitHub Issues](https://github.com/dbccccccc/ttsfm/issues)
+- 💬 **讨论**：[GitHub Discussions](https://github.com/dbccccccc/ttsfm/discussions)
+- 👤 **作者**：[@dbcccc](https://github.com/dbccccccc)
+- ⭐ **为项目加星**：如果您觉得TTSFM有用，请在GitHub上为其加星！
+## 📄 许可证
+MIT许可证 - 详见[LICENSE](LICENSE)文件。
+## 🙏 致谢
+- **OpenAI**：原始TTS API设计
+- **openai.fm**：提供免费TTS服务
+- **社区**：感谢所有帮助改进TTSFM的用户和贡献者
+---
+<div align="center">
+**TTSFM** - 免费文本转语音API，兼容OpenAI
+[![GitHub](https://img.shields.io/badge/GitHub-dbccccccc/ttsfm-blue?style=flat-square&logo=github)](https://github.com/dbccccccc/ttsfm)
+[![PyPI](https://img.shields.io/badge/PyPI-ttsfm-blue?style=flat-square&logo=pypi)](https://pypi.org/project/ttsfm/)
+[![Docker](https://img.shields.io/badge/Docker-dbcccc/ttsfm-blue?style=flat-square&logo=docker)](https://hub.docker.com/r/dbcccc/ttsfm)
+---
+## 📖 文档
+- 🇺🇸 **English**: [README.md](README.md)
+- 🇨🇳 **中文**: [README.zh.md](README.zh.md)
+由[@dbcccc](https://github.com/dbccccccc)用❤️制作
+</div>

docs/websocket-streaming.md ADDED Viewed

	@@ -0,0 +1,244 @@

+# 🚀 WebSocket Streaming for TTSFM
+Real-time audio streaming for text-to-speech generation using WebSockets.
+## Overview
+The WebSocket streaming feature provides:
+- **Real-time audio chunk delivery** as they're generated
+- **Progress tracking** with live updates
+- **Lower perceived latency** - start receiving audio before complete generation
+- **Cancellable operations** - stop mid-generation if needed
+## Quick Start
+### 1. Docker Deployment (Recommended)
+```bash
+# Build with WebSocket support
+docker build -t ttsfm-websocket .
+# Run with WebSocket enabled
+docker run -p 8000:8000 \
+  -e DEBUG=false \
+  ttsfm-websocket
+```
+### 2. Test WebSocket Connection
+Visit `http://localhost:8000/websocket-demo` for an interactive demo.
+### 3. Client Usage
+```javascript
+// Initialize WebSocket client
+const client = new WebSocketTTSClient({
+    socketUrl: 'http://localhost:8000',
+    debug: true
+});
+// Generate speech with streaming
+const result = await client.generateSpeech('Hello, WebSocket world!', {
+    voice: 'alloy',
+    format: 'mp3',
+    onProgress: (progress) => {
+        console.log(`Progress: ${progress.progress}%`);
+    },
+    onChunk: (chunk) => {
+        console.log(`Received chunk ${chunk.chunkIndex + 1}`);
+        // Process audio chunk in real-time
+    },
+    onComplete: (result) => {
+        console.log('Generation complete!');
+        // Play or download the combined audio
+    }
+});
+```
+## API Reference
+### WebSocket Events
+#### Client → Server
+**`generate_stream`**
+```javascript
+{
+    text: string,          // Text to convert
+    voice: string,         // Voice ID (alloy, echo, etc.)
+    format: string,        // Audio format (mp3, wav, opus)
+    chunk_size: number     // Optional, default 1024
+}
+```
+**`cancel_stream`**
+```javascript
+{
+    request_id: string     // Request ID to cancel
+}
+```
+#### Server → Client
+**`stream_started`**
+```javascript
+{
+    request_id: string,
+    timestamp: number
+}
+```
+**`audio_chunk`**
+```javascript
+{
+    request_id: string,
+    chunk_index: number,
+    total_chunks: number,
+    audio_data: string,    // Hex-encoded audio data
+    format: string,
+    duration: number,
+    generation_time: number,
+    chunk_text: string     // Preview of chunk text
+}
+```
+**`stream_progress`**
+```javascript
+{
+    request_id: string,
+    progress: number,      // 0-100
+    total_chunks: number,
+    chunks_completed: number,
+    status: string
+}
+```
+**`stream_complete`**
+```javascript
+{
+    request_id: string,
+    total_chunks: number,
+    status: 'completed',
+    timestamp: number
+}
+```
+**`stream_error`**
+```javascript
+{
+    request_id: string,
+    error: string,
+    timestamp: number
+}
+```
+## Performance Considerations
+1. **Chunk Size**: Smaller chunks (512-1024 chars) provide more frequent updates but increase overhead
+2. **Network Latency**: WebSocket reduces latency compared to HTTP polling
+3. **Audio Buffering**: Client should buffer chunks for smooth playback
+4. **Concurrent Streams**: Server supports multiple concurrent streaming sessions
+## Browser Support
+- Chrome/Edge: Full support
+- Firefox: Full support
+- Safari: Full support (iOS 11.3+)
+- IE11: Not supported (use polling fallback)
+## Troubleshooting
+### Connection Issues
+```javascript
+// Check WebSocket status
+fetch('/api/websocket/status')
+    .then(res => res.json())
+    .then(data => console.log('WebSocket status:', data));
+```
+### Debug Mode
+```javascript
+const client = new WebSocketTTSClient({
+    debug: true  // Enable console logging
+});
+```
+### Common Issues
+1. **"WebSocket connection failed"**
+   - Check if port 8000 is accessible
+   - Ensure eventlet is installed: `pip install eventlet>=0.33.3`
+   - Try polling transport as fallback
+2. **"Chunks arriving out of order"**
+   - Client automatically sorts chunks by index
+   - Check network stability
+3. **"Audio playback stuttering"**
+   - Increase chunk size for better buffering
+   - Check client-side audio buffer implementation
+## Advanced Usage
+### Custom Chunk Processing
+```javascript
+client.generateSpeech(text, {
+    onChunk: async (chunk) => {
+        // Custom processing per chunk
+        const processed = await processAudioChunk(chunk.audioData);
+        audioQueue.push(processed);
+        // Start playback after first chunk
+        if (chunk.chunkIndex === 0) {
+            startStreamingPlayback(audioQueue);
+        }
+    }
+});
+```
+### Progress Visualization
+```javascript
+client.generateSpeech(text, {
+    onProgress: (progress) => {
+        // Update UI progress bar
+        progressBar.style.width = `${progress.progress}%`;
+        statusText.textContent = `Processing chunk ${progress.chunksCompleted}/${progress.totalChunks}`;
+    }
+});
+```
+## Security
+- WebSocket connections respect API key authentication if enabled
+- CORS is configured for cross-origin requests
+- SSL/TLS recommended for production deployments
+## Deployment Notes
+For production deployment with your existing setup:
+```bash
+# Build new image with WebSocket support
+docker build -t ttsfm-websocket:latest .
+# Deploy to your server (192.168.1.150)
+docker stop ttsfm-container
+docker rm ttsfm-container
+docker run -d \
+  --name ttsfm-container \
+  -p 8000:8000 \
+  -e REQUIRE_API_KEY=true \
+  -e TTSFM_API_KEY=your-secret-key \
+  -e DEBUG=false \
+  ttsfm-websocket:latest
+```
+## Performance Metrics
+Based on testing with openai.fm backend:
+- First chunk delivery: ~0.5-1s
+- Streaming overhead: ~10-15% vs batch processing
+- Concurrent connections: 100+ (limited by server resources)
+- Memory usage: ~50MB per active stream
+*Built by a grumpy senior engineer who thinks HTTP was good enough*

pyproject.toml CHANGED Viewed

@@ -1,161 +1,169 @@
-[build-system]
-requires = ["setuptools>=45", "wheel", "setuptools_scm[toml]>=6.2"]
-build-backend = "setuptools.build_meta"
-[project]
-name = "ttsfm"
-version = "3.1.0"
-description = "Text-to-Speech API Client with OpenAI compatibility"
-readme = "README.md"
-license = "MIT"
-authors = [
-    {name = "dbcccc", email = "120614547+dbccccccc@users.noreply.github.com"}
-]
-maintainers = [
-    {name = "dbcccc", email = "120614547+dbccccccc@users.noreply.github.com"}
-]
-classifiers = [
-    "Development Status :: 4 - Beta",
-    "Intended Audience :: Developers",
-    "Operating System :: OS Independent",
-    "Programming Language :: Python :: 3",
-    "Programming Language :: Python :: 3.8",
-    "Programming Language :: Python :: 3.9",
-    "Programming Language :: Python :: 3.10",
-    "Programming Language :: Python :: 3.11",
-    "Programming Language :: Python :: 3.12",
-    "Topic :: Multimedia :: Sound/Audio :: Speech",
-    "Topic :: Software Development :: Libraries :: Python Modules",
-    "Topic :: Internet :: WWW/HTTP :: Dynamic Content",
-]
-keywords = [
-    "tts",
-    "text-to-speech",
-    "speech-synthesis",
-    "openai",
-    "api-client",
-    "audio",
-    "voice",
-    "speech"
-]
-requires-python = ">=3.8"
-dependencies = [
-    "requests>=2.25.0",
-    "aiohttp>=3.8.0",
-    "fake-useragent>=1.4.0",
-]
-[project.optional-dependencies]
-dev = [
-    "pytest>=6.0",
-    "pytest-asyncio>=0.18.0",
-    "pytest-cov>=2.0",
-    "black>=22.0",
-    "isort>=5.0",
-    "flake8>=4.0",
-    "mypy>=0.900",
-    "pre-commit>=2.0",
-]
-docs = [
-    "sphinx>=4.0",
-    "sphinx-rtd-theme>=1.0",
-    "myst-parser>=0.17",
-]
-web = [
-    "flask>=2.0.0",
-    "flask-cors>=3.0.10",
-    "waitress>=3.0.0",
-]
-[project.urls]
-Homepage = "https://github.com/dbccccccc/ttsfm"
-Documentation = "https://github.com/dbccccccc/ttsfm/blob/main/docs/"
-Repository = "https://github.com/dbccccccc/ttsfm"
-"Bug Tracker" = "https://github.com/dbccccccc/ttsfm/issues"
-[project.scripts]
-ttsfm = "ttsfm.cli:main"
-[tool.setuptools]
-packages = ["ttsfm"]
-[tool.setuptools.package-data]
-ttsfm = ["py.typed"]
-[tool.black]
-line-length = 100
-target-version = ['py38']
-include = '\.pyi?$'
-extend-exclude = '''
-/(
-  # directories
-  \.eggs
-  | \.git
-  | \.hg
-  | \.mypy_cache
-  | \.tox
-  | \.venv
-  | build
-  | dist
-)/
-'''
-[tool.isort]
-profile = "black"
-line_length = 100
-multi_line_output = 3
-include_trailing_comma = true
-force_grid_wrap = 0
-use_parentheses = true
-ensure_newline_before_comments = true
-[tool.mypy]
-python_version = "3.8"
-warn_return_any = true
-warn_unused_configs = true
-disallow_untyped_defs = true
-disallow_incomplete_defs = true
-check_untyped_defs = true
-disallow_untyped_decorators = true
-no_implicit_optional = true
-warn_redundant_casts = true
-warn_unused_ignores = true
-warn_no_return = true
-warn_unreachable = true
-strict_equality = true
-[tool.pytest.ini_options]
-minversion = "6.0"
-addopts = "-ra -q --strict-markers --strict-config"
-testpaths = ["tests"]
-python_files = ["test_*.py", "*_test.py"]
-python_classes = ["Test*"]
-python_functions = ["test_*"]
-markers = [
-    "slow: marks tests as slow (deselect with '-m \"not slow\"')",
-    "integration: marks tests as integration tests",
-    "unit: marks tests as unit tests",
-]
-[tool.coverage.run]
-source = ["ttsfm"]
-omit = [
-    "*/tests/*",
-    "*/test_*",
-    "setup.py",
-]
-[tool.coverage.report]
-exclude_lines = [
-    "pragma: no cover",
-    "def __repr__",
-    "if self.debug:",
-    "if settings.DEBUG",
-    "raise AssertionError",
-    "raise NotImplementedError",
-    "if 0:",
-    "if __name__ == .__main__.:",
-    "class .*\\bProtocol\\):",
-    "@(abc\\.)?abstractmethod",
-]

+[build-system]
+requires = ["setuptools>=45", "wheel", "setuptools_scm[toml]>=6.2"]
+build-backend = "setuptools.build_meta"
+[project]
+name = "ttsfm"
+dynamic = ["version"]
+description = "Text-to-Speech API Client with OpenAI compatibility"
+readme = "README.md"
+license = "MIT"
+authors = [
+    {name = "dbcccc", email = "120614547+dbccccccc@users.noreply.github.com"}
+]
+maintainers = [
+    {name = "dbcccc", email = "120614547+dbccccccc@users.noreply.github.com"}
+]
+classifiers = [
+    "Development Status :: 4 - Beta",
+    "Intended Audience :: Developers",
+    "Operating System :: OS Independent",
+    "Programming Language :: Python :: 3",
+    "Programming Language :: Python :: 3.8",
+    "Programming Language :: Python :: 3.9",
+    "Programming Language :: Python :: 3.10",
+    "Programming Language :: Python :: 3.11",
+    "Programming Language :: Python :: 3.12",
+    "Topic :: Multimedia :: Sound/Audio :: Speech",
+    "Topic :: Software Development :: Libraries :: Python Modules",
+    "Topic :: Internet :: WWW/HTTP :: Dynamic Content",
+]
+keywords = [
+    "tts",
+    "text-to-speech",
+    "speech-synthesis",
+    "openai",
+    "api-client",
+    "audio",
+    "voice",
+    "speech"
+]
+requires-python = ">=3.8"
+dependencies = [
+    "requests>=2.25.0",
+    "aiohttp>=3.8.0",
+    "fake-useragent>=1.4.0",
+    "python-dotenv>=1.0.1",
+]
+[project.optional-dependencies]
+dev = [
+    "pytest>=6.0",
+    "pytest-asyncio>=0.18.0",
+    "pytest-cov>=2.0",
+    "black>=22.0",
+    "isort>=5.0",
+    "flake8>=4.0",
+    "mypy>=0.900",
+    "pre-commit>=2.0",
+]
+docs = [
+    "sphinx>=4.0",
+    "sphinx-rtd-theme>=1.0",
+    "myst-parser>=0.17",
+]
+web = [
+    "flask>=2.0.0",
+    "flask-cors>=3.0.10",
+    "flask-socketio>=5.3.0",
+    "python-socketio>=5.10.0",
+    "eventlet>=0.33.3",
+    "waitress>=3.0.0",
+]
+[project.urls]
+Homepage = "https://github.com/dbccccccc/ttsfm"
+Documentation = "https://github.com/dbccccccc/ttsfm/blob/main/docs/"
+Repository = "https://github.com/dbccccccc/ttsfm"
+"Bug Tracker" = "https://github.com/dbccccccc/ttsfm/issues"
+[project.scripts]
+ttsfm = "ttsfm.cli:main"
+[tool.setuptools_scm]
+version_scheme = "no-guess-dev"
+local_scheme = "no-local-version"
+[tool.setuptools]
+packages = ["ttsfm"]
+[tool.setuptools.package-data]
+ttsfm = ["py.typed"]
+[tool.black]
+line-length = 100
+target-version = ['py38']
+include = '\\.pyi?$'
+extend-exclude = '''
+/(
+  # directories
+  \.eggs
+  | \.git
+  | \.hg
+  | \.mypy_cache
+  | \.tox
+  | \.venv
+  | build
+  | dist
+)/
+'''
+[tool.isort]
+profile = "black"
+line_length = 100
+multi_line_output = 3
+include_trailing_comma = true
+force_grid_wrap = 0
+use_parentheses = true
+ensure_newline_before_comments = true
+[tool.mypy]
+python_version = "3.8"
+warn_return_any = true
+warn_unused_configs = true
+disallow_untyped_defs = true
+disallow_incomplete_defs = true
+check_untyped_defs = true
+disallow_untyped_decorators = true
+no_implicit_optional = true
+warn_redundant_casts = true
+warn_unused_ignores = true
+warn_no_return = true
+warn_unreachable = true
+strict_equality = true
+[tool.pytest.ini_options]
+minversion = "6.0"
+addopts = "-ra -q --strict-markers --strict-config"
+testpaths = ["tests"]
+python_files = ["test_*.py", "*_test.py"]
+python_classes = ["Test*"]
+python_functions = ["test_*"]
+markers = [
+    "slow: marks tests as slow (deselect with '-m \"not slow\"')",
+    "integration: marks tests as integration tests",
+    "unit: marks tests as unit tests",
+]
+[tool.coverage.run]
+source = ["ttsfm"]
+omit = [
+    "*/tests/*",
+    "*/test_*",
+    "setup.py",
+]
+[tool.coverage.report]
+exclude_lines = [
+    "pragma: no cover",
+    "def __repr__",
+    "if self.debug:",
+    "if settings.DEBUG",
+    "raise AssertionError",
+    "raise NotImplementedError",
+    "if 0:",
+    "if __name__ == .__main__.:",
+    "class .*\\bProtocol\\):",
+    "@(abc\\.)?abstractmethod",
+]

requirements.txt CHANGED Viewed

@@ -1,4 +1,4 @@
-# Core dependencies for the TTSFM package
-requests>=2.25.0
-aiohttp>=3.8.0
 fake-useragent>=1.4.0

+# Core dependencies for the TTSFM package
+requests>=2.25.0
+aiohttp>=3.8.0
 fake-useragent>=1.4.0

ttsfm-web/app.py CHANGED Viewed

@@ -1,574 +1,988 @@
-"""
-TTSFM Web Application
-A Flask web application that provides a user-friendly interface
-for the TTSFM text-to-speech package.
-"""
-import os
-import json
-import logging
-from datetime import datetime
-from pathlib import Path
-from typing import Dict, Any, Optional
-from flask import Flask, request, jsonify, send_file, Response, render_template
-from flask_cors import CORS
-from dotenv import load_dotenv
-# Import the TTSFM package
-try:
-    from ttsfm import TTSClient, Voice, AudioFormat, TTSException
-    from ttsfm.exceptions import APIException, NetworkException, ValidationException
-    from ttsfm.utils import validate_text_length, split_text_by_length
-except ImportError:
-    # Fallback for development when package is not installed
-    import sys
-    sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..'))
-    from ttsfm import TTSClient, Voice, AudioFormat, TTSException
-    from ttsfm.exceptions import APIException, NetworkException, ValidationException
-    from ttsfm.utils import validate_text_length, split_text_by_length
-# Load environment variables
-load_dotenv()
-# Configure logging
-logging.basicConfig(
-    level=logging.INFO,
-    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
-)
-logger = logging.getLogger(__name__)
-# Create Flask app
-app = Flask(__name__, static_folder='static', static_url_path='/static')
-CORS(app)
-# Configuration
-HOST = os.getenv("HOST", "localhost")
-PORT = int(os.getenv("PORT", "8000"))
-DEBUG = os.getenv("DEBUG", "false").lower() == "true"
-# Create TTS client - now uses openai.fm directly, no configuration needed
-tts_client = TTSClient()
-logger.info("Initialized web app with TTSFM using openai.fm free service")
-@app.route('/')
-def index():
-    """Serve the main web interface."""
-    return render_template('index.html')
-@app.route('/playground')
-def playground():
-    """Serve the interactive playground."""
-    return render_template('playground.html')
-@app.route('/docs')
-def docs():
-    """Serve the API documentation."""
-    return render_template('docs.html')
-@app.route('/api/voices', methods=['GET'])
-def get_voices():
-    """Get list of available voices."""
-    try:
-        voices = [
-            {
-                "id": voice.value,
-                "name": voice.value.title(),
-                "description": f"{voice.value.title()} voice"
-            }
-            for voice in Voice
-        ]
-        return jsonify({
-            "voices": voices,
-            "count": len(voices)
-        })
-    except Exception as e:
-        logger.error(f"Error getting voices: {e}")
-        return jsonify({"error": "Failed to get voices"}), 500
-@app.route('/api/formats', methods=['GET'])
-def get_formats():
-    """Get list of supported audio formats."""
-    try:
-        formats = [
-            {
-                "id": "mp3",
-                "name": "MP3",
-                "mime_type": "audio/mpeg",
-                "description": "MP3 audio format - good quality, small file size",
-                "quality": "Good",
-                "file_size": "Small",
-                "use_case": "Web, mobile apps, general use"
-            },
-            {
-                "id": "opus",
-                "name": "OPUS",
-                "mime_type": "audio/opus",
-                "description": "OPUS audio format - excellent quality, small file size",
-                "quality": "Excellent",
-                "file_size": "Small",
-                "use_case": "Web streaming, VoIP"
-            },
-            {
-                "id": "aac",
-                "name": "AAC",
-                "mime_type": "audio/aac",
-                "description": "AAC audio format - good quality, medium file size",
-                "quality": "Good",
-                "file_size": "Medium",
-                "use_case": "Apple devices, streaming"
-            },
-            {
-                "id": "flac",
-                "name": "FLAC",
-                "mime_type": "audio/flac",
-                "description": "FLAC audio format - lossless quality, large file size",
-                "quality": "Lossless",
-                "file_size": "Large",
-                "use_case": "High-quality archival"
-            },
-            {
-                "id": "wav",
-                "name": "WAV",
-                "mime_type": "audio/wav",
-                "description": "WAV audio format - lossless quality, large file size",
-                "quality": "Lossless",
-                "file_size": "Large",
-                "use_case": "Professional audio"
-            },
-            {
-                "id": "pcm",
-                "name": "PCM",
-                "mime_type": "audio/pcm",
-                "description": "PCM audio format - raw audio data, large file size",
-                "quality": "Raw",
-                "file_size": "Large",
-                "use_case": "Audio processing"
-            }
-        ]
-        return jsonify({
-            "formats": formats,
-            "count": len(formats)
-        })
-    except Exception as e:
-        logger.error(f"Error getting formats: {e}")
-        return jsonify({"error": "Failed to get formats"}), 500
-@app.route('/api/validate-text', methods=['POST'])
-def validate_text():
-    """Validate text length and provide splitting suggestions."""
-    try:
-        data = request.get_json()
-        if not data:
-            return jsonify({"error": "No JSON data provided"}), 400
-        text = data.get('text', '').strip()
-        max_length = data.get('max_length', 4096)
-        if not text:
-            return jsonify({"error": "Text is required"}), 400
-        text_length = len(text)
-        is_valid = text_length <= max_length
-        result = {
-            "text_length": text_length,
-            "max_length": max_length,
-            "is_valid": is_valid,
-            "needs_splitting": not is_valid
-        }
-        if not is_valid:
-            # Provide splitting suggestions
-            chunks = split_text_by_length(text, max_length, preserve_words=True)
-            result.update({
-                "suggested_chunks": len(chunks),
-                "chunk_preview": [chunk[:100] + "..." if len(chunk) > 100 else chunk for chunk in chunks[:3]]
-            })
-        return jsonify(result)
-    except Exception as e:
-        logger.error(f"Text validation error: {e}")
-        return jsonify({"error": "Text validation failed"}), 500
-@app.route('/api/generate', methods=['POST'])
-def generate_speech():
-    """Generate speech from text using the TTSFM package."""
-    try:
-        # Parse request data
-        data = request.get_json()
-        if not data:
-            return jsonify({"error": "No JSON data provided"}), 400
-        # Extract parameters
-        text = data.get('text', '').strip()
-        voice = data.get('voice', Voice.ALLOY.value)
-        response_format = data.get('format', AudioFormat.MP3.value)
-        instructions = data.get('instructions', '').strip() or None
-        max_length = data.get('max_length', 4096)
-        validate_length = data.get('validate_length', True)
-        # Validate required fields
-        if not text:
-            return jsonify({"error": "Text is required"}), 400
-        # Validate voice
-        try:
-            voice_enum = Voice(voice.lower())
-        except ValueError:
-            return jsonify({
-                "error": f"Invalid voice: {voice}. Must be one of: {[v.value for v in Voice]}"
-            }), 400
-        # Validate format
-        try:
-            format_enum = AudioFormat(response_format.lower())
-        except ValueError:
-            return jsonify({
-                "error": f"Invalid format: {response_format}. Must be one of: {[f.value for f in AudioFormat]}"
-            }), 400
-        logger.info(f"Generating speech: text='{text[:50]}...', voice={voice}, format={response_format}")
-        # Generate speech using the TTSFM package with validation
-        response = tts_client.generate_speech(
-            text=text,
-            voice=voice_enum,
-            response_format=format_enum,
-            instructions=instructions,
-            max_length=max_length,
-            validate_length=validate_length
-        )
-        # Return audio data
-        return Response(
-            response.audio_data,
-            mimetype=response.content_type,
-            headers={
-                'Content-Disposition': f'attachment; filename="speech.{response.format.value}"',
-                'Content-Length': str(response.size),
-                'X-Audio-Format': response.format.value,
-                'X-Audio-Size': str(response.size)
-            }
-        )
-    except ValidationException as e:
-        logger.warning(f"Validation error: {e}")
-        return jsonify({"error": str(e)}), 400
-    except APIException as e:
-        logger.error(f"API error: {e}")
-        return jsonify({
-            "error": str(e),
-            "status_code": getattr(e, 'status_code', 500)
-        }), getattr(e, 'status_code', 500)
-    except NetworkException as e:
-        logger.error(f"Network error: {e}")
-        return jsonify({
-            "error": "TTS service is currently unavailable",
-            "details": str(e)
-        }), 503
-    except TTSException as e:
-        logger.error(f"TTS error: {e}")
-        return jsonify({"error": str(e)}), 500
-    except Exception as e:
-        logger.error(f"Unexpected error: {e}")
-        return jsonify({"error": "Internal server error"}), 500
-@app.route('/api/generate-batch', methods=['POST'])
-def generate_speech_batch():
-    """Generate speech from long text by splitting into chunks."""
-    try:
-        data = request.get_json()
-        if not data:
-            return jsonify({"error": "No JSON data provided"}), 400
-        text = data.get('text', '').strip()
-        voice = data.get('voice', Voice.ALLOY.value)
-        response_format = data.get('format', AudioFormat.MP3.value)
-        instructions = data.get('instructions', '').strip() or None
-        max_length = data.get('max_length', 4096)
-        preserve_words = data.get('preserve_words', True)
-        if not text:
-            return jsonify({"error": "Text is required"}), 400
-        # Validate voice and format
-        try:
-            voice_enum = Voice(voice.lower())
-            format_enum = AudioFormat(response_format.lower())
-        except ValueError as e:
-            return jsonify({"error": f"Invalid voice or format: {e}"}), 400
-        # Split text into chunks
-        chunks = split_text_by_length(text, max_length, preserve_words)
-        if not chunks:
-            return jsonify({"error": "No valid text chunks found"}), 400
-        logger.info(f"Processing {len(chunks)} chunks for batch generation")
-        # Generate speech for each chunk
-        results = []
-        for i, chunk in enumerate(chunks):
-            try:
-                response = tts_client.generate_speech(
-                    text=chunk,
-                    voice=voice_enum,
-                    response_format=format_enum,
-                    instructions=instructions,
-                    max_length=max_length,
-                    validate_length=False  # Already split
-                )
-                # Convert to base64 for JSON response
-                import base64
-                audio_b64 = base64.b64encode(response.audio_data).decode('utf-8')
-                results.append({
-                    "chunk_index": i + 1,
-                    "chunk_text": chunk[:100] + "..." if len(chunk) > 100 else chunk,
-                    "audio_data": audio_b64,
-                    "content_type": response.content_type,
-                    "size": response.size,
-                    "format": response.format.value
-                })
-            except Exception as e:
-                logger.error(f"Failed to generate chunk {i+1}: {e}")
-                results.append({
-                    "chunk_index": i + 1,
-                    "chunk_text": chunk[:100] + "..." if len(chunk) > 100 else chunk,
-                    "error": str(e)
-                })
-        return jsonify({
-            "total_chunks": len(chunks),
-            "successful_chunks": len([r for r in results if "audio_data" in r]),
-            "results": results
-        })
-    except Exception as e:
-        logger.error(f"Batch generation error: {e}")
-        return jsonify({"error": "Batch generation failed"}), 500
-@app.route('/api/status', methods=['GET'])
-def get_status():
-    """Get service status."""
-    try:
-        # Try to make a simple request to check if the TTS service is available
-        test_response = tts_client.generate_speech(
-            text="test",
-            voice=Voice.ALLOY,
-            response_format=AudioFormat.MP3
-        )
-        return jsonify({
-            "status": "online",
-            "tts_service": "openai.fm (free)",
-            "package_version": "3.0.0",
-            "timestamp": datetime.now().isoformat()
-        })
-    except Exception as e:
-        logger.error(f"Status check failed: {e}")
-        return jsonify({
-            "status": "error",
-            "tts_service": "openai.fm (free)",
-            "error": str(e),
-            "timestamp": datetime.now().isoformat()
-        }), 503
-@app.route('/api/health', methods=['GET'])
-def health_check():
-    """Simple health check endpoint."""
-    return jsonify({
-        "status": "healthy",
-        "timestamp": datetime.now().isoformat()
-    })
-# OpenAI-compatible API endpoints
-@app.route('/v1/audio/speech', methods=['POST'])
-def openai_speech():
-    """OpenAI-compatible speech generation endpoint."""
-    try:
-        # Parse request data
-        data = request.get_json()
-        if not data:
-            return jsonify({
-                "error": {
-                    "message": "No JSON data provided",
-                    "type": "invalid_request_error",
-                    "code": "missing_data"
-                }
-            }), 400
-        # Extract OpenAI-compatible parameters
-        model = data.get('model', 'gpt-4o-mini-tts')  # Accept but ignore model
-        input_text = data.get('input', '').strip()
-        voice = data.get('voice', 'alloy')
-        response_format = data.get('response_format', 'mp3')
-        instructions = data.get('instructions', '').strip() or None
-        speed = data.get('speed', 1.0)  # Accept but ignore speed
-        # Validate required fields
-        if not input_text:
-            return jsonify({
-                "error": {
-                    "message": "Input text is required",
-                    "type": "invalid_request_error",
-                    "code": "missing_input"
-                }
-            }), 400
-        # Validate voice
-        try:
-            voice_enum = Voice(voice.lower())
-        except ValueError:
-            return jsonify({
-                "error": {
-                    "message": f"Invalid voice: {voice}. Must be one of: {[v.value for v in Voice]}",
-                    "type": "invalid_request_error",
-                    "code": "invalid_voice"
-                }
-            }), 400
-        # Validate format
-        try:
-            format_enum = AudioFormat(response_format.lower())
-        except ValueError:
-            return jsonify({
-                "error": {
-                    "message": f"Invalid response_format: {response_format}. Must be one of: {[f.value for f in AudioFormat]}",
-                    "type": "invalid_request_error",
-                    "code": "invalid_format"
-                }
-            }), 400
-        logger.info(f"OpenAI API: Generating speech: text='{input_text[:50]}...', voice={voice}, format={response_format}")
-        # Generate speech using the TTSFM package
-        response = tts_client.generate_speech(
-            text=input_text,
-            voice=voice_enum,
-            response_format=format_enum,
-            instructions=instructions,
-            max_length=4096,
-            validate_length=True
-        )
-        # Return audio data in OpenAI format
-        return Response(
-            response.audio_data,
-            mimetype=response.content_type,
-            headers={
-                'Content-Type': response.content_type,
-                'Content-Length': str(response.size),
-                'X-Audio-Format': response.format.value,
-                'X-Audio-Size': str(response.size),
-                'X-Powered-By': 'TTSFM-OpenAI-Compatible'
-            }
-        )
-    except ValidationException as e:
-        logger.warning(f"OpenAI API validation error: {e}")
-        return jsonify({
-            "error": {
-                "message": str(e),
-                "type": "invalid_request_error",
-                "code": "validation_error"
-            }
-        }), 400
-    except APIException as e:
-        logger.error(f"OpenAI API error: {e}")
-        return jsonify({
-            "error": {
-                "message": str(e),
-                "type": "api_error",
-                "code": "tts_error"
-            }
-        }), getattr(e, 'status_code', 500)
-    except NetworkException as e:
-        logger.error(f"OpenAI API network error: {e}")
-        return jsonify({
-            "error": {
-                "message": "TTS service is currently unavailable",
-                "type": "service_unavailable_error",
-                "code": "service_unavailable"
-            }
-        }), 503
-    except Exception as e:
-        logger.error(f"OpenAI API unexpected error: {e}")
-        return jsonify({
-            "error": {
-                "message": "An unexpected error occurred",
-                "type": "internal_error",
-                "code": "internal_error"
-            }
-        }), 500
-@app.route('/v1/models', methods=['GET'])
-def openai_models():
-    """OpenAI-compatible models endpoint."""
-    return jsonify({
-        "object": "list",
-        "data": [
-            {
-                "id": "gpt-4o-mini-tts",
-                "object": "model",
-                "created": 1699564800,
-                "owned_by": "ttsfm",
-                "permission": [],
-                "root": "gpt-4o-mini-tts",
-                "parent": None
-            }
-        ]
-    })
-@app.errorhandler(404)
-def not_found(error):
-    """Handle 404 errors."""
-    return jsonify({"error": "Endpoint not found"}), 404
-@app.errorhandler(405)
-def method_not_allowed(error):
-    """Handle 405 errors."""
-    return jsonify({"error": "Method not allowed"}), 405
-@app.errorhandler(500)
-def internal_error(error):
-    """Handle 500 errors."""
-    logger.error(f"Internal server error: {error}")
-    return jsonify({"error": "Internal server error"}), 500
-if __name__ == '__main__':
-    logger.info(f"Starting TTSFM web application on {HOST}:{PORT}")
-    logger.info("Using openai.fm free TTS service")
-    logger.info(f"Debug mode: {DEBUG}")
-    try:
-        app.run(
-            host=HOST,
-            port=PORT,
-            debug=DEBUG
-        )
-    except KeyboardInterrupt:
-        logger.info("Application stopped by user")
-    except Exception as e:
-        logger.error(f"Failed to start application: {e}")
-    finally:
-        # Clean up TTS client
-        tts_client.close()

+"""
+TTSFM Web Application
+A Flask web application that provides a user-friendly interface
+for the TTSFM text-to-speech package.
+"""
+import os
+import json
+import logging
+import tempfile
+import io
+from datetime import datetime
+from pathlib import Path
+from typing import Dict, Any, Optional, List
+from functools import wraps
+from urllib.parse import urlparse, urljoin
+from flask import Flask, request, jsonify, send_file, Response, render_template, redirect, url_for
+from flask_cors import CORS
+from flask_socketio import SocketIO
+from dotenv import load_dotenv
+# Import i18n support
+from i18n import init_i18n, get_locale, set_locale, _
+# Import the TTSFM package
+try:
+    from ttsfm import TTSClient, Voice, AudioFormat, TTSException
+    from ttsfm.exceptions import APIException, NetworkException, ValidationException
+    from ttsfm.utils import validate_text_length, split_text_by_length
+except ImportError:
+    # Fallback for development when package is not installed
+    import sys
+    sys.path.insert(0, os.path.join(os.path.dirname(__file__), '..'))
+    from ttsfm import TTSClient, Voice, AudioFormat, TTSException
+    from ttsfm.exceptions import APIException, NetworkException, ValidationException
+    from ttsfm.utils import validate_text_length, split_text_by_length
+# Load environment variables
+load_dotenv()
+# Configure logging
+logging.basicConfig(
+    level=logging.INFO,
+    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
+)
+logger = logging.getLogger(__name__)
+# Create Flask app
+app = Flask(__name__, static_folder='static', static_url_path='/static')
+app.secret_key = os.getenv("SECRET_KEY", "ttsfm-secret-key-change-in-production")
+CORS(app)
+# Configuration (moved up for socketio initialization)
+HOST = os.getenv("HOST", "localhost")
+PORT = int(os.getenv("PORT", "8000"))
+DEBUG = os.getenv("DEBUG", "false").lower() == "true"
+# Initialize SocketIO with proper async mode
+# Using eventlet for production, threading for development
+async_mode = 'eventlet' if not DEBUG else 'threading'
+socketio = SocketIO(app, cors_allowed_origins="*", async_mode=async_mode)
+# Initialize i18n support
+init_i18n(app)
+# API Key configuration
+API_KEY = os.getenv("TTSFM_API_KEY")  # Set this environment variable for API protection
+REQUIRE_API_KEY = os.getenv("REQUIRE_API_KEY", "false").lower() == "true"
+# Create TTS client - now uses openai.fm directly, no configuration needed
+tts_client = TTSClient()
+# Initialize WebSocket handler
+from websocket_handler import WebSocketTTSHandler
+websocket_handler = WebSocketTTSHandler(socketio, tts_client)
+logger.info("Initialized web app with TTSFM using openai.fm free service")
+logger.info(f"WebSocket support enabled with {async_mode} async mode")
+# API Key validation decorator
+def require_api_key(f):
+    """Decorator to require API key for protected endpoints."""
+    @wraps(f)
+    def decorated_function(*args, **kwargs):
+        # Skip API key check if not required
+        if not REQUIRE_API_KEY:
+            return f(*args, **kwargs)
+        # Check if API key is configured
+        if not API_KEY:
+            logger.warning("API key protection is enabled but TTSFM_API_KEY is not set")
+            return jsonify({
+                "error": "API key protection is enabled but not configured properly"
+            }), 500
+        # Get API key from request headers - prioritize Authorization header (OpenAI compatible)
+        provided_key = None
+        # 1. Check Authorization header first (OpenAI standard)
+        auth_header = request.headers.get('Authorization')
+        if auth_header and auth_header.startswith('Bearer '):
+            provided_key = auth_header[7:]  # Remove 'Bearer ' prefix
+        # 2. Check X-API-Key header as fallback
+        if not provided_key:
+            provided_key = request.headers.get('X-API-Key')
+        # 3. Check API key from query parameters as fallback
+        if not provided_key:
+            provided_key = request.args.get('api_key')
+        # 4. Check API key from JSON body as fallback
+        if not provided_key and request.is_json:
+            data = request.get_json(silent=True)
+            if data:
+                provided_key = data.get('api_key')
+        # Validate API key
+        if not provided_key or provided_key != API_KEY:
+            logger.warning(f"Invalid API key attempt from {request.remote_addr}")
+            return jsonify({
+                "error": {
+                    "message": "Invalid API key provided",
+                    "type": "invalid_request_error",
+                    "code": "invalid_api_key"
+                }
+            }), 401
+        return f(*args, **kwargs)
+    return decorated_function
+def combine_audio_chunks(audio_chunks: List[bytes], format_type: str = "mp3") -> bytes:
+    """
+    Combine multiple audio chunks into a single audio file.
+    Args:
+        audio_chunks: List of audio data as bytes
+        format_type: Audio format (mp3, wav, etc.)
+    Returns:
+        bytes: Combined audio data
+    """
+    try:
+        # Try to use pydub for audio processing if available
+        try:
+            from pydub import AudioSegment
+            # Convert each chunk to AudioSegment
+            audio_segments = []
+            for chunk in audio_chunks:
+                if format_type.lower() == "mp3":
+                    segment = AudioSegment.from_mp3(io.BytesIO(chunk))
+                elif format_type.lower() == "wav":
+                    segment = AudioSegment.from_wav(io.BytesIO(chunk))
+                elif format_type.lower() == "opus":
+                    # For OPUS, we'll treat it as WAV since openai.fm returns WAV for OPUS requests
+                    segment = AudioSegment.from_wav(io.BytesIO(chunk))
+                else:
+                    # For other formats, try to auto-detect or default to WAV
+                    try:
+                        segment = AudioSegment.from_file(io.BytesIO(chunk))
+                    except:
+                        segment = AudioSegment.from_wav(io.BytesIO(chunk))
+                audio_segments.append(segment)
+            # Combine all segments
+            combined = audio_segments[0]
+            for segment in audio_segments[1:]:
+                combined += segment
+            # Export to bytes
+            output_buffer = io.BytesIO()
+            if format_type.lower() == "mp3":
+                combined.export(output_buffer, format="mp3")
+            elif format_type.lower() == "wav":
+                combined.export(output_buffer, format="wav")
+            else:
+                # Default to the original format or WAV
+                try:
+                    combined.export(output_buffer, format=format_type.lower())
+                except:
+                    combined.export(output_buffer, format="wav")
+            return output_buffer.getvalue()
+        except ImportError:
+            # Fallback: Simple concatenation for WAV files
+            logger.warning("pydub not available, using simple concatenation for WAV files")
+            if format_type.lower() == "wav":
+                return _simple_wav_concatenation(audio_chunks)
+            else:
+                # For non-WAV formats without pydub, just concatenate raw bytes
+                # This won't produce valid audio but is better than failing
+                logger.warning(f"Cannot properly combine {format_type} files without pydub, using raw concatenation")
+                return b''.join(audio_chunks)
+    except Exception as e:
+        logger.error(f"Error combining audio chunks: {e}")
+        # Fallback to simple concatenation
+        return b''.join(audio_chunks)
+def _simple_wav_concatenation(wav_chunks: List[bytes]) -> bytes:
+    """
+    Simple WAV file concatenation without external dependencies.
+    This is a basic implementation that works for simple WAV files.
+    """
+    if not wav_chunks:
+        return b''
+    if len(wav_chunks) == 1:
+        return wav_chunks[0]
+    try:
+        # For WAV files, we can do a simple concatenation by:
+        # 1. Taking the header from the first file
+        # 2. Concatenating all the audio data
+        # 3. Updating the file size in the header
+        first_wav = wav_chunks[0]
+        if len(first_wav) < 44:  # WAV header is at least 44 bytes
+            return b''.join(wav_chunks)
+        # Extract header from first file (first 44 bytes)
+        header = bytearray(first_wav[:44])
+        # Collect all audio data (skip headers for subsequent files)
+        audio_data = first_wav[44:]  # Audio data from first file
+        for wav_chunk in wav_chunks[1:]:
+            if len(wav_chunk) > 44:
+                audio_data += wav_chunk[44:]  # Skip header, append audio data
+        # Update file size in header (bytes 4-7)
+        total_size = len(header) + len(audio_data) - 8
+        header[4:8] = total_size.to_bytes(4, byteorder='little')
+        # Update data chunk size in header (bytes 40-43)
+        data_size = len(audio_data)
+        header[40:44] = data_size.to_bytes(4, byteorder='little')
+        return bytes(header) + audio_data
+    except Exception as e:
+        logger.error(f"Error in simple WAV concatenation: {e}")
+        # Ultimate fallback
+        return b''.join(wav_chunks)
+def _is_safe_url(target: Optional[str]) -> bool:
+    """Validate that a target URL is safe for redirection.
+    Allows only relative URLs or absolute URLs that match this server's host
+    and http/https schemes. Prevents open redirects to external domains.
+    """
+    if not target:
+        return False
+    parsed = urlparse(target)
+    if parsed.scheme or parsed.netloc or target.startswith('//'):
+        return False
+    if not parsed.path.startswith('/'):
+        return False
+    joined = urljoin(request.host_url, target)
+    host = urlparse(request.host_url)
+    j = urlparse(joined)
+    return j.scheme in ("http", "https") and j.netloc == host.netloc
+@app.route('/set-language/<lang_code>')
+def set_language(lang_code):
+    """Set the user's language preference."""
+    if set_locale(lang_code):
+        # Redirect back only if the referrer is safe; otherwise go home
+        target = request.referrer
+        if _is_safe_url(target):
+            return redirect(target)
+        return redirect(url_for('index'))
+    else:
+        # Invalid language code, redirect to home
+        return redirect(url_for('index'))
+@app.route('/')
+def index():
+    """Serve the main web interface."""
+    return render_template('index.html')
+@app.route('/playground')
+def playground():
+    """Serve the interactive playground."""
+    return render_template('playground.html')
+@app.route('/docs')
+def docs():
+    """Serve the API documentation."""
+    return render_template('docs.html')
+@app.route('/websocket-demo')
+def websocket_demo():
+    """Serve the WebSocket streaming demo page."""
+    return render_template('websocket_demo.html')
+@app.route('/api/voices', methods=['GET'])
+def get_voices():
+    """Get list of available voices."""
+    try:
+        voices = [
+            {
+                "id": voice.value,
+                "name": voice.value.title(),
+                "description": f"{voice.value.title()} voice"
+            }
+            for voice in Voice
+        ]
+        return jsonify({
+            "voices": voices,
+            "count": len(voices)
+        })
+    except Exception as e:
+        logger.error(f"Error getting voices: {e}")
+        return jsonify({"error": "Failed to get voices"}), 500
+@app.route('/api/formats', methods=['GET'])
+def get_formats():
+    """Get list of supported audio formats."""
+    try:
+        formats = [
+            {
+                "id": "mp3",
+                "name": "MP3",
+                "mime_type": "audio/mpeg",
+                "description": "MP3 audio format - good quality, small file size",
+                "quality": "Good",
+                "file_size": "Small",
+                "use_case": "Web, mobile apps, general use"
+            },
+            {
+                "id": "opus",
+                "name": "OPUS",
+                "mime_type": "audio/opus",
+                "description": "OPUS audio format - excellent quality, small file size",
+                "quality": "Excellent",
+                "file_size": "Small",
+                "use_case": "Web streaming, VoIP"
+            },
+            {
+                "id": "aac",
+                "name": "AAC",
+                "mime_type": "audio/aac",
+                "description": "AAC audio format - good quality, medium file size",
+                "quality": "Good",
+                "file_size": "Medium",
+                "use_case": "Apple devices, streaming"
+            },
+            {
+                "id": "flac",
+                "name": "FLAC",
+                "mime_type": "audio/flac",
+                "description": "FLAC audio format - lossless quality, large file size",
+                "quality": "Lossless",
+                "file_size": "Large",
+                "use_case": "High-quality archival"
+            },
+            {
+                "id": "wav",
+                "name": "WAV",
+                "mime_type": "audio/wav",
+                "description": "WAV audio format - lossless quality, large file size",
+                "quality": "Lossless",
+                "file_size": "Large",
+                "use_case": "Professional audio"
+            },
+            {
+                "id": "pcm",
+                "name": "PCM",
+                "mime_type": "audio/pcm",
+                "description": "PCM audio format - raw audio data, large file size",
+                "quality": "Raw",
+                "file_size": "Large",
+                "use_case": "Audio processing"
+            }
+        ]
+        return jsonify({
+            "formats": formats,
+            "count": len(formats)
+        })
+    except Exception as e:
+        logger.error(f"Error getting formats: {e}")
+        return jsonify({"error": "Failed to get formats"}), 500
+@app.route('/api/validate-text', methods=['POST'])
+def validate_text():
+    """Validate text length and provide splitting suggestions."""
+    try:
+        data = request.get_json()
+        if not data:
+            return jsonify({"error": "No JSON data provided"}), 400
+        text = data.get('text', '').strip()
+        max_length = data.get('max_length', 4096)
+        if not text:
+            return jsonify({"error": "Text is required"}), 400
+        text_length = len(text)
+        is_valid = text_length <= max_length
+        result = {
+            "text_length": text_length,
+            "max_length": max_length,
+            "is_valid": is_valid,
+            "needs_splitting": not is_valid
+        }
+        if not is_valid:
+            # Provide splitting suggestions
+            chunks = split_text_by_length(text, max_length, preserve_words=True)
+            result.update({
+                "suggested_chunks": len(chunks),
+                "chunk_preview": [chunk[:100] + "..." if len(chunk) > 100 else chunk for chunk in chunks[:3]]
+            })
+        return jsonify(result)
+    except Exception as e:
+        logger.error(f"Text validation error: {e}")
+        return jsonify({"error": "Text validation failed"}), 500
+@app.route('/api/generate', methods=['POST'])
+@require_api_key
+def generate_speech():
+    """Generate speech from text using the TTSFM package."""
+    try:
+        # Parse request data
+        data = request.get_json()
+        if not data:
+            return jsonify({"error": "No JSON data provided"}), 400
+        # Extract parameters
+        text = data.get('text', '').strip()
+        voice = data.get('voice', Voice.ALLOY.value)
+        response_format = data.get('format', AudioFormat.MP3.value)
+        instructions = data.get('instructions', '').strip() or None
+        max_length = data.get('max_length', 4096)
+        validate_length = data.get('validate_length', True)
+        # Validate required fields
+        if not text:
+            return jsonify({"error": "Text is required"}), 400
+        # Validate voice
+        try:
+            voice_enum = Voice(voice.lower())
+        except ValueError:
+            return jsonify({
+                "error": f"Invalid voice: {voice}. Must be one of: {[v.value for v in Voice]}"
+            }), 400
+        # Validate format
+        try:
+            format_enum = AudioFormat(response_format.lower())
+        except ValueError:
+            return jsonify({
+                "error": f"Invalid format: {response_format}. Must be one of: {[f.value for f in AudioFormat]}"
+            }), 400
+        logger.info(f"Generating speech: text='{text[:50]}...', voice={voice}, format={response_format}")
+        # Generate speech using the TTSFM package with validation
+        response = tts_client.generate_speech(
+            text=text,
+            voice=voice_enum,
+            response_format=format_enum,
+            instructions=instructions,
+            max_length=max_length,
+            validate_length=validate_length
+        )
+        # Return audio data
+        return Response(
+            response.audio_data,
+            mimetype=response.content_type,
+            headers={
+                'Content-Disposition': f'attachment; filename="speech.{response.format.value}"',
+                'Content-Length': str(response.size),
+                'X-Audio-Format': response.format.value,
+                'X-Audio-Size': str(response.size)
+            }
+        )
+    except ValidationException as e:
+        logger.warning(f"Validation error: {e}")
+        return jsonify({"error": "Invalid input parameters"}), 400
+    except APIException as e:
+        logger.error(f"API error: {e}")
+        return jsonify({
+            "error": "TTS service error",
+            "status_code": getattr(e, 'status_code', 500)
+        }), getattr(e, 'status_code', 500)
+    except NetworkException as e:
+        logger.error(f"Network error: {e}")
+        return jsonify({
+            "error": "TTS service is currently unavailable"
+        }), 503
+    except TTSException as e:
+        logger.error(f"TTS error: {e}")
+        return jsonify({"error": "Text-to-speech generation failed"}), 500
+    except Exception as e:
+        logger.error(f"Unexpected error: {e}")
+        return jsonify({"error": "Internal server error"}), 500
+@app.route('/api/generate-combined', methods=['POST'])
+@require_api_key
+def generate_speech_combined():
+    """Generate speech from long text and return a single combined audio file."""
+    try:
+        data = request.get_json()
+        if not data:
+            return jsonify({"error": "No JSON data provided"}), 400
+        text = data.get('text', '').strip()
+        voice = data.get('voice', Voice.ALLOY.value)
+        response_format = data.get('format', AudioFormat.MP3.value)
+        instructions = data.get('instructions', '').strip() or None
+        max_length = data.get('max_length', 4096)
+        preserve_words = data.get('preserve_words', True)
+        if not text:
+            return jsonify({"error": "Text is required"}), 400
+        # Check if text needs splitting
+        if len(text) <= max_length:
+            # Text is short enough, use regular generation
+            try:
+                voice_enum = Voice(voice.lower())
+                format_enum = AudioFormat(response_format.lower())
+            except ValueError as e:
+                logger.warning(f"Invalid voice or format: {e}")
+                return jsonify({"error": "Invalid voice or format specified"}), 400
+            response = tts_client.generate_speech(
+                text=text,
+                voice=voice_enum,
+                response_format=format_enum,
+                instructions=instructions,
+                max_length=max_length,
+                validate_length=True
+            )
+            return Response(
+                response.audio_data,
+                mimetype=response.content_type,
+                headers={
+                    'Content-Disposition': f'attachment; filename="combined_speech.{response.format.value}"',
+                    'Content-Length': str(response.size),
+                    'X-Audio-Format': response.format.value,
+                    'X-Audio-Size': str(response.size),
+                    'X-Chunks-Combined': '1'
+                }
+            )
+        # Text is long, split and combine
+        try:
+            voice_enum = Voice(voice.lower())
+            format_enum = AudioFormat(response_format.lower())
+        except ValueError as e:
+            logger.warning(f"Invalid voice or format: {e}")
+            return jsonify({"error": "Invalid voice or format specified"}), 400
+        logger.info(f"Generating combined speech for long text: {len(text)} characters, splitting into chunks")
+        # Generate speech chunks
+        try:
+            responses = tts_client.generate_speech_long_text(
+                text=text,
+                voice=voice_enum,
+                response_format=format_enum,
+                instructions=instructions,
+                max_length=max_length,
+                preserve_words=preserve_words
+            )
+        except Exception as e:
+            logger.error(f"Long text generation failed: {e}")
+            return jsonify({"error": "Long text generation failed"}), 500
+        if not responses:
+            return jsonify({"error": "No valid text chunks found"}), 400
+        logger.info(f"Generated {len(responses)} chunks, combining into single audio file")
+        # Extract audio data from responses
+        audio_chunks = [response.audio_data for response in responses]
+        # Combine audio chunks
+        try:
+            combined_audio = combine_audio_chunks(audio_chunks, format_enum.value)
+        except Exception as e:
+            logger.error(f"Failed to combine audio chunks: {e}")
+            return jsonify({"error": "Failed to combine audio chunks"}), 500
+        if not combined_audio:
+            return jsonify({"error": "Failed to generate combined audio"}), 500
+        # Determine content type
+        content_type = responses[0].content_type  # Use content type from first chunk
+        logger.info(f"Successfully combined {len(responses)} chunks into single audio file ({len(combined_audio)} bytes)")
+        return Response(
+            combined_audio,
+            mimetype=content_type,
+            headers={
+                'Content-Disposition': f'attachment; filename="combined_speech.{format_enum.value}"',
+                'Content-Length': str(len(combined_audio)),
+                'X-Audio-Format': format_enum.value,
+                'X-Audio-Size': str(len(combined_audio)),
+                'X-Chunks-Combined': str(len(responses)),
+                'X-Original-Text-Length': str(len(text))
+            }
+        )
+    except ValidationException as e:
+        logger.warning(f"Validation error: {e}")
+        return jsonify({"error": "Invalid input parameters"}), 400
+    except APIException as e:
+        logger.error(f"API error: {e}")
+        return jsonify({
+            "error": "TTS service error",
+            "status_code": getattr(e, 'status_code', 500)
+        }), getattr(e, 'status_code', 500)
+    except NetworkException as e:
+        logger.error(f"Network error: {e}")
+        return jsonify({
+            "error": "TTS service is currently unavailable"
+        }), 503
+    except TTSException as e:
+        logger.error(f"TTS error: {e}")
+        return jsonify({"error": "Text-to-speech generation failed"}), 500
+    except Exception as e:
+        logger.error(f"Combined generation error: {e}")
+        return jsonify({"error": "Combined audio generation failed"}), 500
+@app.route('/api/status', methods=['GET'])
+def get_status():
+    """Get service status."""
+    try:
+        # Try to make a simple request to check if the TTS service is available
+        test_response = tts_client.generate_speech(
+            text="test",
+            voice=Voice.ALLOY,
+            response_format=AudioFormat.MP3
+        )
+        return jsonify({
+            "status": "online",
+            "tts_service": "openai.fm (free)",
+            "package_version": "3.2.3",
+            "timestamp": datetime.now().isoformat()
+        })
+    except Exception as e:
+        logger.error(f"Status check failed: {e}")
+        return jsonify({
+            "status": "error",
+            "tts_service": "openai.fm (free)",
+            "error": "Service status check failed",
+            "timestamp": datetime.now().isoformat()
+        }), 503
+@app.route('/api/health', methods=['GET'])
+def health_check():
+    """Simple health check endpoint."""
+    return jsonify({
+        "status": "healthy",
+        "package_version": "3.2.3",
+        "timestamp": datetime.now().isoformat()
+    })
+@app.route('/api/websocket/status', methods=['GET'])
+def websocket_status():
+    """Get WebSocket server status and active connections."""
+    return jsonify({
+        "websocket_enabled": True,
+        "async_mode": async_mode,
+        "active_sessions": websocket_handler.get_active_sessions_count(),
+        "transport_options": ["websocket", "polling"],
+        "endpoint": f"ws{'s' if request.is_secure else ''}://{request.host}/socket.io/",
+        "timestamp": datetime.now().isoformat()
+    })
+@app.route('/api/auth-status', methods=['GET'])
+def auth_status():
+    """Get authentication status and requirements."""
+    return jsonify({
+        "api_key_required": REQUIRE_API_KEY,
+        "api_key_configured": bool(API_KEY) if REQUIRE_API_KEY else None,
+        "timestamp": datetime.now().isoformat()
+    })
+@app.route('/api/translations/<lang_code>', methods=['GET'])
+def get_translations(lang_code):
+    """Get translations for a specific language."""
+    try:
+        if hasattr(app, 'language_manager'):
+            translations = app.language_manager.translations.get(lang_code, {})
+            return jsonify(translations)
+        else:
+            return jsonify({}), 404
+    except Exception as e:
+        logger.error(f"Error getting translations for {lang_code}: {e}")
+        return jsonify({"error": "Failed to get translations"}), 500
+# OpenAI-compatible API endpoints
+@app.route('/v1/audio/speech', methods=['POST'])
+@require_api_key
+def openai_speech():
+    """OpenAI-compatible speech generation endpoint with auto-combine feature."""
+    try:
+        # Parse request data
+        data = request.get_json()
+        if not data:
+            return jsonify({
+                "error": {
+                    "message": "No JSON data provided",
+                    "type": "invalid_request_error",
+                    "code": "missing_data"
+                }
+            }), 400
+        # Extract OpenAI-compatible parameters
+        model = data.get('model', 'gpt-4o-mini-tts')  # Accept but ignore model
+        input_text = data.get('input', '').strip()
+        voice = data.get('voice', 'alloy')
+        response_format = data.get('response_format', 'mp3')
+        instructions = data.get('instructions', '').strip() or None
+        speed = data.get('speed', 1.0)  # Accept but ignore speed
+        # TTSFM-specific parameters
+        auto_combine = data.get('auto_combine', True)  # New parameter: auto-combine long text (default: True)
+        max_length = data.get('max_length', 4096)  # Custom parameter for chunk size
+        # Validate required fields
+        if not input_text:
+            return jsonify({
+                "error": {
+                    "message": "Input text is required",
+                    "type": "invalid_request_error",
+                    "code": "missing_input"
+                }
+            }), 400
+        # Validate voice
+        try:
+            voice_enum = Voice(voice.lower())
+        except ValueError:
+            return jsonify({
+                "error": {
+                    "message": f"Invalid voice: {voice}. Must be one of: {[v.value for v in Voice]}",
+                    "type": "invalid_request_error",
+                    "code": "invalid_voice"
+                }
+            }), 400
+        # Validate format
+        try:
+            format_enum = AudioFormat(response_format.lower())
+        except ValueError:
+            return jsonify({
+                "error": {
+                    "message": f"Invalid response_format: {response_format}. Must be one of: {[f.value for f in AudioFormat]}",
+                    "type": "invalid_request_error",
+                    "code": "invalid_format"
+                }
+            }), 400
+        logger.info(f"OpenAI API: Generating speech: text='{input_text[:50]}...', voice={voice}, format={response_format}, auto_combine={auto_combine}")
+        # Check if text exceeds limit and auto_combine is enabled
+        if len(input_text) > max_length and auto_combine:
+            # Long text with auto-combine enabled: split and combine
+            logger.info(f"Long text detected ({len(input_text)} chars), auto-combining enabled")
+            # Generate speech chunks
+            responses = tts_client.generate_speech_long_text(
+                text=input_text,
+                voice=voice_enum,
+                response_format=format_enum,
+                instructions=instructions,
+                max_length=max_length,
+                preserve_words=True
+            )
+            if not responses:
+                return jsonify({
+                    "error": {
+                        "message": "No valid text chunks found",
+                        "type": "processing_error",
+                        "code": "no_chunks"
+                    }
+                }), 400
+            # Extract audio data and combine
+            audio_chunks = [response.audio_data for response in responses]
+            combined_audio = combine_audio_chunks(audio_chunks, format_enum.value)
+            if not combined_audio:
+                return jsonify({
+                    "error": {
+                        "message": "Failed to combine audio chunks",
+                        "type": "processing_error",
+                        "code": "combine_failed"
+                    }
+                }), 500
+            content_type = responses[0].content_type
+            logger.info(f"Successfully combined {len(responses)} chunks into single audio file")
+            return Response(
+                combined_audio,
+                mimetype=content_type,
+                headers={
+                    'Content-Type': content_type,
+                    'Content-Length': str(len(combined_audio)),
+                    'X-Audio-Format': format_enum.value,
+                    'X-Audio-Size': str(len(combined_audio)),
+                    'X-Chunks-Combined': str(len(responses)),
+                    'X-Original-Text-Length': str(len(input_text)),
+                    'X-Auto-Combine': 'true',
+                    'X-Powered-By': 'TTSFM-OpenAI-Compatible'
+                }
+            )
+        else:
+            # Short text or auto_combine disabled: use regular generation
+            if len(input_text) > max_length and not auto_combine:
+                # Text is too long but auto_combine is disabled - return error
+                return jsonify({
+                    "error": {
+                        "message": f"Input text is too long ({len(input_text)} characters). Maximum allowed length is {max_length} characters. Enable auto_combine parameter to automatically split and combine long text.",
+                        "type": "invalid_request_error",
+                        "code": "text_too_long"
+                    }
+                }), 400
+            # Generate speech using the TTSFM package
+            response = tts_client.generate_speech(
+                text=input_text,
+                voice=voice_enum,
+                response_format=format_enum,
+                instructions=instructions,
+                max_length=max_length,
+                validate_length=True
+            )
+            # Return audio data in OpenAI format
+            return Response(
+                response.audio_data,
+                mimetype=response.content_type,
+                headers={
+                    'Content-Type': response.content_type,
+                    'Content-Length': str(response.size),
+                    'X-Audio-Format': response.format.value,
+                    'X-Audio-Size': str(response.size),
+                    'X-Chunks-Combined': '1',
+                    'X-Auto-Combine': str(auto_combine).lower(),
+                    'X-Powered-By': 'TTSFM-OpenAI-Compatible'
+                }
+            )
+    except ValidationException as e:
+        logger.warning(f"OpenAI API validation error: {e}")
+        return jsonify({
+            "error": {
+                "message": "Invalid request parameters",
+                "type": "invalid_request_error",
+                "code": "validation_error"
+            }
+        }), 400
+    except APIException as e:
+        logger.error(f"OpenAI API error: {e}")
+        return jsonify({
+            "error": {
+                "message": "Text-to-speech generation failed",
+                "type": "api_error",
+                "code": "tts_error"
+            }
+        }), getattr(e, 'status_code', 500)
+    except NetworkException as e:
+        logger.error(f"OpenAI API network error: {e}")
+        return jsonify({
+            "error": {
+                "message": "TTS service is currently unavailable",
+                "type": "service_unavailable_error",
+                "code": "service_unavailable"
+            }
+        }), 503
+    except Exception as e:
+        logger.error(f"OpenAI API unexpected error: {e}")
+        return jsonify({
+            "error": {
+                "message": "An unexpected error occurred",
+                "type": "internal_error",
+                "code": "internal_error"
+            }
+        }), 500
+@app.route('/v1/models', methods=['GET'])
+def openai_models():
+    """OpenAI-compatible models endpoint."""
+    return jsonify({
+        "object": "list",
+        "data": [
+            {
+                "id": "gpt-4o-mini-tts",
+                "object": "model",
+                "created": 1699564800,
+                "owned_by": "ttsfm",
+                "permission": [],
+                "root": "gpt-4o-mini-tts",
+                "parent": None
+            }
+        ]
+    })
+@app.errorhandler(404)
+def not_found(error):
+    """Handle 404 errors."""
+    return jsonify({"error": "Endpoint not found"}), 404
+@app.errorhandler(405)
+def method_not_allowed(error):
+    """Handle 405 errors."""
+    return jsonify({"error": "Method not allowed"}), 405
+@app.errorhandler(500)
+def internal_error(error):
+    """Handle 500 errors."""
+    logger.error(f"Internal server error: {error}")
+    return jsonify({"error": "Internal server error"}), 500
+if __name__ == '__main__':
+    logger.info(f"Starting TTSFM web application on {HOST}:{PORT}")
+    logger.info("Using openai.fm free TTS service")
+    logger.info(f"Debug mode: {DEBUG}")
+    # Log API key protection status
+    if REQUIRE_API_KEY:
+        if API_KEY:
+            logger.info("🔒 API key protection is ENABLED")
+            logger.info("All TTS generation requests require a valid API key")
+        else:
+            logger.warning("⚠️  API key protection is enabled but TTSFM_API_KEY is not set!")
+            logger.warning("Please set the TTSFM_API_KEY environment variable")
+    else:
+        logger.info("🔓 API key protection is DISABLED - all requests are allowed")
+        logger.info("Set REQUIRE_API_KEY=true to enable API key protection")
+    try:
+        logger.info(f"Starting with {async_mode} async mode")
+        socketio.run(app, host=HOST, port=PORT, debug=DEBUG)
+    except KeyboardInterrupt:
+        logger.info("Application stopped by user")
+    except Exception as e:
+        logger.error(f"Failed to start application: {e}")
+    finally:
+        # Clean up TTS client
+        tts_client.close()

ttsfm-web/i18n.py ADDED Viewed

	@@ -0,0 +1,238 @@

+"""
+Internationalization (i18n) support for TTSFM Web Application
+This module provides multi-language support for the Flask web application,
+including language detection, translation management, and template functions.
+"""
+import json
+import os
+from typing import Dict, Any, Optional
+from flask import request, session, current_app
+class LanguageManager:
+    """Manages language detection, translation loading, and text translation."""
+    def __init__(self, app=None, translations_dir: str = "translations"):
+        """
+        Initialize the LanguageManager.
+        Args:
+            app: Flask application instance
+            translations_dir: Directory containing translation files
+        """
+        self.translations_dir = translations_dir
+        self.translations: Dict[str, Dict[str, Any]] = {}
+        self.supported_languages = ['en', 'zh']
+        self.default_language = 'en'
+        if app is not None:
+            self.init_app(app)
+    def init_app(self, app):
+        """Initialize the Flask application with i18n support."""
+        app.config.setdefault('LANGUAGES', self.supported_languages)
+        app.config.setdefault('DEFAULT_LANGUAGE', self.default_language)
+        # Load translations
+        self.load_translations()
+        # Register template functions
+        app.jinja_env.globals['_'] = self.translate
+        app.jinja_env.globals['get_locale'] = self.get_locale
+        app.jinja_env.globals['get_supported_languages'] = self.get_supported_languages
+        # Store reference to this instance
+        app.language_manager = self
+    def load_translations(self):
+        """Load all translation files from the translations directory."""
+        translations_path = os.path.join(
+            os.path.dirname(__file__),
+            self.translations_dir
+        )
+        if not os.path.exists(translations_path):
+            print(f"Warning: Translations directory not found: {translations_path}")
+            return
+        for lang_code in self.supported_languages:
+            file_path = os.path.join(translations_path, f"{lang_code}.json")
+            if os.path.exists(file_path):
+                try:
+                    with open(file_path, 'r', encoding='utf-8') as f:
+                        self.translations[lang_code] = json.load(f)
+                    print(f"Info: Loaded translations for language: {lang_code}")
+                except Exception as e:
+                    print(f"Error: Failed to load translations for {lang_code}: {e}")
+            else:
+                print(f"Warning: Translation file not found: {file_path}")
+    def get_locale(self) -> str:
+        """
+        Get the current locale based on user preference, session, or browser settings.
+        Returns:
+            Language code (e.g., 'en', 'zh')
+        """
+        # 1. Check URL parameter (for language switching)
+        if 'lang' in request.args:
+            lang = request.args.get('lang')
+            if lang in self.supported_languages:
+                session['language'] = lang
+                return lang
+        # 2. Check session (user's previous choice)
+        if 'language' in session:
+            lang = session['language']
+            if lang in self.supported_languages:
+                return lang
+        # 3. Check browser's Accept-Language header
+        if request.headers.get('Accept-Language'):
+            browser_langs = request.headers.get('Accept-Language').split(',')
+            for browser_lang in browser_langs:
+                # Extract language code (e.g., 'zh-CN' -> 'zh')
+                lang_code = browser_lang.split(';')[0].split('-')[0].strip().lower()
+                if lang_code in self.supported_languages:
+                    session['language'] = lang_code
+                    return lang_code
+        # 4. Fall back to default language
+        return self.default_language
+    def set_locale(self, lang_code: str) -> bool:
+        """
+        Set the current locale.
+        Args:
+            lang_code: Language code to set
+        Returns:
+            True if successful, False if language not supported
+        """
+        if lang_code in self.supported_languages:
+            session['language'] = lang_code
+            return True
+        return False
+    def translate(self, key: str, **kwargs) -> str:
+        """
+        Translate a text key to the current locale.
+        Args:
+            key: Translation key in dot notation (e.g., 'nav.home')
+            **kwargs: Variables for string formatting
+        Returns:
+            Translated text or the key if translation not found
+        """
+        locale = self.get_locale()
+        # Get translation for current locale
+        translation = self._get_nested_value(
+            self.translations.get(locale, {}),
+            key
+        )
+        # Fall back to default language if not found
+        if translation is None and locale != self.default_language:
+            translation = self._get_nested_value(
+                self.translations.get(self.default_language, {}),
+                key
+            )
+        # Fall back to key if still not found
+        if translation is None:
+            translation = key
+        # Format with variables if provided
+        if kwargs and isinstance(translation, str):
+            try:
+                translation = translation.format(**kwargs)
+            except (KeyError, ValueError):
+                pass  # Ignore formatting errors
+        return translation
+    def _get_nested_value(self, data: Dict[str, Any], key: str) -> Optional[str]:
+        """
+        Get a nested value from a dictionary using dot notation.
+        Args:
+            data: Dictionary to search in
+            key: Dot-separated key (e.g., 'nav.home')
+        Returns:
+            Value if found, None otherwise
+        """
+        keys = key.split('.')
+        current = data
+        for k in keys:
+            if isinstance(current, dict) and k in current:
+                current = current[k]
+            else:
+                return None
+        return current if isinstance(current, str) else None
+    def get_supported_languages(self) -> Dict[str, str]:
+        """
+        Get a dictionary of supported languages with their display names.
+        Returns:
+            Dictionary mapping language codes to display names
+        """
+        return {
+            'en': 'English',
+            'zh': '中文'
+        }
+    def get_language_info(self, lang_code: str) -> Dict[str, str]:
+        """
+        Get information about a specific language.
+        Args:
+            lang_code: Language code
+        Returns:
+            Dictionary with language information
+        """
+        language_names = {
+            'en': {'name': 'English', 'native': 'English'},
+            'zh': {'name': 'Chinese', 'native': '中文'}
+        }
+        return language_names.get(lang_code, {
+            'name': lang_code.upper(),
+            'native': lang_code.upper()
+        })
+# Global instance
+language_manager = LanguageManager()
+def init_i18n(app):
+    """Initialize i18n support for the Flask application."""
+    language_manager.init_app(app)
+    return language_manager
+# Template helper functions
+def _(key: str, **kwargs) -> str:
+    """Shorthand translation function for use in templates and code."""
+    return language_manager.translate(key, **kwargs)
+def get_locale() -> str:
+    """Get the current locale."""
+    return language_manager.get_locale()
+def set_locale(lang_code: str) -> bool:
+    """Set the current locale."""
+    return language_manager.set_locale(lang_code)

ttsfm-web/requirements.txt CHANGED Viewed

@@ -1,9 +1,16 @@
-# Web application dependencies
-flask>=2.0.0
-flask-cors>=3.0.10
-waitress>=3.0.0
-python-dotenv>=1.0.0
-# TTSFM package (install from local directory or PyPI)
-# For local development: pip install -e ../
-# For Docker/production: installed via pyproject.toml[web] dependencies

+# Web application dependencies
+flask>=2.0.0
+flask-cors>=3.0.10
+flask-socketio>=5.3.0
+python-socketio>=5.10.0
+eventlet>=0.33.3
+waitress>=3.0.0
+python-dotenv>=1.0.0
+# Audio processing (optional, for combining audio files)
+# If not installed, will fall back to simple concatenation for WAV files
+pydub>=0.25.0
+# TTSFM package (install from local directory or PyPI)
+# For local development: pip install -e ../
+# For Docker/production: installed via pyproject.toml[web] dependencies

ttsfm-web/run.py ADDED Viewed

	@@ -0,0 +1,15 @@

+#!/usr/bin/env python
+"""
+Run script for TTSFM web application with proper eventlet initialization
+"""
+# MUST be the first imports for eventlet to work properly
+import eventlet
+eventlet.monkey_patch()
+# Now import the app
+from app import app, socketio, HOST, PORT, DEBUG
+if __name__ == '__main__':
+    print(f"Starting TTSFM with WebSocket support on {HOST}:{PORT}")
+    socketio.run(app, host=HOST, port=PORT, debug=DEBUG, allow_unsafe_werkzeug=True)

ttsfm-web/static/css/style.css CHANGED Viewed

@@ -1,1390 +1,1399 @@
-/* TTSFM Web Application Custom Styles */
-:root {
-    /* Clean Color Palette */
-    --primary-color: #2563eb;
-    --primary-dark: #1d4ed8;
-    --primary-light: #3b82f6;
-    --secondary-color: #64748b;
-    --secondary-dark: #475569;
-    --accent-color: #10b981;
-    --accent-dark: #059669;
-    /* Status Colors */
-    --success-color: #10b981;
-    --warning-color: #f59e0b;
-    --danger-color: #ef4444;
-    --info-color: #3b82f6;
-    /* Clean Neutral Colors */
-    --light-color: #ffffff;
-    --light-gray: #f8fafc;
-    --medium-gray: #64748b;
-    --dark-color: #1e293b;
-    --text-color: #374151;
-    --text-muted: #6b7280;
-    /* Design System */
-    --border-radius: 0.75rem;
-    --border-radius-sm: 0.5rem;
-    --border-radius-lg: 1rem;
-    --box-shadow: 0 4px 6px -1px rgba(0, 0, 0, 0.1), 0 2px 4px -1px rgba(0, 0, 0, 0.06);
-    --box-shadow-lg: 0 20px 25px -5px rgba(0, 0, 0, 0.1), 0 10px 10px -5px rgba(0, 0, 0, 0.04);
-    --box-shadow-xl: 0 25px 50px -12px rgba(0, 0, 0, 0.25);
-    --transition: all 0.3s cubic-bezier(0.4, 0, 0.2, 1);
-    --transition-fast: all 0.15s cubic-bezier(0.4, 0, 0.2, 1);
-    /* Gradients */
-    --gradient-primary: linear-gradient(135deg, var(--primary-color) 0%, var(--primary-light) 100%);
-    --gradient-secondary: linear-gradient(135deg, var(--secondary-color) 0%, var(--secondary-dark) 100%);
-    --gradient-accent: linear-gradient(135deg, var(--accent-color) 0%, var(--accent-dark) 100%);
-    --gradient-hero: linear-gradient(135deg, var(--primary-color) 0%, var(--secondary-color) 50%, var(--accent-color) 100%);
-}
-/* Global Styles */
-body {
-    font-family: 'Inter', -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
-    line-height: 1.6;
-    color: var(--text-color);
-    background-color: #ffffff;
-    font-weight: 400;
-    -webkit-font-smoothing: antialiased;
-    -moz-osx-font-smoothing: grayscale;
-}
-/* Enhanced Typography */
-h1, h2, h3, h4, h5, h6 {
-    font-weight: 700;
-    line-height: 1.3;
-    color: var(--dark-color);
-    letter-spacing: -0.025em;
-}
-.display-1, .display-2, .display-3, .display-4 {
-    font-weight: 800;
-    letter-spacing: -0.05em;
-}
-.lead {
-    font-size: 1.125rem;
-    font-weight: 400;
-    color: var(--text-muted);
-    line-height: 1.8;
-}
-/* Simplified Button Styles */
-.btn {
-    font-weight: 600;
-    border-radius: var(--border-radius-sm);
-    transition: all 0.2s ease;
-    letter-spacing: 0.025em;
-}
-.btn-primary {
-    background-color: var(--primary-color);
-    border-color: var(--primary-color);
-    color: white;
-}
-.btn-primary:hover {
-    background-color: var(--primary-dark);
-    border-color: var(--primary-dark);
-    color: white;
-}
-.btn-outline-primary {
-    border: 2px solid var(--primary-color);
-    color: var(--primary-color);
-    background: transparent;
-}
-.btn-outline-primary:hover {
-    background: var(--primary-color);
-    border-color: var(--primary-color);
-    color: white;
-}
-.btn-lg {
-    padding: 0.875rem 2rem;
-    font-size: 1.125rem;
-    border-radius: var(--border-radius);
-}
-.btn-sm {
-    padding: 0.5rem 1rem;
-    font-size: 0.875rem;
-    border-radius: var(--border-radius-sm);
-}
-/* Clean Card Styles */
-.card {
-    border: 1px solid #e5e7eb;
-    box-shadow: 0 1px 2px rgba(0, 0, 0, 0.05);
-    transition: all 0.2s ease;
-    border-radius: 12px;
-    background: white;
-}
-.card:hover {
-    box-shadow: 0 4px 6px rgba(0, 0, 0, 0.07);
-    border-color: #d1d5db;
-}
-.card-body {
-    padding: 2rem;
-}
-/* Clean Hero Section */
-.hero-section {
-    background: linear-gradient(135deg, #f8fafc 0%, #ffffff 100%);
-    color: var(--text-color);
-    padding: 6rem 0;
-    min-height: 80vh;
-    display: flex;
-    align-items: center;
-    border-bottom: 1px solid #e5e7eb;
-}
-.min-vh-75 {
-    min-height: 75vh;
-}
-/* Status Indicators */
-.status-indicator {
-    display: inline-block;
-    width: 8px;
-    height: 8px;
-    border-radius: 50%;
-    background-color: #6c757d;
-}
-.status-online {
-    background-color: #28a745;
-}
-.status-offline {
-    background-color: #dc3545;
-}
-/* Footer */
-.footer {
-    margin-top: auto;
-}
-/* Clean Code Blocks */
-pre {
-    background-color: #f8fafc !important;
-    border: 1px solid #e5e7eb;
-    border-radius: 8px;
-    font-size: 0.875rem;
-}
-code {
-    color: #374151;
-    font-family: 'SF Mono', Monaco, 'Cascadia Code', 'Roboto Mono', Consolas, 'Courier New', monospace;
-}
-/* Enhanced Form Styles */
-.form-control, .form-select {
-    border-radius: var(--border-radius-sm);
-    border: 2px solid #e2e8f0;
-    transition: var(--transition);
-    padding: 0.875rem 1rem;
-    font-size: 1rem;
-    background-color: #ffffff;
-    color: var(--text-color);
-}
-.form-control:focus, .form-select:focus {
-    border-color: var(--primary-color);
-    box-shadow: 0 0 0 3px rgba(99, 102, 241, 0.1);
-    outline: none;
-    background-color: #ffffff;
-}
-.form-control:hover, .form-select:hover {
-    border-color: #cbd5e1;
-}
-.form-label {
-    font-weight: 600;
-    color: var(--dark-color);
-    margin-bottom: 0.75rem;
-    font-size: 0.95rem;
-}
-.form-text {
-    color: var(--text-muted);
-    font-size: 0.875rem;
-    margin-top: 0.5rem;
-}
-.form-check-input {
-    border-radius: var(--border-radius-sm);
-    border: 2px solid #e2e8f0;
-    width: 1.25rem;
-    height: 1.25rem;
-}
-.form-check-input:checked {
-    background-color: var(--primary-color);
-    border-color: var(--primary-color);
-}
-.form-check-input:focus {
-    box-shadow: 0 0 0 3px rgba(99, 102, 241, 0.1);
-}
-.form-check-label {
-    color: var(--text-color);
-    font-weight: 500;
-    margin-left: 0.5rem;
-}
-/* Enhanced Status Indicators */
-.status-indicator {
-    display: inline-block;
-    width: 12px;
-    height: 12px;
-    border-radius: 50%;
-    margin-right: 8px;
-    position: relative;
-    animation: statusPulse 2s infinite;
-}
-.status-indicator::before {
-    content: '';
-    position: absolute;
-    top: -2px;
-    left: -2px;
-    right: -2px;
-    bottom: -2px;
-    border-radius: 50%;
-    opacity: 0.3;
-    animation: statusRing 2s infinite;
-}
-.status-online {
-    background-color: var(--success-color);
-    box-shadow: 0 0 8px rgba(16, 185, 129, 0.4);
-}
-.status-online::before {
-    background-color: var(--success-color);
-}
-.status-offline {
-    background-color: var(--danger-color);
-    box-shadow: 0 0 8px rgba(239, 68, 68, 0.4);
-}
-.status-offline::before {
-    background-color: var(--danger-color);
-}
-@keyframes statusPulse {
-    0%, 100% { opacity: 1; }
-    50% { opacity: 0.7; }
-}
-@keyframes statusRing {
-    0% { transform: scale(0.8); opacity: 0.8; }
-    100% { transform: scale(1.4); opacity: 0; }
-}
-/* Enhanced Audio Player */
-.audio-player {
-    width: 100%;
-    margin-top: 1rem;
-    border-radius: var(--border-radius);
-    box-shadow: var(--box-shadow);
-    background: var(--light-color);
-    padding: 0.5rem;
-}
-.audio-player::-webkit-media-controls-panel {
-    background-color: var(--light-color);
-    border-radius: var(--border-radius-sm);
-}
-/* Enhanced Sections */
-.features-section {
-    padding: 6rem 0;
-    background: linear-gradient(180deg, #ffffff 0%, var(--light-color) 100%);
-}
-.stats-section {
-    padding: 4rem 0;
-    background: var(--gradient-primary);
-    color: white;
-    position: relative;
-    overflow: hidden;
-}
-.stats-section::before {
-    content: '';
-    position: absolute;
-    top: 0;
-    left: 0;
-    right: 0;
-    bottom: 0;
-    background: url('data:image/svg+xml,<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 100 100"><defs><pattern id="stats-pattern" width="40" height="40" patternUnits="userSpaceOnUse"><circle cx="20" cy="20" r="1" fill="white" opacity="0.1"/></pattern></defs><rect width="100" height="100" fill="url(%23stats-pattern)"/></svg>');
-}
-.stat-card {
-    text-align: center;
-    padding: 2rem 1rem;
-    background: rgba(255, 255, 255, 0.1);
-    border-radius: var(--border-radius);
-    backdrop-filter: blur(10px);
-    border: 1px solid rgba(255, 255, 255, 0.2);
-    transition: var(--transition);
-}
-.stat-card:hover {
-    transform: translateY(-5px);
-    background: rgba(255, 255, 255, 0.15);
-}
-.stat-icon {
-    font-size: 2.5rem;
-    margin-bottom: 1rem;
-    color: rgba(255, 255, 255, 0.9);
-}
-.stat-number {
-    font-size: 3rem;
-    font-weight: 800;
-    color: white;
-    margin-bottom: 0.5rem;
-    display: block;
-}
-.stat-label {
-    color: rgba(255, 255, 255, 0.9);
-    font-weight: 500;
-    font-size: 0.95rem;
-}
-.quick-start-section {
-    padding: 6rem 0;
-}
-.use-cases-section {
-    padding: 6rem 0;
-    background: var(--light-color);
-}
-.tech-specs-section {
-    padding: 6rem 0;
-}
-.faq-section {
-    padding: 6rem 0;
-    background: var(--light-color);
-}
-.final-cta-section {
-    padding: 6rem 0;
-    background: var(--gradient-hero);
-    color: white;
-    position: relative;
-    overflow: hidden;
-}
-.cta-background-animation {
-    position: absolute;
-    top: 0;
-    left: 0;
-    right: 0;
-    bottom: 0;
-    background: linear-gradient(45deg, transparent 30%, rgba(255,255,255,0.05) 50%, transparent 70%);
-    animation: shimmer 4s ease-in-out infinite;
-}
-.section-badge {
-    display: inline-block;
-    background: var(--gradient-primary);
-    color: white;
-    padding: 0.5rem 1.5rem;
-    border-radius: 2rem;
-    font-size: 0.875rem;
-    font-weight: 600;
-    margin-bottom: 1.5rem;
-    box-shadow: 0 4px 14px 0 rgba(99, 102, 241, 0.3);
-}
-/* Enhanced Loading States */
-.loading-spinner {
-    display: none;
-}
-.loading .loading-spinner {
-    display: inline-block;
-}
-.loading .btn-text {
-    display: none;
-}
-.loading {
-    position: relative;
-    overflow: hidden;
-}
-.loading::after {
-    content: '';
-    position: absolute;
-    top: 0;
-    left: -100%;
-    width: 100%;
-    height: 100%;
-    background: linear-gradient(90deg, transparent, rgba(255,255,255,0.3), transparent);
-    animation: loading-shimmer 1.5s infinite;
-}
-@keyframes loading-shimmer {
-    0% { left: -100%; }
-    100% { left: 100%; }
-}
-/* Enhanced Code Blocks */
-.code-card {
-    background: white;
-    border-radius: var(--border-radius);
-    box-shadow: var(--box-shadow);
-    overflow: hidden;
-    border: 1px solid #e2e8f0;
-    transition: var(--transition);
-}
-.code-card:hover {
-    transform: translateY(-2px);
-    box-shadow: var(--box-shadow-lg);
-}
-.code-header {
-    background: var(--light-gray);
-    padding: 1rem 1.5rem;
-    border-bottom: 1px solid #e2e8f0;
-    display: flex;
-    justify-content: between;
-    align-items: center;
-}
-.code-header h4 {
-    margin: 0;
-    font-size: 1.1rem;
-    color: var(--dark-color);
-}
-.code-content {
-    padding: 1.5rem;
-    background: #f8fafc;
-    margin: 0;
-    overflow-x: auto;
-}
-.code-content code {
-    font-family: 'Monaco', 'Menlo', 'Ubuntu Mono', monospace;
-    font-size: 0.9rem;
-    line-height: 1.6;
-    color: var(--text-color);
-}
-.code-footer {
-    padding: 1rem 1.5rem;
-    background: white;
-    border-top: 1px solid #e2e8f0;
-}
-.copy-btn {
-    font-size: 0.8rem;
-    padding: 0.25rem 0.75rem;
-}
-/* Enhanced Use Case Cards */
-.use-case-card {
-    background: white;
-    border-radius: var(--border-radius);
-    padding: 2rem;
-    box-shadow: var(--box-shadow);
-    transition: var(--transition);
-    border: 1px solid #e2e8f0;
-    height: 100%;
-    text-align: center;
-}
-.use-case-card:hover {
-    transform: translateY(-4px);
-    box-shadow: var(--box-shadow-lg);
-    border-color: rgba(99, 102, 241, 0.2);
-}
-.use-case-icon {
-    width: 4rem;
-    height: 4rem;
-    background: var(--gradient-primary);
-    border-radius: 50%;
-    display: flex;
-    align-items: center;
-    justify-content: center;
-    font-size: 1.5rem;
-    color: white;
-    margin: 0 auto 1.5rem;
-    box-shadow: 0 4px 14px 0 rgba(99, 102, 241, 0.3);
-}
-.use-case-title {
-    font-size: 1.25rem;
-    font-weight: 700;
-    color: var(--dark-color);
-    margin-bottom: 1rem;
-}
-.use-case-description {
-    color: var(--text-muted);
-    margin-bottom: 1.5rem;
-    line-height: 1.7;
-}
-.use-case-examples {
-    display: flex;
-    flex-wrap: wrap;
-    gap: 0.5rem;
-    justify-content: center;
-}
-.use-case-examples .badge {
-    font-size: 0.75rem;
-    padding: 0.4rem 0.8rem;
-    border-radius: 1rem;
-    background: var(--light-gray);
-    color: var(--text-color);
-    border: 1px solid #e2e8f0;
-}
-/* Enhanced Tech Spec Cards */
-.tech-spec-card {
-    background: white;
-    border-radius: var(--border-radius);
-    padding: 2rem;
-    box-shadow: var(--box-shadow);
-    transition: var(--transition);
-    border: 1px solid #e2e8f0;
-    height: 100%;
-}
-.tech-spec-card:hover {
-    transform: translateY(-2px);
-    box-shadow: var(--box-shadow-lg);
-}
-.tech-spec-icon {
-    width: 3rem;
-    height: 3rem;
-    background: var(--gradient-accent);
-    border-radius: var(--border-radius-sm);
-    display: flex;
-    align-items: center;
-    justify-content: center;
-    font-size: 1.25rem;
-    color: white;
-    margin: 0 auto 1rem;
-}
-.tech-spec-card h4, .tech-spec-card h5 {
-    color: var(--dark-color);
-    margin-bottom: 1.5rem;
-}
-.tech-spec-card ul {
-    list-style: none;
-    padding: 0;
-}
-.tech-spec-card li {
-    padding: 0.5rem 0;
-    color: var(--text-color);
-    border-bottom: 1px solid #f1f5f9;
-}
-.tech-spec-card li:last-child {
-    border-bottom: none;
-}
-/* Enhanced Validation Styles */
-.badge {
-    font-size: 0.75em;
-    padding: 0.4em 0.8em;
-    border-radius: 1rem;
-    font-weight: 600;
-    letter-spacing: 0.025em;
-}
-.validation-result {
-    animation: slideDown 0.3s ease;
-}
-@keyframes slideDown {
-    from {
-        opacity: 0;
-        transform: translateY(-10px);
-    }
-    to {
-        opacity: 1;
-        transform: translateY(0);
-    }
-}
-/* Enhanced Alert Styles */
-.alert {
-    border-radius: var(--border-radius);
-    border: none;
-    box-shadow: var(--box-shadow);
-    padding: 1rem 1.5rem;
-}
-.alert-success {
-    background: linear-gradient(135deg, rgba(16, 185, 129, 0.1) 0%, rgba(16, 185, 129, 0.05) 100%);
-    color: #065f46;
-    border-left: 4px solid var(--success-color);
-}
-.alert-warning {
-    background: linear-gradient(135deg, rgba(245, 158, 11, 0.1) 0%, rgba(245, 158, 11, 0.05) 100%);
-    color: #92400e;
-    border-left: 4px solid var(--warning-color);
-}
-.alert-danger {
-    background: linear-gradient(135deg, rgba(239, 68, 68, 0.1) 0%, rgba(239, 68, 68, 0.05) 100%);
-    color: #991b1b;
-    border-left: 4px solid var(--danger-color);
-}
-.alert-info {
-    background: linear-gradient(135deg, rgba(59, 130, 246, 0.1) 0%, rgba(59, 130, 246, 0.05) 100%);
-    color: #1e40af;
-    border-left: 4px solid var(--info-color);
-}
-/* Enhanced Accordion */
-.accordion-item {
-    border: none;
-    margin-bottom: 1rem;
-    border-radius: var(--border-radius) !important;
-    box-shadow: var(--box-shadow);
-    overflow: hidden;
-}
-.accordion-button {
-    background: white;
-    border: none;
-    padding: 1.5rem;
-    font-weight: 600;
-    color: var(--dark-color);
-    border-radius: var(--border-radius) !important;
-}
-.accordion-button:not(.collapsed) {
-    background: var(--light-gray);
-    color: var(--primary-color);
-    box-shadow: none;
-}
-.accordion-button:focus {
-    box-shadow: 0 0 0 3px rgba(99, 102, 241, 0.1);
-    border-color: transparent;
-}
-.accordion-body {
-    padding: 1.5rem;
-    background: white;
-    color: var(--text-color);
-    line-height: 1.7;
-}
-/* Enhanced CTA Buttons */
-.cta-btn-primary, .cta-btn-secondary {
-    position: relative;
-    overflow: hidden;
-    backdrop-filter: blur(10px);
-    border-radius: var(--border-radius);
-}
-.cta-btn-primary small, .cta-btn-secondary small {
-    font-size: 0.75rem;
-    opacity: 0.9;
-    font-weight: 400;
-}
-.cta-content {
-    position: relative;
-    z-index: 2;
-}
-.cta-buttons {
-    margin: 2rem 0;
-}
-.cta-stats {
-    margin-top: 3rem;
-}
-.cta-stat h4 {
-    font-size: 2rem;
-    font-weight: 800;
-    margin-bottom: 0.25rem;
-}
-.cta-stat small {
-    font-size: 0.9rem;
-    opacity: 0.9;
-}
-/* Enhanced Quick Start */
-.quick-start-cta {
-    background: white;
-    border-radius: var(--border-radius-lg);
-    padding: 3rem;
-    box-shadow: var(--box-shadow-lg);
-    text-align: center;
-    border: 1px solid #e2e8f0;
-}
-.quick-start-cta h4 {
-    color: var(--dark-color);
-    margin-bottom: 1.5rem;
-}
-/* Enhanced Batch Processing */
-.batch-chunk-card {
-    transition: var(--transition);
-    border: 1px solid #e2e8f0;
-    border-radius: var(--border-radius);
-    overflow: hidden;
-}
-.batch-chunk-card:hover {
-    transform: translateY(-2px);
-    box-shadow: var(--box-shadow-lg);
-    border-color: rgba(99, 102, 241, 0.2);
-}
-.batch-chunk-card .card-body {
-    padding: 1.5rem;
-}
-.batch-chunk-card .card-title {
-    font-size: 1rem;
-    font-weight: 600;
-    color: var(--dark-color);
-}
-.batch-chunk-card .card-text {
-    color: var(--text-muted);
-    line-height: 1.6;
-}
-.download-chunk {
-    transition: var(--transition-fast);
-}
-.download-chunk:hover {
-    transform: scale(1.1);
-}
-/* Enhanced Navigation */
-.navbar {
-    backdrop-filter: blur(10px);
-    background: rgba(255, 255, 255, 0.95) !important;
-    border-bottom: 1px solid rgba(226, 232, 240, 0.8);
-    box-shadow: 0 1px 3px 0 rgba(0, 0, 0, 0.1);
-}
-.navbar-brand {
-    font-weight: 800;
-    font-size: 1.5rem;
-    color: var(--primary-color) !important;
-    transition: var(--transition);
-}
-.navbar-brand:hover {
-    transform: scale(1.05);
-}
-.navbar-nav .nav-link {
-    font-weight: 500;
-    transition: var(--transition);
-    color: var(--text-color) !important;
-    position: relative;
-    padding: 0.75rem 1rem !important;
-}
-.navbar-nav .nav-link::after {
-    content: '';
-    position: absolute;
-    bottom: 0;
-    left: 50%;
-    width: 0;
-    height: 2px;
-    background: var(--gradient-primary);
-    transition: var(--transition);
-    transform: translateX(-50%);
-}
-.navbar-nav .nav-link:hover::after {
-    width: 80%;
-}
-.navbar-nav .nav-link:hover {
-    color: var(--primary-color) !important;
-}
-.navbar-text {
-    color: var(--text-muted) !important;
-    font-weight: 500;
-}
-/* Enhanced Footer */
-.footer {
-    background: linear-gradient(135deg, var(--dark-color) 0%, #2d3748 100%);
-    color: white;
-    padding: 3rem 0 2rem;
-    margin-top: 6rem;
-    position: relative;
-    overflow: hidden;
-}
-.footer::before {
-    content: '';
-    position: absolute;
-    top: 0;
-    left: 0;
-    right: 0;
-    bottom: 0;
-    background: url('data:image/svg+xml,<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 100 100"><defs><pattern id="footer-pattern" width="20" height="20" patternUnits="userSpaceOnUse"><circle cx="10" cy="10" r="0.5" fill="white" opacity="0.1"/></pattern></defs><rect width="100" height="100" fill="url(%23footer-pattern)"/></svg>');
-}
-.footer h5 {
-    color: white;
-    font-weight: 700;
-    margin-bottom: 1rem;
-}
-.footer p, .footer a {
-    color: rgba(255, 255, 255, 0.8);
-    transition: var(--transition);
-}
-.footer a:hover {
-    color: white;
-    text-decoration: none;
-}
-/* Enhanced Responsive Design */
-@media (max-width: 1200px) {
-    .hero-section {
-        padding: 4rem 0;
-    }
-    .floating-icon-container {
-        width: 250px;
-        height: 250px;
-    }
-    .floating-icon {
-        width: 50px;
-        height: 50px;
-        font-size: 1.25rem;
-    }
-    .hero-main-icon {
-        width: 100px;
-        height: 100px;
-        font-size: 2.5rem;
-    }
-}
-@media (max-width: 992px) {
-    .hero-section {
-        padding: 3rem 0;
-        min-height: auto;
-    }
-    .display-3 {
-        font-size: 2.5rem;
-    }
-    .features-section, .stats-section, .quick-start-section,
-    .use-cases-section, .tech-specs-section, .faq-section,
-    .final-cta-section {
-        padding: 4rem 0;
-    }
-    .floating-icon-container {
-        display: none;
-    }
-    .hero-visual {
-        margin-top: 2rem;
-    }
-}
-@media (max-width: 768px) {
-    .hero-section {
-        padding: 2rem 0;
-        text-align: center;
-    }
-    .display-3 {
-        font-size: 2rem;
-    }
-    .lead {
-        font-size: 1rem;
-    }
-    .btn-lg {
-        padding: 0.75rem 1.5rem;
-        font-size: 1rem;
-        width: 100%;
-        margin-bottom: 1rem;
-    }
-    .hero-stats .col-4 {
-        margin-bottom: 1rem;
-    }
-    .stat-item h3 {
-        font-size: 2rem;
-    }
-    .features-section, .stats-section, .quick-start-section,
-    .use-cases-section, .tech-specs-section, .faq-section,
-    .final-cta-section {
-        padding: 3rem 0;
-    }
-    .feature-card-enhanced, .use-case-card, .tech-spec-card {
-        margin-bottom: 2rem;
-    }
-    .code-card {
-        margin-bottom: 1.5rem;
-    }
-    .code-header {
-        flex-direction: column;
-        gap: 1rem;
-        text-align: center;
-    }
-    .quick-start-cta {
-        padding: 2rem 1rem;
-    }
-    .cta-buttons .btn {
-        width: 100%;
-        margin-bottom: 1rem;
-    }
-    .navbar-nav {
-        text-align: center;
-        padding: 1rem 0;
-    }
-    .toc {
-        position: static;
-        margin-bottom: 2rem;
-        max-height: none;
-    }
-}
-@media (max-width: 576px) {
-    .container {
-        padding-left: 1rem;
-        padding-right: 1rem;
-    }
-    .hero-section {
-        padding: 1.5rem 0;
-    }
-    .display-3 {
-        font-size: 1.75rem;
-    }
-    .card-body {
-        padding: 1.5rem;
-    }
-    .feature-card-enhanced, .use-case-card, .tech-spec-card {
-        padding: 1.5rem;
-    }
-    .stat-number {
-        font-size: 2.5rem;
-    }
-    .hero-main-icon {
-        width: 80px;
-        height: 80px;
-        font-size: 2rem;
-    }
-    .pulse-ring {
-        width: 100px;
-        height: 100px;
-    }
-}
-/* Enhanced Accessibility */
-.btn:focus,
-.form-control:focus,
-.form-select:focus,
-.form-check-input:focus {
-    outline: 3px solid rgba(99, 102, 241, 0.3);
-    outline-offset: 2px;
-}
-.btn:focus-visible,
-.form-control:focus-visible,
-.form-select:focus-visible {
-    outline: 3px solid var(--primary-color);
-    outline-offset: 2px;
-}
-/* Skip to content link for screen readers */
-.skip-link {
-    position: absolute;
-    top: -40px;
-    left: 6px;
-    background: var(--primary-color);
-    color: white;
-    padding: 8px;
-    text-decoration: none;
-    border-radius: 4px;
-    z-index: 1000;
-}
-.skip-link:focus {
-    top: 6px;
-}
-/* Enhanced Animation Classes */
-.fade-in {
-    animation: fadeIn 0.6s cubic-bezier(0.4, 0, 0.2, 1);
-}
-@keyframes fadeIn {
-    from {
-        opacity: 0;
-        transform: translateY(10px);
-    }
-    to {
-        opacity: 1;
-        transform: translateY(0);
-    }
-}
-.slide-up {
-    animation: slideUp 0.6s cubic-bezier(0.4, 0, 0.2, 1);
-}
-@keyframes slideUp {
-    from {
-        opacity: 0;
-        transform: translateY(30px);
-    }
-    to {
-        opacity: 1;
-        transform: translateY(0);
-    }
-}
-.scale-in {
-    animation: scaleIn 0.5s cubic-bezier(0.4, 0, 0.2, 1);
-}
-@keyframes scaleIn {
-    from {
-        opacity: 0;
-        transform: scale(0.9);
-    }
-    to {
-        opacity: 1;
-        transform: scale(1);
-    }
-}
-/* Enhanced Utility Classes */
-.text-gradient {
-    background: var(--gradient-primary);
-    -webkit-background-clip: text;
-    -webkit-text-fill-color: transparent;
-    background-clip: text;
-}
-.text-gradient-secondary {
-    background: var(--gradient-secondary);
-    -webkit-background-clip: text;
-    -webkit-text-fill-color: transparent;
-    background-clip: text;
-}
-.shadow-custom {
-    box-shadow: var(--box-shadow);
-}
-.shadow-lg-custom {
-    box-shadow: var(--box-shadow-lg);
-}
-.shadow-xl-custom {
-    box-shadow: var(--box-shadow-xl);
-}
-.border-radius-custom {
-    border-radius: var(--border-radius);
-}
-.bg-gradient-primary {
-    background: var(--gradient-primary);
-}
-.bg-gradient-secondary {
-    background: var(--gradient-secondary);
-}
-.bg-gradient-accent {
-    background: var(--gradient-accent);
-}
-/* Enhanced Progress Indicators */
-.progress-custom {
-    height: 10px;
-    border-radius: var(--border-radius-sm);
-    background-color: #e2e8f0;
-    overflow: hidden;
-    box-shadow: inset 0 1px 3px rgba(0, 0, 0, 0.1);
-}
-.progress-bar-custom {
-    height: 100%;
-    background: var(--gradient-primary);
-    transition: width 0.6s cubic-bezier(0.4, 0, 0.2, 1);
-    position: relative;
-    overflow: hidden;
-}
-.progress-bar-custom::after {
-    content: '';
-    position: absolute;
-    top: 0;
-    left: 0;
-    right: 0;
-    bottom: 0;
-    background: linear-gradient(90deg, transparent, rgba(255,255,255,0.3), transparent);
-    animation: progress-shimmer 2s infinite;
-}
-@keyframes progress-shimmer {
-    0% { transform: translateX(-100%); }
-    100% { transform: translateX(100%); }
-}
-/* Enhanced Tooltip */
-.tooltip-inner {
-    background-color: var(--dark-color);
-    border-radius: var(--border-radius-sm);
-    font-size: 0.875rem;
-    padding: 0.5rem 0.75rem;
-    box-shadow: var(--box-shadow);
-}
-/* Enhanced Custom Scrollbar */
-::-webkit-scrollbar {
-    width: 10px;
-    height: 10px;
-}
-::-webkit-scrollbar-track {
-    background: var(--light-gray);
-    border-radius: var(--border-radius-sm);
-}
-::-webkit-scrollbar-thumb {
-    background: var(--gradient-primary);
-    border-radius: var(--border-radius-sm);
-    border: 2px solid var(--light-gray);
-}
-::-webkit-scrollbar-thumb:hover {
-    background: var(--gradient-secondary);
-}
-::-webkit-scrollbar-corner {
-    background: var(--light-gray);
-}
-/* Print Styles */
-@media print {
-    .navbar, .footer, .hero-scroll-indicator, .floating-icon-container {
-        display: none !important;
-    }
-    .hero-section {
-        background: white !important;
-        color: black !important;
-        padding: 1rem 0 !important;
-    }
-    .card {
-        box-shadow: none !important;
-        border: 1px solid #ddd !important;
-    }
-    .btn {
-        border: 1px solid #ddd !important;
-        background: white !important;
-        color: black !important;
-    }
-}
-/* Playground-Specific Styles */
-.playground-visual {
-    position: relative;
-    display: flex;
-    justify-content: center;
-    align-items: center;
-    height: 200px;
-}
-.playground-icon {
-    width: 100px;
-    height: 100px;
-    background: rgba(255, 255, 255, 0.15);
-    border-radius: 50%;
-    display: flex;
-    align-items: center;
-    justify-content: center;
-    font-size: 2.5rem;
-    color: white;
-    backdrop-filter: blur(20px);
-    border: 2px solid rgba(255, 255, 255, 0.3);
-    position: relative;
-}
-.audio-player-container {
-    border: 2px solid #e2e8f0;
-    transition: var(--transition);
-}
-.audio-player-container:hover {
-    border-color: var(--primary-color);
-    box-shadow: 0 0 0 3px rgba(99, 102, 241, 0.1);
-}
-.stat-item {
-    padding: 1rem;
-    text-align: center;
-}
-.stat-item i {
-    font-size: 1.5rem;
-    margin-bottom: 0.5rem;
-    display: block;
-}
-.stat-value {
-    font-size: 1.25rem;
-    font-weight: 700;
-    color: var(--dark-color);
-    margin-bottom: 0.25rem;
-}
-.stat-label {
-    font-size: 0.875rem;
-    color: var(--text-muted);
-    font-weight: 500;
-}
-.card-header {
-    border-bottom: none;
-    border-radius: var(--border-radius) var(--border-radius) 0 0 !important;
-}
-/* Enhanced Form Controls for Playground */
-.playground .form-control,
-.playground .form-select {
-    border: 2px solid #e2e8f0;
-    border-radius: var(--border-radius-sm);
-    padding: 1rem;
-    font-size: 1rem;
-    transition: var(--transition);
-}
-.playground .form-control:focus,
-.playground .form-select:focus {
-    border-color: var(--primary-color);
-    box-shadow: 0 0 0 4px rgba(99, 102, 241, 0.1);
-    transform: translateY(-1px);
-}
-.playground .btn-group .btn {
-    border-radius: var(--border-radius-sm);
-}
-.playground .btn-group .btn:first-child {
-    border-top-right-radius: 0;
-    border-bottom-right-radius: 0;
-}
-.playground .btn-group .btn:last-child {
-    border-top-left-radius: 0;
-    border-bottom-left-radius: 0;
-}
-/* Audio Player Enhancements */
-audio::-webkit-media-controls-panel {
-    background-color: var(--light-gray);
-    border-radius: var(--border-radius-sm);
-}
-audio::-webkit-media-controls-play-button,
-audio::-webkit-media-controls-pause-button {
-    background-color: var(--primary-color);
-    border-radius: 50%;
-}
-audio::-webkit-media-controls-timeline {
-    background-color: var(--light-gray);
-    border-radius: var(--border-radius-sm);
-}
-audio::-webkit-media-controls-current-time-display,
-audio::-webkit-media-controls-time-remaining-display {
-    color: var(--text-color);
-    font-weight: 500;
-}
-/* Reduced Motion Support */
-@media (prefers-reduced-motion: reduce) {
-    *,
-    *::before,
-    *::after {
-        animation-duration: 0.01ms !important;
-        animation-iteration-count: 1 !important;
-        transition-duration: 0.01ms !important;
-    }
-    .hero-background-animation,
-    .floating-icon,
-    .pulse-ring,
-    .hero-scroll-indicator,
-    .playground-icon {
-        animation: none !important;
-    }
-}

+/* TTSFM Web Application Custom Styles */
+:root {
+    /* Clean Color Palette */
+    --primary-color: #4f46e5;
+    --primary-dark: #3730a3;
+    --primary-light: #6366f1;
+    --secondary-color: #6b7280;
+    --secondary-dark: #4b5563;
+    --accent-color: #059669;
+    --accent-dark: #047857;
+    /* Status Colors */
+    --success-color: #059669;
+    --warning-color: #d97706;
+    --danger-color: #dc2626;
+    --info-color: #2563eb;
+    /* Clean Neutral Colors */
+    --light-color: #ffffff;
+    --light-gray: #f9fafb;
+    --medium-gray: #6b7280;
+    --dark-color: #111827;
+    --text-color: #374151;
+    --text-muted: #6b7280;
+    /* Design System */
+    --border-radius: 0.75rem;
+    --border-radius-sm: 0.5rem;
+    --border-radius-lg: 1rem;
+    --box-shadow: 0 4px 6px -1px rgba(0, 0, 0, 0.1), 0 2px 4px -1px rgba(0, 0, 0, 0.06);
+    --box-shadow-lg: 0 20px 25px -5px rgba(0, 0, 0, 0.1), 0 10px 10px -5px rgba(0, 0, 0, 0.04);
+    --box-shadow-xl: 0 25px 50px -12px rgba(0, 0, 0, 0.25);
+    --transition: all 0.3s cubic-bezier(0.4, 0, 0.2, 1);
+    --transition-fast: all 0.15s cubic-bezier(0.4, 0, 0.2, 1);
+    /* Gradients */
+    --gradient-primary: linear-gradient(135deg, var(--primary-color) 0%, var(--primary-light) 100%);
+    --gradient-secondary: linear-gradient(135deg, var(--secondary-color) 0%, var(--secondary-dark) 100%);
+    --gradient-accent: linear-gradient(135deg, var(--accent-color) 0%, var(--accent-dark) 100%);
+    --gradient-hero: linear-gradient(135deg, var(--primary-color) 0%, var(--secondary-color) 50%, var(--accent-color) 100%);
+}
+/* Global Styles */
+body {
+    font-family: 'Inter', -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
+    line-height: 1.6;
+    color: var(--text-color);
+    background-color: #ffffff;
+    font-weight: 400;
+    -webkit-font-smoothing: antialiased;
+    -moz-osx-font-smoothing: grayscale;
+}
+/* Enhanced Typography */
+h1, h2, h3, h4, h5, h6 {
+    font-weight: 700;
+    line-height: 1.3;
+    color: var(--dark-color);
+    letter-spacing: -0.025em;
+}
+.display-1, .display-2, .display-3, .display-4 {
+    font-weight: 800;
+    letter-spacing: -0.05em;
+}
+.lead {
+    font-size: 1.125rem;
+    font-weight: 400;
+    color: var(--text-muted);
+    line-height: 1.8;
+}
+/* Simplified Button Styles */
+.btn {
+    font-weight: 600;
+    border-radius: 12px;
+    transition: all 0.3s ease;
+    letter-spacing: 0.025em;
+    border: none;
+    box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1);
+}
+.btn-primary {
+    background: linear-gradient(135deg, var(--primary-color) 0%, var(--primary-light) 100%);
+    color: white;
+}
+.btn-primary:hover {
+    background: linear-gradient(135deg, var(--primary-dark) 0%, var(--primary-color) 100%);
+    color: white;
+    transform: translateY(-1px);
+    box-shadow: 0 4px 8px rgba(0, 0, 0, 0.15);
+}
+.btn-outline-primary {
+    border: 2px solid var(--primary-color);
+    color: var(--primary-color);
+    background: transparent;
+    box-shadow: none;
+}
+.btn-outline-primary:hover {
+    background: var(--primary-color);
+    border-color: var(--primary-color);
+    color: white;
+    transform: translateY(-1px);
+    box-shadow: 0 4px 8px rgba(0, 0, 0, 0.15);
+}
+.btn-lg {
+    padding: 0.875rem 2rem;
+    font-size: 1.125rem;
+    border-radius: var(--border-radius);
+}
+.btn-sm {
+    padding: 0.5rem 1rem;
+    font-size: 0.875rem;
+    border-radius: var(--border-radius-sm);
+}
+/* Clean Card Styles */
+.card {
+    border: 1px solid #e5e7eb;
+    box-shadow: 0 1px 3px rgba(0, 0, 0, 0.1);
+    transition: all 0.3s ease;
+    border-radius: 16px;
+    background: white;
+}
+.card:hover {
+    box-shadow: 0 10px 25px rgba(0, 0, 0, 0.1);
+    border-color: var(--primary-light);
+    transform: translateY(-2px);
+}
+.card-body {
+    padding: 2rem;
+}
+/* Clean Hero Section */
+.hero-section {
+    background: linear-gradient(135deg, #f9fafb 0%, #ffffff 100%);
+    color: var(--text-color);
+    padding: 5rem 0;
+    min-height: 75vh;
+    display: flex;
+    align-items: center;
+    border-bottom: 1px solid #e5e7eb;
+}
+.min-vh-75 {
+    min-height: 75vh;
+}
+/* Status Indicators */
+.status-indicator {
+    display: inline-block;
+    width: 8px;
+    height: 8px;
+    border-radius: 50%;
+    background-color: #6c757d;
+}
+.status-online {
+    background-color: #28a745;
+}
+.status-offline {
+    background-color: #dc3545;
+}
+/* Footer */
+.footer {
+    margin-top: auto;
+}
+/* Clean Code Blocks */
+pre {
+    background-color: #f8fafc !important;
+    border: 1px solid #e5e7eb;
+    border-radius: 8px;
+    font-size: 0.875rem;
+}
+code {
+    color: #374151;
+    font-family: 'SF Mono', Monaco, 'Cascadia Code', 'Roboto Mono', Consolas, 'Courier New', monospace;
+}
+/* Enhanced Form Styles */
+.form-control, .form-select {
+    border-radius: 12px;
+    border: 2px solid #e5e7eb;
+    transition: var(--transition);
+    padding: 1rem 1.25rem;
+    font-size: 1rem;
+    background-color: #ffffff;
+    color: var(--text-color);
+    box-shadow: 0 1px 3px rgba(0, 0, 0, 0.1);
+}
+.form-control:focus, .form-select:focus {
+    border-color: var(--primary-color);
+    box-shadow: 0 0 0 4px rgba(79, 70, 229, 0.1);
+    outline: none;
+    background-color: #ffffff;
+    transform: translateY(-1px);
+}
+.form-control:hover, .form-select:hover {
+    border-color: var(--primary-light);
+    box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1);
+}
+.form-label {
+    font-weight: 600;
+    color: var(--dark-color);
+    margin-bottom: 0.75rem;
+    font-size: 0.95rem;
+}
+.form-text {
+    color: var(--text-muted);
+    font-size: 0.875rem;
+    margin-top: 0.5rem;
+}
+.form-check-input {
+    border-radius: var(--border-radius-sm);
+    border: 2px solid #e2e8f0;
+    width: 1.25rem;
+    height: 1.25rem;
+}
+.form-check-input:checked {
+    background-color: var(--primary-color);
+    border-color: var(--primary-color);
+}
+.form-check-input:focus {
+    box-shadow: 0 0 0 3px rgba(99, 102, 241, 0.1);
+}
+.form-check-label {
+    color: var(--text-color);
+    font-weight: 500;
+    margin-left: 0.5rem;
+}
+/* Enhanced Status Indicators */
+.status-indicator {
+    display: inline-block;
+    width: 12px;
+    height: 12px;
+    border-radius: 50%;
+    margin-right: 8px;
+    position: relative;
+    animation: statusPulse 2s infinite;
+}
+.status-indicator::before {
+    content: '';
+    position: absolute;
+    top: -2px;
+    left: -2px;
+    right: -2px;
+    bottom: -2px;
+    border-radius: 50%;
+    opacity: 0.3;
+    animation: statusRing 2s infinite;
+}
+.status-online {
+    background-color: var(--success-color);
+    box-shadow: 0 0 8px rgba(16, 185, 129, 0.4);
+}
+.status-online::before {
+    background-color: var(--success-color);
+}
+.status-offline {
+    background-color: var(--danger-color);
+    box-shadow: 0 0 8px rgba(239, 68, 68, 0.4);
+}
+.status-offline::before {
+    background-color: var(--danger-color);
+}
+@keyframes statusPulse {
+    0%, 100% { opacity: 1; }
+    50% { opacity: 0.7; }
+}
+@keyframes statusRing {
+    0% { transform: scale(0.8); opacity: 0.8; }
+    100% { transform: scale(1.4); opacity: 0; }
+}
+/* Enhanced Audio Player */
+.audio-player {
+    width: 100%;
+    margin-top: 1rem;
+    border-radius: var(--border-radius);
+    box-shadow: var(--box-shadow);
+    background: var(--light-color);
+    padding: 0.5rem;
+}
+.audio-player::-webkit-media-controls-panel {
+    background-color: var(--light-color);
+    border-radius: var(--border-radius-sm);
+}
+/* Enhanced Sections */
+.features-section {
+    padding: 6rem 0;
+    background: linear-gradient(180deg, #ffffff 0%, var(--light-color) 100%);
+}
+.stats-section {
+    padding: 4rem 0;
+    background: var(--gradient-primary);
+    color: white;
+    position: relative;
+    overflow: hidden;
+}
+.stats-section::before {
+    content: '';
+    position: absolute;
+    top: 0;
+    left: 0;
+    right: 0;
+    bottom: 0;
+    background: url('data:image/svg+xml,<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 100 100"><defs><pattern id="stats-pattern" width="40" height="40" patternUnits="userSpaceOnUse"><circle cx="20" cy="20" r="1" fill="white" opacity="0.1"/></pattern></defs><rect width="100" height="100" fill="url(%23stats-pattern)"/></svg>');
+}
+.stat-card {
+    text-align: center;
+    padding: 2rem 1rem;
+    background: rgba(255, 255, 255, 0.1);
+    border-radius: var(--border-radius);
+    backdrop-filter: blur(10px);
+    border: 1px solid rgba(255, 255, 255, 0.2);
+    transition: var(--transition);
+}
+.stat-card:hover {
+    transform: translateY(-5px);
+    background: rgba(255, 255, 255, 0.15);
+}
+.stat-icon {
+    font-size: 2.5rem;
+    margin-bottom: 1rem;
+    color: rgba(255, 255, 255, 0.9);
+}
+.stat-number {
+    font-size: 3rem;
+    font-weight: 800;
+    color: white;
+    margin-bottom: 0.5rem;
+    display: block;
+}
+.stat-label {
+    color: rgba(255, 255, 255, 0.9);
+    font-weight: 500;
+    font-size: 0.95rem;
+}
+.quick-start-section {
+    padding: 6rem 0;
+}
+.use-cases-section {
+    padding: 6rem 0;
+    background: var(--light-color);
+}
+.tech-specs-section {
+    padding: 6rem 0;
+}
+.faq-section {
+    padding: 6rem 0;
+    background: var(--light-color);
+}
+.final-cta-section {
+    padding: 6rem 0;
+    background: var(--gradient-hero);
+    color: white;
+    position: relative;
+    overflow: hidden;
+}
+.cta-background-animation {
+    position: absolute;
+    top: 0;
+    left: 0;
+    right: 0;
+    bottom: 0;
+    background: linear-gradient(45deg, transparent 30%, rgba(255,255,255,0.05) 50%, transparent 70%);
+    animation: shimmer 4s ease-in-out infinite;
+}
+.section-badge {
+    display: inline-block;
+    background: var(--gradient-primary);
+    color: white;
+    padding: 0.5rem 1.5rem;
+    border-radius: 2rem;
+    font-size: 0.875rem;
+    font-weight: 600;
+    margin-bottom: 1.5rem;
+    box-shadow: 0 4px 14px 0 rgba(99, 102, 241, 0.3);
+}
+/* Enhanced Loading States */
+.loading-spinner {
+    display: none;
+}
+.loading .loading-spinner {
+    display: inline-block;
+}
+.loading .btn-text {
+    display: none;
+}
+.loading {
+    position: relative;
+    overflow: hidden;
+}
+.loading::after {
+    content: '';
+    position: absolute;
+    top: 0;
+    left: -100%;
+    width: 100%;
+    height: 100%;
+    background: linear-gradient(90deg, transparent, rgba(255,255,255,0.3), transparent);
+    animation: loading-shimmer 1.5s infinite;
+}
+@keyframes loading-shimmer {
+    0% { left: -100%; }
+    100% { left: 100%; }
+}
+/* Enhanced Code Blocks */
+.code-card {
+    background: white;
+    border-radius: var(--border-radius);
+    box-shadow: var(--box-shadow);
+    overflow: hidden;
+    border: 1px solid #e2e8f0;
+    transition: var(--transition);
+}
+.code-card:hover {
+    transform: translateY(-2px);
+    box-shadow: var(--box-shadow-lg);
+}
+.code-header {
+    background: var(--light-gray);
+    padding: 1rem 1.5rem;
+    border-bottom: 1px solid #e2e8f0;
+    display: flex;
+    justify-content: between;
+    align-items: center;
+}
+.code-header h4 {
+    margin: 0;
+    font-size: 1.1rem;
+    color: var(--dark-color);
+}
+.code-content {
+    padding: 1.5rem;
+    background: #f8fafc;
+    margin: 0;
+    overflow-x: auto;
+}
+.code-content code {
+    font-family: 'Monaco', 'Menlo', 'Ubuntu Mono', monospace;
+    font-size: 0.9rem;
+    line-height: 1.6;
+    color: var(--text-color);
+}
+.code-footer {
+    padding: 1rem 1.5rem;
+    background: white;
+    border-top: 1px solid #e2e8f0;
+}
+.copy-btn {
+    font-size: 0.8rem;
+    padding: 0.25rem 0.75rem;
+}
+/* Enhanced Use Case Cards */
+.use-case-card {
+    background: white;
+    border-radius: var(--border-radius);
+    padding: 2rem;
+    box-shadow: var(--box-shadow);
+    transition: var(--transition);
+    border: 1px solid #e2e8f0;
+    height: 100%;
+    text-align: center;
+}
+.use-case-card:hover {
+    transform: translateY(-4px);
+    box-shadow: var(--box-shadow-lg);
+    border-color: rgba(99, 102, 241, 0.2);
+}
+.use-case-icon {
+    width: 4rem;
+    height: 4rem;
+    background: var(--gradient-primary);
+    border-radius: 50%;
+    display: flex;
+    align-items: center;
+    justify-content: center;
+    font-size: 1.5rem;
+    color: white;
+    margin: 0 auto 1.5rem;
+    box-shadow: 0 4px 14px 0 rgba(99, 102, 241, 0.3);
+}
+.use-case-title {
+    font-size: 1.25rem;
+    font-weight: 700;
+    color: var(--dark-color);
+    margin-bottom: 1rem;
+}
+.use-case-description {
+    color: var(--text-muted);
+    margin-bottom: 1.5rem;
+    line-height: 1.7;
+}
+.use-case-examples {
+    display: flex;
+    flex-wrap: wrap;
+    gap: 0.5rem;
+    justify-content: center;
+}
+.use-case-examples .badge {
+    font-size: 0.75rem;
+    padding: 0.4rem 0.8rem;
+    border-radius: 1rem;
+    background: var(--light-gray);
+    color: var(--text-color);
+    border: 1px solid #e2e8f0;
+}
+/* Enhanced Tech Spec Cards */
+.tech-spec-card {
+    background: white;
+    border-radius: var(--border-radius);
+    padding: 2rem;
+    box-shadow: var(--box-shadow);
+    transition: var(--transition);
+    border: 1px solid #e2e8f0;
+    height: 100%;
+}
+.tech-spec-card:hover {
+    transform: translateY(-2px);
+    box-shadow: var(--box-shadow-lg);
+}
+.tech-spec-icon {
+    width: 3rem;
+    height: 3rem;
+    background: var(--gradient-accent);
+    border-radius: var(--border-radius-sm);
+    display: flex;
+    align-items: center;
+    justify-content: center;
+    font-size: 1.25rem;
+    color: white;
+    margin: 0 auto 1rem;
+}
+.tech-spec-card h4, .tech-spec-card h5 {
+    color: var(--dark-color);
+    margin-bottom: 1.5rem;
+}
+.tech-spec-card ul {
+    list-style: none;
+    padding: 0;
+}
+.tech-spec-card li {
+    padding: 0.5rem 0;
+    color: var(--text-color);
+    border-bottom: 1px solid #f1f5f9;
+}
+.tech-spec-card li:last-child {
+    border-bottom: none;
+}
+/* Enhanced Validation Styles */
+.badge {
+    font-size: 0.75em;
+    padding: 0.4em 0.8em;
+    border-radius: 1rem;
+    font-weight: 600;
+    letter-spacing: 0.025em;
+}
+.validation-result {
+    animation: slideDown 0.3s ease;
+}
+@keyframes slideDown {
+    from {
+        opacity: 0;
+        transform: translateY(-10px);
+    }
+    to {
+        opacity: 1;
+        transform: translateY(0);
+    }
+}
+/* Enhanced Alert Styles */
+.alert {
+    border-radius: var(--border-radius);
+    border: none;
+    box-shadow: var(--box-shadow);
+    padding: 1rem 1.5rem;
+}
+.alert-success {
+    background: linear-gradient(135deg, rgba(16, 185, 129, 0.1) 0%, rgba(16, 185, 129, 0.05) 100%);
+    color: #065f46;
+    border-left: 4px solid var(--success-color);
+}
+.alert-warning {
+    background: linear-gradient(135deg, rgba(245, 158, 11, 0.1) 0%, rgba(245, 158, 11, 0.05) 100%);
+    color: #92400e;
+    border-left: 4px solid var(--warning-color);
+}
+.alert-danger {
+    background: linear-gradient(135deg, rgba(239, 68, 68, 0.1) 0%, rgba(239, 68, 68, 0.05) 100%);
+    color: #991b1b;
+    border-left: 4px solid var(--danger-color);
+}
+.alert-info {
+    background: linear-gradient(135deg, rgba(59, 130, 246, 0.1) 0%, rgba(59, 130, 246, 0.05) 100%);
+    color: #1e40af;
+    border-left: 4px solid var(--info-color);
+}
+/* Enhanced Accordion */
+.accordion-item {
+    border: none;
+    margin-bottom: 1rem;
+    border-radius: var(--border-radius) !important;
+    box-shadow: var(--box-shadow);
+    overflow: hidden;
+}
+.accordion-button {
+    background: white;
+    border: none;
+    padding: 1.5rem;
+    font-weight: 600;
+    color: var(--dark-color);
+    border-radius: var(--border-radius) !important;
+}
+.accordion-button:not(.collapsed) {
+    background: var(--light-gray);
+    color: var(--primary-color);
+    box-shadow: none;
+}
+.accordion-button:focus {
+    box-shadow: 0 0 0 3px rgba(99, 102, 241, 0.1);
+    border-color: transparent;
+}
+.accordion-body {
+    padding: 1.5rem;
+    background: white;
+    color: var(--text-color);
+    line-height: 1.7;
+}
+/* Enhanced CTA Buttons */
+.cta-btn-primary, .cta-btn-secondary {
+    position: relative;
+    overflow: hidden;
+    backdrop-filter: blur(10px);
+    border-radius: var(--border-radius);
+}
+.cta-btn-primary small, .cta-btn-secondary small {
+    font-size: 0.75rem;
+    opacity: 0.9;
+    font-weight: 400;
+}
+.cta-content {
+    position: relative;
+    z-index: 2;
+}
+.cta-buttons {
+    margin: 2rem 0;
+}
+.cta-stats {
+    margin-top: 3rem;
+}
+.cta-stat h4 {
+    font-size: 2rem;
+    font-weight: 800;
+    margin-bottom: 0.25rem;
+}
+.cta-stat small {
+    font-size: 0.9rem;
+    opacity: 0.9;
+}
+/* Enhanced Quick Start */
+.quick-start-cta {
+    background: white;
+    border-radius: var(--border-radius-lg);
+    padding: 3rem;
+    box-shadow: var(--box-shadow-lg);
+    text-align: center;
+    border: 1px solid #e2e8f0;
+}
+.quick-start-cta h4 {
+    color: var(--dark-color);
+    margin-bottom: 1.5rem;
+}
+/* Enhanced Batch Processing */
+.batch-chunk-card {
+    transition: var(--transition);
+    border: 1px solid #e2e8f0;
+    border-radius: var(--border-radius);
+    overflow: hidden;
+}
+.batch-chunk-card:hover {
+    transform: translateY(-2px);
+    box-shadow: var(--box-shadow-lg);
+    border-color: rgba(99, 102, 241, 0.2);
+}
+.batch-chunk-card .card-body {
+    padding: 1.5rem;
+}
+.batch-chunk-card .card-title {
+    font-size: 1rem;
+    font-weight: 600;
+    color: var(--dark-color);
+}
+.batch-chunk-card .card-text {
+    color: var(--text-muted);
+    line-height: 1.6;
+}
+.download-chunk {
+    transition: var(--transition-fast);
+}
+.download-chunk:hover {
+    transform: scale(1.1);
+}
+/* Enhanced Navigation */
+.navbar {
+    backdrop-filter: blur(10px);
+    background: rgba(255, 255, 255, 0.95) !important;
+    border-bottom: 1px solid rgba(226, 232, 240, 0.8);
+    box-shadow: 0 1px 3px 0 rgba(0, 0, 0, 0.1);
+}
+.navbar-brand {
+    font-weight: 800;
+    font-size: 1.5rem;
+    color: var(--primary-color) !important;
+    transition: var(--transition);
+}
+.navbar-brand:hover {
+    transform: scale(1.05);
+}
+.navbar-nav .nav-link {
+    font-weight: 500;
+    transition: var(--transition);
+    color: var(--text-color) !important;
+    position: relative;
+    padding: 0.75rem 1rem !important;
+}
+.navbar-nav .nav-link::after {
+    content: '';
+    position: absolute;
+    bottom: 0;
+    left: 50%;
+    width: 0;
+    height: 2px;
+    background: var(--gradient-primary);
+    transition: var(--transition);
+    transform: translateX(-50%);
+}
+.navbar-nav .nav-link:hover::after {
+    width: 80%;
+}
+.navbar-nav .nav-link:hover {
+    color: var(--primary-color) !important;
+}
+.navbar-text {
+    color: var(--text-muted) !important;
+    font-weight: 500;
+}
+/* Enhanced Footer */
+.footer {
+    background: linear-gradient(135deg, var(--dark-color) 0%, #2d3748 100%);
+    color: white;
+    padding: 3rem 0 2rem;
+    margin-top: 6rem;
+    position: relative;
+    overflow: hidden;
+}
+.footer::before {
+    content: '';
+    position: absolute;
+    top: 0;
+    left: 0;
+    right: 0;
+    bottom: 0;
+    background: url('data:image/svg+xml,<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 100 100"><defs><pattern id="footer-pattern" width="20" height="20" patternUnits="userSpaceOnUse"><circle cx="10" cy="10" r="0.5" fill="white" opacity="0.1"/></pattern></defs><rect width="100" height="100" fill="url(%23footer-pattern)"/></svg>');
+}
+.footer h5 {
+    color: white;
+    font-weight: 700;
+    margin-bottom: 1rem;
+}
+.footer p, .footer a {
+    color: rgba(255, 255, 255, 0.8);
+    transition: var(--transition);
+}
+.footer a:hover {
+    color: white;
+    text-decoration: none;
+}
+/* Enhanced Responsive Design */
+@media (max-width: 1200px) {
+    .hero-section {
+        padding: 4rem 0;
+    }
+    .floating-icon-container {
+        width: 250px;
+        height: 250px;
+    }
+    .floating-icon {
+        width: 50px;
+        height: 50px;
+        font-size: 1.25rem;
+    }
+    .hero-main-icon {
+        width: 100px;
+        height: 100px;
+        font-size: 2.5rem;
+    }
+}
+@media (max-width: 992px) {
+    .hero-section {
+        padding: 3rem 0;
+        min-height: auto;
+    }
+    .display-3 {
+        font-size: 2.5rem;
+    }
+    .features-section, .stats-section, .quick-start-section,
+    .use-cases-section, .tech-specs-section, .faq-section,
+    .final-cta-section {
+        padding: 4rem 0;
+    }
+    .floating-icon-container {
+        display: none;
+    }
+    .hero-visual {
+        margin-top: 2rem;
+    }
+}
+@media (max-width: 768px) {
+    .hero-section {
+        padding: 2rem 0;
+        text-align: center;
+    }
+    .display-3 {
+        font-size: 2rem;
+    }
+    .lead {
+        font-size: 1rem;
+    }
+    .btn-lg {
+        padding: 0.75rem 1.5rem;
+        font-size: 1rem;
+        width: 100%;
+        margin-bottom: 1rem;
+    }
+    .hero-stats .col-4 {
+        margin-bottom: 1rem;
+    }
+    .stat-item h3 {
+        font-size: 2rem;
+    }
+    .features-section, .stats-section, .quick-start-section,
+    .use-cases-section, .tech-specs-section, .faq-section,
+    .final-cta-section {
+        padding: 3rem 0;
+    }
+    .feature-card-enhanced, .use-case-card, .tech-spec-card {
+        margin-bottom: 2rem;
+    }
+    .code-card {
+        margin-bottom: 1.5rem;
+    }
+    .code-header {
+        flex-direction: column;
+        gap: 1rem;
+        text-align: center;
+    }
+    .quick-start-cta {
+        padding: 2rem 1rem;
+    }
+    .cta-buttons .btn {
+        width: 100%;
+        margin-bottom: 1rem;
+    }
+    .navbar-nav {
+        text-align: center;
+        padding: 1rem 0;
+    }
+    .toc {
+        position: static;
+        margin-bottom: 2rem;
+        max-height: none;
+    }
+}
+@media (max-width: 576px) {
+    .container {
+        padding-left: 1rem;
+        padding-right: 1rem;
+    }
+    .hero-section {
+        padding: 1.5rem 0;
+    }
+    .display-3 {
+        font-size: 1.75rem;
+    }
+    .card-body {
+        padding: 1.5rem;
+    }
+    .feature-card-enhanced, .use-case-card, .tech-spec-card {
+        padding: 1.5rem;
+    }
+    .stat-number {
+        font-size: 2.5rem;
+    }
+    .hero-main-icon {
+        width: 80px;
+        height: 80px;
+        font-size: 2rem;
+    }
+    .pulse-ring {
+        width: 100px;
+        height: 100px;
+    }
+}
+/* Enhanced Accessibility */
+.btn:focus,
+.form-control:focus,
+.form-select:focus,
+.form-check-input:focus {
+    outline: 3px solid rgba(99, 102, 241, 0.3);
+    outline-offset: 2px;
+}
+.btn:focus-visible,
+.form-control:focus-visible,
+.form-select:focus-visible {
+    outline: 3px solid var(--primary-color);
+    outline-offset: 2px;
+}
+/* Skip to content link for screen readers */
+.skip-link {
+    position: absolute;
+    top: -40px;
+    left: 6px;
+    background: var(--primary-color);
+    color: white;
+    padding: 8px;
+    text-decoration: none;
+    border-radius: 4px;
+    z-index: 1000;
+}
+.skip-link:focus {
+    top: 6px;
+}
+/* Enhanced Animation Classes */
+.fade-in {
+    animation: fadeIn 0.6s cubic-bezier(0.4, 0, 0.2, 1);
+}
+@keyframes fadeIn {
+    from {
+        opacity: 0;
+        transform: translateY(10px);
+    }
+    to {
+        opacity: 1;
+        transform: translateY(0);
+    }
+}
+.slide-up {
+    animation: slideUp 0.6s cubic-bezier(0.4, 0, 0.2, 1);
+}
+@keyframes slideUp {
+    from {
+        opacity: 0;
+        transform: translateY(30px);
+    }
+    to {
+        opacity: 1;
+        transform: translateY(0);
+    }
+}
+.scale-in {
+    animation: scaleIn 0.5s cubic-bezier(0.4, 0, 0.2, 1);
+}
+@keyframes scaleIn {
+    from {
+        opacity: 0;
+        transform: scale(0.9);
+    }
+    to {
+        opacity: 1;
+        transform: scale(1);
+    }
+}
+/* Enhanced Utility Classes */
+.text-gradient {
+    background: var(--gradient-primary);
+    -webkit-background-clip: text;
+    -webkit-text-fill-color: transparent;
+    background-clip: text;
+}
+.text-gradient-secondary {
+    background: var(--gradient-secondary);
+    -webkit-background-clip: text;
+    -webkit-text-fill-color: transparent;
+    background-clip: text;
+}
+.shadow-custom {
+    box-shadow: var(--box-shadow);
+}
+.shadow-lg-custom {
+    box-shadow: var(--box-shadow-lg);
+}
+.shadow-xl-custom {
+    box-shadow: var(--box-shadow-xl);
+}
+.border-radius-custom {
+    border-radius: var(--border-radius);
+}
+.bg-gradient-primary {
+    background: var(--gradient-primary);
+}
+.bg-gradient-secondary {
+    background: var(--gradient-secondary);
+}
+.bg-gradient-accent {
+    background: var(--gradient-accent);
+}
+/* Enhanced Progress Indicators */
+.progress-custom {
+    height: 10px;
+    border-radius: var(--border-radius-sm);
+    background-color: #e2e8f0;
+    overflow: hidden;
+    box-shadow: inset 0 1px 3px rgba(0, 0, 0, 0.1);
+}
+.progress-bar-custom {
+    height: 100%;
+    background: var(--gradient-primary);
+    transition: width 0.6s cubic-bezier(0.4, 0, 0.2, 1);
+    position: relative;
+    overflow: hidden;
+}
+.progress-bar-custom::after {
+    content: '';
+    position: absolute;
+    top: 0;
+    left: 0;
+    right: 0;
+    bottom: 0;
+    background: linear-gradient(90deg, transparent, rgba(255,255,255,0.3), transparent);
+    animation: progress-shimmer 2s infinite;
+}
+@keyframes progress-shimmer {
+    0% { transform: translateX(-100%); }
+    100% { transform: translateX(100%); }
+}
+/* Enhanced Tooltip */
+.tooltip-inner {
+    background-color: var(--dark-color);
+    border-radius: var(--border-radius-sm);
+    font-size: 0.875rem;
+    padding: 0.5rem 0.75rem;
+    box-shadow: var(--box-shadow);
+}
+/* Enhanced Custom Scrollbar */
+::-webkit-scrollbar {
+    width: 10px;
+    height: 10px;
+}
+::-webkit-scrollbar-track {
+    background: var(--light-gray);
+    border-radius: var(--border-radius-sm);
+}
+::-webkit-scrollbar-thumb {
+    background: var(--gradient-primary);
+    border-radius: var(--border-radius-sm);
+    border: 2px solid var(--light-gray);
+}
+::-webkit-scrollbar-thumb:hover {
+    background: var(--gradient-secondary);
+}
+::-webkit-scrollbar-corner {
+    background: var(--light-gray);
+}
+/* Print Styles */
+@media print {
+    .navbar, .footer, .hero-scroll-indicator, .floating-icon-container {
+        display: none !important;
+    }
+    .hero-section {
+        background: white !important;
+        color: black !important;
+        padding: 1rem 0 !important;
+    }
+    .card {
+        box-shadow: none !important;
+        border: 1px solid #ddd !important;
+    }
+    .btn {
+        border: 1px solid #ddd !important;
+        background: white !important;
+        color: black !important;
+    }
+}
+/* Playground-Specific Styles */
+.playground-visual {
+    position: relative;
+    display: flex;
+    justify-content: center;
+    align-items: center;
+    height: 200px;
+}
+.playground-icon {
+    width: 100px;
+    height: 100px;
+    background: rgba(255, 255, 255, 0.15);
+    border-radius: 50%;
+    display: flex;
+    align-items: center;
+    justify-content: center;
+    font-size: 2.5rem;
+    color: white;
+    backdrop-filter: blur(20px);
+    border: 2px solid rgba(255, 255, 255, 0.3);
+    position: relative;
+}
+.audio-player-container {
+    border: 2px solid #e2e8f0;
+    transition: var(--transition);
+}
+.audio-player-container:hover {
+    border-color: var(--primary-color);
+    box-shadow: 0 0 0 3px rgba(99, 102, 241, 0.1);
+}
+.stat-item {
+    padding: 1rem;
+    text-align: center;
+}
+.stat-item i {
+    font-size: 1.5rem;
+    margin-bottom: 0.5rem;
+    display: block;
+}
+.stat-value {
+    font-size: 1.25rem;
+    font-weight: 700;
+    color: var(--dark-color);
+    margin-bottom: 0.25rem;
+}
+.stat-label {
+    font-size: 0.875rem;
+    color: var(--text-muted);
+    font-weight: 500;
+}
+.card-header {
+    border-bottom: none;
+    border-radius: var(--border-radius) var(--border-radius) 0 0 !important;
+}
+/* Enhanced Form Controls for Playground */
+.playground .form-control,
+.playground .form-select {
+    border: 2px solid #e2e8f0;
+    border-radius: var(--border-radius-sm);
+    padding: 1rem;
+    font-size: 1rem;
+    transition: var(--transition);
+}
+.playground .form-control:focus,
+.playground .form-select:focus {
+    border-color: var(--primary-color);
+    box-shadow: 0 0 0 4px rgba(99, 102, 241, 0.1);
+    transform: translateY(-1px);
+}
+.playground .btn-group .btn {
+    border-radius: var(--border-radius-sm);
+}
+.playground .btn-group .btn:first-child {
+    border-top-right-radius: 0;
+    border-bottom-right-radius: 0;
+}
+.playground .btn-group .btn:last-child {
+    border-top-left-radius: 0;
+    border-bottom-left-radius: 0;
+}
+/* Audio Player Enhancements */
+audio::-webkit-media-controls-panel {
+    background-color: var(--light-gray);
+    border-radius: var(--border-radius-sm);
+}
+audio::-webkit-media-controls-play-button,
+audio::-webkit-media-controls-pause-button {
+    background-color: var(--primary-color);
+    border-radius: 50%;
+}
+audio::-webkit-media-controls-timeline {
+    background-color: var(--light-gray);
+    border-radius: var(--border-radius-sm);
+}
+audio::-webkit-media-controls-current-time-display,
+audio::-webkit-media-controls-time-remaining-display {
+    color: var(--text-color);
+    font-weight: 500;
+}
+/* Reduced Motion Support */
+@media (prefers-reduced-motion: reduce) {
+    *,
+    *::before,
+    *::after {
+        animation-duration: 0.01ms !important;
+        animation-iteration-count: 1 !important;
+        transition-duration: 0.01ms !important;
+    }
+    .hero-background-animation,
+    .floating-icon,
+    .pulse-ring,
+    .hero-scroll-indicator,
+    .playground-icon {
+        animation: none !important;
+    }
+}

ttsfm-web/static/js/i18n.js ADDED Viewed

	@@ -0,0 +1,221 @@

+// JavaScript Internationalization Support for TTSFM
+// Translation data - this will be populated by the server
+window.i18nData = window.i18nData || {};
+// Current locale
+window.currentLocale = document.documentElement.lang || 'en';
+// Translation function
+function _(key, params = {}) {
+    const keys = key.split('.');
+    let value = window.i18nData;
+    // Navigate through the nested object
+    for (const k of keys) {
+        if (value && typeof value === 'object' && k in value) {
+            value = value[k];
+        } else {
+            // Fallback to key if translation not found
+            return key;
+        }
+    }
+    // If we found a string, apply parameters
+    if (typeof value === 'string') {
+        return formatString(value, params);
+    }
+    // Fallback to key
+    return key;
+}
+// Format string with parameters
+function formatString(str, params) {
+    return str.replace(/\{(\w+)\}/g, (match, key) => {
+        return params.hasOwnProperty(key) ? params[key] : match;
+    });
+}
+// Load translations from server
+async function loadTranslations() {
+    try {
+        const response = await fetch(`/api/translations/${window.currentLocale}`);
+        if (response.ok) {
+            window.i18nData = await response.json();
+        }
+    } catch (error) {
+        console.warn('Failed to load translations:', error);
+    }
+}
+// Sample texts for different languages
+const sampleTexts = {
+    en: {
+        welcome: "Welcome to TTSFM! This is a free text-to-speech service that converts your text into high-quality audio using advanced AI technology.",
+        story: "Once upon a time, in a digital world far away, there lived a small Python package that could transform any text into beautiful speech. This package was called TTSFM, and it brought joy to developers everywhere.",
+        technical: "TTSFM is a Python client for text-to-speech APIs that provides both synchronous and asynchronous interfaces. It supports multiple voices and audio formats, making it perfect for various applications.",
+        multilingual: "TTSFM supports multiple languages and voices, allowing you to create diverse audio content for global audiences. The service is completely free and requires no API keys.",
+        long: "This is a longer text sample designed to test the auto-combine feature of TTSFM. When text exceeds the maximum length limit, TTSFM automatically splits it into smaller chunks, generates audio for each chunk, and then seamlessly combines them into a single audio file. This process is completely transparent to the user and ensures that you can convert text of any length without worrying about technical limitations. The resulting audio maintains consistent quality and natural flow throughout the entire content."
+    },
+    zh: {
+        welcome: "欢迎使用TTSFM！这是一个免费的文本转语音服务，使用先进的AI技术将您的文本转换为高质量音频。",
+        story: "很久很久以前，在一个遥远的数字世界里，住着一个小小的Python包，它能够将任何文本转换成美妙的语音。这个包叫做TTSFM，它为世界各地的开发者带来了快乐。",
+        technical: "TTSFM是一个用于文本转语音API的Python客户端，提供同步和异步接口。它支持多种声音和音频格式，非常适合各种应用。",
+        multilingual: "TTSFM支持多种语言和声音，让您能够为全球受众创建多样化的音频内容。该服务完全免费，无需API密钥。",
+        long: "这是一个较长的文本示例，用于测试TTSFM的自动合并功能。当文本超过最大长度限制时，TTSFM会自动将其分割成较小的片段，为每个片段生成音频，然后无缝地将它们合并成一个音频文件。这个过程对用户完全透明，确保您可以转换任何长度的文本，而无需担心技术限制。生成的音频在整个内容中保持一致的质量和自然的流畅性。"
+    }
+};
+// Get sample text for current locale
+function getSampleText(type) {
+    const locale = window.currentLocale;
+    const texts = sampleTexts[locale] || sampleTexts.en;
+    return texts[type] || texts.welcome;
+}
+// Error messages
+const errorMessages = {
+    en: {
+        empty_text: "Please enter some text to convert.",
+        generation_failed: "Failed to generate speech. Please try again.",
+        network_error: "Network error. Please check your connection and try again.",
+        invalid_format: "Invalid audio format selected.",
+        invalid_voice: "Invalid voice selected.",
+        text_too_long: "Text is too long. Please reduce the length or enable auto-combine.",
+        server_error: "Server error. Please try again later."
+    },
+    zh: {
+        empty_text: "请输入要转换的文本。",
+        generation_failed: "语音生成失败。请重试。",
+        network_error: "网络错误。请检查您的连接并重��。",
+        invalid_format: "选择的音频格式无效。",
+        invalid_voice: "选择的声音无效。",
+        text_too_long: "文本太长。请减少长度或启用自动合并。",
+        server_error: "服务器错误。请稍后重试。"
+    }
+};
+// Success messages
+const successMessages = {
+    en: {
+        generation_complete: "Speech generated successfully!",
+        text_copied: "Text copied to clipboard!",
+        download_started: "Download started!"
+    },
+    zh: {
+        generation_complete: "语音生成成功！",
+        text_copied: "文本已复制到剪贴板！",
+        download_started: "下载已开始！"
+    }
+};
+// Get error message
+function getErrorMessage(key) {
+    const locale = window.currentLocale;
+    const messages = errorMessages[locale] || errorMessages.en;
+    return messages[key] || key;
+}
+// Get success message
+function getSuccessMessage(key) {
+    const locale = window.currentLocale;
+    const messages = successMessages[locale] || successMessages.en;
+    return messages[key] || key;
+}
+// Format file size
+function formatFileSize(bytes) {
+    if (bytes === 0) return '0 Bytes';
+    const k = 1024;
+    const sizes = window.currentLocale === 'zh'
+        ? ['字节', 'KB', 'MB', 'GB']
+        : ['Bytes', 'KB', 'MB', 'GB'];
+    const i = Math.floor(Math.log(bytes) / Math.log(k));
+    return parseFloat((bytes / Math.pow(k, i)).toFixed(2)) + ' ' + sizes[i];
+}
+// Format duration
+function formatDuration(seconds) {
+    if (isNaN(seconds) || seconds < 0) {
+        return window.currentLocale === 'zh' ? '未知' : 'Unknown';
+    }
+    const minutes = Math.floor(seconds / 60);
+    const remainingSeconds = Math.floor(seconds % 60);
+    if (minutes > 0) {
+        return window.currentLocale === 'zh'
+            ? `${minutes}分${remainingSeconds}秒`
+            : `${minutes}m ${remainingSeconds}s`;
+    } else {
+        return window.currentLocale === 'zh'
+            ? `${remainingSeconds}秒`
+            : `${remainingSeconds}s`;
+    }
+}
+// Update UI text based on current locale
+function updateUIText() {
+    // Update button texts
+    const generateBtn = document.getElementById('generate-btn');
+    if (generateBtn && !generateBtn.disabled) {
+        generateBtn.innerHTML = window.currentLocale === 'zh'
+            ? '<i class="fas fa-magic me-2"></i>生成语音'
+            : '<i class="fas fa-magic me-2"></i>Generate Speech';
+    }
+    // Update other dynamic text elements
+    const charCountElement = document.querySelector('#char-count');
+    if (charCountElement) {
+        const count = charCountElement.textContent;
+        const parent = charCountElement.parentElement;
+        if (parent) {
+            // Escape HTML characters to prevent XSS
+            const escapedCount = count.replace(/&/g, '&amp;')
+                                     .replace(/</g, '&lt;')
+                                     .replace(/>/g, '&gt;')
+                                     .replace(/"/g, '&quot;')
+                                     .replace(/'/g, '&#x27;');
+            parent.innerHTML = window.currentLocale === 'zh'
+                ? `<i class="fas fa-keyboard me-1"></i><span id="char-count">${escapedCount}</span> 字符`
+                : `<i class="fas fa-keyboard me-1"></i><span id="char-count">${escapedCount}</span> characters`;
+        }
+    }
+}
+// Initialize i18n
+function initI18n() {
+    // Load translations if needed
+    loadTranslations();
+    // Update UI text
+    updateUIText();
+    // Listen for language changes
+    document.addEventListener('languageChanged', function(event) {
+        window.currentLocale = event.detail.locale;
+        loadTranslations().then(() => {
+            updateUIText();
+        });
+    });
+}
+// Export functions for global use
+window._ = _;
+window.getSampleText = getSampleText;
+window.getErrorMessage = getErrorMessage;
+window.getSuccessMessage = getSuccessMessage;
+window.formatFileSize = formatFileSize;
+window.formatDuration = formatDuration;
+window.initI18n = initI18n;
+// Auto-initialize when DOM is ready
+if (document.readyState === 'loading') {
+    document.addEventListener('DOMContentLoaded', initI18n);
+} else {
+    initI18n();
+}

ttsfm-web/static/js/playground-enhanced-fixed.js ADDED Viewed

	@@ -0,0 +1,712 @@

+// TTSFM Enhanced Playground with WebSocket Streaming Support - Fixed Version
+// Global variables
+let currentAudioBlob = null;
+let currentFormat = 'mp3';
+let batchResults = [];
+let wsClient = null;
+let streamingMode = false;
+let currentStreamRequest = null;
+// Initialize playground
+document.addEventListener('DOMContentLoaded', function() {
+    initializePlayground();
+    initializeWebSocket();
+});
+// Initialize WebSocket client
+function initializeWebSocket() {
+    // Check if Socket.IO is available
+    if (typeof io === 'undefined') {
+        console.warn('Socket.IO not loaded. WebSocket streaming will be disabled.');
+        return;
+    }
+    // Initialize WebSocket client
+    wsClient = new WebSocketTTSClient({
+        socketUrl: window.location.origin,
+        debug: true,
+        onConnect: () => {
+            console.log('WebSocket connected');
+            updateStreamingStatus('connected');
+        },
+        onDisconnect: () => {
+            console.log('WebSocket disconnected');
+            updateStreamingStatus('disconnected');
+        },
+        onError: (error) => {
+            console.error('WebSocket error:', error);
+            updateStreamingStatus('error');
+        }
+    });
+}
+// Update streaming status indicator
+function updateStreamingStatus(status) {
+    const indicator = document.getElementById('streaming-indicator');
+    if (!indicator) return;
+    indicator.className = 'streaming-status';
+    switch(status) {
+        case 'connected':
+            indicator.classList.add('connected');
+            indicator.innerHTML = '<i class="fas fa-bolt"></i> Streaming Ready';
+            enableStreamingMode(true);
+            break;
+        case 'disconnected':
+            indicator.classList.add('disconnected');
+            indicator.innerHTML = '<i class="fas fa-plug"></i> Streaming Offline';
+            enableStreamingMode(false);
+            break;
+        case 'error':
+            indicator.classList.add('error');
+            indicator.innerHTML = '<i class="fas fa-exclamation-triangle"></i> Connection Error';
+            enableStreamingMode(false);
+            break;
+        case 'streaming':
+            indicator.classList.add('streaming');
+            indicator.innerHTML = '<i class="fas fa-stream"></i> Streaming...';
+            break;
+    }
+}
+// Enable/disable streaming mode
+function enableStreamingMode(enabled) {
+    const streamToggle = document.getElementById('stream-mode-toggle');
+    if (streamToggle) {
+        streamToggle.disabled = !enabled;
+        if (!enabled && streamingMode) {
+            streamingMode = false;
+            streamToggle.checked = false;
+        }
+    }
+}
+// Check authentication status
+async function checkAuthStatus() {
+    try {
+        const response = await fetch('/api/auth-status');
+        const data = await response.json();
+        const apiKeySection = document.getElementById('api-key-section');
+        if (apiKeySection) {
+            if (data.api_key_required) {
+                apiKeySection.style.display = 'block';
+                const apiKeyInput = document.getElementById('api-key-input');
+                if (apiKeyInput) {
+                    apiKeyInput.required = true;
+                }
+            } else {
+                apiKeySection.style.display = 'none';
+            }
+        }
+    } catch (error) {
+        console.warn('Could not check auth status:', error);
+    }
+}
+function initializePlayground() {
+    console.log('Initializing enhanced playground...');
+    checkAuthStatus();
+    loadVoices();
+    loadFormats();
+    updateCharCount();
+    setupEventListeners();
+    setupStreamingControls();
+    console.log('Enhanced playground initialization complete');
+}
+function setupStreamingControls() {
+    // Add streaming mode toggle
+    const generateButton = document.getElementById('generate-btn');
+    if (generateButton && generateButton.parentElement) {
+        const streamingControls = document.createElement('div');
+        streamingControls.className = 'streaming-controls mt-3';
+        streamingControls.innerHTML = `
+            <div class="form-check form-switch">
+                <input class="form-check-input" type="checkbox" id="stream-mode-toggle" disabled>
+                <label class="form-check-label" for="stream-mode-toggle">
+                    <i class="fas fa-bolt me-1"></i>
+                    Enable WebSocket Streaming
+                    <small class="text-muted">(Real-time audio chunks)</small>
+                </label>
+            </div>
+            <div id="streaming-indicator" class="streaming-status mt-2"></div>
+        `;
+        generateButton.parentElement.appendChild(streamingControls);
+        // Add toggle event listener
+        const toggle = document.getElementById('stream-mode-toggle');
+        if (toggle) {
+            toggle.addEventListener('change', (e) => {
+                streamingMode = e.target.checked;
+                console.log('Streaming mode:', streamingMode ? 'ON' : 'OFF');
+                // Update button text
+                const btnText = generateButton.querySelector('.btn-text');
+                if (btnText) {
+                    if (streamingMode) {
+                        btnText.innerHTML = '<i class="fas fa-bolt me-2"></i>Stream Speech';
+                    } else {
+                        btnText.innerHTML = '<i class="fas fa-magic me-2"></i>' +
+                            (window.currentLocale === 'zh' ? '生成语音' : 'Generate Speech');
+                    }
+                }
+            });
+        }
+    }
+    // Add streaming progress section and error message div
+    const audioResult = document.getElementById('audio-result');
+    if (audioResult && audioResult.parentElement) {
+        // Add error message div
+        const errorDiv = document.createElement('div');
+        errorDiv.id = 'error-message';
+        errorDiv.className = 'alert alert-danger';
+        errorDiv.style.display = 'none';
+        audioResult.parentElement.insertBefore(errorDiv, audioResult);
+        // Add loading section
+        const loadingDiv = document.createElement('div');
+        loadingDiv.id = 'loading-section';
+        loadingDiv.className = 'text-center';
+        loadingDiv.style.display = 'none';
+        loadingDiv.innerHTML = `
+            <div class="spinner-border text-primary" role="status">
+                <span class="visually-hidden">Loading...</span>
+            </div>
+            <p class="mt-2">Generating speech...</p>
+        `;
+        audioResult.parentElement.insertBefore(loadingDiv, audioResult);
+        // Add progress section
+        const progressSection = document.createElement('div');
+        progressSection.id = 'streaming-progress';
+        progressSection.className = 'streaming-progress-section';
+        progressSection.style.display = 'none';
+        progressSection.innerHTML = `
+            <div class="card border-primary">
+                <div class="card-body">
+                    <h5 class="card-title">
+                        <i class="fas fa-stream me-2"></i>Streaming Progress
+                    </h5>
+                    <div class="progress mb-3" style="height: 25px;">
+                        <div class="progress-bar progress-bar-striped progress-bar-animated"
+                             id="stream-progress-bar"
+                             role="progressbar"
+                             style="width: 0%">
+                            <span id="stream-progress-text">0%</span>
+                        </div>
+                    </div>
+                    <div class="row text-center">
+                        <div class="col-md-4">
+                            <h6>Chunks</h6>
+                            <p class="h5"><span id="chunks-count">0</span> / <span id="total-chunks">0</span></p>
+                        </div>
+                        <div class="col-md-4">
+                            <h6>Data</h6>
+                            <p class="h5" id="data-transferred">0 KB</p>
+                        </div>
+                        <div class="col-md-4">
+                            <h6>Time</h6>
+                            <p class="h5" id="stream-time">0.0s</p>
+                        </div>
+                    </div>
+                    <div id="chunks-visualization" class="chunks-visual mt-3"></div>
+                </div>
+            </div>
+        `;
+        audioResult.parentElement.insertBefore(progressSection, audioResult);
+    }
+}
+function setupEventListeners() {
+    console.log('Setting up event listeners...');
+    // Form and input events
+    const textInput = document.getElementById('text-input');
+    if (textInput) {
+        textInput.addEventListener('input', updateCharCount);
+    }
+    // Form submit
+    const form = document.getElementById('tts-form');
+    if (form) {
+        form.addEventListener('submit', function(event) {
+            event.preventDefault();
+            event.stopPropagation();
+            if (streamingMode && wsClient && wsClient.isConnected()) {
+                generateSpeechStreaming(event);
+            } else {
+                generateSpeech(event);
+            }
+            return false;
+        });
+    }
+    // Download button
+    const downloadBtn = document.getElementById('download-btn');
+    if (downloadBtn) {
+        downloadBtn.addEventListener('click', downloadAudio);
+    }
+}
+// Generate speech using WebSocket streaming
+async function generateSpeechStreaming(event) {
+    event.preventDefault();
+    const text = document.getElementById('text-input').value.trim();
+    const voice = document.getElementById('voice-select').value;
+    const format = document.getElementById('format-select').value;
+    if (!text) {
+        showError('Please enter some text to convert');
+        return;
+    }
+    // Reset UI
+    hideError();
+    hideResults();
+    disableForm();
+    // Show streaming progress
+    const progressSection = document.getElementById('streaming-progress');
+    if (progressSection) progressSection.style.display = 'block';
+    // Reset progress
+    updateStreamingProgress(0, 0, 0);
+    const chunksViz = document.getElementById('chunks-visualization');
+    if (chunksViz) chunksViz.innerHTML = '';
+    // Update status
+    updateStreamingStatus('streaming');
+    const startTime = Date.now();
+    let audioChunks = [];
+    try {
+        const result = await wsClient.generateSpeech(text, {
+            voice: voice,
+            format: format,
+            chunkSize: 512,
+            onStart: (data) => {
+                currentStreamRequest = data.request_id;
+                console.log('Streaming started:', data);
+            },
+            onProgress: (progress) => {
+                updateStreamingProgress(
+                    progress.progress,
+                    progress.chunksCompleted,
+                    progress.totalChunks
+                );
+                const elapsed = (Date.now() - startTime) / 1000;
+                const timeEl = document.getElementById('stream-time');
+                if (timeEl) timeEl.textContent = `${elapsed.toFixed(1)}s`;
+            },
+            onChunk: (chunk) => {
+                // Visualize chunk
+                const chunksViz = document.getElementById('chunks-visualization');
+                if (chunksViz) {
+                    const chunkViz = document.createElement('div');
+                    chunkViz.className = 'chunk-indicator';
+                    chunkViz.title = `Chunk ${chunk.chunkIndex + 1} - ${(chunk.audioData.byteLength / 1024).toFixed(1)}KB`;
+                    chunkViz.innerHTML = `<i class="fas fa-music"></i>`;
+                    chunksViz.appendChild(chunkViz);
+                }
+                // Update data transferred
+                const dataEl = document.getElementById('data-transferred');
+                if (dataEl) {
+                    const currentData = parseFloat(dataEl.textContent) || 0;
+                    const newData = currentData + (chunk.audioData.byteLength / 1024);
+                    dataEl.textContent = `${newData.toFixed(1)} KB`;
+                }
+                audioChunks.push(chunk);
+            },
+            onComplete: (result) => {
+                console.log('Streaming complete:', result);
+                // Create blob from audio data
+                currentAudioBlob = new Blob([result.audioData], { type: `audio/${result.format}` });
+                currentFormat = result.format;
+                // Show results
+                showResults(currentAudioBlob, result.format);
+                // Update final stats
+                const totalTime = (Date.now() - startTime) / 1000;
+                showStreamingStats({
+                    chunks: result.chunks.length,
+                    totalSize: (result.audioData.byteLength / 1024).toFixed(1),
+                    totalTime: totalTime.toFixed(2),
+                    format: result.format
+                });
+            },
+            onError: (error) => {
+                showError(`Streaming error: ${error.message}`);
+                enableForm();
+                if (progressSection) progressSection.style.display = 'none';
+            }
+        });
+    } catch (error) {
+        showError(`Failed to stream speech: ${error.message}`);
+        enableForm();
+        if (progressSection) progressSection.style.display = 'none';
+    } finally {
+        updateStreamingStatus('connected');
+        currentStreamRequest = null;
+    }
+}
+function updateStreamingProgress(progress, chunks, totalChunks) {
+    const progressBar = document.getElementById('stream-progress-bar');
+    const progressText = document.getElementById('stream-progress-text');
+    const chunksCount = document.getElementById('chunks-count');
+    const totalChunksEl = document.getElementById('total-chunks');
+    if (progressBar) {
+        progressBar.style.width = `${progress}%`;
+        if (progressText) progressText.textContent = `${progress}%`;
+    }
+    if (chunksCount) chunksCount.textContent = chunks;
+    if (totalChunksEl) totalChunksEl.textContent = totalChunks;
+}
+function showStreamingStats(stats) {
+    const progressSection = document.getElementById('streaming-progress');
+    if (!progressSection) return;
+    const statsHtml = `
+        <div class="alert alert-success mt-3">
+            <h6><i class="fas fa-check-circle me-2"></i>Streaming Complete!</h6>
+            <div class="row mt-2">
+                <div class="col-md-3">
+                    <strong>Chunks:</strong> ${stats.chunks}
+                </div>
+                <div class="col-md-3">
+                    <strong>Total Size:</strong> ${stats.totalSize} KB
+                </div>
+                <div class="col-md-3">
+                    <strong>Time:</strong> ${stats.totalTime}s
+                </div>
+                <div class="col-md-3">
+                    <strong>Format:</strong> ${stats.format.toUpperCase()}
+                </div>
+            </div>
+        </div>
+    `;
+    const statsDiv = document.createElement('div');
+    statsDiv.innerHTML = statsHtml;
+    progressSection.appendChild(statsDiv);
+}
+// Load available voices
+async function loadVoices() {
+    try {
+        const response = await fetch('/api/voices');
+        const data = await response.json();
+        const voiceSelect = document.getElementById('voice-select');
+        if (voiceSelect) {
+            voiceSelect.innerHTML = '';
+            data.voices.forEach(voice => {
+                const option = document.createElement('option');
+                option.value = voice.id;
+                option.textContent = voice.name;
+                if (voice.id === 'alloy') {
+                    option.selected = true;
+                }
+                voiceSelect.appendChild(option);
+            });
+        }
+    } catch (error) {
+        console.error('Failed to load voices:', error);
+    }
+}
+// Load available formats
+async function loadFormats() {
+    try {
+        const response = await fetch('/api/formats');
+        const data = await response.json();
+        const formatSelect = document.getElementById('format-select');
+        if (formatSelect) {
+            formatSelect.innerHTML = '';
+            data.formats.forEach(format => {
+                const option = document.createElement('option');
+                option.value = format.id;
+                option.textContent = `${format.name} - ${format.quality}`;
+                if (format.id === 'mp3') {
+                    option.selected = true;
+                }
+                formatSelect.appendChild(option);
+            });
+        }
+    } catch (error) {
+        console.error('Failed to load formats:', error);
+    }
+}
+// Update character count
+function updateCharCount() {
+    const textInput = document.getElementById('text-input');
+    const charCount = document.getElementById('char-count');
+    const maxLengthInput = document.getElementById('max-length-input');
+    if (textInput && charCount) {
+        const currentLength = textInput.value.length;
+        const maxLength = maxLengthInput ? parseInt(maxLengthInput.value) : 4096;
+        charCount.textContent = currentLength;
+        if (currentLength > maxLength) {
+            charCount.className = 'text-danger fw-bold';
+        } else if (currentLength > maxLength * 0.8) {
+            charCount.className = 'text-warning fw-bold';
+        } else {
+            charCount.className = '';
+        }
+    }
+}
+// Generate speech (original HTTP method)
+async function generateSpeech(event) {
+    event.preventDefault();
+    const text = document.getElementById('text-input').value.trim();
+    const voice = document.getElementById('voice-select').value;
+    const format = document.getElementById('format-select').value;
+    const instructions = document.getElementById('instructions-input')?.value.trim() || '';
+    const apiKey = document.getElementById('api-key-input')?.value.trim() || '';
+    if (!text) {
+        showError('Please enter some text to convert');
+        return;
+    }
+    hideError();
+    hideResults();
+    showLoading();
+    disableForm();
+    try {
+        const headers = {
+            'Content-Type': 'application/json'
+        };
+        if (apiKey) {
+            headers['Authorization'] = `Bearer ${apiKey}`;
+        }
+        const requestBody = {
+            text: text,
+            voice: voice,
+            format: format
+        };
+        if (instructions) {
+            requestBody.instructions = instructions;
+        }
+        const response = await fetch('/api/generate', {
+            method: 'POST',
+            headers: headers,
+            body: JSON.stringify(requestBody)
+        });
+        if (!response.ok) {
+            let errorMessage = `Error: ${response.status} ${response.statusText}`;
+            try {
+                const errorData = await response.json();
+                if (errorData.error?.message) {
+                    errorMessage = errorData.error.message;
+                }
+            } catch (e) {
+                // Use default error message
+            }
+            throw new Error(errorMessage);
+        }
+        const blob = await response.blob();
+        currentAudioBlob = blob;
+        currentFormat = format;
+        showResults(blob, format);
+    } catch (error) {
+        showError(error.message);
+    } finally {
+        hideLoading();
+        enableForm();
+    }
+}
+// Show/hide functions
+function showLoading() {
+    const loading = document.getElementById('loading-section');
+    if (loading) loading.style.display = 'block';
+}
+function hideLoading() {
+    const loading = document.getElementById('loading-section');
+    if (loading) loading.style.display = 'none';
+}
+function showResults(blob, format) {
+    const audioUrl = URL.createObjectURL(blob);
+    const audioPlayer = document.getElementById('audio-player');
+    if (audioPlayer) {
+        audioPlayer.src = audioUrl;
+    }
+    const audioResult = document.getElementById('audio-result');
+    if (audioResult) {
+        audioResult.classList.remove('d-none');
+    }
+    const downloadBtn = document.getElementById('download-btn');
+    if (downloadBtn) {
+        downloadBtn.disabled = false;
+    }
+    enableForm();
+}
+function hideResults() {
+    const audioResult = document.getElementById('audio-result');
+    if (audioResult) {
+        audioResult.classList.add('d-none');
+    }
+}
+function showError(message) {
+    const errorDiv = document.getElementById('error-message');
+    if (errorDiv) {
+        errorDiv.textContent = message;
+        errorDiv.style.display = 'block';
+    }
+}
+function hideError() {
+    const errorDiv = document.getElementById('error-message');
+    if (errorDiv) {
+        errorDiv.style.display = 'none';
+    }
+}
+function disableForm() {
+    const elements = ['generate-btn', 'text-input', 'voice-select', 'format-select'];
+    elements.forEach(id => {
+        const el = document.getElementById(id);
+        if (el) el.disabled = true;
+    });
+}
+function enableForm() {
+    const elements = ['generate-btn', 'text-input', 'voice-select', 'format-select'];
+    elements.forEach(id => {
+        const el = document.getElementById(id);
+        if (el) el.disabled = false;
+    });
+}
+// Download audio
+function downloadAudio() {
+    if (!currentAudioBlob) return;
+    const url = URL.createObjectURL(currentAudioBlob);
+    const a = document.createElement('a');
+    a.href = url;
+    a.download = `tts_${Date.now()}.${currentFormat}`;
+    a.click();
+    URL.revokeObjectURL(url);
+}
+// Add CSS for streaming visualization
+const style = document.createElement('style');
+style.textContent = `
+.streaming-controls {
+    padding: 15px;
+    background-color: #f8f9fa;
+    border-radius: 8px;
+}
+.streaming-status {
+    display: inline-block;
+    padding: 5px 10px;
+    border-radius: 20px;
+    font-size: 0.875rem;
+    font-weight: 500;
+}
+.streaming-status.connected {
+    background-color: #d4edda;
+    color: #155724;
+}
+.streaming-status.disconnected {
+    background-color: #f8d7da;
+    color: #721c24;
+}
+.streaming-status.error {
+    background-color: #fff3cd;
+    color: #856404;
+}
+.streaming-status.streaming {
+    background-color: #cce5ff;
+    color: #004085;
+    animation: pulse 1.5s infinite;
+}
+@keyframes pulse {
+    0% { opacity: 1; }
+    50% { opacity: 0.7; }
+    100% { opacity: 1; }
+}
+.streaming-progress-section {
+    margin-bottom: 20px;
+}
+.chunks-visual {
+    display: flex;
+    flex-wrap: wrap;
+    gap: 5px;
+}
+.chunk-indicator {
+    width: 30px;
+    height: 30px;
+    background-color: #007bff;
+    color: white;
+    border-radius: 4px;
+    display: flex;
+    align-items: center;
+    justify-content: center;
+    font-size: 0.75rem;
+    animation: chunkAppear 0.3s ease-out;
+}
+@keyframes chunkAppear {
+    from {
+        transform: scale(0);
+        opacity: 0;
+    }
+    to {
+        transform: scale(1);
+        opacity: 1;
+    }
+}
+`;
+document.head.appendChild(style);

ttsfm-web/static/js/playground.js CHANGED Viewed

@@ -1,745 +1,861 @@
-// TTSFM Playground JavaScript
-// Global variables
-let currentAudioBlob = null;
-let currentFormat = 'mp3';
-let batchResults = [];
-// Initialize playground
-document.addEventListener('DOMContentLoaded', function() {
-    initializePlayground();
-});
-function initializePlayground() {
-    loadVoices();
-    loadFormats();
-    updateCharCount();
-    setupEventListeners();
-    // Initialize tooltips if Bootstrap is available
-    if (typeof bootstrap !== 'undefined') {
-        const tooltipTriggerList = [].slice.call(document.querySelectorAll('[data-bs-toggle="tooltip"]'));
-        tooltipTriggerList.map(function (tooltipTriggerEl) {
-            return new bootstrap.Tooltip(tooltipTriggerEl);
-        });
-    }
-}
-function setupEventListeners() {
-    // Form and input events
-    document.getElementById('text-input').addEventListener('input', updateCharCount);
-    document.getElementById('tts-form').addEventListener('submit', generateSpeech);
-    document.getElementById('max-length-input').addEventListener('input', updateCharCount);
-    document.getElementById('auto-split-check').addEventListener('change', updateGenerateButton);
-    // Enhanced button events
-    document.getElementById('validate-text-btn').addEventListener('click', validateText);
-    document.getElementById('random-text-btn').addEventListener('click', loadRandomText);
-    document.getElementById('download-btn').addEventListener('click', downloadAudio);
-    document.getElementById('download-all-btn').addEventListener('click', downloadAllAudio);
-    // New button events
-    const clearTextBtn = document.getElementById('clear-text-btn');
-    if (clearTextBtn) {
-        clearTextBtn.addEventListener('click', clearText);
-    }
-    const resetFormBtn = document.getElementById('reset-form-btn');
-    if (resetFormBtn) {
-        resetFormBtn.addEventListener('click', resetForm);
-    }
-    const replayBtn = document.getElementById('replay-btn');
-    if (replayBtn) {
-        replayBtn.addEventListener('click', replayAudio);
-    }
-    const shareBtn = document.getElementById('share-btn');
-    if (shareBtn) {
-        shareBtn.addEventListener('click', shareAudio);
-    }
-    // Voice and format selection events
-    document.getElementById('voice-select').addEventListener('change', updateVoiceInfo);
-    document.getElementById('format-select').addEventListener('change', updateFormatInfo);
-    // Example text buttons
-    document.querySelectorAll('.use-example').forEach(button => {
-        button.addEventListener('click', function() {
-            document.getElementById('text-input').value = this.dataset.text;
-            updateCharCount();
-            // Add visual feedback
-            this.classList.add('btn-success');
-            setTimeout(() => {
-                this.classList.remove('btn-success');
-                this.classList.add('btn-outline-primary');
-            }, 1000);
-        });
-    });
-    // Keyboard shortcuts
-    document.addEventListener('keydown', function(e) {
-        // Ctrl/Cmd + Enter to generate speech
-        if ((e.ctrlKey || e.metaKey) && e.key === 'Enter') {
-            e.preventDefault();
-            document.getElementById('generate-btn').click();
-        }
-        // Escape to clear results
-        if (e.key === 'Escape') {
-            clearResults();
-        }
-    });
-}
-async function loadVoices() {
-    try {
-        const response = await fetch('/api/voices');
-        const data = await response.json();
-        const select = document.getElementById('voice-select');
-        select.innerHTML = '';
-        data.voices.forEach(voice => {
-            const option = document.createElement('option');
-            option.value = voice.id;
-            option.textContent = `${voice.name} - ${voice.description}`;
-            select.appendChild(option);
-        });
-        // Select default voice
-        select.value = 'alloy';
-    } catch (error) {
-        console.error('Failed to load voices:', error);
-        console.log('Failed to load voices. Please refresh the page.');
-    }
-}
-async function loadFormats() {
-    try {
-        const response = await fetch('/api/formats');
-        const data = await response.json();
-        const select = document.getElementById('format-select');
-        select.innerHTML = '';
-        data.formats.forEach(format => {
-            const option = document.createElement('option');
-            option.value = format.id;
-            option.textContent = `${format.name} - ${format.description}`;
-            select.appendChild(option);
-        });
-        // Select default format
-        select.value = 'mp3';
-        updateFormatInfo();
-    } catch (error) {
-        console.error('Failed to load formats:', error);
-        console.log('Failed to load formats. Please refresh the page.');
-    }
-}
-function updateCharCount() {
-    const text = document.getElementById('text-input').value;
-    const maxLength = parseInt(document.getElementById('max-length-input').value) || 4096;
-    const charCount = text.length;
-    document.getElementById('char-count').textContent = charCount.toLocaleString();
-    // Update length status with better visual feedback
-    const statusElement = document.getElementById('length-status');
-    const percentage = (charCount / maxLength) * 100;
-    if (charCount > maxLength) {
-        statusElement.innerHTML = '<span class="badge bg-danger"><i class="fas fa-exclamation-triangle me-1"></i>Exceeds limit</span>';
-    } else if (percentage > 80) {
-        statusElement.innerHTML = '<span class="badge bg-warning"><i class="fas fa-exclamation me-1"></i>Near limit</span>';
-    } else if (percentage > 50) {
-        statusElement.innerHTML = '<span class="badge bg-info"><i class="fas fa-info me-1"></i>Good</span>';
-    } else {
-        statusElement.innerHTML = '<span class="badge bg-success"><i class="fas fa-check me-1"></i>OK</span>';
-    }
-    updateGenerateButton();
-}
-function updateGenerateButton() {
-    const text = document.getElementById('text-input').value;
-    const maxLength = parseInt(document.getElementById('max-length-input').value) || 4096;
-    const autoSplit = document.getElementById('auto-split-check').checked;
-    const generateBtn = document.getElementById('generate-btn');
-    const btnText = generateBtn.querySelector('.btn-text');
-    if (text.length > maxLength && autoSplit) {
-        btnText.innerHTML = '<i class="fas fa-layer-group me-2"></i>Generate Speech (Batch Mode)';
-        generateBtn.classList.add('btn-warning');
-        generateBtn.classList.remove('btn-primary');
-    } else {
-        btnText.innerHTML = '<i class="fas fa-magic me-2"></i>Generate Speech';
-        generateBtn.classList.add('btn-primary');
-        generateBtn.classList.remove('btn-warning');
-    }
-}
-async function validateText() {
-    const text = document.getElementById('text-input').value.trim();
-    const maxLength = parseInt(document.getElementById('max-length-input').value) || 4096;
-    if (!text) {
-        console.log('Please enter some text to validate');
-        return;
-    }
-    const validateBtn = document.getElementById('validate-text-btn');
-    setLoading(validateBtn, true);
-    try {
-        const response = await fetch('/api/validate-text', {
-            method: 'POST',
-            headers: { 'Content-Type': 'application/json' },
-            body: JSON.stringify({ text, max_length: maxLength })
-        });
-        const data = await response.json();
-        const resultDiv = document.getElementById('validation-result');
-        if (data.is_valid) {
-            resultDiv.innerHTML = `
-                <div class="alert alert-success fade-in">
-                    <i class="fas fa-check-circle me-2"></i>
-                    <strong>Text is valid!</strong> (${data.text_length.toLocaleString()} characters)
-                    <div class="progress progress-custom mt-2">
-                        <div class="progress-bar-custom" style="width: ${(data.text_length / data.max_length) * 100}%"></div>
-                    </div>
-                </div>
-            `;
-        } else {
-            resultDiv.innerHTML = `
-                <div class="alert alert-warning fade-in">
-                    <i class="fas fa-exclamation-triangle me-2"></i>
-                    <strong>Text exceeds limit!</strong> (${data.text_length.toLocaleString()}/${data.max_length.toLocaleString()} characters)
-                    <br><small class="mt-2 d-block">Suggested chunks: ${data.suggested_chunks}</small>
-                    <div class="mt-3">
-                        <strong>Preview of chunks:</strong>
-                        <div class="mt-2">
-                            ${data.chunk_preview.map((chunk, i) => `
-                                <div class="border rounded p-2 mb-2 bg-light">
-                                    <small class="text-muted">Chunk ${i+1}:</small>
-                                    <div class="small">${chunk}</div>
-                                </div>
-                            `).join('')}
-                        </div>
-                        <button class="btn btn-sm btn-outline-primary mt-2" onclick="enableAutoSplit()">
-                            <i class="fas fa-magic me-1"></i>Enable Auto-Split
-                        </button>
-                    </div>
-                </div>
-            `;
-        }
-        resultDiv.classList.remove('d-none');
-        resultDiv.scrollIntoView({ behavior: 'smooth', block: 'nearest' });
-    } catch (error) {
-        console.error('Validation failed:', error);
-        console.log('Failed to validate text. Please try again.');
-    } finally {
-        setLoading(validateBtn, false);
-    }
-}
-function enableAutoSplit() {
-    document.getElementById('auto-split-check').checked = true;
-    updateGenerateButton();
-    console.log('Auto-split enabled! Click Generate Speech to process in batch mode.');
-}
-async function generateSpeech(event) {
-    event.preventDefault();
-    const button = document.getElementById('generate-btn');
-    const audioResult = document.getElementById('audio-result');
-    const batchResult = document.getElementById('batch-result');
-    // Get form data
-    const formData = getFormData();
-    if (!validateFormData(formData)) {
-        return;
-    }
-    // Check if we need batch processing
-    const needsBatch = formData.text.length > formData.maxLength && formData.autoSplit;
-    // Show loading state
-    setLoading(button, true);
-    clearResults();
-    try {
-        if (needsBatch) {
-            await generateBatchSpeech(formData);
-        } else {
-            await generateSingleSpeech(formData);
-        }
-    } catch (error) {
-        console.error('Generation failed:', error);
-        console.log(`Failed to generate speech: ${error.message}`);
-    } finally {
-        setLoading(button, false);
-    }
-}
-function getFormData() {
-    return {
-        text: document.getElementById('text-input').value.trim(),
-        voice: document.getElementById('voice-select').value,
-        format: document.getElementById('format-select').value,
-        instructions: document.getElementById('instructions-input').value.trim(),
-        maxLength: parseInt(document.getElementById('max-length-input').value) || 4096,
-        validateLength: document.getElementById('validate-length-check').checked,
-        autoSplit: document.getElementById('auto-split-check').checked
-    };
-}
-function validateFormData(formData) {
-    if (!formData.text || !formData.voice || !formData.format) {
-        console.log('Please fill in all required fields');
-        return false;
-    }
-    if (formData.text.length > formData.maxLength && formData.validateLength && !formData.autoSplit) {
-        console.log(`Text is too long (${formData.text.length} characters). Enable auto-split or reduce text length.`);
-        return false;
-    }
-    return true;
-}
-function clearResults() {
-    document.getElementById('audio-result').classList.add('d-none');
-    document.getElementById('batch-result').classList.add('d-none');
-    document.getElementById('validation-result').classList.add('d-none');
-}
-// Utility functions
-function setLoading(button, loading) {
-    if (loading) {
-        button.classList.add('loading');
-        button.disabled = true;
-    } else {
-        button.classList.remove('loading');
-        button.disabled = false;
-    }
-}
-async function generateSingleSpeech(formData) {
-    const audioResult = document.getElementById('audio-result');
-    const response = await fetch('/api/generate', {
-        method: 'POST',
-        headers: { 'Content-Type': 'application/json' },
-        body: JSON.stringify({
-            text: formData.text,
-            voice: formData.voice,
-            format: formData.format,
-            instructions: formData.instructions || undefined,
-            max_length: formData.maxLength,
-            validate_length: formData.validateLength
-        })
-    });
-    if (!response.ok) {
-        const errorData = await response.json();
-        throw new Error(errorData.error || `HTTP ${response.status}`);
-    }
-    // Get audio data
-    const audioBlob = await response.blob();
-    currentAudioBlob = audioBlob;
-    currentFormat = formData.format;
-    // Create audio URL and setup player
-    const audioUrl = URL.createObjectURL(audioBlob);
-    const audioPlayer = document.getElementById('audio-player');
-    audioPlayer.src = audioUrl;
-    // Use enhanced display function
-    displayAudioResult(audioBlob, formData.format, formData.voice, formData.text);
-    console.log('Speech generated successfully! Click play to listen.');
-    // Auto-play if user prefers
-    if (localStorage.getItem('autoPlay') === 'true') {
-        audioPlayer.play().catch(() => {
-            // Auto-play blocked, that's fine
-        });
-    }
-}
-async function generateBatchSpeech(formData) {
-    const batchResult = document.getElementById('batch-result');
-    const response = await fetch('/api/generate-batch', {
-        method: 'POST',
-        headers: { 'Content-Type': 'application/json' },
-        body: JSON.stringify({
-            text: formData.text,
-            voice: formData.voice,
-            format: formData.format,
-            instructions: formData.instructions || undefined,
-            max_length: formData.maxLength,
-            preserve_words: true
-        })
-    });
-    if (!response.ok) {
-        const errorData = await response.json();
-        throw new Error(errorData.error || `HTTP ${response.status}`);
-    }
-    const data = await response.json();
-    batchResults = data.results;
-    // Update batch summary
-    const summaryDiv = document.getElementById('batch-summary');
-    summaryDiv.innerHTML = `
-        <i class="fas fa-layer-group me-2"></i>
-        <strong>Batch Processing Complete!</strong>
-        Generated ${data.successful_chunks} of ${data.total_chunks} audio chunks successfully.
-        ${data.successful_chunks < data.total_chunks ?
-            `<br><small class="text-warning">⚠️ ${data.total_chunks - data.successful_chunks} chunks failed to generate.</small>` :
-            '<br><small class="text-success">✅ All chunks generated successfully!</small>'
-        }
-    `;
-    // Display chunks
-    displayBatchChunks(data.results, formData.format);
-    // Show batch result with animation
-    batchResult.classList.remove('d-none');
-    batchResult.classList.add('fade-in');
-    console.log(`Batch processing completed! Generated ${data.successful_chunks} audio files.`);
-}
-function displayBatchChunks(results, format) {
-    const chunksDiv = document.getElementById('batch-chunks');
-    chunksDiv.innerHTML = '';
-    results.forEach((result, index) => {
-        const chunkDiv = document.createElement('div');
-        chunkDiv.className = 'col-md-6 col-lg-4 mb-3';
-        if (result.audio_data) {
-            // Convert base64 to blob
-            const audioBlob = base64ToBlob(result.audio_data, result.content_type);
-            const audioUrl = URL.createObjectURL(audioBlob);
-            chunkDiv.innerHTML = `
-                <div class="card batch-chunk-card h-100">
-                    <div class="card-body">
-                        <div class="d-flex justify-content-between align-items-start mb-2">
-                            <h6 class="card-title mb-0">
-                                <i class="fas fa-music me-1"></i>Chunk ${result.chunk_index}
-                            </h6>
-                            <span class="badge bg-success">
-                                <i class="fas fa-check me-1"></i>Success
-                            </span>
-                        </div>
-                        <p class="card-text small text-muted mb-3">${result.chunk_text}</p>
-                        <audio controls class="w-100 mb-3" preload="metadata">
-                            <source src="${audioUrl}" type="${result.content_type}">
-                            Your browser does not support audio playback.
-                        </audio>
-                        <div class="d-flex justify-content-between align-items-center">
-                            <small class="text-muted">
-                                <i class="fas fa-file-audio me-1"></i>
-                                ${(result.size / 1024).toFixed(1)} KB
-                            </small>
-                            <button class="btn btn-sm btn-outline-primary download-chunk"
-                                    data-url="${audioUrl}"
-                                    data-filename="chunk_${result.chunk_index}.${result.format}"
-                                    title="Download this chunk">
-                                <i class="fas fa-download"></i>
-                            </button>
-                        </div>
-                    </div>
-                </div>
-            `;
-        } else {
-            chunkDiv.innerHTML = `
-                <div class="card border-danger h-100">
-                    <div class="card-body">
-                        <div class="d-flex justify-content-between align-items-start mb-2">
-                            <h6 class="card-title mb-0 text-danger">
-                                <i class="fas fa-exclamation-triangle me-1"></i>Chunk ${result.chunk_index}
-                            </h6>
-                            <span class="badge bg-danger">
-                                <i class="fas fa-times me-1"></i>Failed
-                            </span>
-                        </div>
-                        <p class="card-text small text-muted mb-3">${result.chunk_text}</p>
-                        <div class="alert alert-danger small mb-0">
-                            <i class="fas fa-exclamation-circle me-1"></i>
-                            ${result.error}
-                        </div>
-                    </div>
-                </div>
-            `;
-        }
-        chunksDiv.appendChild(chunkDiv);
-    });
-    // Add download event listeners
-    document.querySelectorAll('.download-chunk').forEach(btn => {
-        btn.addEventListener('click', function() {
-            const url = this.dataset.url;
-            const filename = this.dataset.filename;
-            downloadFromUrl(url, filename);
-            // Visual feedback
-            const icon = this.querySelector('i');
-            icon.className = 'fas fa-check';
-            setTimeout(() => {
-                icon.className = 'fas fa-download';
-            }, 1000);
-        });
-    });
-}
-function downloadAudio() {
-    if (!currentAudioBlob) {
-        console.log('No audio to download');
-        return;
-    }
-    const url = URL.createObjectURL(currentAudioBlob);
-    const timestamp = new Date().toISOString().slice(0, 19).replace(/:/g, '-');
-    downloadFromUrl(url, `ttsfm-speech-${timestamp}.${currentFormat}`);
-    URL.revokeObjectURL(url);
-}
-function downloadAllAudio() {
-    const downloadButtons = document.querySelectorAll('.download-chunk');
-    if (downloadButtons.length === 0) {
-        console.log('No batch audio files to download');
-        return;
-    }
-    console.log(`Starting download of ${downloadButtons.length} files...`);
-    downloadButtons.forEach((btn, index) => {
-        setTimeout(() => {
-            btn.click();
-        }, index * 500); // Stagger downloads to avoid browser limits
-    });
-}
-function base64ToBlob(base64, contentType) {
-    const byteCharacters = atob(base64);
-    const byteNumbers = new Array(byteCharacters.length);
-    for (let i = 0; i < byteCharacters.length; i++) {
-        byteNumbers[i] = byteCharacters.charCodeAt(i);
-    }
-    const byteArray = new Uint8Array(byteNumbers);
-    return new Blob([byteArray], { type: contentType });
-}
-function downloadFromUrl(url, filename) {
-    const a = document.createElement('a');
-    a.href = url;
-    a.download = filename;
-    a.style.display = 'none';
-    document.body.appendChild(a);
-    a.click();
-    document.body.removeChild(a);
-}
-// New enhanced functions
-function clearText() {
-    document.getElementById('text-input').value = '';
-    updateCharCount();
-    clearResults();
-    console.log('Text cleared successfully');
-}
-function loadRandomText() {
-    const randomTexts = [
-        // News & Information
-        "Breaking news: Scientists have discovered a revolutionary new method for generating incredibly natural synthetic speech using advanced neural networks and machine learning algorithms.",
-        "Weather update: Today will be partly cloudy with temperatures reaching 75 degrees Fahrenheit. Light winds from the southwest at 5 to 10 miles per hour.",
-        "Technology report: The latest advancements in artificial intelligence are revolutionizing how we interact with digital devices and services.",
-        // Educational & Informative
-        "The human brain contains approximately 86 billion neurons, each connected to thousands of others, creating a complex network that enables consciousness, memory, and thought.",
-        "Photosynthesis is the process by which plants convert sunlight, carbon dioxide, and water into glucose and oxygen, forming the foundation of most life on Earth.",
-        "The speed of light in a vacuum is exactly 299,792,458 meters per second, making it one of the fundamental constants of physics.",
-        // Creative & Storytelling
-        "Once upon a time, in a land far away, there lived a wise old wizard who could speak to the stars and understand their ancient secrets.",
-        "The mysterious lighthouse stood alone on the rocky cliff, its beacon cutting through the fog like a sword of light, guiding lost ships safely home.",
-        "In the depths of the enchanted forest, where sunbeams danced through emerald leaves, a young adventurer discovered a hidden path to destiny.",
-        // Business & Professional
-        "Our quarterly results demonstrate strong growth across all market segments, with revenue increasing by 23% compared to the same period last year.",
-        "The new product launch exceeded expectations, capturing 15% market share within the first six months and establishing our brand as an industry leader.",
-        "We are committed to sustainable business practices that benefit our customers, employees, and the environment for generations to come.",
-        // Technical & Programming
-        "The TTSFM package provides a comprehensive API for text-to-speech generation with support for multiple voices and audio formats.",
-        "Machine learning algorithms process vast amounts of data to identify patterns and make predictions with remarkable accuracy.",
-        "Cloud computing has transformed how businesses store, process, and access their data, enabling scalability and flexibility like never before.",
-        // Conversational & Casual
-        "Welcome to TTSFM! Experience the future of text-to-speech technology with our premium AI voices.",
-        "Good morning! Today is a beautiful day to learn something new and explore the possibilities of text-to-speech technology.",
-        "Have you ever wondered what it would be like if your computer could speak with perfect human-like intonation and emotion?"
-    ];
-    const randomText = randomTexts[Math.floor(Math.random() * randomTexts.length)];
-    document.getElementById('text-input').value = randomText;
-    updateCharCount();
-    console.log('Random text loaded successfully');
-}
-function resetForm() {
-    // Reset form to default values
-    document.getElementById('text-input').value = 'Welcome to TTSFM! Experience the future of text-to-speech technology with our premium AI voices. Generate natural, expressive speech for any application.';
-    document.getElementById('voice-select').value = 'alloy';
-    document.getElementById('format-select').value = 'mp3';
-    document.getElementById('instructions-input').value = '';
-    document.getElementById('max-length-input').value = '4096';
-    document.getElementById('validate-length-check').checked = true;
-    document.getElementById('auto-split-check').checked = false;
-    updateCharCount();
-    updateGenerateButton();
-    clearResults();
-    console.log('Form reset to default values');
-}
-function replayAudio() {
-    const audioPlayer = document.getElementById('audio-player');
-    if (audioPlayer && audioPlayer.src) {
-        audioPlayer.currentTime = 0;
-        audioPlayer.play().catch(() => {
-            console.log('Unable to replay audio. Please check your browser settings.');
-        });
-    }
-}
-function shareAudio() {
-    if (navigator.share && currentAudioBlob) {
-        const file = new File([currentAudioBlob], `ttsfm-speech.${currentFormat}`, {
-            type: `audio/${currentFormat}`
-        });
-        navigator.share({
-            title: 'TTSFM Generated Speech',
-            text: 'Check out this speech generated with TTSFM!',
-            files: [file]
-        }).catch(() => {
-            // Fallback to copying link
-            copyAudioLink();
-        });
-    } else {
-        copyAudioLink();
-    }
-}
-function copyAudioLink() {
-    const audioPlayer = document.getElementById('audio-player');
-    if (audioPlayer && audioPlayer.src) {
-        navigator.clipboard.writeText(audioPlayer.src).then(() => {
-            console.log('Audio link copied to clipboard!');
-        }).catch(() => {
-            console.log('Unable to copy link. Please try downloading the audio instead.');
-        });
-    }
-}
-function updateVoiceInfo() {
-    const voiceSelect = document.getElementById('voice-select');
-    const previewBtn = document.getElementById('preview-voice-btn');
-    if (voiceSelect.value) {
-        previewBtn.disabled = false;
-        previewBtn.onclick = () => previewVoice(voiceSelect.value);
-    } else {
-        previewBtn.disabled = true;
-    }
-}
-function updateFormatInfo() {
-    const formatSelect = document.getElementById('format-select');
-    const formatInfo = document.getElementById('format-info');
-    const formatDescriptions = {
-        'mp3': '🎵 MP3 - Good quality, small file size. Best for web and general use.',
-        'opus': '📻 OPUS - Excellent quality, small file size. Best for streaming and VoIP.',
-        'aac': '📱 AAC - Good quality, medium file size. Best for Apple devices and streaming.',
-        'flac': '💿 FLAC - Lossless quality, large file size. Best for archival and high-quality audio.',
-        'wav': '🎧 WAV - Lossless quality, large file size. Best for professional audio production.',
-        'pcm': '🔊 PCM - Raw audio data, large file size. Best for audio processing.'
-    };
-    if (formatInfo && formatSelect.value) {
-        formatInfo.textContent = formatDescriptions[formatSelect.value] || 'High-quality audio format';
-    }
-}
-function previewVoice(voiceId) {
-    // This would typically play a short preview of the voice
-    console.log(`Voice preview for ${voiceId} - Feature coming soon!`);
-}
-// Enhanced audio result display
-function displayAudioResult(audioBlob, format, voice, text) {
-    const audioResult = document.getElementById('audio-result');
-    const audioPlayer = document.getElementById('audio-player');
-    const audioInfo = document.getElementById('audio-info');
-    // Create audio URL and setup player
-    const audioUrl = URL.createObjectURL(audioBlob);
-    audioPlayer.src = audioUrl;
-    // Update audio stats
-    const sizeKB = (audioBlob.size / 1024).toFixed(1);
-    document.getElementById('audio-size').textContent = `${sizeKB} KB`;
-    document.getElementById('audio-format').textContent = format.toUpperCase();
-    document.getElementById('audio-voice').textContent = voice.charAt(0).toUpperCase() + voice.slice(1);
-    // Update audio info
-    audioInfo.innerHTML = `
-        <i class="fas fa-check-circle text-success me-1"></i>
-        Generated successfully • ${sizeKB} KB • ${format.toUpperCase()}
-    `;
-    // Show result with animation
-    audioResult.classList.remove('d-none');
-    audioResult.classList.add('fade-in');
-    // Update duration when metadata loads
-    audioPlayer.addEventListener('loadedmetadata', function() {
-        const duration = Math.round(audioPlayer.duration);
-        document.getElementById('audio-duration').textContent = `${duration}s`;
-    }, { once: true });
-    // Scroll to result
-    audioResult.scrollIntoView({ behavior: 'smooth', block: 'nearest' });
-}
-// Export functions for use in HTML
-window.enableAutoSplit = enableAutoSplit;
-window.clearText = clearText;
-window.loadRandomText = loadRandomText;
-window.resetForm = resetForm;

+// TTSFM Playground JavaScript
+// Global variables
+let currentAudioBlob = null;
+let currentFormat = 'mp3';
+let batchResults = [];
+// Initialize playground
+document.addEventListener('DOMContentLoaded', function() {
+    initializePlayground();
+});
+// Check authentication status and show/hide API key field
+async function checkAuthStatus() {
+    try {
+        const response = await fetch('/api/auth-status');
+        const data = await response.json();
+        const apiKeySection = document.getElementById('api-key-section');
+        if (apiKeySection) {
+            if (data.api_key_required) {
+                // Show API key field and mark as required
+                apiKeySection.style.display = 'block';
+                const apiKeyInput = document.getElementById('api-key-input');
+                const label = apiKeySection.querySelector('label');
+                if (apiKeyInput) {
+                    apiKeyInput.required = true;
+                    apiKeyInput.placeholder = 'Enter your API key (required)';
+                }
+                if (label) {
+                    label.innerHTML = '<i class="fas fa-key me-2"></i>' + (window.currentLocale === 'zh' ? 'API密钥（必需）' : 'API Key (Required)');
+                }
+                // Update form text
+                const formText = apiKeySection.querySelector('.form-text');
+                if (formText) {
+                    formText.innerHTML = '<i class="fas fa-exclamation-triangle me-1 text-warning"></i>API key protection is enabled - this field is required';
+                }
+            } else {
+                // Hide API key field or mark as optional
+                apiKeySection.style.display = 'none';
+            }
+        }
+    } catch (error) {
+        console.warn('Could not check auth status:', error);
+        // If we can't check, assume API key might be required and show the field
+        const apiKeySection = document.getElementById('api-key-section');
+        if (apiKeySection) {
+            apiKeySection.style.display = 'block';
+        }
+    }
+}
+function initializePlayground() {
+    console.log('Initializing playground...');
+    checkAuthStatus();
+    loadVoices();
+    loadFormats();
+    updateCharCount();
+    setupEventListeners();
+    console.log('Playground initialization complete');
+    // Initialize tooltips if Bootstrap is available
+    if (typeof bootstrap !== 'undefined') {
+        const tooltipTriggerList = [].slice.call(document.querySelectorAll('[data-bs-toggle="tooltip"]'));
+        tooltipTriggerList.map(function (tooltipTriggerEl) {
+            return new bootstrap.Tooltip(tooltipTriggerEl);
+        });
+    }
+}
+function setupEventListeners() {
+    console.log('Setting up event listeners...');
+    // Form and input events
+    const textInput = document.getElementById('text-input');
+    if (textInput) {
+        textInput.addEventListener('input', updateCharCount);
+        console.log('Text input event listener added');
+    } else {
+        console.error('Text input element not found!');
+    }
+    // Add form submit event listener with better error handling
+    const form = document.getElementById('tts-form');
+    if (form) {
+        form.addEventListener('submit', function(event) {
+            console.log('Form submit event triggered');
+            event.preventDefault(); // Prevent default form submission
+            event.stopPropagation(); // Stop event bubbling
+            generateSpeech(event);
+            return false; // Additional prevention
+        });
+    } else {
+        console.error('TTS form not found!');
+    }
+    const maxLengthInput = document.getElementById('max-length-input');
+    if (maxLengthInput) {
+        maxLengthInput.addEventListener('input', updateCharCount);
+        console.log('Max length input event listener added');
+    } else {
+        console.error('Max length input element not found!');
+    }
+    const autoCombineCheck = document.getElementById('auto-combine-check');
+    if (autoCombineCheck) {
+        autoCombineCheck.addEventListener('change', updateAutoCombineStatus);
+    }
+    // Enhanced button events
+    const validateBtn = document.getElementById('validate-text-btn');
+    if (validateBtn) {
+        validateBtn.addEventListener('click', validateText);
+        console.log('Validate button event listener added');
+    } else {
+        console.error('Validate button not found!');
+    }
+    const randomBtn = document.getElementById('random-text-btn');
+    if (randomBtn) {
+        randomBtn.addEventListener('click', loadRandomText);
+        console.log('Random text button event listener added');
+    } else {
+        console.error('Random text button not found!');
+    }
+    const downloadBtn = document.getElementById('download-btn');
+    if (downloadBtn) {
+        downloadBtn.addEventListener('click', downloadAudio);
+        console.log('Download button event listener added');
+    } else {
+        console.error('Download button not found!');
+    }
+    // Add direct click event listener for generate button as backup
+    const generateBtn = document.getElementById('generate-btn');
+    if (generateBtn) {
+        generateBtn.addEventListener('click', function(event) {
+            console.log('Generate button clicked directly');
+            event.preventDefault();
+            event.stopPropagation();
+            generateSpeech(event);
+            return false;
+        });
+    }
+    // New button events
+    const clearTextBtn = document.getElementById('clear-text-btn');
+    if (clearTextBtn) {
+        clearTextBtn.addEventListener('click', clearText);
+    }
+    const resetFormBtn = document.getElementById('reset-form-btn');
+    if (resetFormBtn) {
+        resetFormBtn.addEventListener('click', resetForm);
+    }
+    const replayBtn = document.getElementById('replay-btn');
+    if (replayBtn) {
+        replayBtn.addEventListener('click', replayAudio);
+    }
+    const shareBtn = document.getElementById('share-btn');
+    if (shareBtn) {
+        shareBtn.addEventListener('click', shareAudio);
+    }
+    // API Key visibility toggle
+    const toggleApiKeyBtn = document.getElementById('toggle-api-key-visibility');
+    if (toggleApiKeyBtn) {
+        toggleApiKeyBtn.addEventListener('click', toggleApiKeyVisibility);
+    }
+    // Voice and format selection events
+    const voiceSelect = document.getElementById('voice-select');
+    if (voiceSelect) {
+        voiceSelect.addEventListener('change', updateVoiceInfo);
+        console.log('Voice select event listener added');
+    } else {
+        console.error('Voice select element not found!');
+    }
+    const formatSelect = document.getElementById('format-select');
+    if (formatSelect) {
+        formatSelect.addEventListener('change', updateFormatInfo);
+        console.log('Format select event listener added');
+    } else {
+        console.error('Format select element not found!');
+    }
+    // Example text buttons
+    document.querySelectorAll('.use-example').forEach(button => {
+        button.addEventListener('click', function() {
+            document.getElementById('text-input').value = this.dataset.text;
+            updateCharCount();
+            // Add visual feedback
+            this.classList.add('btn-success');
+            setTimeout(() => {
+                this.classList.remove('btn-success');
+                this.classList.add('btn-outline-primary');
+            }, 1000);
+        });
+    });
+    // Keyboard shortcuts
+    document.addEventListener('keydown', function(e) {
+        // Ctrl/Cmd + Enter to generate speech
+        if ((e.ctrlKey || e.metaKey) && e.key === 'Enter') {
+            e.preventDefault();
+            document.getElementById('generate-btn').click();
+        }
+        // Escape to clear results
+        if (e.key === 'Escape') {
+            clearResults();
+        }
+    });
+    // Initialize auto-combine status
+    updateAutoCombineStatus();
+}
+async function loadVoices() {
+    try {
+        // Prepare headers for API key if available (OpenAI compatible format)
+        const headers = {};
+        const apiKeyInput = document.getElementById('api-key-input');
+        if (apiKeyInput && apiKeyInput.value.trim()) {
+            headers['Authorization'] = `Bearer ${apiKeyInput.value.trim()}`;
+        }
+        const response = await fetch('/api/voices', { headers });
+        const data = await response.json();
+        const select = document.getElementById('voice-select');
+        select.innerHTML = '';
+        data.voices.forEach(voice => {
+            const option = document.createElement('option');
+            option.value = voice.id;
+            option.textContent = `${voice.name} - ${voice.description}`;
+            select.appendChild(option);
+        });
+        // Select default voice
+        select.value = 'alloy';
+    } catch (error) {
+        console.error('Failed to load voices:', error);
+        console.log('Failed to load voices. Please refresh the page.');
+    }
+}
+async function loadFormats() {
+    try {
+        // Prepare headers for API key if available (OpenAI compatible format)
+        const headers = {};
+        const apiKeyInput = document.getElementById('api-key-input');
+        if (apiKeyInput && apiKeyInput.value.trim()) {
+            headers['Authorization'] = `Bearer ${apiKeyInput.value.trim()}`;
+        }
+        const response = await fetch('/api/formats', { headers });
+        const data = await response.json();
+        const select = document.getElementById('format-select');
+        select.innerHTML = '';
+        data.formats.forEach(format => {
+            const option = document.createElement('option');
+            option.value = format.id;
+            option.textContent = `${format.name} - ${format.description}`;
+            select.appendChild(option);
+        });
+        // Select default format
+        select.value = 'mp3';
+        updateFormatInfo();
+    } catch (error) {
+        console.error('Failed to load formats:', error);
+        console.log('Failed to load formats. Please refresh the page.');
+    }
+}
+function updateCharCount() {
+    const textInput = document.getElementById('text-input');
+    const maxLengthInput = document.getElementById('max-length-input');
+    const charCountElement = document.getElementById('char-count');
+    if (!textInput || !maxLengthInput || !charCountElement) {
+        console.warn('Required elements not found for updateCharCount');
+        return;
+    }
+    const text = textInput.value;
+    const maxLength = parseInt(maxLengthInput.value) || 4096;
+    const charCount = text.length;
+    charCountElement.textContent = charCount.toLocaleString();
+    // Update length status with better visual feedback
+    const statusElement = document.getElementById('length-status');
+    if (statusElement) {
+        const percentage = (charCount / maxLength) * 100;
+        if (charCount > maxLength) {
+            statusElement.innerHTML = '<span class="badge bg-danger"><i class="fas fa-exclamation-triangle me-1"></i>Exceeds limit</span>';
+        } else if (percentage > 80) {
+            statusElement.innerHTML = '<span class="badge bg-warning"><i class="fas fa-exclamation me-1"></i>Near limit</span>';
+        } else if (percentage > 50) {
+            statusElement.innerHTML = '<span class="badge bg-info"><i class="fas fa-info me-1"></i>Good</span>';
+        } else {
+            statusElement.innerHTML = '<span class="badge bg-success"><i class="fas fa-check me-1"></i>OK</span>';
+        }
+    }
+    updateGenerateButton();
+    updateAutoCombineStatus();
+}
+function updateGenerateButton() {
+    const text = document.getElementById('text-input').value;
+    const maxLength = parseInt(document.getElementById('max-length-input').value) || 4096;
+    const autoCombineCheck = document.getElementById('auto-combine-check');
+    const autoCombine = autoCombineCheck ? autoCombineCheck.checked : false;
+    const generateBtn = document.getElementById('generate-btn');
+    if (!generateBtn) {
+        console.warn('Generate button not found');
+        return;
+    }
+    const btnText = generateBtn.querySelector('.btn-text');
+    if (!btnText) {
+        console.warn('Button text element not found');
+        return;
+    }
+    if (text.length > maxLength && autoCombine) {
+        btnText.innerHTML = '<i class="fas fa-magic me-2"></i>Generate Speech (Auto-Combine)';
+        generateBtn.classList.add('btn-warning');
+        generateBtn.classList.remove('btn-primary');
+    } else {
+        btnText.innerHTML = '<i class="fas fa-magic me-2"></i>Generate Speech';
+        generateBtn.classList.add('btn-primary');
+        generateBtn.classList.remove('btn-warning');
+    }
+}
+async function validateText() {
+    const text = document.getElementById('text-input').value.trim();
+    const maxLength = parseInt(document.getElementById('max-length-input').value) || 4096;
+    if (!text) {
+        console.log('Please enter some text to validate');
+        return;
+    }
+    const validateBtn = document.getElementById('validate-text-btn');
+    setLoading(validateBtn, true);
+    try {
+        const response = await fetch('/api/validate-text', {
+            method: 'POST',
+            headers: { 'Content-Type': 'application/json' },
+            body: JSON.stringify({ text, max_length: maxLength })
+        });
+        const data = await response.json();
+        const resultDiv = document.getElementById('validation-result');
+        if (data.is_valid) {
+            resultDiv.innerHTML = `
+                <div class="alert alert-success fade-in">
+                    <i class="fas fa-check-circle me-2"></i>
+                    <strong>Text is valid!</strong> (${data.text_length.toLocaleString()} characters)
+                    <div class="progress progress-custom mt-2">
+                        <div class="progress-bar-custom" style="width: ${(data.text_length / data.max_length) * 100}%"></div>
+                    </div>
+                </div>
+            `;
+        } else {
+            resultDiv.innerHTML = `
+                <div class="alert alert-warning fade-in">
+                    <i class="fas fa-exclamation-triangle me-2"></i>
+                    <strong>Text exceeds limit!</strong> (${data.text_length.toLocaleString()}/${data.max_length.toLocaleString()} characters)
+                    <br><small class="mt-2 d-block">Suggested chunks: ${data.suggested_chunks}</small>
+                    <div class="mt-3">
+                        <strong>Preview of chunks:</strong>
+                        <div class="mt-2">
+                            ${data.chunk_preview.map((chunk, i) => `
+                                <div class="border rounded p-2 mb-2 bg-light">
+                                    <small class="text-muted">Chunk ${i+1}:</small>
+                                    <div class="small">${chunk}</div>
+                                </div>
+                            `).join('')}
+                        </div>
+                    </div>
+                </div>
+            `;
+        }
+        resultDiv.classList.remove('d-none');
+        resultDiv.scrollIntoView({ behavior: 'smooth', block: 'nearest' });
+    } catch (error) {
+        console.error('Validation failed:', error);
+        console.log('Failed to validate text. Please try again.');
+    } finally {
+        setLoading(validateBtn, false);
+    }
+}
+function updateAutoCombineStatus() {
+    const autoCombineCheck = document.getElementById('auto-combine-check');
+    const statusBadge = document.getElementById('auto-combine-status');
+    const textInput = document.getElementById('text-input');
+    const maxLength = parseInt(document.getElementById('max-length-input').value) || 4096;
+    if (!autoCombineCheck || !statusBadge) return;
+    const isAutoCombineEnabled = autoCombineCheck.checked;
+    const textLength = textInput.value.length;
+    const isLongText = textLength > maxLength;
+    // Show/hide status badge
+    if (isAutoCombineEnabled && isLongText) {
+        statusBadge.classList.remove('d-none');
+        statusBadge.classList.add('bg-success');
+        statusBadge.classList.remove('bg-warning');
+        statusBadge.innerHTML = '<i class="fas fa-magic me-1"></i>Auto-combine enabled';
+    } else if (!isAutoCombineEnabled && isLongText) {
+        statusBadge.classList.remove('d-none');
+        statusBadge.classList.add('bg-warning');
+        statusBadge.classList.remove('bg-success');
+        statusBadge.innerHTML = '<i class="fas fa-exclamation-triangle me-1"></i>Long text detected';
+    } else {
+        statusBadge.classList.add('d-none');
+    }
+    // Remove the recursive call to updateCharCount() - this was causing infinite recursion
+}
+async function generateSpeech(event) {
+    console.log('generateSpeech function called');
+    // Prevent default form submission behavior
+    if (event) {
+        event.preventDefault();
+        event.stopPropagation();
+    }
+    const button = document.getElementById('generate-btn');
+    const audioResult = document.getElementById('audio-result');
+    // Get form data
+    const formData = getFormData();
+    if (!validateFormData(formData)) {
+        console.log('Form validation failed');
+        return false;
+    }
+    // Show loading state
+    setLoading(button, true);
+    clearResults();
+    try {
+        console.log('Starting speech generation...');
+        // Always use the unified endpoint with auto-combine
+        await generateUnifiedSpeech(formData);
+        console.log('Speech generation completed successfully');
+    } catch (error) {
+        console.error('Generation failed:', error);
+        console.log(`Failed to generate speech: ${error.message}`);
+    } finally {
+        setLoading(button, false);
+    }
+    return false; // Ensure form doesn't submit
+}
+function getFormData() {
+    return {
+        text: document.getElementById('text-input').value.trim(),
+        voice: document.getElementById('voice-select').value,
+        format: document.getElementById('format-select').value,
+        instructions: document.getElementById('instructions-input').value.trim(),
+        maxLength: parseInt(document.getElementById('max-length-input').value) || 4096,
+        validateLength: document.getElementById('validate-length-check').checked,
+        autoCombine: document.getElementById('auto-combine-check').checked,
+        apiKey: document.getElementById('api-key-input').value.trim()
+    };
+}
+function validateFormData(formData) {
+    if (!formData.text || !formData.voice || !formData.format) {
+        console.log('Please fill in all required fields');
+        return false;
+    }
+    if (formData.text.length > formData.maxLength && formData.validateLength && !formData.autoCombine) {
+        console.log(`Text is too long (${formData.text.length} characters). Enable auto-combine or reduce text length.`);
+        return false;
+    }
+    return true;
+}
+function clearResults() {
+    document.getElementById('audio-result').classList.add('d-none');
+    const batchResult = document.getElementById('batch-result');
+    if (batchResult) {
+        batchResult.classList.add('d-none');
+    }
+    document.getElementById('validation-result').classList.add('d-none');
+}
+// Utility functions
+function setLoading(button, loading) {
+    if (loading) {
+        button.classList.add('loading');
+        button.disabled = true;
+    } else {
+        button.classList.remove('loading');
+        button.disabled = false;
+    }
+}
+// New unified function using OpenAI-compatible endpoint with auto-combine
+async function generateUnifiedSpeech(formData) {
+    const audioResult = document.getElementById('audio-result');
+    // Prepare headers
+    const headers = { 'Content-Type': 'application/json' };
+    // Add API key if provided (OpenAI compatible format)
+    if (formData.apiKey) {
+        headers['Authorization'] = `Bearer ${formData.apiKey}`;
+    }
+    const response = await fetch('/v1/audio/speech', {
+        method: 'POST',
+        headers: headers,
+        body: JSON.stringify({
+            model: 'gpt-4o-mini-tts',
+            input: formData.text,
+            voice: formData.voice,
+            response_format: formData.format,
+            instructions: formData.instructions || undefined,
+            auto_combine: formData.autoCombine,
+            max_length: formData.maxLength
+        })
+    });
+    if (!response.ok) {
+        const errorData = await response.json();
+        const errorMessage = errorData.error?.message || errorData.error || `HTTP ${response.status}`;
+        throw new Error(errorMessage);
+    }
+    // Get audio data
+    const audioBlob = await response.blob();
+    currentAudioBlob = audioBlob;
+    currentFormat = formData.format;
+    // Create audio URL and setup player
+    const audioUrl = URL.createObjectURL(audioBlob);
+    const audioPlayer = document.getElementById('audio-player');
+    audioPlayer.src = audioUrl;
+    // Get response headers for enhanced display
+    const chunksCount = response.headers.get('X-Chunks-Combined') || '1';
+    const autoCombineUsed = response.headers.get('X-Auto-Combine') === 'true';
+    const originalLength = response.headers.get('X-Original-Text-Length');
+    // Use enhanced display function with new metadata
+    displayAudioResult(audioBlob, formData.format, formData.voice, formData.text, {
+        chunksCount,
+        autoCombineUsed,
+        originalLength
+    });
+    console.log('Speech generated successfully! Click play to listen.');
+    if (autoCombineUsed && chunksCount > 1) {
+        console.log(`Auto-combine feature combined ${chunksCount} chunks into a single audio file.`);
+    }
+    // Auto-play if user prefers
+    if (localStorage.getItem('autoPlay') === 'true') {
+        audioPlayer.play().catch(() => {
+            // Auto-play blocked, that's fine
+        });
+    }
+}
+// Legacy function for backward compatibility
+async function generateSingleSpeech(formData) {
+    // Use the new unified function
+    await generateUnifiedSpeech(formData);
+}
+function downloadAudio() {
+    if (!currentAudioBlob) {
+        console.log('No audio to download');
+        return;
+    }
+    const url = URL.createObjectURL(currentAudioBlob);
+    const timestamp = new Date().toISOString().slice(0, 19).replace(/:/g, '-');
+    downloadFromUrl(url, `ttsfm-speech-${timestamp}.${currentFormat}`);
+    URL.revokeObjectURL(url);
+}
+function downloadFromUrl(url, filename) {
+    const a = document.createElement('a');
+    a.href = url;
+    a.download = filename;
+    a.style.display = 'none';
+    document.body.appendChild(a);
+    a.click();
+    document.body.removeChild(a);
+}
+// New enhanced functions
+function clearText() {
+    document.getElementById('text-input').value = '';
+    updateCharCount();
+    clearResults();
+    console.log('Text cleared successfully');
+}
+function loadRandomText() {
+    const randomTexts = [
+        // News & Information
+        "Breaking news: Scientists have discovered a revolutionary new method for generating incredibly natural synthetic speech using advanced neural networks and machine learning algorithms.",
+        "Weather update: Today will be partly cloudy with temperatures reaching 75 degrees Fahrenheit. Light winds from the southwest at 5 to 10 miles per hour.",
+        "Technology report: The latest advancements in artificial intelligence are revolutionizing how we interact with digital devices and services.",
+        // Educational & Informative
+        "The human brain contains approximately 86 billion neurons, each connected to thousands of others, creating a complex network that enables consciousness, memory, and thought.",
+        "Photosynthesis is the process by which plants convert sunlight, carbon dioxide, and water into glucose and oxygen, forming the foundation of most life on Earth.",
+        "The speed of light in a vacuum is exactly 299,792,458 meters per second, making it one of the fundamental constants of physics.",
+        // Creative & Storytelling
+        "Once upon a time, in a land far away, there lived a wise old wizard who could speak to the stars and understand their ancient secrets.",
+        "The mysterious lighthouse stood alone on the rocky cliff, its beacon cutting through the fog like a sword of light, guiding lost ships safely home.",
+        "In the depths of the enchanted forest, where sunbeams danced through emerald leaves, a young adventurer discovered a hidden path to destiny.",
+        // Business & Professional
+        "Our quarterly results demonstrate strong growth across all market segments, with revenue increasing by 23% compared to the same period last year.",
+        "The new product launch exceeded expectations, capturing 15% market share within the first six months and establishing our brand as an industry leader.",
+        "We are committed to sustainable business practices that benefit our customers, employees, and the environment for generations to come.",
+        // Technical & Programming
+        "The TTSFM package provides a comprehensive API for text-to-speech generation with support for multiple voices and audio formats.",
+        "Machine learning algorithms process vast amounts of data to identify patterns and make predictions with remarkable accuracy.",
+        "Cloud computing has transformed how businesses store, process, and access their data, enabling scalability and flexibility like never before.",
+        // Conversational & Casual
+        "Welcome to TTSFM! Experience the future of text-to-speech technology with our premium AI voices.",
+        "Good morning! Today is a beautiful day to learn something new and explore the possibilities of text-to-speech technology.",
+        "Have you ever wondered what it would be like if your computer could speak with perfect human-like intonation and emotion?"
+    ];
+    const randomText = randomTexts[Math.floor(Math.random() * randomTexts.length)];
+    document.getElementById('text-input').value = randomText;
+    updateCharCount();
+    console.log('Random text loaded successfully');
+}
+function resetForm() {
+    // Reset form to default values
+    document.getElementById('text-input').value = 'Welcome to TTSFM! Experience the future of text-to-speech technology with our premium AI voices. Generate natural, expressive speech for any application.';
+    document.getElementById('voice-select').value = 'alloy';
+    document.getElementById('format-select').value = 'mp3';
+    document.getElementById('instructions-input').value = '';
+    document.getElementById('max-length-input').value = '4096';
+    document.getElementById('validate-length-check').checked = true;
+    const autoCombineCheck = document.getElementById('auto-combine-check');
+    if (autoCombineCheck) {
+        autoCombineCheck.checked = true;
+    }
+    updateCharCount();
+    updateGenerateButton();
+    clearResults();
+    console.log('Form reset to default values');
+}
+function replayAudio() {
+    const audioPlayer = document.getElementById('audio-player');
+    if (audioPlayer && audioPlayer.src) {
+        audioPlayer.currentTime = 0;
+        audioPlayer.play().catch(() => {
+            console.log('Unable to replay audio. Please check your browser settings.');
+        });
+    }
+}
+function shareAudio() {
+    if (navigator.share && currentAudioBlob) {
+        const file = new File([currentAudioBlob], `ttsfm-speech.${currentFormat}`, {
+            type: `audio/${currentFormat}`
+        });
+        navigator.share({
+            title: 'TTSFM Generated Speech',
+            text: 'Check out this speech generated with TTSFM!',
+            files: [file]
+        }).catch(() => {
+            // Fallback to copying link
+            copyAudioLink();
+        });
+    } else {
+        copyAudioLink();
+    }
+}
+function copyAudioLink() {
+    const audioPlayer = document.getElementById('audio-player');
+    if (audioPlayer && audioPlayer.src) {
+        navigator.clipboard.writeText(audioPlayer.src).then(() => {
+            console.log('Audio link copied to clipboard!');
+        }).catch(() => {
+            console.log('Unable to copy link. Please try downloading the audio instead.');
+        });
+    }
+}
+function updateVoiceInfo() {
+    const voiceSelect = document.getElementById('voice-select');
+    const previewBtn = document.getElementById('preview-voice-btn');
+    if (voiceSelect.value) {
+        previewBtn.disabled = false;
+        previewBtn.onclick = () => previewVoice(voiceSelect.value);
+    } else {
+        previewBtn.disabled = true;
+    }
+}
+function updateFormatInfo() {
+    const formatSelect = document.getElementById('format-select');
+    const formatInfo = document.getElementById('format-info');
+    const formatDescriptions = {
+        'mp3': '🎵 MP3 - Good quality, small file size. Best for web and general use.',
+        'opus': '📻 OPUS - Excellent quality, small file size. Best for streaming and VoIP.',
+        'aac': '📱 AAC - Good quality, medium file size. Best for Apple devices and streaming.',
+        'flac': '💿 FLAC - Lossless quality, large file size. Best for archival and high-quality audio.',
+        'wav': '🎧 WAV - Lossless quality, large file size. Best for professional audio production.',
+        'pcm': '🔊 PCM - Raw audio data, large file size. Best for audio processing.'
+    };
+    if (formatInfo && formatSelect.value) {
+        formatInfo.textContent = formatDescriptions[formatSelect.value] || 'High-quality audio format';
+    }
+}
+function previewVoice(voiceId) {
+    // This would typically play a short preview of the voice
+    console.log(`Voice preview for ${voiceId} - Feature coming soon!`);
+}
+// Enhanced audio result display with auto-combine metadata
+function displayAudioResult(audioBlob, format, voice, text, metadata = {}) {
+    const audioResult = document.getElementById('audio-result');
+    const audioPlayer = document.getElementById('audio-player');
+    const audioInfo = document.getElementById('audio-info');
+    // Create audio URL and setup player
+    const audioUrl = URL.createObjectURL(audioBlob);
+    audioPlayer.src = audioUrl;
+    // Update audio stats
+    const sizeKB = (audioBlob.size / 1024).toFixed(1);
+    document.getElementById('audio-size').textContent = `${sizeKB} KB`;
+    document.getElementById('audio-format').textContent = format.toUpperCase();
+    document.getElementById('audio-voice').textContent = voice.charAt(0).toUpperCase() + voice.slice(1);
+    // Update audio info safely without innerHTML
+    // Clear existing content
+    audioInfo.textContent = '';
+    // Create and append icon element
+    const icon = document.createElement('i');
+    icon.className = 'fas fa-check-circle text-success me-1';
+    audioInfo.appendChild(icon);
+    // Create info text with auto-combine details
+    let infoText = `Generated successfully • ${sizeKB} KB • ${format.toUpperCase()}`;
+    if (metadata.autoCombineUsed && metadata.chunksCount > 1) {
+        infoText += ` • Auto-combined ${metadata.chunksCount} chunks`;
+        // Add a special badge for auto-combine
+        const badge = document.createElement('span');
+        badge.className = 'badge bg-primary ms-2';
+        badge.innerHTML = '<i class="fas fa-magic me-1"></i>Auto-combined';
+        audioInfo.appendChild(document.createTextNode(infoText));
+        audioInfo.appendChild(badge);
+    } else {
+        // Create and append text content (safely escaped)
+        const textNode = document.createTextNode(infoText);
+        audioInfo.appendChild(textNode);
+    }
+    // Show result with animation
+    audioResult.classList.remove('d-none');
+    audioResult.classList.add('fade-in');
+    // Update duration when metadata loads
+    audioPlayer.addEventListener('loadedmetadata', function() {
+        const duration = Math.round(audioPlayer.duration);
+        document.getElementById('audio-duration').textContent = `${duration}s`;
+    }, { once: true });
+    // Scroll to result
+    audioResult.scrollIntoView({ behavior: 'smooth', block: 'nearest' });
+}
+// API Key visibility toggle function
+function toggleApiKeyVisibility() {
+    const apiKeyInput = document.getElementById('api-key-input');
+    const eyeIcon = document.getElementById('api-key-eye-icon');
+    if (apiKeyInput.type === 'password') {
+        apiKeyInput.type = 'text';
+        eyeIcon.className = 'fas fa-eye-slash';
+    } else {
+        apiKeyInput.type = 'password';
+        eyeIcon.className = 'fas fa-eye';
+    }
+}
+// Export functions for use in HTML
+window.clearText = clearText;
+window.loadRandomText = loadRandomText;
+window.resetForm = resetForm;
+window.toggleApiKeyVisibility = toggleApiKeyVisibility;

ttsfm-web/static/js/websocket-tts.js ADDED Viewed

	@@ -0,0 +1,366 @@

+/**
+ * WebSocket TTS Streaming Client
+ *
+ * Because apparently HTTP requests are so 2023.
+ * Now we need real-time streaming for everything.
+ */
+class WebSocketTTSClient {
+    constructor(options = {}) {
+        this.socketUrl = options.socketUrl || window.location.origin;
+        this.socket = null;
+        this.activeRequests = new Map();
+        this.reconnectAttempts = 0;
+        this.maxReconnectAttempts = options.maxReconnectAttempts || 5;
+        this.reconnectDelay = options.reconnectDelay || 1000;
+        this.debug = options.debug || false;
+        // Audio context for seamless playback
+        this.audioContext = null;
+        this.audioQueue = new Map(); // request_id -> audio chunks
+        // Event handlers
+        this.onConnect = options.onConnect || (() => {});
+        this.onDisconnect = options.onDisconnect || (() => {});
+        this.onError = options.onError || ((error) => console.error('WebSocket error:', error));
+        // Initialize
+        this.connect();
+    }
+    connect() {
+        if (this.socket && this.socket.connected) {
+            this.log('Already connected');
+            return;
+        }
+        this.log('Connecting to WebSocket server...');
+        // Initialize Socket.IO connection
+        this.socket = io(this.socketUrl, {
+            transports: ['websocket', 'polling'],
+            reconnection: true,
+            reconnectionAttempts: this.maxReconnectAttempts,
+            reconnectionDelay: this.reconnectDelay
+        });
+        // Set up event handlers
+        this.setupEventHandlers();
+    }
+    setupEventHandlers() {
+        // Connection events
+        this.socket.on('connect', () => {
+            this.log('Connected to WebSocket server');
+            this.reconnectAttempts = 0;
+            this.onConnect();
+        });
+        this.socket.on('disconnect', (reason) => {
+            this.log('Disconnected from WebSocket server:', reason);
+            this.onDisconnect(reason);
+        });
+        this.socket.on('connect_error', (error) => {
+            this.log('Connection error:', error);
+            this.reconnectAttempts++;
+            this.onError({
+                type: 'connection_error',
+                message: error.message,
+                attempts: this.reconnectAttempts
+            });
+        });
+        // TTS streaming events
+        this.socket.on('connected', (data) => {
+            this.log('Session established:', data.session_id);
+        });
+        this.socket.on('stream_started', (data) => {
+            this.log('Stream started:', data.request_id);
+            const request = this.activeRequests.get(data.request_id);
+            if (request && request.onStart) {
+                request.onStart(data);
+            }
+        });
+        this.socket.on('audio_chunk', (data) => {
+            this.handleAudioChunk(data);
+        });
+        this.socket.on('stream_progress', (data) => {
+            this.handleProgress(data);
+        });
+        this.socket.on('stream_complete', (data) => {
+            this.handleStreamComplete(data);
+        });
+        this.socket.on('stream_error', (data) => {
+            this.handleStreamError(data);
+        });
+    }
+    /**
+     * Generate speech with real-time streaming
+     */
+    generateSpeech(text, options = {}) {
+        return new Promise((resolve, reject) => {
+            if (!this.socket || !this.socket.connected) {
+                reject(new Error('WebSocket not connected'));
+                return;
+            }
+            const requestId = this.generateRequestId();
+            const audioChunks = [];
+            // Store request info
+            this.activeRequests.set(requestId, {
+                resolve,
+                reject,
+                audioChunks,
+                options,
+                startTime: Date.now(),
+                onStart: options.onStart,
+                onProgress: options.onProgress,
+                onChunk: options.onChunk,
+                onComplete: options.onComplete,
+                onError: options.onError
+            });
+            // Initialize audio queue for this request
+            this.audioQueue.set(requestId, []);
+            // Emit generation request
+            this.socket.emit('generate_stream', {
+                request_id: requestId,
+                text: text,
+                voice: options.voice || 'alloy',
+                format: options.format || 'mp3',
+                chunk_size: options.chunkSize || 1024
+            });
+            this.log('Requested speech generation:', requestId);
+        });
+    }
+    handleAudioChunk(data) {
+        const request = this.activeRequests.get(data.request_id);
+        if (!request) {
+            this.log('Received chunk for unknown request:', data.request_id);
+            return;
+        }
+        // Convert hex string back to binary
+        const audioData = this.hexToArrayBuffer(data.audio_data);
+        // Store chunk
+        request.audioChunks.push({
+            index: data.chunk_index,
+            data: audioData,
+            duration: data.duration,
+            format: data.format
+        });
+        // Add to audio queue for streaming playback
+        const queue = this.audioQueue.get(data.request_id);
+        if (queue) {
+            queue.push(audioData);
+        }
+        // Call chunk handler if provided
+        if (request.onChunk) {
+            request.onChunk({
+                chunkIndex: data.chunk_index,
+                totalChunks: data.total_chunks,
+                audioData: audioData,
+                duration: data.duration,
+                text: data.chunk_text
+            });
+        }
+        this.log(`Received chunk ${data.chunk_index + 1}/${data.total_chunks} for request ${data.request_id}`);
+    }
+    handleProgress(data) {
+        const request = this.activeRequests.get(data.request_id);
+        if (request && request.onProgress) {
+            request.onProgress({
+                progress: data.progress,
+                chunksCompleted: data.chunks_completed,
+                totalChunks: data.total_chunks,
+                status: data.status
+            });
+        }
+    }
+    handleStreamComplete(data) {
+        const request = this.activeRequests.get(data.request_id);
+        if (!request) {
+            this.log('Completion for unknown request:', data.request_id);
+            return;
+        }
+        // Sort chunks by index
+        request.audioChunks.sort((a, b) => a.index - b.index);
+        // Combine all audio chunks
+        const combinedAudio = this.combineAudioChunks(request.audioChunks);
+        const result = {
+            requestId: data.request_id,
+            audioData: combinedAudio,
+            chunks: request.audioChunks,
+            duration: request.audioChunks.reduce((sum, chunk) => sum + chunk.duration, 0),
+            generationTime: Date.now() - request.startTime,
+            format: request.audioChunks[0]?.format || 'mp3'
+        };
+        // Call complete handler
+        if (request.onComplete) {
+            request.onComplete(result);
+        }
+        // Resolve promise
+        request.resolve(result);
+        // Cleanup
+        this.activeRequests.delete(data.request_id);
+        this.audioQueue.delete(data.request_id);
+        this.log('Stream completed:', data.request_id);
+    }
+    handleStreamError(data) {
+        const request = this.activeRequests.get(data.request_id);
+        if (!request) {
+            this.log('Error for unknown request:', data.request_id);
+            return;
+        }
+        const error = new Error(data.error);
+        error.requestId = data.request_id;
+        error.timestamp = data.timestamp;
+        // Call error handler
+        if (request.onError) {
+            request.onError(error);
+        }
+        // Reject promise
+        request.reject(error);
+        // Cleanup
+        this.activeRequests.delete(data.request_id);
+        this.audioQueue.delete(data.request_id);
+        this.log('Stream error:', data.request_id, data.error);
+    }
+    /**
+     * Cancel an active stream
+     */
+    cancelStream(requestId) {
+        if (!this.socket || !this.socket.connected) {
+            throw new Error('WebSocket not connected');
+        }
+        this.socket.emit('cancel_stream', { request_id: requestId });
+        // Clean up local state
+        const request = this.activeRequests.get(requestId);
+        if (request) {
+            request.reject(new Error('Stream cancelled by user'));
+            this.activeRequests.delete(requestId);
+            this.audioQueue.delete(requestId);
+        }
+    }
+    /**
+     * Combine audio chunks into a single buffer
+     */
+    combineAudioChunks(chunks) {
+        if (chunks.length === 0) return new ArrayBuffer(0);
+        // Calculate total size
+        const totalSize = chunks.reduce((sum, chunk) => sum + chunk.data.byteLength, 0);
+        // Create combined buffer
+        const combined = new ArrayBuffer(totalSize);
+        const view = new Uint8Array(combined);
+        let offset = 0;
+        for (const chunk of chunks) {
+            view.set(new Uint8Array(chunk.data), offset);
+            offset += chunk.data.byteLength;
+        }
+        return combined;
+    }
+    /**
+     * Play audio directly (experimental streaming playback)
+     */
+    async playAudioStream(requestId) {
+        if (!this.audioContext) {
+            this.audioContext = new (window.AudioContext || window.webkitAudioContext)();
+        }
+        const queue = this.audioQueue.get(requestId);
+        if (!queue) {
+            throw new Error('No audio queue found for request');
+        }
+        // This is a simplified version - real implementation would need
+        // proper audio decoding and buffering for seamless playback
+        this.log('Streaming audio playback not fully implemented yet');
+    }
+    /**
+     * Utility functions
+     */
+    hexToArrayBuffer(hex) {
+        const bytes = new Uint8Array(hex.length / 2);
+        for (let i = 0; i < hex.length; i += 2) {
+            bytes[i / 2] = parseInt(hex.substr(i, 2), 16);
+        }
+        return bytes.buffer;
+    }
+    generateRequestId() {
+        return `req_${Date.now()}_${Math.random().toString(36).substr(2, 9)}`;
+    }
+    log(...args) {
+        if (this.debug) {
+            console.log('[WebSocketTTS]', ...args);
+        }
+    }
+    /**
+     * Get connection status
+     */
+    isConnected() {
+        return this.socket && this.socket.connected;
+    }
+    /**
+     * Disconnect from server
+     */
+    disconnect() {
+        if (this.socket) {
+            this.socket.disconnect();
+            this.socket = null;
+        }
+        // Clear all active requests
+        for (const [requestId, request] of this.activeRequests) {
+            request.reject(new Error('Client disconnected'));
+        }
+        this.activeRequests.clear();
+        this.audioQueue.clear();
+    }
+}
+// Export for use
+window.WebSocketTTSClient = WebSocketTTSClient;

ttsfm-web/templates/base.html CHANGED Viewed

@@ -1,356 +1,363 @@
-<!DOCTYPE html>
-<html lang="en">
-<head>
-        <!-- Cronitor RUM -->
-    <script async src="https://rum.cronitor.io/script.js"></script>
-    <script>
-        window.cronitor = window.cronitor || function() { (window.cronitor.q = window.cronitor.q || []).push(arguments); };
-        cronitor('config', { clientKey: 'bdc4a3faf9c16d842b5099e1a0e3ba6f' });
-    </script>
-    <meta charset="UTF-8">
-    <meta name="viewport" content="width=device-width, initial-scale=1.0">
-    <title>{% block title %}TTSFM - Text-to-Speech{% endblock %}</title>
-    <!-- Bootstrap CSS -->
-    <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.1.3/dist/css/bootstrap.min.css" rel="stylesheet">
-    <!-- Font Awesome -->
-    <link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.0.0/css/all.min.css" rel="stylesheet">
-    <!-- Google Fonts -->
-    <link href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700&display=swap" rel="stylesheet">
-    <!-- Custom CSS -->
-    <link href="{{ url_for('static', filename='css/style.css') }}" rel="stylesheet">
-    <!-- Additional Performance Optimizations -->
-    <link rel="preconnect" href="https://fonts.googleapis.com">
-    <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
-    <!-- Favicon -->
-    <link rel="icon" type="image/svg+xml" href="data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 100 100'><text y='.9em' font-size='90'>🎤</text></svg>">
-    <!-- Meta tags for better SEO and social sharing -->
-    <meta name="description" content="TTSFM - A Python client for text-to-speech APIs. Simple to use with support for multiple voices and audio formats.">
-    <meta name="keywords" content="text-to-speech, TTS, python, API, voice synthesis, audio generation">
-    <meta name="author" content="TTSFM">
-    <!-- Open Graph / Facebook -->
-    <meta property="og:type" content="website">
-    <meta property="og:url" content="{{ request.url }}">
-    <meta property="og:title" content="{% block og_title %}TTSFM - Python Text-to-Speech Client{% endblock %}">
-    <meta property="og:description" content="A Python client for text-to-speech APIs. Simple to use with support for multiple voices and audio formats.">
-    <!-- Twitter -->
-    <meta property="twitter:card" content="summary">
-    <meta property="twitter:url" content="{{ request.url }}">
-    <meta property="twitter:title" content="{% block twitter_title %}TTSFM - Python Text-to-Speech Client{% endblock %}">
-    <meta property="twitter:description" content="A Python client for text-to-speech APIs. Simple to use with support for multiple voices and audio formats.">
-    {% block extra_css %}{% endblock %}
-</head>
-<body>
-    <!-- Skip to content link for accessibility -->
-    <a href="#main-content" class="skip-link">Skip to main content</a>
-    <!-- Clean Navigation -->
-    <nav class="navbar navbar-expand-lg fixed-top" style="background-color: rgba(255, 255, 255, 0.95); backdrop-filter: blur(10px); border-bottom: 1px solid #e5e7eb;">
-        <div class="container">
-            <a class="navbar-brand" href="{{ url_for('index') }}">
-                <i class="fas fa-microphone-alt me-2"></i>
-                <span class="fw-bold">TTSFM</span>
-                <span class="badge bg-primary ms-2 small">v3.0</span>
-            </a>
-            <button class="navbar-toggler border-0" type="button" data-bs-toggle="collapse" data-bs-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation">
-                <span class="navbar-toggler-icon"></span>
-            </button>
-            <div class="collapse navbar-collapse" id="navbarNav">
-                <ul class="navbar-nav me-auto">
-                    <li class="nav-item">
-                        <a class="nav-link" href="{{ url_for('index') }}" aria-label="Home page">
-                            <i class="fas fa-home me-1"></i>Home
-                        </a>
-                    </li>
-                    <li class="nav-item">
-                        <a class="nav-link" href="{{ url_for('playground') }}" aria-label="Interactive playground">
-                            <i class="fas fa-play me-1"></i>Playground
-                        </a>
-                    </li>
-                    <li class="nav-item">
-                        <a class="nav-link" href="{{ url_for('docs') }}" aria-label="API documentation">
-                            <i class="fas fa-book me-1"></i>Documentation
-                        </a>
-                    </li>
-                </ul>
-                <ul class="navbar-nav">
-                    <li class="nav-item">
-                        <span class="navbar-text d-flex align-items-center">
-                            <span id="status-indicator" class="status-indicator status-offline" aria-hidden="true"></span>
-                            <span id="status-text" class="small">Checking...</span>
-                        </span>
-                    </li>
-                    <li class="nav-item ms-2">
-                        <a class="btn btn-outline-primary btn-sm" href="https://github.com/dbccccccc/ttsfm" target="_blank" rel="noopener noreferrer" aria-label="View source code on GitHub">
-                            <i class="fab fa-github me-1"></i>GitHub
-                        </a>
-                    </li>
-                </ul>
-            </div>
-        </div>
-    </nav>
-    <!-- Main Content -->
-    <main id="main-content" style="padding-top: 76px;">
-        {% block content %}{% endblock %}
-    </main>
-    <!-- Simplified Footer -->
-    <footer class="footer py-4" style="background-color: #f8fafc; border-top: 1px solid #e5e7eb;" role="contentinfo">
-        <div class="container">
-            <div class="row align-items-center">
-                <div class="col-md-6">
-                    <div class="d-flex align-items-center mb-2 mb-md-0">
-                        <i class="fas fa-microphone-alt me-2 text-primary"></i>
-                        <strong class="text-dark">TTSFM</strong>
-                        <span class="ms-2 text-muted">Free Text-to-Speech for Python</span>
-                    </div>
-                </div>
-                <div class="col-md-6 text-md-end">
-                    <div class="d-flex justify-content-md-end gap-3">
-                        <a href="{{ url_for('playground') }}" class="text-decoration-none" style="color: #6b7280;">
-                            <i class="fas fa-play me-1"></i>Demo
-                        </a>
-                        <a href="{{ url_for('docs') }}" class="text-decoration-none" style="color: #6b7280;">
-                            <i class="fas fa-book me-1"></i>Docs
-                        </a>
-                        <a href="https://github.com/dbccccccc/ttsfm" class="text-decoration-none" style="color: #6b7280;" target="_blank" rel="noopener noreferrer">
-                            <i class="fab fa-github me-1"></i>GitHub
-                        </a>
-                    </div>
-                </div>
-            </div>
-            <hr class="my-3" style="border-color: #e5e7eb;">
-            <div class="row align-items-center">
-                <div class="col-md-6">
-                    <small class="text-muted">&copy; 2024 TTSFM. MIT License.</small>
-                </div>
-                <div class="col-md-6 text-md-end">
-                    <small class="text-muted">
-                        <span id="footer-status" class="d-inline-flex align-items-center">
-                            <span class="status-indicator status-offline me-2"></span>
-                            Status: <span id="footer-status-text" class="ms-1">Checking...</span>
-                        </span>
-                    </small>
-                </div>
-            </div>
-        </div>
-    </footer>
-    <!-- Bootstrap JS -->
-    <script src="https://cdn.jsdelivr.net/npm/bootstrap@5.1.3/dist/js/bootstrap.bundle.min.js"></script>
-    <!-- Enhanced Common JavaScript -->
-    <script>
-        // Enhanced service status checking
-        async function checkStatus() {
-            try {
-                const response = await fetch('/api/health');
-                const data = await response.json();
-                const indicator = document.getElementById('status-indicator');
-                const text = document.getElementById('status-text');
-                const footerIndicator = document.querySelector('#footer-status .status-indicator');
-                const footerText = document.getElementById('footer-status-text');
-                if (response.ok && data.status === 'healthy') {
-                    // Update navbar status
-                    indicator.className = 'status-indicator status-online';
-                    text.textContent = 'Online';
-                    // Update footer status
-                    if (footerIndicator) footerIndicator.className = 'status-indicator status-online';
-                    if (footerText) footerText.textContent = 'Online';
-                } else {
-                    // Update navbar status
-                    indicator.className = 'status-indicator status-offline';
-                    text.textContent = 'Offline';
-                    // Update footer status
-                    if (footerIndicator) footerIndicator.className = 'status-indicator status-offline';
-                    if (footerText) footerText.textContent = 'Offline';
-                }
-            } catch (error) {
-                // Update navbar status
-                const indicator = document.getElementById('status-indicator');
-                const text = document.getElementById('status-text');
-                indicator.className = 'status-indicator status-offline';
-                text.textContent = 'Offline';
-                // Update footer status
-                const footerIndicator = document.querySelector('#footer-status .status-indicator');
-                const footerText = document.getElementById('footer-status-text');
-                if (footerIndicator) footerIndicator.className = 'status-indicator status-offline';
-                if (footerText) footerText.textContent = 'Offline';
-            }
-        }
-        // Enhanced page initialization
-        document.addEventListener('DOMContentLoaded', function() {
-            // Check status immediately and periodically
-            checkStatus();
-            setInterval(checkStatus, 30000); // Check every 30 seconds
-            // Initialize tooltips
-            if (typeof bootstrap !== 'undefined') {
-                const tooltipTriggerList = [].slice.call(document.querySelectorAll('[data-bs-toggle="tooltip"]'));
-                tooltipTriggerList.map(function (tooltipTriggerEl) {
-                    return new bootstrap.Tooltip(tooltipTriggerEl);
-                });
-            }
-            // Add smooth scrolling for anchor links
-            document.querySelectorAll('a[href^="#"]').forEach(anchor => {
-                anchor.addEventListener('click', function (e) {
-                    const target = document.querySelector(this.getAttribute('href'));
-                    if (target) {
-                        e.preventDefault();
-                        target.scrollIntoView({
-                            behavior: 'smooth',
-                            block: 'start'
-                        });
-                    }
-                });
-            });
-            // Add fade-in animation to main content
-            const mainContent = document.querySelector('main');
-            if (mainContent) {
-                mainContent.classList.add('fade-in');
-            }
-            // Add loading states to external links
-            document.querySelectorAll('a[target="_blank"]').forEach(link => {
-                link.addEventListener('click', function() {
-                    this.style.opacity = '0.7';
-                    setTimeout(() => {
-                        this.style.opacity = '1';
-                    }, 1000);
-                });
-            });
-        });
-        // Enhanced utility function to show loading state
-        function setLoading(button, loading) {
-            if (loading) {
-                button.classList.add('loading');
-                button.disabled = true;
-                button.style.cursor = 'wait';
-            } else {
-                button.classList.remove('loading');
-                button.disabled = false;
-                button.style.cursor = 'pointer';
-            }
-        }
-        // Enhanced utility function to show alerts
-        function showAlert(message, type = 'info', duration = 5000) {
-            const alertDiv = document.createElement('div');
-            alertDiv.className = `alert alert-${type} alert-dismissible fade show fade-in`;
-            alertDiv.style.position = 'relative';
-            alertDiv.style.zIndex = '1050';
-            alertDiv.innerHTML = `
-                <i class="fas fa-${getAlertIcon(type)} me-2"></i>
-                ${message}
-                <button type="button" class="btn-close" data-bs-dismiss="alert" aria-label="Close"></button>
-            `;
-            // Find the best container to insert the alert
-            const container = document.querySelector('main .container') || document.querySelector('.container') || document.body;
-            if (container) {
-                container.insertBefore(alertDiv, container.firstChild);
-                // Auto-dismiss after specified duration
-                setTimeout(() => {
-                    if (alertDiv.parentNode) {
-                        alertDiv.classList.remove('show');
-                        setTimeout(() => {
-                            if (alertDiv.parentNode) {
-                                alertDiv.remove();
-                            }
-                        }, 150);
-                    }
-                }, duration);
-                // Scroll to alert if it's not visible
-                alertDiv.scrollIntoView({ behavior: 'smooth', block: 'nearest' });
-            }
-        }
-        // Helper function to get appropriate icon for alert type
-        function getAlertIcon(type) {
-            const icons = {
-                'success': 'check-circle',
-                'danger': 'exclamation-triangle',
-                'warning': 'exclamation-triangle',
-                'info': 'info-circle',
-                'primary': 'info-circle'
-            };
-            return icons[type] || 'info-circle';
-        }
-        // Enhanced error handling for fetch requests
-        async function safeFetch(url, options = {}) {
-            try {
-                const response = await fetch(url, options);
-                if (!response.ok) {
-                    throw new Error(`HTTP ${response.status}: ${response.statusText}`);
-                }
-                return response;
-            } catch (error) {
-                console.error('Fetch error:', error);
-                showAlert(`Network error: ${error.message}`, 'danger');
-                throw error;
-            }
-        }
-        // Performance monitoring
-        window.addEventListener('load', function() {
-            // Log page load time
-            const loadTime = performance.now();
-            console.log(`Page loaded in ${Math.round(loadTime)}ms`);
-            // Check for slow loading resources
-            if (loadTime > 3000) {
-                console.warn('Page load time is slow. Consider optimizing resources.');
-            }
-        });
-        // Keyboard shortcuts
-        document.addEventListener('keydown', function(e) {
-            // Alt + H for home
-            if (e.altKey && e.key === 'h') {
-                e.preventDefault();
-                window.location.href = '{{ url_for("index") }}';
-            }
-            // Alt + P for playground
-            if (e.altKey && e.key === 'p') {
-                e.preventDefault();
-                window.location.href = '{{ url_for("playground") }}';
-            }
-            // Alt + D for docs
-            if (e.altKey && e.key === 'd') {
-                e.preventDefault();
-                window.location.href = '{{ url_for("docs") }}';
-            }
-        });
-    </script>
-    {% block extra_js %}{% endblock %}
-</body>
-</html>

+<!DOCTYPE html>
+<html lang="{{ get_locale() }}">
+<head>
+    <meta charset="UTF-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+    <title>{% block title %}TTSFM - {{ _('nav.home') }}{% endblock %}</title>
+    <!-- Bootstrap CSS -->
+    <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.1.3/dist/css/bootstrap.min.css" rel="stylesheet">
+    <!-- Font Awesome -->
+    <link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.0.0/css/all.min.css" rel="stylesheet">
+    <!-- Google Fonts -->
+    <link href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700&display=swap" rel="stylesheet">
+    <!-- Custom CSS -->
+    <link href="{{ url_for('static', filename='css/style.css') }}" rel="stylesheet">
+    <!-- Additional Performance Optimizations -->
+    <link rel="preconnect" href="https://fonts.googleapis.com">
+    <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
+    <!-- Favicon -->
+    <link rel="icon" type="image/svg+xml" href="data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 100 100'><text y='.9em' font-size='90'>🎤</text></svg>">
+    <!-- Meta tags for better SEO and social sharing -->
+    <meta name="description" content="TTSFM - A Python client for text-to-speech APIs. Simple to use with support for multiple voices and audio formats.">
+    <meta name="keywords" content="text-to-speech, TTS, python, API, voice synthesis, audio generation">
+    <meta name="author" content="TTSFM">
+    <!-- Open Graph / Facebook -->
+    <meta property="og:type" content="website">
+    <meta property="og:url" content="{{ request.url }}">
+    <meta property="og:title" content="{% block og_title %}TTSFM - Python Text-to-Speech Client{% endblock %}">
+    <meta property="og:description" content="A Python client for text-to-speech APIs. Simple to use with support for multiple voices and audio formats.">
+    <!-- Twitter -->
+    <meta property="twitter:card" content="summary">
+    <meta property="twitter:url" content="{{ request.url }}">
+    <meta property="twitter:title" content="{% block twitter_title %}TTSFM - Python Text-to-Speech Client{% endblock %}">
+    <meta property="twitter:description" content="A Python client for text-to-speech APIs. Simple to use with support for multiple voices and audio formats.">
+    {% block extra_css %}{% endblock %}
+    <!-- Language button styling -->
+    <style>
+        /* Language dropdown button styling */
+        #languageDropdown {
+            border-color: #6c757d;
+            color: #6c757d;
+            transition: all 0.2s ease-in-out;
+            font-size: 0.875rem;
+        }
+        #languageDropdown:hover {
+            border-color: #495057;
+            color: #495057;
+            background-color: #f8f9fa;
+        }
+        #languageDropdown:focus {
+            box-shadow: 0 0 0 0.2rem rgba(108, 117, 125, 0.25);
+        }
+        /* Responsive language button */
+        @media (max-width: 576px) {
+            #languageDropdown {
+                font-size: 0.75rem;
+                padding: 0.25rem 0.5rem;
+            }
+        }
+        /* Ensure consistent button heights */
+        .navbar-nav .btn {
+            display: inline-flex;
+            align-items: center;
+        }
+    </style>
+</head>
+<body>
+    <!-- Skip to content link for accessibility -->
+    <a href="#main-content" class="skip-link">Skip to main content</a>
+    <!-- Clean Navigation -->
+    <nav class="navbar navbar-expand-lg fixed-top" style="background-color: rgba(255, 255, 255, 0.95); backdrop-filter: blur(10px); border-bottom: 1px solid #e5e7eb;">
+        <div class="container">
+            <a class="navbar-brand" href="{{ url_for('index') }}">
+                <i class="fas fa-microphone-alt me-2"></i>
+                <span class="fw-bold">TTSFM</span>
+                <span class="badge bg-primary ms-2 small">v3.2.2</span>
+            </a>
+            <button class="navbar-toggler border-0" type="button" data-bs-toggle="collapse" data-bs-target="#navbarNav" aria-controls="navbarNav" aria-expanded="false" aria-label="Toggle navigation">
+                <span class="navbar-toggler-icon"></span>
+            </button>
+            <div class="collapse navbar-collapse" id="navbarNav">
+                <ul class="navbar-nav me-auto">
+                    <li class="nav-item">
+                        <a class="nav-link" href="{{ url_for('index') }}" aria-label="{{ _('nav.home') }}">
+                            <i class="fas fa-home me-1"></i>{{ _('nav.home') }}
+                        </a>
+                    </li>
+                    <li class="nav-item">
+                        <a class="nav-link" href="{{ url_for('playground') }}" aria-label="{{ _('nav.playground') }}">
+                            <i class="fas fa-play me-1"></i>{{ _('nav.playground') }}
+                        </a>
+                    </li>
+                    <li class="nav-item">
+                        <a class="nav-link" href="{{ url_for('docs') }}" aria-label="{{ _('nav.documentation') }}">
+                            <i class="fas fa-book me-1"></i>{{ _('nav.documentation') }}
+                        </a>
+                    </li>
+                </ul>
+                <ul class="navbar-nav">
+                    <li class="nav-item">
+                        <span class="navbar-text d-flex align-items-center">
+                            <span id="status-indicator" class="status-indicator status-offline" aria-hidden="true"></span>
+                            <span id="status-text" class="small">{{ _('nav.status_checking') }}</span>
+                        </span>
+                    </li>
+                    <li class="nav-item dropdown ms-3">
+                        <button class="btn btn-outline-secondary btn-sm dropdown-toggle" type="button" id="languageDropdown" data-bs-toggle="dropdown" aria-expanded="false" title="{{ _('common.language') }}">
+                            {% if get_locale() == 'zh' %}🇨🇳 中文{% else %}🇺🇸 English{% endif %}
+                        </button>
+                        <ul class="dropdown-menu" aria-labelledby="languageDropdown">
+                            {% for lang_code, lang_name in get_supported_languages().items() %}
+                            <li>
+                                <a class="dropdown-item{% if get_locale() == lang_code %} active{% endif %}"
+                                   href="{{ url_for('set_language', lang_code=lang_code) }}">
+                                    {% if lang_code == 'en' %}🇺🇸{% elif lang_code == 'zh' %}🇨🇳{% endif %} {{ lang_name }}
+                                </a>
+                            </li>
+                            {% endfor %}
+                        </ul>
+                    </li>
+                    <li class="nav-item ms-3">
+                        <a class="btn btn-outline-primary btn-sm" href="https://github.com/dbccccccc/ttsfm" target="_blank" rel="noopener noreferrer" aria-label="{{ _('nav.github') }}">
+                            <i class="fab fa-github me-1"></i>{{ _('nav.github') }}
+                        </a>
+                    </li>
+                </ul>
+            </div>
+        </div>
+    </nav>
+    <!-- Main Content -->
+    <main id="main-content" style="padding-top: 76px;">
+        {% block content %}{% endblock %}
+    </main>
+    <!-- Simplified Footer -->
+    <footer class="footer py-3" style="background-color: #f9fafb; border-top: 1px solid #e5e7eb;" role="contentinfo">
+        <div class="container">
+            <div class="row align-items-center">
+                <div class="col-md-6">
+                    <div class="d-flex align-items-center">
+                        <i class="fas fa-microphone-alt me-2 text-primary"></i>
+                        <strong class="text-dark">TTSFM</strong>
+                        <span class="ms-2 text-muted">v3.2.2</span>
+                    </div>
+                </div>
+                <div class="col-md-6 text-md-end">
+                    <small class="text-muted">
+                        {{ _('home.footer_copyright') }} •
+                        <a href="{{ url_for('docs') }}" class="text-decoration-none text-muted">{{ _('nav.documentation') }}</a> •
+                        <a href="https://github.com/dbccccccc/ttsfm" class="text-decoration-none text-muted" target="_blank">{{ _('nav.github') }}</a>
+                    </small>
+                </div>
+            </div>
+        </div>
+    </footer>
+    <!-- Bootstrap JS -->
+    <script src="https://cdn.jsdelivr.net/npm/bootstrap@5.1.3/dist/js/bootstrap.bundle.min.js"></script>
+    <!-- Internationalization Support -->
+    <script src="{{ url_for('static', filename='js/i18n.js') }}"></script>
+    <!-- Enhanced Common JavaScript -->
+    <script>
+        // Enhanced service status checking
+        async function checkStatus() {
+            try {
+                const response = await fetch('/api/health');
+                const data = await response.json();
+                const indicator = document.getElementById('status-indicator');
+                const text = document.getElementById('status-text');
+                if (response.ok && data.status === 'healthy') {
+                    indicator.className = 'status-indicator status-online';
+                    text.textContent = '{{ _("nav.status_online") }}';
+                } else {
+                    indicator.className = 'status-indicator status-offline';
+                    text.textContent = '{{ _("nav.status_offline") }}';
+                }
+            } catch (error) {
+                const indicator = document.getElementById('status-indicator');
+                const text = document.getElementById('status-text');
+                indicator.className = 'status-indicator status-offline';
+                text.textContent = '{{ _("nav.status_offline") }}';
+            }
+        }
+        // Enhanced page initialization
+        document.addEventListener('DOMContentLoaded', function() {
+            // Check status immediately and periodically
+            checkStatus();
+            setInterval(checkStatus, 30000); // Check every 30 seconds
+            // Initialize tooltips
+            if (typeof bootstrap !== 'undefined') {
+                const tooltipTriggerList = [].slice.call(document.querySelectorAll('[data-bs-toggle="tooltip"]'));
+                tooltipTriggerList.map(function (tooltipTriggerEl) {
+                    return new bootstrap.Tooltip(tooltipTriggerEl);
+                });
+            }
+            // Add smooth scrolling for anchor links
+            document.querySelectorAll('a[href^="#"]').forEach(anchor => {
+                anchor.addEventListener('click', function (e) {
+                    const target = document.querySelector(this.getAttribute('href'));
+                    if (target) {
+                        e.preventDefault();
+                        target.scrollIntoView({
+                            behavior: 'smooth',
+                            block: 'start'
+                        });
+                    }
+                });
+            });
+            // Add fade-in animation to main content
+            const mainContent = document.querySelector('main');
+            if (mainContent) {
+                mainContent.classList.add('fade-in');
+            }
+            // Add loading states to external links
+            document.querySelectorAll('a[target="_blank"]').forEach(link => {
+                link.addEventListener('click', function() {
+                    this.style.opacity = '0.7';
+                    setTimeout(() => {
+                        this.style.opacity = '1';
+                    }, 1000);
+                });
+            });
+        });
+        // Enhanced utility function to show loading state
+        function setLoading(button, loading) {
+            if (loading) {
+                button.classList.add('loading');
+                button.disabled = true;
+                button.style.cursor = 'wait';
+            } else {
+                button.classList.remove('loading');
+                button.disabled = false;
+                button.style.cursor = 'pointer';
+            }
+        }
+        // Enhanced utility function to show alerts
+        function showAlert(message, type = 'info', duration = 5000) {
+            const alertDiv = document.createElement('div');
+            alertDiv.className = `alert alert-${type} alert-dismissible fade show fade-in`;
+            alertDiv.style.position = 'relative';
+            alertDiv.style.zIndex = '1050';
+            alertDiv.innerHTML = `
+                <i class="fas fa-${getAlertIcon(type)} me-2"></i>
+                ${message}
+                <button type="button" class="btn-close" data-bs-dismiss="alert" aria-label="Close"></button>
+            `;
+            // Find the best container to insert the alert
+            const container = document.querySelector('main .container') || document.querySelector('.container') || document.body;
+            if (container) {
+                container.insertBefore(alertDiv, container.firstChild);
+                // Auto-dismiss after specified duration
+                setTimeout(() => {
+                    if (alertDiv.parentNode) {
+                        alertDiv.classList.remove('show');
+                        setTimeout(() => {
+                            if (alertDiv.parentNode) {
+                                alertDiv.remove();
+                            }
+                        }, 150);
+                    }
+                }, duration);
+                // Scroll to alert if it's not visible
+                alertDiv.scrollIntoView({ behavior: 'smooth', block: 'nearest' });
+            }
+        }
+        // Helper function to get appropriate icon for alert type
+        function getAlertIcon(type) {
+            const icons = {
+                'success': 'check-circle',
+                'danger': 'exclamation-triangle',
+                'warning': 'exclamation-triangle',
+                'info': 'info-circle',
+                'primary': 'info-circle'
+            };
+            return icons[type] || 'info-circle';
+        }
+        // Enhanced error handling for fetch requests
+        async function safeFetch(url, options = {}) {
+            try {
+                const response = await fetch(url, options);
+                if (!response.ok) {
+                    throw new Error(`HTTP ${response.status}: ${response.statusText}`);
+                }
+                return response;
+            } catch (error) {
+                console.error('Fetch error:', error);
+                showAlert(`Network error: ${error.message}`, 'danger');
+                throw error;
+            }
+        }
+        // Performance monitoring
+        window.addEventListener('load', function() {
+            // Log page load time
+            const loadTime = performance.now();
+            console.log(`Page loaded in ${Math.round(loadTime)}ms`);
+            // Check for slow loading resources
+            if (loadTime > 3000) {
+                console.warn('Page load time is slow. Consider optimizing resources.');
+            }
+        });
+        // Keyboard shortcuts
+        document.addEventListener('keydown', function(e) {
+            // Alt + H for home
+            if (e.altKey && e.key === 'h') {
+                e.preventDefault();
+                window.location.href = '{{ url_for("index") }}';
+            }
+            // Alt + P for playground
+            if (e.altKey && e.key === 'p') {
+                e.preventDefault();
+                window.location.href = '{{ url_for("playground") }}';
+            }
+            // Alt + D for docs
+            if (e.altKey && e.key === 'd') {
+                e.preventDefault();
+                window.location.href = '{{ url_for("docs") }}';
+            }
+        });
+    </script>
+    {% block extra_js %}{% endblock %}
+</body>
+</html>

ttsfm-web/templates/docs.html CHANGED Viewed

@@ -1,369 +1,734 @@
-{% extends "base.html" %}
-{% block title %}TTSFM API Documentation{% endblock %}
-{% block extra_css %}
-<style>
-    .code-block {
-        background-color: #f8f9fa;
-        border: 1px solid #e9ecef;
-        border-radius: 0.375rem;
-        padding: 1rem;
-        margin: 1rem 0;
-        overflow-x: auto;
-    }
-    .endpoint-card {
-        border-left: 4px solid #007bff;
-        margin-bottom: 2rem;
-    }
-    .method-badge {
-        font-size: 0.75rem;
-        padding: 0.25rem 0.5rem;
-        border-radius: 0.25rem;
-        font-weight: bold;
-        margin-right: 0.5rem;
-    }
-    .method-get { background-color: #28a745; color: white; }
-    .method-post { background-color: #007bff; color: white; }
-    .method-put { background-color: #ffc107; color: black; }
-    .method-delete { background-color: #dc3545; color: white; }
-    .response-example {
-        background-color: #f1f3f4;
-        border-radius: 0.375rem;
-        padding: 1rem;
-        margin-top: 1rem;
-    }
-    .toc {
-        position: sticky;
-        top: 2rem;
-        max-height: calc(100vh - 4rem);
-        overflow-y: auto;
-    }
-    .toc a {
-        color: #6c757d;
-        text-decoration: none;
-        display: block;
-        padding: 0.25rem 0;
-        border-left: 2px solid transparent;
-        padding-left: 1rem;
-    }
-    .toc a:hover, .toc a.active {
-        color: #007bff;
-        border-left-color: #007bff;
-    }
-</style>
-{% endblock %}
-{% block content %}
-<div class="container py-5">
-    <div class="row">
-        <div class="col-12 text-center mb-5">
-            <h1 class="display-4 fw-bold">
-                <i class="fas fa-book me-3"></i>API Documentation
-            </h1>
-            <p class="lead text-muted">
-                Complete reference for the TTSFM Text-to-Speech API
-            </p>
-        </div>
-    </div>
-    <div class="row">
-        <!-- Table of Contents -->
-        <div class="col-lg-3">
-            <div class="toc">
-                <h5 class="fw-bold mb-3">Contents</h5>
-                <a href="#overview">Overview</a>
-                <a href="#authentication">Authentication</a>
-                <a href="#text-validation">Text Validation</a>
-                <a href="#endpoints">API Endpoints</a>
-                <a href="#voices">Voices</a>
-                <a href="#formats">Audio Formats</a>
-                <a href="#generate">Generate Speech</a>
-                <a href="#batch">Batch Processing</a>
-                <a href="#status">Status & Health</a>
-                <a href="#errors">Error Handling</a>
-                <a href="#examples">Code Examples</a>
-                <a href="#python-package">Python Package</a>
-            </div>
-        </div>
-        <!-- Documentation Content -->
-        <div class="col-lg-9">
-            <!-- Overview -->
-            <section id="overview" class="mb-5">
-                <h2 class="fw-bold mb-3">Overview</h2>
-                <p>
-                    The TTSFM API provides a modern, OpenAI-compatible interface for text-to-speech generation.
-                    It supports multiple voices, audio formats, and includes advanced features like text length
-                    validation and batch processing.
-                </p>
-                <div class="alert alert-info">
-                    <i class="fas fa-info-circle me-2"></i>
-                    <strong>Base URL:</strong> <code>{{ request.url_root }}api/</code>
-                </div>
-                <h4>Key Features</h4>
-                <ul>
-                    <li>11 different voice options</li>
-                    <li>Multiple audio formats (MP3, WAV, OPUS, etc.)</li>
-                    <li>Text length validation (4096 character limit)</li>
-                    <li>Automatic text splitting for long content</li>
-                    <li>Batch processing capabilities</li>
-                    <li>Real-time status monitoring</li>
-                </ul>
-            </section>
-            <!-- Authentication -->
-            <section id="authentication" class="mb-5">
-                <h2 class="fw-bold mb-3">Authentication</h2>
-                <p>
-                    Currently, the API supports optional API key authentication. If configured,
-                    include your API key in the request headers.
-                </p>
-                <div class="code-block">
-                    <pre><code>Authorization: Bearer YOUR_API_KEY</code></pre>
-                </div>
-            </section>
-            <!-- Text Validation -->
-            <section id="text-validation" class="mb-5">
-                <h2 class="fw-bold mb-3">Text Length Validation</h2>
-                <p>
-                    TTSFM includes built-in text length validation to ensure compatibility with TTS models.
-                    The default maximum length is 4096 characters, but this can be customized.
-                </p>
-                <div class="alert alert-warning">
-                    <i class="fas fa-exclamation-triangle me-2"></i>
-                    <strong>Important:</strong> Text exceeding the maximum length will be rejected unless
-                    validation is disabled or the text is split into chunks.
-                </div>
-                <h4>Validation Options</h4>
-                <ul>
-                    <li><code>max_length</code>: Maximum allowed characters (default: 4096)</li>
-                    <li><code>validate_length</code>: Enable/disable validation (default: true)</li>
-                    <li><code>preserve_words</code>: Avoid splitting words when chunking (default: true)</li>
-                </ul>
-            </section>
-            <!-- API Endpoints -->
-            <section id="endpoints" class="mb-5">
-                <h2 class="fw-bold mb-3">API Endpoints</h2>
-                <!-- Voices Endpoint -->
-                <div class="card endpoint-card" id="voices">
-                    <div class="card-body">
-                        <h4 class="card-title">
-                            <span class="method-badge method-get">GET</span>
-                            /api/voices
-                        </h4>
-                        <p class="card-text">Get list of available voices.</p>
-                        <h6>Response Example:</h6>
-                        <div class="response-example">
-                            <pre><code>{
-  "voices": [
-    {
-      "id": "alloy",
-      "name": "Alloy",
-      "description": "Alloy voice"
-    },
-    {
-      "id": "echo",
-      "name": "Echo",
-      "description": "Echo voice"
-    }
-  ],
-  "count": 6
-}</code></pre>
-                        </div>
-                    </div>
-                </div>
-                <!-- Formats Endpoint -->
-                <div class="card endpoint-card" id="formats">
-                    <div class="card-body">
-                        <h4 class="card-title">
-                            <span class="method-badge method-get">GET</span>
-                            /api/formats
-                        </h4>
-                        <p class="card-text">Get list of supported audio formats.</p>
-                        <h6>Response Example:</h6>
-                        <div class="response-example">
-                            <pre><code>{
-  "formats": [
-    {
-      "id": "mp3",
-      "name": "MP3",
-      "mime_type": "audio/mp3",
-      "description": "MP3 audio format"
-    }
-  ],
-  "count": 6
-}</code></pre>
-                        </div>
-                    </div>
-                </div>
-                <!-- Text Validation Endpoint -->
-                <div class="card endpoint-card">
-                    <div class="card-body">
-                        <h4 class="card-title">
-                            <span class="method-badge method-post">POST</span>
-                            /api/validate-text
-                        </h4>
-                        <p class="card-text">Validate text length and get splitting suggestions.</p>
-                        <h6>Request Body:</h6>
-                        <div class="code-block">
-                            <pre><code>{
-  "text": "Your text to validate",
-  "max_length": 4096
-}</code></pre>
-                        </div>
-                        <h6>Response Example:</h6>
-                        <div class="response-example">
-                            <pre><code>{
-  "text_length": 5000,
-  "max_length": 4096,
-  "is_valid": false,
-  "needs_splitting": true,
-  "suggested_chunks": 2,
-  "chunk_preview": [
-    "First chunk preview...",
-    "Second chunk preview..."
-  ]
-}</code></pre>
-                        </div>
-                    </div>
-                </div>
-                <!-- Generate Speech Endpoint -->
-                <div class="card endpoint-card" id="generate">
-                    <div class="card-body">
-                        <h4 class="card-title">
-                            <span class="method-badge method-post">POST</span>
-                            /api/generate
-                        </h4>
-                        <p class="card-text">Generate speech from text.</p>
-                        <h6>Request Body:</h6>
-                        <div class="code-block">
-                            <pre><code>{
-  "text": "Hello, world!",
-  "voice": "alloy",
-  "format": "mp3",
-  "instructions": "Speak cheerfully",
-  "max_length": 4096,
-  "validate_length": true
-}</code></pre>
-                        </div>
-                        <h6>Parameters:</h6>
-                        <ul>
-                            <li><code>text</code> (required): Text to convert to speech</li>
-                            <li><code>voice</code> (optional): Voice ID (default: "alloy")</li>
-                            <li><code>format</code> (optional): Audio format (default: "mp3")</li>
-                            <li><code>instructions</code> (optional): Voice modulation instructions</li>
-                            <li><code>max_length</code> (optional): Maximum text length (default: 4096)</li>
-                            <li><code>validate_length</code> (optional): Enable validation (default: true)</li>
-                        </ul>
-                        <h6>Response:</h6>
-                        <p>Returns audio file with appropriate Content-Type header.</p>
-                    </div>
-                </div>
-                <!-- Batch Processing Endpoint -->
-                <div class="card endpoint-card" id="batch">
-                    <div class="card-body">
-                        <h4 class="card-title">
-                            <span class="method-badge method-post">POST</span>
-                            /api/generate-batch
-                        </h4>
-                        <p class="card-text">Generate speech from long text by automatically splitting into chunks.</p>
-                        <h6>Request Body:</h6>
-                        <div class="code-block">
-                            <pre><code>{
-  "text": "Very long text that exceeds the limit...",
-  "voice": "alloy",
-  "format": "mp3",
-  "max_length": 4096,
-  "preserve_words": true
-}</code></pre>
-                        </div>
-                        <h6>Response Example:</h6>
-                        <div class="response-example">
-                            <pre><code>{
-  "total_chunks": 3,
-  "successful_chunks": 3,
-  "results": [
-    {
-      "chunk_index": 1,
-      "chunk_text": "First chunk text...",
-      "audio_data": "base64_encoded_audio",
-      "content_type": "audio/mp3",
-      "size": 12345,
-      "format": "mp3"
-    }
-  ]
-}</code></pre>
-                        </div>
-                    </div>
-                </div>
-            </section>
-        </div>
-    </div>
-</div>
-{% endblock %}
-{% block extra_js %}
-<script>
-    // Smooth scrolling for TOC links
-    document.querySelectorAll('.toc a').forEach(link => {
-        link.addEventListener('click', function(e) {
-            e.preventDefault();
-            const target = document.querySelector(this.getAttribute('href'));
-            if (target) {
-                target.scrollIntoView({ behavior: 'smooth' });
-                // Update active link
-                document.querySelectorAll('.toc a').forEach(l => l.classList.remove('active'));
-                this.classList.add('active');
-            }
-        });
-    });
-    // Highlight current section in TOC
-    window.addEventListener('scroll', function() {
-        const sections = document.querySelectorAll('section[id]');
-        const scrollPos = window.scrollY + 100;
-        sections.forEach(section => {
-            const top = section.offsetTop;
-            const bottom = top + section.offsetHeight;
-            const id = section.getAttribute('id');
-            const link = document.querySelector(`.toc a[href="#${id}"]`);
-            if (scrollPos >= top && scrollPos < bottom) {
-                document.querySelectorAll('.toc a').forEach(l => l.classList.remove('active'));
-                if (link) link.classList.add('active');
-            }
-        });
-    });
-</script>
-{% endblock %}

+{% extends "base.html" %}
+{% block title %}TTSFM {{ _('docs.title') }}{% endblock %}
+{% block extra_css %}
+<style>
+    .code-block {
+        background-color: #f8f9fa;
+        border: 1px solid #e9ecef;
+        border-radius: 0.375rem;
+        padding: 1rem;
+        margin: 1rem 0;
+        overflow-x: auto;
+    }
+    .endpoint-card {
+        border-left: 4px solid #007bff;
+        margin-bottom: 2rem;
+    }
+    .method-badge {
+        font-size: 0.75rem;
+        padding: 0.25rem 0.5rem;
+        border-radius: 0.25rem;
+        font-weight: bold;
+        margin-right: 0.5rem;
+    }
+    .method-get { background-color: #28a745; color: white; }
+    .method-post { background-color: #007bff; color: white; }
+    .method-put { background-color: #ffc107; color: black; }
+    .method-delete { background-color: #dc3545; color: white; }
+    .response-example {
+        background-color: #f1f3f4;
+        border-radius: 0.375rem;
+        padding: 1rem;
+        margin-top: 1rem;
+    }
+    .toc {
+        position: sticky;
+        top: 2rem;
+        max-height: calc(100vh - 4rem);
+        overflow-y: auto;
+    }
+    .toc a {
+        color: #6c757d;
+        text-decoration: none;
+        display: block;
+        padding: 0.25rem 0;
+        border-left: 2px solid transparent;
+        padding-left: 1rem;
+    }
+    .toc a:hover, .toc a.active {
+        color: #007bff;
+        border-left-color: #007bff;
+    }
+</style>
+{% endblock %}
+{% block content %}
+<div class="container py-5">
+    <div class="row">
+        <div class="col-12 text-center mb-5">
+            <h1 class="display-4 fw-bold">
+                <i class="fas fa-book me-3 text-primary"></i>{{ _('docs.title') }}
+            </h1>
+            <p class="lead text-muted">
+                {{ _('docs.subtitle') }}
+            </p>
+        </div>
+    </div>
+    <div class="row">
+        <!-- Table of Contents -->
+        <div class="col-lg-3">
+            <div class="toc">
+                <h5 class="fw-bold mb-3">{{ _('docs.contents') }}</h5>
+                <a href="#overview">{{ _('docs.overview') }}</a>
+                <a href="#authentication">{{ _('docs.authentication') }}</a>
+                <a href="#text-validation">{{ _('docs.text_validation') }}</a>
+                <a href="#endpoints">{{ _('docs.endpoints') }}</a>
+                <a href="#voices">{{ _('docs.voices') }}</a>
+                <a href="#formats">{{ _('docs.formats') }}</a>
+                <a href="#generate">{{ _('docs.generate') }}</a>
+                <a href="#combined">{{ _('docs.combined') }}</a>
+                <a href="#status">{{ _('docs.status') }}</a>
+                <a href="#errors">{{ _('docs.errors') }}</a>
+                <a href="#examples">{{ _('docs.examples') }}</a>
+                <a href="#python-package">{{ _('docs.python_package') }}</a>
+                <a href="#websocket">WebSocket Streaming</a>
+            </div>
+        </div>
+        <!-- Documentation Content -->
+        <div class="col-lg-9">
+            <!-- Overview -->
+            <section id="overview" class="mb-5">
+                <h2 class="fw-bold mb-3">{{ _('docs.overview_title') }}</h2>
+                <p>
+                    {{ _('docs.overview_desc') }}
+                </p>
+                <div class="alert alert-info">
+                    <i class="fas fa-info-circle me-2"></i>
+                    <strong>{{ _('docs.base_url') }}</strong> <code>{{ request.url_root }}api/</code>
+                </div>
+                <h4>{{ _('docs.key_features') }}</h4>
+                <ul>
+                    <li><strong>🎤 {{ _('docs.feature_voices') }}</strong></li>
+                    <li><strong>🎵 {{ _('docs.feature_formats') }}</strong></li>
+                    <li><strong>🤖 {{ _('docs.feature_openai') }}</strong></li>
+                    <li><strong>✨ {{ _('docs.feature_auto_combine') }}</strong></li>
+                    <li><strong>📊 {{ _('docs.feature_validation') }}</strong></li>
+                    <li><strong>📈 {{ _('docs.feature_monitoring') }}</strong></li>
+                </ul>
+                <div class="alert alert-success">
+                    <i class="fas fa-star me-2"></i>
+                    <strong>{{ _('docs.new_version') }}</strong> {{ _('docs.new_version_desc') }}
+                </div>
+            </section>
+            <!-- Authentication -->
+            <section id="authentication" class="mb-5">
+                <h2 class="fw-bold mb-3">{{ _('docs.authentication_title') }}</h2>
+                <p>
+                    {{ _('docs.authentication_desc') }}
+                </p>
+                <div class="code-block">
+                    <pre><code>Authorization: Bearer YOUR_API_KEY</code></pre>
+                </div>
+            </section>
+            <!-- Text Validation -->
+            <section id="text-validation" class="mb-5">
+                <h2 class="fw-bold mb-3">{{ _('docs.text_validation_title') }}</h2>
+                <p>
+                    {{ _('docs.text_validation_desc') }}
+                </p>
+                <div class="alert alert-warning">
+                    <i class="fas fa-exclamation-triangle me-2"></i>
+                    <strong>{{ _('docs.important') }}</strong> {{ _('docs.text_validation_warning') }}
+                </div>
+                <h4>{{ _('docs.validation_options') }}</h4>
+                <ul>
+                    <li><code>max_length</code>: {{ _('docs.max_length_option') }}</li>
+                    <li><code>validate_length</code>: {{ _('docs.validate_length_option') }}</li>
+                    <li><code>preserve_words</code>: {{ _('docs.preserve_words_option') }}</li>
+                </ul>
+            </section>
+            <!-- API Endpoints -->
+            <section id="endpoints" class="mb-5">
+                <h2 class="fw-bold mb-3">{{ _('docs.endpoints_title') }}</h2>
+                <!-- Voices Endpoint -->
+                <div class="card endpoint-card" id="voices">
+                    <div class="card-body">
+                        <h4 class="card-title">
+                            <span class="method-badge method-get">GET</span>
+                            /api/voices
+                        </h4>
+                        <p class="card-text">{{ _('docs.get_voices_desc') }}</p>
+                        <h6>{{ _('docs.response_example') }}</h6>
+                        <div class="response-example">
+                            <pre><code>{
+  "voices": [
+    {
+      "id": "alloy",
+      "name": "Alloy",
+      "description": "Alloy voice"
+    },
+    {
+      "id": "echo",
+      "name": "Echo",
+      "description": "Echo voice"
+    }
+  ],
+  "count": 6
+}</code></pre>
+                        </div>
+                    </div>
+                </div>
+                <!-- Formats Endpoint -->
+                <div class="card endpoint-card" id="formats">
+                    <div class="card-body">
+                        <h4 class="card-title">
+                            <span class="method-badge method-get">GET</span>
+                            /api/formats
+                        </h4>
+                        <p class="card-text">Get available audio formats for speech generation.</p>
+                        <h6>Available Formats</h6>
+                        <p>We support multiple format requests, but internally:</p>
+                        <ul>
+                            <li><strong>mp3</strong> - Returns actual MP3 format</li>
+                            <li><strong>All other formats</strong> (opus, aac, flac, wav, pcm) - Mapped to WAV format</li>
+                        </ul>
+                        <div class="alert alert-info">
+                            <i class="fas fa-info-circle me-2"></i>
+                            <strong>Note:</strong> When you request opus, aac, flac, wav, or pcm, you'll receive WAV audio data.
+                        </div>
+                        <h6>{{ _('docs.response_example') }}</h6>
+                        <div class="response-example">
+                            <pre><code>{
+  "formats": [
+    {
+      "id": "mp3",
+      "name": "MP3",
+      "mime_type": "audio/mp3",
+      "description": "MP3 audio format"
+    },
+    {
+      "id": "opus",
+      "name": "Opus",
+      "mime_type": "audio/wav",
+      "description": "Returns WAV format"
+    },
+    {
+      "id": "aac",
+      "name": "AAC",
+      "mime_type": "audio/wav",
+      "description": "Returns WAV format"
+    },
+    {
+      "id": "flac",
+      "name": "FLAC",
+      "mime_type": "audio/wav",
+      "description": "Returns WAV format"
+    },
+    {
+      "id": "wav",
+      "name": "WAV",
+      "mime_type": "audio/wav",
+      "description": "WAV audio format"
+    },
+    {
+      "id": "pcm",
+      "name": "PCM",
+      "mime_type": "audio/wav",
+      "description": "Returns WAV format"
+    }
+  ],
+  "count": 6
+}</code></pre>
+                        </div>
+                    </div>
+                </div>
+                <!-- Text Validation Endpoint -->
+                <div class="card endpoint-card">
+                    <div class="card-body">
+                        <h4 class="card-title">
+                            <span class="method-badge method-post">POST</span>
+                            /api/validate-text
+                        </h4>
+                        <p class="card-text">{{ _('docs.validate_text_desc') }}</p>
+                        <h6>{{ _('docs.request_body') }}</h6>
+                        <div class="code-block">
+                            <pre><code>{
+  "text": "Your text to validate",
+  "max_length": 4096
+}</code></pre>
+                        </div>
+                        <h6>{{ _('docs.response_example') }}</h6>
+                        <div class="response-example">
+                            <pre><code>{
+  "text_length": 5000,
+  "max_length": 4096,
+  "is_valid": false,
+  "needs_splitting": true,
+  "suggested_chunks": 2,
+  "chunk_preview": [
+    "First chunk preview...",
+    "Second chunk preview..."
+  ]
+}</code></pre>
+                        </div>
+                    </div>
+                </div>
+                <!-- Generate Speech Endpoint -->
+                <div class="card endpoint-card" id="generate">
+                    <div class="card-body">
+                        <h4 class="card-title">
+                            <span class="method-badge method-post">POST</span>
+                            /api/generate
+                        </h4>
+                        <p class="card-text">{{ _('docs.generate_speech_desc') }}</p>
+                        <h6>{{ _('docs.request_body') }}</h6>
+                        <div class="code-block">
+                            <pre><code>{
+  "text": "Hello, world!",
+  "voice": "alloy",
+  "format": "mp3",
+  "instructions": "Speak cheerfully",
+  "max_length": 4096,
+  "validate_length": true
+}</code></pre>
+                        </div>
+                        <h6>{{ _('docs.parameters') }}</h6>
+                        <ul>
+                            <li><code>text</code> ({{ _('docs.required') }}): {{ _('docs.text_param') }}</li>
+                            <li><code>voice</code> ({{ _('docs.optional') }}): {{ _('docs.voice_param') }}</li>
+                            <li><code>format</code> ({{ _('docs.optional') }}): {{ _('docs.format_param') }}</li>
+                            <li><code>instructions</code> ({{ _('docs.optional') }}): {{ _('docs.instructions_param') }}</li>
+                            <li><code>max_length</code> ({{ _('docs.optional') }}): {{ _('docs.max_length_param') }}</li>
+                            <li><code>validate_length</code> ({{ _('docs.optional') }}): {{ _('docs.validate_length_param') }}</li>
+                        </ul>
+                        <h6>{{ _('docs.response') }}</h6>
+                        <p>{{ _('docs.response_audio') }}</p>
+                    </div>
+                </div>
+            </section>
+            <!-- Python Package -->
+            <section id="python-package" class="mb-5">
+                <h3 class="fw-bold mb-4">
+                    <i class="fab fa-python me-2 text-warning"></i>{{ _('docs.python_package_title') }}
+                </h3>
+                <div class="card">
+                    <div class="card-body">
+                        <h5>{{ _('docs.long_text_support') }}</h5>
+                        <p>{{ _('docs.long_text_desc') }}</p>
+                        <div class="code-block">
+                            <pre><code>from ttsfm import TTSClient, Voice, AudioFormat
+# Create client
+client = TTSClient()
+# Generate speech from long text (automatically splits into separate files)
+responses = client.generate_speech_long_text(
+    text="Very long text that exceeds 4096 characters...",
+    voice=Voice.ALLOY,
+    response_format=AudioFormat.MP3,
+    max_length=2000,
+    preserve_words=True
+)
+# Save each chunk as separate files
+for i, response in enumerate(responses, 1):
+    response.save_to_file(f"part_{i:03d}.mp3")</code></pre>
+                        </div>
+                        <h6 class="mt-4">{{ _('docs.developer_features') }}</h6>
+                        <ul>
+                            <li><strong>{{ _('docs.manual_splitting') }}</strong></li>
+                            <li><strong>{{ _('docs.word_preservation') }}</strong></li>
+                            <li><strong>{{ _('docs.separate_files') }}</strong></li>
+                            <li><strong>{{ _('docs.cli_support') }}</strong></li>
+                        </ul>
+                        <div class="alert alert-info">
+                            <i class="fas fa-info-circle me-2"></i>
+                            <strong>{{ _('docs.note') }}</strong> {{ _('docs.auto_combine_note') }}
+                        </div>
+                    </div>
+                </div>
+                <!-- Combined Audio Endpoints -->
+                <div class="card endpoint-card" id="combined">
+                    <div class="card-body">
+                        <h4 class="card-title">
+                            <span class="method-badge method-post">POST</span>
+                            /api/generate-combined
+                        </h4>
+                        <p class="card-text">{{ _('docs.combined_audio_desc') }}</p>
+                        <h6>{{ _('docs.request_body') }}</h6>
+                        <div class="code-block">
+                            <pre><code>{
+  "text": "Very long text that exceeds the limit...",
+  "voice": "alloy",
+  "format": "mp3",
+  "instructions": "Optional voice instructions",
+  "max_length": 4096,
+  "preserve_words": true
+}</code></pre>
+                        </div>
+                        <h6>{{ _('docs.response') }}</h6>
+                        <p>{{ _('docs.response_combined_audio') }}</p>
+                        <h6>{{ _('docs.response_headers') }}</h6>
+                        <ul>
+                            <li><code>X-Chunks-Combined</code>: {{ _('docs.chunks_combined_header') }}</li>
+                            <li><code>X-Original-Text-Length</code>: {{ _('docs.original_text_length_header') }}</li>
+                            <li><code>X-Audio-Size</code>: {{ _('docs.audio_size_header') }}</li>
+                        </ul>
+                    </div>
+                </div>
+                <!-- OpenAI Compatible Endpoint with Auto-Combine -->
+                <div class="card endpoint-card">
+                    <div class="card-body">
+                        <h4 class="card-title">
+                            <span class="method-badge method-post">POST</span>
+                            /v1/audio/speech
+                        </h4>
+                        <p class="card-text">{{ _('docs.openai_compatible_desc') }}</p>
+                        <h6>{{ _('docs.request_body') }}</h6>
+                        <div class="code-block">
+                            <pre><code>{
+  "model": "gpt-4o-mini-tts",
+  "input": "Text of any length...",
+  "voice": "alloy",
+  "response_format": "mp3",
+  "instructions": "Optional voice instructions",
+  "speed": 1.0,
+  "auto_combine": true,
+  "max_length": 4096
+}</code></pre>
+                        </div>
+                        <h6>{{ _('docs.enhanced_parameters') }}</h6>
+                        <ul>
+                            <li><strong>auto_combine</strong> (boolean, default: true):
+                                <ul>
+                                    <li><code>true</code>: {{ _('docs.auto_combine_param') }}</li>
+                                    <li><code>false</code>: {{ _('docs.auto_combine_false') }}</li>
+                                </ul>
+                            </li>
+                            <li><strong>max_length</strong> (integer, default: 4096): {{ _('docs.max_length_chunk_param') }}</li>
+                        </ul>
+                        <h6>{{ _('docs.response_headers') }}</h6>
+                        <ul>
+                            <li><code>X-Auto-Combine</code>: {{ _('docs.auto_combine_header') }}</li>
+                            <li><code>X-Chunks-Combined</code>: {{ _('docs.chunks_combined_response') }}</li>
+                            <li><code>X-Original-Text-Length</code>: {{ _('docs.original_text_response') }}</li>
+                            <li><code>X-Audio-Format</code>: {{ _('docs.audio_format_header') }}</li>
+                            <li><code>X-Audio-Size</code>: {{ _('docs.audio_size_response') }}</li>
+                        </ul>
+                        <h6>{{ _('docs.examples_title') }}</h6>
+                        <div class="code-block">
+                            <pre><code># {{ _('docs.short_text_comment') }}
+curl -X POST {{ request.url_root }}v1/audio/speech \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "gpt-4o-mini-tts",
+    "input": "Hello world!",
+    "voice": "alloy"
+  }'
+# {{ _('docs.long_text_auto_comment') }}
+curl -X POST {{ request.url_root }}v1/audio/speech \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "gpt-4o-mini-tts",
+    "input": "Very long text...",
+    "voice": "alloy",
+    "auto_combine": true
+  }'
+# {{ _('docs.long_text_no_auto_comment') }}
+curl -X POST {{ request.url_root }}v1/audio/speech \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "gpt-4o-mini-tts",
+    "input": "Very long text...",
+    "voice": "alloy",
+    "auto_combine": false
+  }'</code></pre>
+                        </div>
+                        <div class="alert alert-info mt-3">
+                            <i class="fas fa-info-circle me-2"></i>
+                            <strong>{{ _('docs.audio_combination') }}</strong> {{ _('docs.audio_combination_desc') }}
+                        </div>
+                        <h6 class="mt-4">{{ _('docs.use_cases') }}</h6>
+                        <ul>
+                            <li><strong>{{ _('docs.use_case_articles') }}</strong></li>
+                            <li><strong>{{ _('docs.use_case_audiobooks') }}</strong></li>
+                            <li><strong>{{ _('docs.use_case_podcasts') }}</strong></li>
+                            <li><strong>{{ _('docs.use_case_education') }}</strong></li>
+                        </ul>
+                        <h6 class="mt-4">{{ _('docs.example_usage') }}</h6>
+                        <div class="code-block">
+                            <pre><code># {{ _('docs.python_example_comment') }}
+import requests
+response = requests.post(
+    "{{ request.url_root }}api/generate-combined",
+    json={
+        "text": "Your very long text content here...",
+        "voice": "nova",
+        "format": "mp3",
+        "max_length": 2000
+    }
+)
+if response.status_code == 200:
+    with open("combined_audio.mp3", "wb") as f:
+        f.write(response.content)
+    chunks = response.headers.get('X-Chunks-Combined')
+    print(f"Combined {chunks} chunks into single file")</code></pre>
+                        </div>
+                    </div>
+                </div>
+            </section>
+            <!-- WebSocket Streaming -->
+            <section id="websocket" class="mb-5">
+                <h2 class="mb-4">
+                    <i class="fas fa-bolt text-warning me-2"></i>WebSocket Streaming
+                </h2>
+                <p class="lead">
+                    Real-time audio streaming for enhanced user experience. Get audio chunks as they're generated instead of waiting for the complete file.
+                </p>
+                <div class="alert alert-info">
+                    <i class="fas fa-info-circle me-2"></i>
+                    WebSocket streaming provides lower perceived latency and real-time progress tracking for TTS generation.
+                </div>
+                <h3 class="mt-4">Connection</h3>
+                <div class="code-block">
+                    <pre><code>// JavaScript WebSocket client
+const client = new WebSocketTTSClient({
+    socketUrl: '{{ request.url_root[:-1] }}',
+    debug: true
+});
+// Connection events
+client.onConnect = () => console.log('Connected');
+client.onDisconnect = () => console.log('Disconnected');</code></pre>
+                </div>
+                <h3 class="mt-4">Streaming TTS Generation</h3>
+                <div class="code-block">
+                    <pre><code>// Generate speech with real-time streaming
+const result = await client.generateSpeech('Hello, WebSocket world!', {
+    voice: 'alloy',
+    format: 'mp3',
+    chunkSize: 1024,  // Characters per chunk
+    // Progress callback
+    onProgress: (progress) => {
+        console.log(`Progress: ${progress.progress}%`);
+        console.log(`Chunks: ${progress.chunksCompleted}/${progress.totalChunks}`);
+    },
+    // Receive audio chunks in real-time
+    onChunk: (chunk) => {
+        console.log(`Received chunk ${chunk.chunkIndex + 1}`);
+        // Process or play audio chunk immediately
+        processAudioChunk(chunk.audioData);
+    },
+    // Completion callback
+    onComplete: (result) => {
+        console.log('Streaming complete!');
+        // result.audioData contains the complete audio
+    }
+});</code></pre>
+                </div>
+                <h3 class="mt-4">WebSocket Events</h3>
+                <div class="endpoint-card card">
+                    <div class="card-body">
+                        <h5>Client → Server Events</h5>
+                        <table class="table table-sm">
+                            <thead>
+                                <tr>
+                                    <th>Event</th>
+                                    <th>Description</th>
+                                    <th>Payload</th>
+                                </tr>
+                            </thead>
+                            <tbody>
+                                <tr>
+                                    <td><code>generate_stream</code></td>
+                                    <td>Start TTS generation</td>
+                                    <td><code>{text, voice, format, chunk_size}</code></td>
+                                </tr>
+                                <tr>
+                                    <td><code>cancel_stream</code></td>
+                                    <td>Cancel active stream</td>
+                                    <td><code>{request_id}</code></td>
+                                </tr>
+                            </tbody>
+                        </table>
+                        <h5 class="mt-4">Server → Client Events</h5>
+                        <table class="table table-sm">
+                            <thead>
+                                <tr>
+                                    <th>Event</th>
+                                    <th>Description</th>
+                                    <th>Payload</th>
+                                </tr>
+                            </thead>
+                            <tbody>
+                                <tr>
+                                    <td><code>stream_started</code></td>
+                                    <td>Stream initiated</td>
+                                    <td><code>{request_id, timestamp}</code></td>
+                                </tr>
+                                <tr>
+                                    <td><code>audio_chunk</code></td>
+                                    <td>Audio chunk ready</td>
+                                    <td><code>{request_id, chunk_index, audio_data, duration}</code></td>
+                                </tr>
+                                <tr>
+                                    <td><code>stream_progress</code></td>
+                                    <td>Progress update</td>
+                                    <td><code>{progress, chunks_completed, total_chunks}</code></td>
+                                </tr>
+                                <tr>
+                                    <td><code>stream_complete</code></td>
+                                    <td>Generation complete</td>
+                                    <td><code>{request_id, total_chunks, status}</code></td>
+                                </tr>
+                                <tr>
+                                    <td><code>stream_error</code></td>
+                                    <td>Error occurred</td>
+                                    <td><code>{request_id, error, timestamp}</code></td>
+                                </tr>
+                            </tbody>
+                        </table>
+                    </div>
+                </div>
+                <h3 class="mt-4">Benefits</h3>
+                <ul>
+                    <li><strong>Real-time feedback:</strong> Users see progress as audio generates</li>
+                    <li><strong>Lower latency:</strong> First audio chunk arrives quickly</li>
+                    <li><strong>Cancellable:</strong> Stop generation mid-stream if needed</li>
+                    <li><strong>Efficient:</strong> Process chunks as they arrive</li>
+                </ul>
+                <h3 class="mt-4">Example: Streaming Audio Player</h3>
+                <div class="code-block">
+                    <pre><code>// Create a streaming audio player
+const audioChunks = [];
+let isPlaying = false;
+const streamingPlayer = await client.generateSpeech(longText, {
+    voice: 'nova',
+    format: 'mp3',
+    onChunk: (chunk) => {
+        // Store chunk
+        audioChunks.push(chunk.audioData);
+        // Start playing after first chunk
+        if (!isPlaying && audioChunks.length >= 3) {
+            startStreamingPlayback(audioChunks);
+            isPlaying = true;
+        }
+    },
+    onComplete: (result) => {
+        // Ensure all chunks are played
+        finishPlayback(result.audioData);
+    }
+});</code></pre>
+                </div>
+                <div class="alert alert-success mt-4">
+                    <h6><i class="fas fa-rocket me-2"></i>Try It Out!</h6>
+                    <p class="mb-0">
+                        Experience WebSocket streaming in action at the
+                        <a href="/websocket-demo" class="alert-link">WebSocket Demo</a> or enable streaming mode in the
+                        <a href="/playground" class="alert-link">Playground</a>.
+                    </p>
+                </div>
+            </section>
+        </div>
+    </div>
+</div>
+{% endblock %}
+{% block extra_js %}
+<script>
+    // Smooth scrolling for TOC links
+    document.querySelectorAll('.toc a').forEach(link => {
+        link.addEventListener('click', function(e) {
+            e.preventDefault();
+            const target = document.querySelector(this.getAttribute('href'));
+            if (target) {
+                target.scrollIntoView({ behavior: 'smooth' });
+                // Update active link
+                document.querySelectorAll('.toc a').forEach(l => l.classList.remove('active'));
+                this.classList.add('active');
+            }
+        });
+    });
+    // Highlight current section in TOC
+    window.addEventListener('scroll', function() {
+        const sections = document.querySelectorAll('section[id]');
+        const scrollPos = window.scrollY + 100;
+        sections.forEach(section => {
+            const top = section.offsetTop;
+            const bottom = top + section.offsetHeight;
+            const id = section.getAttribute('id');
+            const link = document.querySelector(`.toc a[href="#${id}"]`);
+            if (scrollPos >= top && scrollPos < bottom) {
+                document.querySelectorAll('.toc a').forEach(l => l.classList.remove('active'));
+                if (link) link.classList.add('active');
+            }
+        });
+    });
+</script>
+{% endblock %}

ttsfm-web/templates/index.html CHANGED Viewed

@@ -1,146 +1,156 @@
-{% extends "base.html" %}
-{% block title %}TTSFM - Free Text-to-Speech for Python{% endblock %}
-{% block content %}
-<!-- Hero Section -->
-<section class="hero-section">
-    <div class="container">
-        <div class="row align-items-center min-vh-75">
-            <div class="col-lg-8 mx-auto text-center">
-                <div class="hero-content">
-                    <div class="badge bg-primary text-white mb-3 px-3 py-2">
-                        <i class="fas fa-code me-2"></i>Python Package
-                    </div>
-                    <h1 class="display-4 fw-bold mb-4">
-                        Free Text-to-Speech for Python
-                    </h1>
-                    <p class="lead mb-4">
-                        Access free text-to-speech using openai.fm's service. No API keys required,
-                        just install and use immediately.
-                    </p>
-                    <div class="d-flex flex-wrap gap-3 justify-content-center">
-                        <a href="{{ url_for('playground') }}" class="btn btn-primary btn-lg">
-                            <i class="fas fa-play me-2"></i>Try Demo
-                        </a>
-                        <a href="{{ url_for('docs') }}" class="btn btn-outline-secondary btn-lg">
-                            <i class="fas fa-book me-2"></i>Documentation
-                        </a>
-                        <a href="https://github.com/dbccccccc/ttsfm" class="btn btn-outline-secondary btn-lg" target="_blank" rel="noopener noreferrer">
-                            <i class="fab fa-github me-2"></i>GitHub
-                        </a>
-                    </div>
-                </div>
-            </div>
-        </div>
-    </div>
-</section>
-<!-- Features Section -->
-<section class="py-5" style="background-color: #f8fafc;">
-    <div class="container">
-        <div class="row">
-            <div class="col-12 text-center mb-5">
-                <h2 class="fw-bold mb-4">Key Features</h2>
-                <p class="lead text-muted">
-                    Simple, free, and powerful text-to-speech for Python developers.
-                </p>
-            </div>
-        </div>
-        <div class="row g-4">
-            <div class="col-lg-4">
-                <div class="text-center">
-                    <div class="feature-icon text-white rounded-circle d-inline-flex align-items-center justify-content-center mb-3" style="width: 4rem; height: 4rem; background-color: #2563eb;">
-                        <i class="fas fa-key"></i>
-                    </div>
-                    <h5 class="fw-bold">No API Keys</h5>
-                    <p class="text-muted">Completely free service with no registration or API keys required.</p>
-                </div>
-            </div>
-            <div class="col-lg-4">
-                <div class="text-center">
-                    <div class="feature-icon text-white rounded-circle d-inline-flex align-items-center justify-content-center mb-3" style="width: 4rem; height: 4rem; background-color: #10b981;">
-                        <i class="fas fa-bolt"></i>
-                    </div>
-                    <h5 class="fw-bold">Easy to Use</h5>
-                    <p class="text-muted">Simple Python API with both sync and async support for all use cases.</p>
-                </div>
-            </div>
-            <div class="col-lg-4">
-                <div class="text-center">
-                    <div class="feature-icon text-white rounded-circle d-inline-flex align-items-center justify-content-center mb-3" style="width: 4rem; height: 4rem; background-color: #64748b;">
-                        <i class="fas fa-microphone-alt"></i>
-                    </div>
-                    <h5 class="fw-bold">Multiple Voices</h5>
-                    <p class="text-muted">Access to various voice options and audio formats for your needs.</p>
-                </div>
-            </div>
-        </div>
-    </div>
-</section>
-<!-- Quick Start Section -->
-<section class="py-5">
-    <div class="container">
-        <div class="row">
-            <div class="col-12 text-center mb-5">
-                <h2 class="fw-bold mb-4">Getting Started</h2>
-                <p class="lead text-muted">
-                    Install TTSFM and start generating speech with just a few lines of code.
-                </p>
-            </div>
-        </div>
-        <div class="row g-4">
-            <div class="col-lg-6">
-                <div class="card h-100">
-                    <div class="card-body">
-                        <h5 class="card-title">
-                            <i class="fas fa-download me-2 text-primary"></i>Installation
-                        </h5>
-                        <pre class="bg-light p-3 rounded"><code>pip install ttsfm</code></pre>
-                        <small class="text-muted">Requires Python 3.8+</small>
-                    </div>
-                </div>
-            </div>
-            <div class="col-lg-6">
-                <div class="card h-100">
-                    <div class="card-body">
-                        <h5 class="card-title">
-                            <i class="fas fa-play me-2 text-success"></i>Basic Usage
-                        </h5>
-                        <pre class="bg-light p-3 rounded"><code>from ttsfm import TTSClient
-client = TTSClient()
-response = client.generate_speech(
-    text="Hello, world!",
-    voice="alloy"
-)
-response.save_to_file("hello.wav")</code></pre>
-                        <small class="text-muted">No API keys required</small>
-                    </div>
-                </div>
-            </div>
-        </div>
-        <div class="row mt-4">
-            <div class="col-12 text-center">
-                <div class="d-flex justify-content-center gap-3 flex-wrap">
-                    <a href="{{ url_for('playground') }}" class="btn btn-primary">
-                        <i class="fas fa-play me-2"></i>Try Demo
-                    </a>
-                    <a href="{{ url_for('docs') }}" class="btn btn-outline-primary">
-                        <i class="fas fa-book me-2"></i>Documentation
-                    </a>
-                </div>
-            </div>
-        </div>
-    </div>
-</section>
-{% endblock %}

+{% extends "base.html" %}
+{% block title %}TTSFM - {{ _('home.title') }}{% endblock %}
+{% block content %}
+<!-- Hero Section -->
+<section class="hero-section">
+    <div class="container">
+        <div class="row align-items-center min-vh-75">
+            <div class="col-lg-8 mx-auto text-center">
+                <div class="hero-content">
+                    <div class="badge bg-primary text-white mb-3 px-3 py-2">
+                        <i class="fas fa-code me-2"></i>Python Package
+                    </div>
+                    <h1 class="display-4 fw-bold mb-4">
+                        {{ _('home.title') }}
+                    </h1>
+                    <p class="lead mb-4">
+                        {{ _('home.subtitle') }}
+                    </p>
+                    <div class="d-flex flex-wrap gap-3 justify-content-center">
+                        <a href="{{ url_for('playground') }}" class="btn btn-primary btn-lg">
+                            <i class="fas fa-play me-2"></i>{{ _('home.try_demo') }}
+                        </a>
+                        <a href="{{ url_for('docs') }}" class="btn btn-outline-secondary btn-lg">
+                            <i class="fas fa-book me-2"></i>{{ _('home.documentation') }}
+                        </a>
+                        <a href="https://github.com/dbccccccc/ttsfm" class="btn btn-outline-secondary btn-lg" target="_blank" rel="noopener noreferrer">
+                            <i class="fab fa-github me-2"></i>{{ _('home.github') }}
+                        </a>
+                    </div>
+                </div>
+            </div>
+        </div>
+    </div>
+</section>
+<!-- Features Section -->
+<section class="py-5" style="background-color: #f8fafc;">
+    <div class="container">
+        <div class="row">
+            <div class="col-12 text-center mb-5">
+                <h2 class="fw-bold mb-4">{{ _('home.features_title') }}</h2>
+                <p class="lead text-muted">
+                    {{ _('home.features_subtitle') }}
+                </p>
+            </div>
+        </div>
+        <div class="row g-4">
+            <div class="col-lg-3">
+                <div class="text-center">
+                    <div class="feature-icon text-white rounded-circle d-inline-flex align-items-center justify-content-center mb-3" style="width: 4rem; height: 4rem; background: linear-gradient(135deg, #4f46e5 0%, #6366f1 100%);">
+                        <i class="fas fa-key"></i>
+                    </div>
+                    <h5 class="fw-bold">{{ _('home.feature_free_title') }}</h5>
+                    <p class="text-muted">{{ _('home.feature_free_desc') }}</p>
+                </div>
+            </div>
+            <div class="col-lg-3">
+                <div class="text-center">
+                    <div class="feature-icon text-white rounded-circle d-inline-flex align-items-center justify-content-center mb-3" style="width: 4rem; height: 4rem; background: linear-gradient(135deg, #f59e0b 0%, #fbbf24 100%);">
+                        <i class="fas fa-magic"></i>
+                    </div>
+                    <h5 class="fw-bold">{{ _('home.feature_openai_title') }} <span class="badge bg-success ms-1">v3.2.3</span></h5>
+                    <p class="text-muted">{{ _('home.feature_openai_desc') }}</p>
+                </div>
+            </div>
+            <div class="col-lg-3">
+                <div class="text-center">
+                    <div class="feature-icon text-white rounded-circle d-inline-flex align-items-center justify-content-center mb-3" style="width: 4rem; height: 4rem; background: linear-gradient(135deg, #059669 0%, #10b981 100%);">
+                        <i class="fas fa-bolt"></i>
+                    </div>
+                    <h5 class="fw-bold">{{ _('home.feature_async_title') }}</h5>
+                    <p class="text-muted">{{ _('home.feature_async_desc') }}</p>
+                </div>
+            </div>
+            <div class="col-lg-3">
+                <div class="text-center">
+                    <div class="feature-icon text-white rounded-circle d-inline-flex align-items-center justify-content-center mb-3" style="width: 4rem; height: 4rem; background: linear-gradient(135deg, #6b7280 0%, #9ca3af 100%);">
+                        <i class="fas fa-microphone-alt"></i>
+                    </div>
+                    <h5 class="fw-bold">{{ _('home.feature_voices_title') }} & {{ _('home.feature_formats_title') }}</h5>
+                    <p class="text-muted">{{ _('home.feature_voices_desc') }} {{ _('home.feature_formats_desc') }}</p>
+                </div>
+            </div>
+        </div>
+    </div>
+</section>
+<!-- Quick Start Section -->
+<section class="py-5">
+    <div class="container">
+        <div class="row">
+            <div class="col-12 text-center mb-5">
+                <h2 class="fw-bold mb-4">{{ _('home.quick_start_title') }}</h2>
+                <p class="lead text-muted">
+                    {{ _('home.subtitle') }}
+                </p>
+            </div>
+        </div>
+        <div class="row g-4">
+            <div class="col-lg-6">
+                <div class="card h-100">
+                    <div class="card-body">
+                        <h5 class="card-title">
+                            <i class="fas fa-download me-2 text-primary"></i>{{ _('home.installation_title') }}
+                        </h5>
+                        <pre class="bg-light p-3 rounded"><code>{{ _('home.installation_code') }}</code></pre>
+                        <small class="text-muted">Requires Python 3.8+</small>
+                    </div>
+                </div>
+            </div>
+            <div class="col-lg-6">
+                <div class="card h-100">
+                    <div class="card-body">
+                        <h5 class="card-title">
+                            <i class="fas fa-play me-2 text-success"></i>{{ _('home.usage_title') }}
+                        </h5>
+                        <pre class="bg-light p-3 rounded"><code>from ttsfm import TTSClient, Voice, AudioFormat
+client = TTSClient()
+response = client.generate_speech(
+    text="Hello, world!",
+    voice=Voice.ALLOY,
+    response_format=AudioFormat.MP3
+)
+response.save_to_file("hello")</code></pre>
+                        <small class="text-muted">No API keys required</small>
+                    </div>
+                </div>
+            </div>
+        </div>
+        <div class="row mt-4">
+            <div class="col-12 text-center">
+                <div class="d-flex justify-content-center gap-3 flex-wrap">
+                    <a href="{{ url_for('playground') }}" class="btn btn-primary">
+                        <i class="fas fa-play me-2"></i>{{ _('home.try_demo') }}
+                    </a>
+                    <a href="{{ url_for('docs') }}" class="btn btn-outline-primary">
+                        <i class="fas fa-book me-2"></i>{{ _('home.documentation') }}
+                    </a>
+                </div>
+            </div>
+        </div>
+    </div>
+</section>
+{% endblock %}

ttsfm-web/templates/playground.html CHANGED Viewed

@@ -1,295 +1,317 @@
-{% extends "base.html" %}
-{% block title %}TTSFM Playground - Try Text-to-Speech{% endblock %}
-{% block content %}
-<!-- Clean Playground Header -->
-<section class="py-5" style="background-color: white; border-bottom: 1px solid #e5e7eb;">
-    <div class="container">
-        <div class="row align-items-center">
-            <div class="col-lg-8">
-                <div class="fade-in">
-                    <div class="badge bg-primary text-white mb-3 px-3 py-2">
-                        <i class="fas fa-flask me-2"></i>Demo
-                    </div>
-                    <h1 class="display-4 fw-bold mb-3 text-dark">
-                        <i class="fas fa-play-circle me-3 text-primary"></i>TTS Playground
-                    </h1>
-                    <p class="lead mb-4 text-muted">
-                        Test the TTSFM text-to-speech functionality with different voices and formats.
-                    </p>
-                </div>
-            </div>
-            <div class="col-lg-4 text-center">
-                <div class="playground-visual fade-in" style="animation-delay: 0.3s;">
-                    <div class="playground-icon">
-                        <i class="fas fa-waveform-lines text-primary"></i>
-                        <div class="pulse-ring"></div>
-                        <div class="pulse-ring pulse-ring-delay"></div>
-                    </div>
-                </div>
-            </div>
-        </div>
-    </div>
-</section>
-<div class="container py-5 playground">
-    <div class="row">
-        <div class="col-lg-10 mx-auto">
-            <div class="card shadow-lg-custom border-0 fade-in">
-                <div class="card-header bg-gradient-primary text-white">
-                    <h4 class="mb-0 d-flex align-items-center">
-                        <i class="fas fa-microphone me-2"></i>
-                        Text-to-Speech Generator
-                    </h4>
-                </div>
-                <div class="card-body p-4">
-                    <form id="tts-form">
-                        <!-- Enhanced Text Input -->
-                        <div class="mb-4">
-                            <label for="text-input" class="form-label fw-bold d-flex align-items-center">
-                                <i class="fas fa-edit me-2 text-primary"></i>
-                                Text to Convert
-                            </label>
-                            <div class="position-relative">
-                                <textarea
-                                    class="form-control shadow-sm"
-                                    id="text-input"
-                                    rows="4"
-                                    placeholder="Enter the text you want to convert to speech..."
-                                    required
-                                >Hello! This is a test of the TTSFM text-to-speech system.</textarea>
-                                <div class="position-absolute top-0 end-0 p-2">
-                                    <button type="button" class="btn btn-sm btn-outline-secondary" id="clear-text-btn" title="Clear text">
-                                        <i class="fas fa-times"></i>
-                                    </button>
-                                </div>
-                            </div>
-                            <div class="form-text d-flex justify-content-between align-items-center">
-                                <div class="d-flex align-items-center gap-3">
-                                    <span class="text-muted">
-                                        <i class="fas fa-keyboard me-1"></i>
-                                        <span id="char-count">0</span> characters
-                                    </span>
-                                    <span id="length-status" class=""></span>
-                                    <span class="text-muted small">
-                                        <i class="fas fa-lightbulb me-1"></i>
-                                        Tip: Use Ctrl+Enter to generate
-                                    </span>
-                                </div>
-                                <div class="btn-group" role="group">
-                                    <button type="button" class="btn btn-sm btn-outline-primary" id="validate-text-btn">
-                                        <i class="fas fa-check me-1"></i>Validate
-                                    </button>
-                                    <button type="button" class="btn btn-sm btn-outline-secondary" id="random-text-btn">
-                                        <i class="fas fa-dice me-1"></i>Random
-                                    </button>
-                                </div>
-                            </div>
-                            <div id="validation-result" class="mt-2 d-none"></div>
-                        </div>
-                        <div class="row">
-                            <!-- Enhanced Voice Selection -->
-                            <div class="col-md-6 mb-4">
-                                <label for="voice-select" class="form-label fw-bold d-flex align-items-center">
-                                    <i class="fas fa-microphone me-2 text-primary"></i>
-                                    Voice
-                                </label>
-                                <select class="form-select shadow-sm" id="voice-select" required>
-                                    <option value="">Loading voices...</option>
-                                </select>
-                                <div class="form-text">
-                                    <span>Choose from available voices</span>
-                                </div>
-                            </div>
-                            <!-- Enhanced Format Selection -->
-                            <div class="col-md-6 mb-4">
-                                <label for="format-select" class="form-label fw-bold d-flex align-items-center">
-                                    <i class="fas fa-file-audio me-2 text-primary"></i>
-                                    Audio Format
-                                </label>
-                                <select class="form-select shadow-sm" id="format-select" required>
-                                    <option value="">Loading formats...</option>
-                                </select>
-                                <div class="form-text">
-                                    <span>Select your preferred audio format</span>
-                                </div>
-                            </div>
-                        </div>
-                        <!-- Advanced Options -->
-                        <div class="row">
-                            <div class="col-md-6 mb-4">
-                                <label for="max-length-input" class="form-label fw-bold">
-                                    <i class="fas fa-ruler me-2"></i>Max Length
-                                </label>
-                                <input
-                                    type="number"
-                                    class="form-control"
-                                    id="max-length-input"
-                                    value="4096"
-                                    min="100"
-                                    max="10000"
-                                >
-                                <div class="form-text">
-                                    Maximum characters per request (default: 4096)
-                                </div>
-                            </div>
-                            <div class="col-md-6 mb-4">
-                                <label class="form-label fw-bold">
-                                    <i class="fas fa-cog me-2"></i>Options
-                                </label>
-                                <div class="form-check">
-                                    <input class="form-check-input" type="checkbox" id="validate-length-check" checked>
-                                    <label class="form-check-label" for="validate-length-check">
-                                        Enable length validation
-                                    </label>
-                                </div>
-                                <div class="form-check">
-                                    <input class="form-check-input" type="checkbox" id="auto-split-check">
-                                    <label class="form-check-label" for="auto-split-check">
-                                        Auto-split long text
-                                    </label>
-                                </div>
-                            </div>
-                        </div>
-                        <!-- Instructions (Optional) -->
-                        <div class="mb-4">
-                            <label for="instructions-input" class="form-label fw-bold">
-                                <i class="fas fa-magic me-2"></i>Instructions (Optional)
-                            </label>
-                            <input
-                                type="text"
-                                class="form-control"
-                                id="instructions-input"
-                                placeholder="e.g., Speak in a cheerful and upbeat tone"
-                            >
-                            <div class="form-text">
-                                Provide optional instructions for voice modulation
-                            </div>
-                        </div>
-                        <!-- Enhanced Generate Button -->
-                        <div class="text-center mb-4">
-                            <div class="d-grid gap-2 d-md-block">
-                                <button type="submit" class="btn btn-primary btn-lg px-4 py-3" id="generate-btn">
-                                    <span class="btn-text">
-                                        <i class="fas fa-magic me-2"></i>Generate Speech
-                                    </span>
-                                    <span class="loading-spinner">
-                                        <i class="fas fa-spinner fa-spin me-2"></i>Generating...
-                                    </span>
-                                </button>
-                                <button type="button" class="btn btn-outline-secondary btn-lg ms-md-3" id="reset-form-btn">
-                                    <i class="fas fa-redo me-2"></i>Reset
-                                </button>
-                            </div>
-                        </div>
-                    </form>
-                    <!-- Enhanced Audio Player -->
-                    <div id="audio-result" class="d-none">
-                        <div class="border-top pt-4 mt-4">
-                            <div class="d-flex align-items-center justify-content-between mb-3">
-                                <h5 class="mb-0 d-flex align-items-center">
-                                    <i class="fas fa-volume-up me-2 text-success"></i>
-                                    Generated Audio
-                                    <span class="badge bg-success ms-2">
-                                        <i class="fas fa-check me-1"></i>Ready
-                                    </span>
-                                </h5>
-                                <div class="btn-group" role="group">
-                                    <button type="button" class="btn btn-sm btn-outline-primary" id="replay-btn" title="Replay audio">
-                                        <i class="fas fa-redo"></i>
-                                    </button>
-                                    <button type="button" class="btn btn-sm btn-outline-secondary" id="share-btn" title="Share audio">
-                                        <i class="fas fa-share"></i>
-                                    </button>
-                                </div>
-                            </div>
-                            <div class="audio-player-container bg-light rounded p-3 mb-3">
-                                <audio controls class="audio-player w-100" id="audio-player" preload="metadata">
-                                    Your browser does not support the audio element.
-                                </audio>
-                                <div class="audio-controls mt-2 d-flex justify-content-between align-items-center">
-                                    <div class="audio-info">
-                                        <span id="audio-info" class="text-muted small"></span>
-                                    </div>
-                                    <div class="audio-actions">
-                                        <button type="button" class="btn btn-success btn-sm" id="download-btn">
-                                            <i class="fas fa-download me-1"></i>Download
-                                        </button>
-                                    </div>
-                                </div>
-                            </div>
-                            <div class="audio-stats row text-center">
-                                <div class="col-md-3 col-6">
-                                    <div class="stat-item">
-                                        <i class="fas fa-clock text-primary"></i>
-                                        <div class="stat-value" id="audio-duration">--</div>
-                                        <div class="stat-label">Duration</div>
-                                    </div>
-                                </div>
-                                <div class="col-md-3 col-6">
-                                    <div class="stat-item">
-                                        <i class="fas fa-file text-info"></i>
-                                        <div class="stat-value" id="audio-size">--</div>
-                                        <div class="stat-label">File Size</div>
-                                    </div>
-                                </div>
-                                <div class="col-md-3 col-6">
-                                    <div class="stat-item">
-                                        <i class="fas fa-microphone text-warning"></i>
-                                        <div class="stat-value" id="audio-voice">--</div>
-                                        <div class="stat-label">Voice</div>
-                                    </div>
-                                </div>
-                                <div class="col-md-3 col-6">
-                                    <div class="stat-item">
-                                        <i class="fas fa-music text-success"></i>
-                                        <div class="stat-value" id="audio-format">--</div>
-                                        <div class="stat-label">Format</div>
-                                    </div>
-                                </div>
-                            </div>
-                        </div>
-                    </div>
-                    <!-- Batch Results -->
-                    <div id="batch-result" class="d-none">
-                        <hr>
-                        <h5 class="mb-3">
-                            <i class="fas fa-layer-group me-2"></i>Batch Processing Results
-                        </h5>
-                        <div class="alert alert-info" id="batch-summary"></div>
-                        <div id="batch-chunks" class="row g-3"></div>
-                        <div class="mt-3">
-                            <button type="button" class="btn btn-outline-primary" id="download-all-btn">
-                                <i class="fas fa-download me-2"></i>Download All Audio Files
-                            </button>
-                        </div>
-                    </div>
-                </div>
-            </div>
-        </div>
-    </div>
-</div>
-{% endblock %}
-{% block extra_js %}
-<!-- Playground JavaScript -->
-<script src="{{ url_for('static', filename='js/playground.js') }}"></script>
-<script>
-    // Additional playground-specific functionality
-    console.log('TTSFM Playground loaded successfully!');
-</script>
-{% endblock %}

+{% extends "base.html" %}
+{% block title %}TTSFM {{ _('nav.playground') }} - {{ _('playground.title') }}{% endblock %}
+{% block content %}
+<!-- Clean Playground Header -->
+<section class="py-5" style="background-color: white; border-bottom: 1px solid #e5e7eb;">
+    <div class="container">
+        <div class="row align-items-center">
+            <div class="col-lg-8">
+                <div class="fade-in">
+                    <div class="badge bg-primary text-white mb-3 px-3 py-2">
+                        <i class="fas fa-flask me-2"></i>Demo
+                    </div>
+                    <h1 class="display-4 fw-bold mb-3 text-dark">
+                        <i class="fas fa-play-circle me-3 text-primary"></i>{{ _('playground.title') }}
+                    </h1>
+                    <p class="lead mb-4 text-muted">
+                        {{ _('playground.subtitle') }}
+                    </p>
+                </div>
+            </div>
+            <div class="col-lg-4 text-center">
+                <div class="playground-visual fade-in" style="animation-delay: 0.3s;">
+                    <div class="playground-icon">
+                        <i class="fas fa-waveform-lines text-primary"></i>
+                        <div class="pulse-ring"></div>
+                        <div class="pulse-ring pulse-ring-delay"></div>
+                    </div>
+                </div>
+            </div>
+        </div>
+    </div>
+</section>
+<div class="container py-5 playground">
+    <div class="row">
+        <div class="col-lg-10 mx-auto">
+            <div class="card shadow-lg-custom border-0 fade-in">
+                <div class="card-header bg-gradient-primary text-white">
+                    <h4 class="mb-0 d-flex align-items-center">
+                        <i class="fas fa-microphone me-2"></i>
+                        {{ _('playground.title') }}
+                    </h4>
+                </div>
+                <div class="card-body p-4">
+                    <form id="tts-form" onsubmit="return false;">
+                        <!-- Enhanced Text Input -->
+                        <div class="mb-4">
+                            <label for="text-input" class="form-label fw-bold d-flex align-items-center">
+                                <i class="fas fa-edit me-2 text-primary"></i>
+                                {{ _('playground.text_input_label') }}
+                            </label>
+                            <div class="position-relative">
+                                <textarea
+                                    class="form-control shadow-sm"
+                                    id="text-input"
+                                    rows="4"
+                                    placeholder="{{ _('playground.text_input_placeholder') }}"
+                                    required
+                                >Hello! This is a test of the TTSFM text-to-speech system.</textarea>
+                                <div class="position-absolute top-0 end-0 p-2">
+                                    <button type="button" class="btn btn-sm btn-outline-secondary" id="clear-text-btn" title="Clear text">
+                                        <i class="fas fa-times"></i>
+                                    </button>
+                                </div>
+                            </div>
+                            <div class="form-text d-flex justify-content-between align-items-center">
+                                <div class="d-flex align-items-center gap-3">
+                                    <span class="text-muted">
+                                        <i class="fas fa-keyboard me-1"></i>
+                                        <span id="char-count">0</span> {{ _('playground.character_count') }}
+                                    </span>
+                                    <span id="length-status" class=""></span>
+                                    <span id="auto-combine-status" class="badge bg-success d-none">
+                                        <i class="fas fa-magic me-1"></i>{{ _('playground.max_length_warning') }}
+                                    </span>
+                                    <span class="text-muted small">
+                                        <i class="fas fa-lightbulb me-1"></i>
+                                        Tip: Use Ctrl+Enter to generate
+                                    </span>
+                                </div>
+                                <div class="btn-group" role="group">
+                                    <button type="button" class="btn btn-sm btn-outline-primary" id="validate-text-btn">
+                                        <i class="fas fa-check me-1"></i>{{ _('common.validate') if _('common.validate') != 'common.validate' else 'Validate' }}
+                                    </button>
+                                    <button type="button" class="btn btn-sm btn-outline-secondary" id="random-text-btn">
+                                        <i class="fas fa-dice me-1"></i>{{ _('playground.random_text') }}
+                                    </button>
+                                </div>
+                            </div>
+                            <div id="validation-result" class="mt-2 d-none"></div>
+                        </div>
+                        <div class="row">
+                            <!-- Enhanced Voice Selection -->
+                            <div class="col-md-6 mb-4">
+                                <label for="voice-select" class="form-label fw-bold d-flex align-items-center">
+                                    <i class="fas fa-microphone me-2 text-primary"></i>
+                                    {{ _('playground.voice_label') }}
+                                </label>
+                                <select class="form-select shadow-sm" id="voice-select" required>
+                                    <option value="">{{ _('common.loading_voices') }}</option>
+                                </select>
+                                <div class="form-text">
+                                    <span>{{ _('common.choose_voice') }}</span>
+                                </div>
+                            </div>
+                            <!-- Enhanced Format Selection -->
+                            <div class="col-md-6 mb-4">
+                                <label for="format-select" class="form-label fw-bold d-flex align-items-center">
+                                    <i class="fas fa-file-audio me-2 text-primary"></i>
+                                    {{ _('playground.format_label') }}
+                                </label>
+                                <select class="form-select shadow-sm" id="format-select" required>
+                                    <option value="">{{ _('common.loading_formats') }}</option>
+                                </select>
+                                <div class="form-text">
+                                    <span>{{ _('common.select_format') }}</span>
+                                </div>
+                            </div>
+                        </div>
+                        <!-- Advanced Options -->
+                        <div class="row">
+                            <div class="col-md-6 mb-4">
+                                <label for="max-length-input" class="form-label fw-bold">
+                                    <i class="fas fa-ruler me-2"></i>{{ _('common.max_length') }}
+                                </label>
+                                <input
+                                    type="number"
+                                    class="form-control"
+                                    id="max-length-input"
+                                    value="4096"
+                                    min="100"
+                                    max="10000"
+                                >
+                                <div class="form-text">
+                                    {{ _('playground.max_length_description') }}
+                                </div>
+                            </div>
+                            <div class="col-md-6 mb-4">
+                                <label class="form-label fw-bold">
+                                    <i class="fas fa-cog me-2"></i>{{ _('common.options') }}
+                                </label>
+                                <div class="form-check">
+                                    <input class="form-check-input" type="checkbox" id="validate-length-check" checked>
+                                    <label class="form-check-label" for="validate-length-check">
+                                        {{ _('playground.enable_length_validation') }}
+                                    </label>
+                                </div>
+                                <div class="form-check">
+                                    <input class="form-check-input" type="checkbox" id="auto-combine-check" checked>
+                                    <label class="form-check-label" for="auto-combine-check">
+                                        <span class="fw-bold text-primary">{{ _('playground.auto_combine_long_text') }}</span>
+                                        <i class="fas fa-info-circle ms-1" data-bs-toggle="tooltip"
+                                           title="{{ _('playground.auto_combine_tooltip') }}"></i>
+                                    </label>
+                                    <div class="form-text small">
+                                        <i class="fas fa-magic me-1"></i>
+                                        {{ _('playground.auto_combine_description') }}
+                                    </div>
+                                </div>
+                            </div>
+                        </div>
+                        <!-- Instructions (Optional) -->
+                        <div class="mb-4">
+                            <label for="instructions-input" class="form-label fw-bold">
+                                <i class="fas fa-magic me-2"></i>{{ _('playground.instructions_label') }}
+                            </label>
+                            <input
+                                type="text"
+                                class="form-control"
+                                id="instructions-input"
+                                placeholder="{{ _('playground.instructions_placeholder') }}"
+                            >
+                            <div class="form-text">
+                                {{ _('playground.instructions_description') }}
+                            </div>
+                        </div>
+                        <!-- API Key (Optional) -->
+                        <div class="mb-4" id="api-key-section">
+                            <label for="api-key-input" class="form-label fw-bold">
+                                <i class="fas fa-key me-2"></i>{{ _('playground.api_key_optional') }}
+                            </label>
+                            <div class="input-group">
+                                <input
+                                    type="password"
+                                    class="form-control"
+                                    id="api-key-input"
+                                    placeholder="{{ _('playground.api_key_placeholder') }}"
+                                >
+                                <button class="btn btn-outline-secondary" type="button" id="toggle-api-key-visibility">
+                                    <i class="fas fa-eye" id="api-key-eye-icon"></i>
+                                </button>
+                            </div>
+                            <div class="form-text">
+                                <i class="fas fa-info-circle me-1"></i>
+                                {{ _('playground.api_key_description') }}
+                            </div>
+                        </div>
+                        <!-- Enhanced Generate Button -->
+                        <div class="text-center mb-4">
+                            <div class="d-grid gap-2 d-md-block">
+                                <button type="submit" class="btn btn-primary btn-lg px-4 py-3" id="generate-btn">
+                                    <span class="btn-text">
+                                        <i class="fas fa-magic me-2"></i>{{ _('playground.generate_speech') }}
+                                    </span>
+                                    <span class="loading-spinner">
+                                        <i class="fas fa-spinner fa-spin me-2"></i>{{ _('playground.generating') }}
+                                    </span>
+                                </button>
+                                <button type="button" class="btn btn-outline-secondary btn-lg ms-md-3" id="reset-form-btn">
+                                    <i class="fas fa-redo me-2"></i>{{ _('common.reset') }}
+                                </button>
+                            </div>
+                        </div>
+                    </form>
+                    <!-- Enhanced Audio Player -->
+                    <div id="audio-result" class="d-none">
+                        <div class="border-top pt-4 mt-4">
+                            <div class="d-flex align-items-center justify-content-between mb-3">
+                                <h5 class="mb-0 d-flex align-items-center">
+                                    <i class="fas fa-volume-up me-2 text-success"></i>
+                                    {{ _('playground.audio_player_title') }}
+                                    <span class="badge bg-success ms-2">
+                                        <i class="fas fa-check me-1"></i>Ready
+                                    </span>
+                                </h5>
+                                <div class="btn-group" role="group">
+                                    <button type="button" class="btn btn-sm btn-outline-primary" id="replay-btn" title="Replay audio">
+                                        <i class="fas fa-redo"></i>
+                                    </button>
+                                    <button type="button" class="btn btn-sm btn-outline-secondary" id="share-btn" title="Share audio">
+                                        <i class="fas fa-share"></i>
+                                    </button>
+                                </div>
+                            </div>
+                            <div class="audio-player-container bg-light rounded p-3 mb-3">
+                                <audio controls class="audio-player w-100" id="audio-player" preload="metadata">
+                                    Your browser does not support the audio element.
+                                </audio>
+                                <div class="audio-controls mt-2 d-flex justify-content-between align-items-center">
+                                    <div class="audio-info">
+                                        <span id="audio-info" class="text-muted small"></span>
+                                    </div>
+                                    <div class="audio-actions">
+                                        <button type="button" class="btn btn-success btn-sm" id="download-btn">
+                                            <i class="fas fa-download me-1"></i>{{ _('playground.download_audio') }}
+                                        </button>
+                                    </div>
+                                </div>
+                            </div>
+                            <div class="audio-stats row text-center">
+                                <div class="col-md-3 col-6">
+                                    <div class="stat-item">
+                                        <i class="fas fa-clock text-primary"></i>
+                                        <div class="stat-value" id="audio-duration">--</div>
+                                        <div class="stat-label">{{ _('playground.duration') }}</div>
+                                    </div>
+                                </div>
+                                <div class="col-md-3 col-6">
+                                    <div class="stat-item">
+                                        <i class="fas fa-file text-info"></i>
+                                        <div class="stat-value" id="audio-size">--</div>
+                                        <div class="stat-label">{{ _('playground.file_size') }}</div>
+                                    </div>
+                                </div>
+                                <div class="col-md-3 col-6">
+                                    <div class="stat-item">
+                                        <i class="fas fa-microphone text-warning"></i>
+                                        <div class="stat-value" id="audio-voice">--</div>
+                                        <div class="stat-label">{{ _('playground.voice') }}</div>
+                                    </div>
+                                </div>
+                                <div class="col-md-3 col-6">
+                                    <div class="stat-item">
+                                        <i class="fas fa-music text-success"></i>
+                                        <div class="stat-value" id="audio-format">--</div>
+                                        <div class="stat-label">{{ _('playground.format') }}</div>
+                                    </div>
+                                </div>
+                            </div>
+                        </div>
+                    </div>
+                </div>
+            </div>
+        </div>
+    </div>
+</div>
+{% endblock %}
+{% block extra_js %}
+<!-- Socket.IO for WebSocket support -->
+<script src="https://cdn.socket.io/4.6.0/socket.io.min.js"></script>
+<!-- WebSocket TTS Client -->
+<script src="{{ url_for('static', filename='js/websocket-tts.js') }}"></script>
+<!-- Enhanced Playground JavaScript with WebSocket Support -->
+<script src="{{ url_for('static', filename='js/playground-enhanced-fixed.js') }}"></script>
+<script>
+    // Additional playground-specific functionality
+    console.log('TTSFM Enhanced Playground with WebSocket support loaded successfully!');
+</script>
+{% endblock %}

ttsfm-web/templates/websocket_demo.html ADDED Viewed

	@@ -0,0 +1,390 @@

+{% extends "base.html" %}
+{% block title %}{{ _('websocket.title', 'WebSocket Streaming Demo') }} - TTSFM{% endblock %}
+{% block content %}
+<div class="container mt-5">
+    <div class="row">
+        <div class="col-lg-10 mx-auto">
+            <h1 class="text-center mb-4">
+                <i class="fas fa-bolt text-warning"></i>
+                {{ _('websocket.title', 'WebSocket Streaming Demo') }}
+            </h1>
+            <!-- Connection Status -->
+            <div class="alert alert-info" id="connection-status">
+                <i class="fas fa-plug me-2"></i>
+                <span id="status-text">Connecting to WebSocket server...</span>
+            </div>
+            <!-- Input Form -->
+            <div class="card shadow-sm mb-4">
+                <div class="card-body">
+                    <h5 class="card-title">{{ _('playground.generate_speech', 'Generate Speech') }}</h5>
+                    <form id="streaming-form">
+                        <div class="mb-3">
+                            <label for="text-input" class="form-label">
+                                {{ _('playground.text_input', 'Text to Convert') }}
+                            </label>
+                            <textarea
+                                class="form-control"
+                                id="text-input"
+                                rows="4"
+                                maxlength="4096"
+                                placeholder="{{ _('playground.text_placeholder', 'Enter your text here...') }}"
+                            >Experience the future of text-to-speech with real-time WebSocket streaming! This innovative feature delivers audio chunks as they're generated, providing a more responsive and engaging user experience.</textarea>
+                            <div class="form-text">
+                                <i class="fas fa-info-circle me-1"></i>
+                                Streaming will split text into chunks for real-time delivery
+                            </div>
+                        </div>
+                        <div class="row">
+                            <div class="col-md-6 mb-3">
+                                <label for="voice-select" class="form-label">
+                                    {{ _('playground.voice', 'Voice') }}
+                                </label>
+                                <select class="form-select" id="voice-select">
+                                    <option value="alloy">Alloy</option>
+                                    <option value="echo">Echo</option>
+                                    <option value="fable">Fable</option>
+                                    <option value="onyx">Onyx</option>
+                                    <option value="nova">Nova</option>
+                                    <option value="shimmer">Shimmer</option>
+                                </select>
+                            </div>
+                            <div class="col-md-6 mb-3">
+                                <label for="format-select" class="form-label">
+                                    {{ _('playground.format', 'Audio Format') }}
+                                </label>
+                                <select class="form-select" id="format-select">
+                                    <option value="mp3">MP3</option>
+                                    <option value="wav">WAV</option>
+                                    <option value="opus">OPUS</option>
+                                </select>
+                            </div>
+                        </div>
+                        <div class="d-grid gap-2 d-md-flex justify-content-md-end">
+                            <button type="submit" class="btn btn-primary" id="stream-btn">
+                                <i class="fas fa-bolt me-2"></i>
+                                Start Streaming
+                            </button>
+                            <button type="button" class="btn btn-danger" id="cancel-btn" style="display: none;">
+                                <i class="fas fa-stop me-2"></i>
+                                Cancel
+                            </button>
+                        </div>
+                    </form>
+                </div>
+            </div>
+            <!-- Progress Section -->
+            <div class="card shadow-sm mb-4" id="progress-section" style="display: none;">
+                <div class="card-body">
+                    <h5 class="card-title">Streaming Progress</h5>
+                    <div class="progress mb-3" style="height: 25px;">
+                        <div
+                            class="progress-bar progress-bar-striped progress-bar-animated"
+                            id="progress-bar"
+                            role="progressbar"
+                            style="width: 0%"
+                        >
+                            <span id="progress-text">0%</span>
+                        </div>
+                    </div>
+                    <div class="row text-center">
+                        <div class="col-md-4">
+                            <h6>Chunks Received</h6>
+                            <p class="h4"><span id="chunks-received">0</span> / <span id="total-chunks">0</span></p>
+                        </div>
+                        <div class="col-md-4">
+                            <h6>Data Transferred</h6>
+                            <p class="h4" id="data-transferred">0 KB</p>
+                        </div>
+                        <div class="col-md-4">
+                            <h6>Generation Time</h6>
+                            <p class="h4" id="generation-time">0.0s</p>
+                        </div>
+                    </div>
+                </div>
+            </div>
+            <!-- Audio Chunks Display -->
+            <div class="card shadow-sm mb-4" id="chunks-section" style="display: none;">
+                <div class="card-body">
+                    <h5 class="card-title">Audio Chunks</h5>
+                    <div id="chunks-container" class="row g-2">
+                        <!-- Chunks will be added here dynamically -->
+                    </div>
+                </div>
+            </div>
+            <!-- Final Audio Player -->
+            <div class="card shadow-sm" id="audio-section" style="display: none;">
+                <div class="card-body">
+                    <h5 class="card-title">Generated Audio</h5>
+                    <audio id="audio-player" controls class="w-100"></audio>
+                    <div class="mt-2">
+                        <button class="btn btn-success" id="download-btn">
+                            <i class="fas fa-download me-2"></i>
+                            Download Audio
+                        </button>
+                    </div>
+                </div>
+            </div>
+            <!-- Info Section -->
+            <div class="card shadow-sm mt-4">
+                <div class="card-body">
+                    <h5 class="card-title">
+                        <i class="fas fa-info-circle text-info me-2"></i>
+                        About WebSocket Streaming
+                    </h5>
+                    <p>
+                        This demo showcases real-time audio streaming using WebSockets. Instead of waiting
+                        for the entire audio to be generated, you receive chunks as they're processed,
+                        providing immediate feedback and a more responsive experience.
+                    </p>
+                    <ul>
+                        <li><strong>Lower Perceived Latency:</strong> Start receiving audio before generation completes</li>
+                        <li><strong>Progress Tracking:</strong> Real-time updates on generation progress</li>
+                        <li><strong>Cancellable:</strong> Stop generation mid-stream if needed</li>
+                        <li><strong>Efficient:</strong> Stream chunks as they're ready, no waiting</li>
+                    </ul>
+                </div>
+            </div>
+        </div>
+    </div>
+</div>
+<!-- Include Socket.IO -->
+<script src="https://cdn.socket.io/4.6.0/socket.io.min.js"></script>
+<!-- Include our WebSocket client -->
+<script src="{{ url_for('static', filename='js/websocket-tts.js') }}"></script>
+<script>
+// Initialize WebSocket client
+let wsClient = null;
+let currentRequestId = null;
+let startTime = null;
+// Initialize on page load
+document.addEventListener('DOMContentLoaded', function() {
+    // Create WebSocket client
+    wsClient = new WebSocketTTSClient({
+        debug: true,
+        onConnect: () => {
+            updateConnectionStatus('connected');
+        },
+        onDisconnect: () => {
+            updateConnectionStatus('disconnected');
+        },
+        onError: (error) => {
+            updateConnectionStatus('error');
+            showError(`Connection error: ${error.message}`);
+        }
+    });
+    // Form submission
+    document.getElementById('streaming-form').addEventListener('submit', handleStreamingSubmit);
+    // Cancel button
+    document.getElementById('cancel-btn').addEventListener('click', handleCancel);
+});
+function updateConnectionStatus(status) {
+    const statusEl = document.getElementById('connection-status');
+    const statusText = document.getElementById('status-text');
+    statusEl.className = 'alert';
+    switch(status) {
+        case 'connected':
+            statusEl.classList.add('alert-success');
+            statusText.innerHTML = '<i class="fas fa-check-circle me-2"></i>Connected to WebSocket server';
+            break;
+        case 'disconnected':
+            statusEl.classList.add('alert-warning');
+            statusText.innerHTML = '<i class="fas fa-exclamation-triangle me-2"></i>Disconnected from server';
+            break;
+        case 'error':
+            statusEl.classList.add('alert-danger');
+            statusText.innerHTML = '<i class="fas fa-times-circle me-2"></i>Connection error';
+            break;
+        default:
+            statusEl.classList.add('alert-info');
+            statusText.innerHTML = '<i class="fas fa-plug me-2"></i>Connecting...';
+    }
+}
+async function handleStreamingSubmit(e) {
+    e.preventDefault();
+    if (!wsClient || !wsClient.isConnected()) {
+        showError('WebSocket not connected. Please refresh the page.');
+        return;
+    }
+    // Get form values
+    const text = document.getElementById('text-input').value.trim();
+    const voice = document.getElementById('voice-select').value;
+    const format = document.getElementById('format-select').value;
+    if (!text) {
+        showError('Please enter some text to convert.');
+        return;
+    }
+    // Reset UI
+    resetUI();
+    // Show progress section
+    document.getElementById('progress-section').style.display = 'block';
+    document.getElementById('chunks-section').style.display = 'block';
+    document.getElementById('stream-btn').disabled = true;
+    document.getElementById('cancel-btn').style.display = 'inline-block';
+    startTime = Date.now();
+    try {
+        const result = await wsClient.generateSpeech(text, {
+            voice: voice,
+            format: format,
+            chunkSize: 512, // Smaller chunks for more updates
+            onStart: (data) => {
+                currentRequestId = data.request_id;
+                console.log('Stream started:', data);
+            },
+            onProgress: (progress) => {
+                updateProgress(progress);
+            },
+            onChunk: (chunk) => {
+                handleAudioChunk(chunk);
+            },
+            onComplete: (result) => {
+                handleStreamComplete(result);
+            },
+            onError: (error) => {
+                showError(`Streaming error: ${error.message}`);
+            }
+        });
+        console.log('Streaming completed:', result);
+    } catch (error) {
+        showError(`Failed to generate speech: ${error.message}`);
+        resetUI();
+    }
+}
+function updateProgress(progress) {
+    const progressBar = document.getElementById('progress-bar');
+    const progressText = document.getElementById('progress-text');
+    const chunksReceived = document.getElementById('chunks-received');
+    const totalChunks = document.getElementById('total-chunks');
+    const generationTime = document.getElementById('generation-time');
+    progressBar.style.width = `${progress.progress}%`;
+    progressText.textContent = `${progress.progress}%`;
+    chunksReceived.textContent = progress.chunksCompleted;
+    totalChunks.textContent = progress.totalChunks;
+    if (startTime) {
+        const elapsed = (Date.now() - startTime) / 1000;
+        generationTime.textContent = `${elapsed.toFixed(1)}s`;
+    }
+}
+function handleAudioChunk(chunk) {
+    const container = document.getElementById('chunks-container');
+    // Create chunk visualization
+    const chunkEl = document.createElement('div');
+    chunkEl.className = 'col-auto';
+    chunkEl.innerHTML = `
+        <div class="badge bg-primary p-2" title="Chunk ${chunk.chunkIndex + 1}">
+            <i class="fas fa-music me-1"></i>
+            ${chunk.chunkIndex + 1}
+            <small class="d-block">${(chunk.audioData.byteLength / 1024).toFixed(1)}KB</small>
+        </div>
+    `;
+    container.appendChild(chunkEl);
+    // Update data transferred
+    const currentData = parseFloat(document.getElementById('data-transferred').textContent);
+    const newData = currentData + (chunk.audioData.byteLength / 1024);
+    document.getElementById('data-transferred').textContent = `${newData.toFixed(1)} KB`;
+}
+function handleStreamComplete(result) {
+    // Create blob from combined audio
+    const blob = new Blob([result.audioData], { type: `audio/${result.format}` });
+    const url = URL.createObjectURL(blob);
+    // Set up audio player
+    const audioPlayer = document.getElementById('audio-player');
+    audioPlayer.src = url;
+    // Show audio section
+    document.getElementById('audio-section').style.display = 'block';
+    // Set up download button
+    document.getElementById('download-btn').onclick = () => {
+        const a = document.createElement('a');
+        a.href = url;
+        a.download = `tts_stream_${Date.now()}.${result.format}`;
+        a.click();
+    };
+    // Update final stats
+    document.getElementById('generation-time').textContent = `${(result.generationTime / 1000).toFixed(2)}s`;
+    // Reset buttons
+    document.getElementById('stream-btn').disabled = false;
+    document.getElementById('cancel-btn').style.display = 'none';
+    // Update progress bar to success
+    const progressBar = document.getElementById('progress-bar');
+    progressBar.classList.remove('progress-bar-animated');
+    progressBar.classList.add('bg-success');
+}
+function handleCancel() {
+    if (currentRequestId) {
+        wsClient.cancelStream(currentRequestId);
+        showInfo('Stream cancelled');
+        resetUI();
+    }
+}
+function resetUI() {
+    document.getElementById('progress-section').style.display = 'none';
+    document.getElementById('chunks-section').style.display = 'none';
+    document.getElementById('audio-section').style.display = 'none';
+    document.getElementById('stream-btn').disabled = false;
+    document.getElementById('cancel-btn').style.display = 'none';
+    document.getElementById('chunks-container').innerHTML = '';
+    document.getElementById('progress-bar').style.width = '0%';
+    document.getElementById('progress-bar').className = 'progress-bar progress-bar-striped progress-bar-animated';
+    document.getElementById('data-transferred').textContent = '0 KB';
+    currentRequestId = null;
+    startTime = null;
+}
+function showError(message) {
+    console.error(message);
+    // You could add a toast notification here
+}
+function showInfo(message) {
+    console.info(message);
+    // You could add a toast notification here
+}
+</script>
+{% endblock %}

ttsfm-web/translations/en.json ADDED Viewed

	@@ -0,0 +1,224 @@

+{
+  "nav": {
+    "home": "Home",
+    "playground": "Playground",
+    "documentation": "Documentation",
+    "github": "GitHub",
+    "status_checking": "Checking...",
+    "status_online": "Online",
+    "status_offline": "Offline"
+  },
+  "common": {
+    "loading": "Loading...",
+    "error": "Error",
+    "success": "Success",
+    "warning": "Warning",
+    "info": "Info",
+    "close": "Close",
+    "save": "Save",
+    "cancel": "Cancel",
+    "confirm": "Confirm",
+    "download": "Download",
+    "upload": "Upload",
+    "generate": "Generate",
+    "play": "Play",
+    "stop": "Stop",
+    "pause": "Pause",
+    "resume": "Resume",
+    "clear": "Clear",
+    "reset": "Reset",
+    "copy": "Copy",
+    "copied": "Copied!",
+    "language": "Language",
+    "english": "English",
+    "chinese": "中文",
+    "validate": "Validate",
+    "options": "Options",
+    "max_length": "Max Length",
+    "tip": "Tip",
+    "choose_voice": "Choose from available voices",
+    "select_format": "Select your preferred audio format",
+    "loading_voices": "Loading voices...",
+    "loading_formats": "Loading formats...",
+    "ctrl_enter_tip": "Use Ctrl+Enter to generate",
+    "auto_combine_enabled": "Auto-combine enabled"
+  },
+  "home": {
+    "title": "Free Text-to-Speech for Python",
+    "subtitle": "Generate high-quality speech from text using the free openai.fm service. No API keys, no registration - just install and start creating audio.",
+    "try_demo": "Try Demo",
+    "documentation": "Documentation",
+    "github": "GitHub",
+    "features_title": "Key Features",
+    "features_subtitle": "Simple, free, and powerful text-to-speech for Python developers.",
+    "feature_free_title": "Completely Free",
+    "feature_free_desc": "No API keys or registration required. Uses the free openai.fm service.",
+    "feature_voices_title": "11 Voices",
+    "feature_voices_desc": "All OpenAI-compatible voices available for different use cases.",
+    "feature_formats_title": "6 Audio Formats",
+    "feature_formats_desc": "MP3, WAV, OPUS, AAC, FLAC, and PCM support for any application.",
+    "feature_docker_title": "Docker Ready",
+    "feature_docker_desc": "One-command deployment with web interface and API endpoints.",
+    "feature_openai_title": "OpenAI Compatible",
+    "feature_openai_desc": "Drop-in replacement for OpenAI's TTS API with auto-combine for long text.",
+    "feature_async_title": "Async & Sync",
+    "feature_async_desc": "Both asyncio and synchronous clients for maximum flexibility.",
+    "quick_start_title": "Quick Start",
+    "installation_title": "Installation",
+    "installation_code": "pip install ttsfm",
+    "usage_title": "Basic Usage",
+    "docker_title": "Docker Deployment",
+    "docker_desc": "Run TTSFM with web interface:",
+    "api_title": "OpenAI-Compatible API",
+    "api_desc": "Use with OpenAI Python client:",
+    "footer_copyright": "© 2024 dbcccc"
+  },
+  "playground": {
+    "title": "Interactive TTS Playground",
+    "subtitle": "Test different voices and audio formats in real-time",
+    "text_input_label": "Text to Convert",
+    "text_input_placeholder": "Enter the text you want to convert to speech...",
+    "voice_label": "Voice",
+    "format_label": "Audio Format",
+    "instructions_label": "Voice Instructions (Optional)",
+    "instructions_placeholder": "Additional instructions for voice generation...",
+    "character_count": "characters",
+    "max_length_warning": "Text exceeds maximum length. It will be automatically split and combined.",
+    "generate_speech": "Generate Speech",
+    "generating": "Generating...",
+    "download_audio": "Download Audio",
+    "audio_player_title": "Generated Audio",
+    "file_size": "File Size",
+    "duration": "Duration",
+    "format": "Format",
+    "voice": "Voice",
+    "chunks_combined": "Chunks Combined",
+    "random_text": "Random Text",
+    "clear_text": "Clear Text",
+    "max_length_description": "Maximum characters per request (default: 4096)",
+    "enable_length_validation": "Enable length validation",
+    "auto_combine_long_text": "Auto-combine long text",
+    "auto_combine_tooltip": "Automatically split long text and combine audio chunks into a single file",
+    "auto_combine_description": "Automatically handles text longer than the limit",
+    "instructions_description": "Provide optional instructions for voice modulation",
+    "api_key_optional": "API Key (Optional)",
+    "api_key_placeholder": "Enter your API key if required",
+    "api_key_description": "Only required if API key protection is enabled on the server",
+    "sample_texts": {
+      "welcome": "Welcome to TTSFM! This is a free text-to-speech service that converts your text into high-quality audio using advanced AI technology.",
+      "story": "Once upon a time, in a digital world far away, there lived a small Python package that could transform any text into beautiful speech. This package was called TTSFM, and it brought joy to developers everywhere.",
+      "technical": "TTSFM is a Python client for text-to-speech APIs that provides both synchronous and asynchronous interfaces. It supports multiple voices and audio formats, making it perfect for various applications.",
+      "multilingual": "TTSFM supports multiple languages and voices, allowing you to create diverse audio content for global audiences. The service is completely free and requires no API keys.",
+      "long": "This is a longer text sample designed to test the auto-combine feature of TTSFM. When text exceeds the maximum length limit, TTSFM automatically splits it into smaller chunks, generates audio for each chunk, and then seamlessly combines them into a single audio file. This process is completely transparent to the user and ensures that you can convert text of any length without worrying about technical limitations. The resulting audio maintains consistent quality and natural flow throughout the entire content."
+    },
+    "error_messages": {
+      "empty_text": "Please enter some text to convert.",
+      "generation_failed": "Failed to generate speech. Please try again.",
+      "network_error": "Network error. Please check your connection and try again.",
+      "invalid_format": "Invalid audio format selected.",
+      "invalid_voice": "Invalid voice selected.",
+      "text_too_long": "Text is too long. Please reduce the length or enable auto-combine.",
+      "server_error": "Server error. Please try again later."
+    },
+    "success_messages": {
+      "generation_complete": "Speech generated successfully!",
+      "text_copied": "Text copied to clipboard!",
+      "download_started": "Download started!"
+    }
+  },
+  "docs": {
+    "title": "API Documentation",
+    "subtitle": "Complete reference for the TTSFM Text-to-Speech API. Free, simple, and powerful.",
+    "contents": "Contents",
+    "overview": "Overview",
+    "authentication": "Authentication",
+    "text_validation": "Text Validation",
+    "endpoints": "API Endpoints",
+    "voices": "Voices",
+    "formats": "Audio Formats",
+    "generate": "Generate Speech",
+    "combined": "Combined Audio",
+    "status": "Status & Health",
+    "errors": "Error Handling",
+    "examples": "Code Examples",
+    "python_package": "Python Package",
+    "overview_title": "Overview",
+    "overview_desc": "The TTSFM API provides a modern, OpenAI-compatible interface for text-to-speech generation. It supports multiple voices, audio formats, and includes advanced features like text length validation and intelligent auto-combine functionality.",
+    "base_url": "Base URL:",
+    "key_features": "Key Features",
+    "feature_voices": "11 different voice options - Choose from alloy, echo, nova, and more",
+    "feature_formats": "Multiple audio formats - MP3, WAV, OPUS, AAC, FLAC, PCM support",
+    "feature_openai": "OpenAI compatibility - Drop-in replacement for OpenAI's TTS API",
+    "feature_auto_combine": "Auto-combine feature - Automatically handles long text (>4096 chars) by splitting and combining audio",
+    "feature_validation": "Text length validation - Smart validation with configurable limits",
+    "feature_monitoring": "Real-time monitoring - Status endpoints and health checks",
+    "new_version": "New in v3.2.3:",
+    "new_version_desc": "Enhanced `/v1/audio/speech` endpoint with intelligent auto-combine feature. Streamlined web interface with clean, user-friendly design and automatic long-text handling!",
+    "authentication_title": "Authentication",
+    "authentication_desc": "Currently, the API supports optional API key authentication. If configured, include your API key in the request headers.",
+    "text_validation_title": "Text Length Validation",
+    "text_validation_desc": "TTSFM includes built-in text length validation to ensure compatibility with TTS models. The default maximum length is 4096 characters, but this can be customized.",
+    "important": "Important:",
+    "text_validation_warning": "Text exceeding the maximum length will be rejected unless validation is disabled or the text is split into chunks.",
+    "validation_options": "Validation Options",
+    "max_length_option": "Maximum allowed characters (default: 4096)",
+    "validate_length_option": "Enable/disable validation (default: true)",
+    "preserve_words_option": "Avoid splitting words when chunking (default: true)",
+    "endpoints_title": "API Endpoints",
+    "get_voices_desc": "Get list of available voices.",
+    "get_formats_desc": "Get list of supported audio formats.",
+    "validate_text_desc": "Validate text length and get splitting suggestions.",
+    "generate_speech_desc": "Generate speech from text.",
+    "response_example": "Response Example:",
+    "request_body": "Request Body:",
+    "parameters": "Parameters:",
+    "text_param": "Text to convert to speech",
+    "voice_param": "Voice ID (default: \"alloy\")",
+    "format_param": "Audio format (default: \"mp3\")",
+    "instructions_param": "Voice modulation instructions",
+    "max_length_param": "Maximum text length (default: 4096)",
+    "validate_length_param": "Enable validation (default: true)",
+    "response": "Response:",
+    "response_audio": "Returns audio file with appropriate Content-Type header.",
+    "response_combined_audio": "Returns a single audio file containing all chunks combined seamlessly.",
+    "required": "required",
+    "optional": "optional",
+    "python_package_title": "Python Package",
+    "long_text_support": "Long Text Support",
+    "long_text_desc": "The TTSFM Python package includes built-in long text splitting functionality for developers who need fine-grained control:",
+    "developer_features": "Developer Features:",
+    "manual_splitting": "Manual Splitting: Full control over text chunking for advanced use cases",
+    "word_preservation": "Word Preservation: Maintains word boundaries for natural speech",
+    "separate_files": "Separate Files: Each chunk saved as individual audio file",
+    "cli_support": "CLI Support: Use `--split-long-text` flag for command-line usage",
+    "note": "Note:",
+    "auto_combine_note": "For web users, the auto-combine feature in `/v1/audio/speech` is recommended as it automatically handles long text and returns a single seamless audio file.",
+    "combined_audio_desc": "Generate a single combined audio file from long text. Automatically splits text into chunks, generates speech for each chunk, and combines them into one seamless audio file.",
+    "response_headers": "Response Headers:",
+    "chunks_combined_header": "Number of chunks that were combined",
+    "original_text_length_header": "Original text length in characters",
+    "audio_size_header": "Final audio file size in bytes",
+    "openai_compatible_desc": "Enhanced OpenAI-compatible endpoint with auto-combine feature. Automatically handles long text by splitting and combining audio chunks when needed.",
+    "enhanced_parameters": "Enhanced Parameters:",
+    "auto_combine_param": "Automatically split long text and combine audio chunks into a single file",
+    "auto_combine_false": "Return error if text exceeds max_length (standard OpenAI behavior)",
+    "max_length_chunk_param": "Maximum characters per chunk when splitting",
+    "auto_combine_header": "Whether auto-combine was enabled (true/false)",
+    "chunks_combined_response": "Number of audio chunks combined (1 for short text)",
+    "original_text_response": "Original text length (for long text processing)",
+    "audio_format_header": "Audio format of the response",
+    "audio_size_response": "Audio file size in bytes",
+    "short_text_comment": "Short text (works normally)",
+    "long_text_auto_comment": "Long text with auto-combine (default)",
+    "long_text_no_auto_comment": "Long text without auto-combine (will error)",
+    "audio_combination": "Audio Combination:",
+    "audio_combination_desc": "Uses advanced audio processing (PyDub) when available, with intelligent fallbacks for different environments. Supports all audio formats.",
+    "use_cases": "Use Cases:",
+    "use_case_articles": "Long Articles: Convert blog posts or articles to single audio files",
+    "use_case_audiobooks": "Audiobooks: Generate chapters as single audio files",
+    "use_case_podcasts": "Podcasts: Create podcast episodes from scripts",
+    "use_case_education": "Educational Content: Convert learning materials to audio",
+    "example_usage": "Example Usage:",
+    "python_example_comment": "Python example"
+  }
+}

ttsfm-web/translations/zh.json ADDED Viewed

	@@ -0,0 +1,224 @@

+{
+  "nav": {
+    "home": "首页",
+    "playground": "试用平台",
+    "documentation": "文档",
+    "github": "GitHub",
+    "status_checking": "检查中...",
+    "status_online": "在线",
+    "status_offline": "离线"
+  },
+  "common": {
+    "loading": "加载中...",
+    "error": "错误",
+    "success": "成功",
+    "warning": "警告",
+    "info": "信息",
+    "close": "关闭",
+    "save": "保存",
+    "cancel": "取消",
+    "confirm": "确认",
+    "download": "下载",
+    "upload": "上传",
+    "generate": "生成",
+    "play": "播放",
+    "stop": "停止",
+    "pause": "暂停",
+    "resume": "继续",
+    "clear": "清除",
+    "reset": "重置",
+    "copy": "复制",
+    "copied": "已复制！",
+    "language": "语言",
+    "english": "English",
+    "chinese": "中文",
+    "validate": "验证",
+    "options": "选项",
+    "max_length": "最大长度",
+    "tip": "提示",
+    "choose_voice": "从可用声音中选择",
+    "select_format": "选择您偏好的音频格式",
+    "loading_voices": "加载声音中...",
+    "loading_formats": "加载格式中...",
+    "ctrl_enter_tip": "使用 Ctrl+Enter 生成",
+    "auto_combine_enabled": "自动合并已启用"
+  },
+  "home": {
+    "title": "免费的Python文本转语音",
+    "subtitle": "使用免费的openai.fm服务从文本生成高质量语音。无需API密钥，无需注册 - 只需安装即可开始创建音频。",
+    "try_demo": "试用演示",
+    "documentation": "文档",
+    "github": "GitHub",
+    "features_title": "主要特性",
+    "features_subtitle": "简单、免费且强大的Python开发者文本转语音工具。",
+    "feature_free_title": "完全免费",
+    "feature_free_desc": "无需API密钥或注册。使用免费的openai.fm服务。",
+    "feature_voices_title": "11种声音",
+    "feature_voices_desc": "提供所有OpenAI兼容的声音，适用于不同使用场景。",
+    "feature_formats_title": "6种音频格式",
+    "feature_formats_desc": "支持MP3、WAV、OPUS、AAC、FLAC和PCM格式，适用于任何应用。",
+    "feature_docker_title": "Docker就绪",
+    "feature_docker_desc": "一键部署，包含Web界面和API端点。",
+    "feature_openai_title": "OpenAI兼容",
+    "feature_openai_desc": "OpenAI TTS API的直接替代品，支持长文本自动合并。",
+    "feature_async_title": "异步和同步",
+    "feature_async_desc": "提供asyncio和同步客户端，最大化灵活性。",
+    "quick_start_title": "快速开始",
+    "installation_title": "安装",
+    "installation_code": "pip install ttsfm",
+    "usage_title": "基本用法",
+    "docker_title": "Docker部署",
+    "docker_desc": "运行带有Web界面的TTSFM：",
+    "api_title": "OpenAI兼容API",
+    "api_desc": "与OpenAI Python客户端一起使用：",
+    "footer_copyright": "© 2024 dbcccc"
+  },
+  "playground": {
+    "title": "交互式TTS试用平台",
+    "subtitle": "实时测试不同的声音和音频格式",
+    "text_input_label": "要转换的文本",
+    "text_input_placeholder": "输入您想要转换为语音的文本...",
+    "voice_label": "声音",
+    "format_label": "音频格式",
+    "instructions_label": "声音指令（可选）",
+    "instructions_placeholder": "语音生成的额外指令...",
+    "character_count": "字符",
+    "max_length_warning": "文本超过最大长度。将自动分割并合并。",
+    "generate_speech": "生成语音",
+    "generating": "生成中...",
+    "download_audio": "下载音频",
+    "audio_player_title": "生成的音频",
+    "file_size": "文件大小",
+    "duration": "时长",
+    "format": "格式",
+    "voice": "声音",
+    "chunks_combined": "合并片段",
+    "random_text": "随机文本",
+    "clear_text": "清除文本",
+    "max_length_description": "每个请求的最大字符数（默认：4096）",
+    "enable_length_validation": "启用长度验证",
+    "auto_combine_long_text": "自动合并长文本",
+    "auto_combine_tooltip": "自动分割长文本并将音频片段合并为单个文件",
+    "auto_combine_description": "自动处理超过限制的文本",
+    "instructions_description": "为声音调制提供可选指令",
+    "api_key_optional": "API密钥（可选）",
+    "api_key_placeholder": "如果需要，请输入您的API密钥",
+    "api_key_description": "仅在服务器启用API密钥保护时需要",
+    "sample_texts": {
+      "welcome": "欢迎使用TTSFM！这是一个免费的文本转语音服务，使用先进的AI技术将您的文本转换为高质量音频。",
+      "story": "很久很久以前，在一个遥远的数字世界里，住着一个小小的Python包，它能够将任何文本转换成美妙的语音。这个包叫做TTSFM，它为世界各地的开发者带来了快乐。",
+      "technical": "TTSFM是一个用于文本转语音API的Python客户端，提供同步和异步接口。它支持多种声音和音频格式，非常适合各种应用。",
+      "multilingual": "TTSFM支持多种语言和声音，让您能够为全球受众创建多样化的音频内容。该服务完全免费，无需API密钥。",
+      "long": "这是一个较长的文本示例，用于测试TTSFM的自动合并功能。当文本超过最大长度限制时，TTSFM会自动将其分割成较小的片段，为每个片段生成音频，然后无缝地将它们合并成一个音频文件。这个过程对用户完全透明，确保您可以转换任何长度的文本，而无需担心技术限制。生成的音频在整个内容中保持一致的质量和自然的流畅性。"
+    },
+    "error_messages": {
+      "empty_text": "请输入要转换的文本。",
+      "generation_failed": "语音生成失败。请重试。",
+      "network_error": "网络错误。请检查您的连接并重试。",
+      "invalid_format": "选择的音频格式无效。",
+      "invalid_voice": "选择的声音无效。",
+      "text_too_long": "文本太长。请减少长度或启用自动合并。",
+      "server_error": "服务器错误。请稍后重试。"
+    },
+    "success_messages": {
+      "generation_complete": "语音生成成功！",
+      "text_copied": "文本已复制到剪贴板！",
+      "download_started": "下载已开始！"
+    }
+  },
+  "docs": {
+    "title": "API文档",
+    "subtitle": "TTSFM文本转语音API的完整参考。免费、简单且强大。",
+    "contents": "目录",
+    "overview": "概述",
+    "authentication": "身份验证",
+    "text_validation": "文本验证",
+    "endpoints": "API端点",
+    "voices": "声音",
+    "formats": "音频格式",
+    "generate": "生成语音",
+    "combined": "合并音频",
+    "status": "状态和健康检查",
+    "errors": "错误处理",
+    "examples": "代码示例",
+    "python_package": "Python包",
+    "overview_title": "概述",
+    "overview_desc": "TTSFM API提供现代的、OpenAI兼容的文本转语音生成接口。它支持多种声音、音频格式，并包含高级功能，如文本长度验证和智能自动合并功能。",
+    "base_url": "基础URL：",
+    "key_features": "主要特性",
+    "feature_voices": "11种不同的声音选项 - 从alloy、echo、nova等中选择",
+    "feature_formats": "多种音频格式 - 支持MP3、WAV、OPUS、AAC、FLAC、PCM",
+    "feature_openai": "OpenAI兼容性 - OpenAI TTS API的直接替代品",
+    "feature_auto_combine": "自动合并功能 - 自动处理长文本（>4096字符），通过分割和合并音频",
+    "feature_validation": "文本长度验证 - 智能验证，可配置限制",
+    "feature_monitoring": "实时监控 - 状态端点和健康检查",
+    "new_version": "v3.2.3新功能：",
+    "new_version_desc": "增强的`/v1/audio/speech`端点，具有智能自动合并功能。简化的Web界面，设计简洁、用户友好，自动处理长文本！",
+    "authentication_title": "身份验证",
+    "authentication_desc": "目前，API支持可选的API密钥身份验证。如果已配置，请在请求头中包含您的API密钥。",
+    "text_validation_title": "文本长度验证",
+    "text_validation_desc": "TTSFM包含内置的文本长度验证，以确保与TTS模型的兼容性。默认最大长度为4096个字符，但可以自定义。",
+    "important": "重要：",
+    "text_validation_warning": "超过最大长度的文本将被拒绝，除非禁用验证或将文本分割成块。",
+    "validation_options": "验证选项",
+    "max_length_option": "允许的最大字符数（默认：4096）",
+    "validate_length_option": "启用/禁用验证（默认：true）",
+    "preserve_words_option": "分块时避免分割单词（默认：true）",
+    "endpoints_title": "API端点",
+    "get_voices_desc": "获取可用声音列表。",
+    "get_formats_desc": "获取支持的音频格式列表。",
+    "validate_text_desc": "验证文本长度并获取分割建议。",
+    "generate_speech_desc": "从文本生成语音。",
+    "response_example": "响应示例：",
+    "request_body": "请求体：",
+    "parameters": "参数：",
+    "text_param": "要转换为语音的文本",
+    "voice_param": "声音ID（默认：\"alloy\"）",
+    "format_param": "音频格式（默认：\"mp3\"）",
+    "instructions_param": "声音调制指令",
+    "max_length_param": "最大文本长度（默认：4096）",
+    "validate_length_param": "启用验证（默认：true）",
+    "response": "响应：",
+    "response_audio": "返回带有适当Content-Type头的音频文件。",
+    "response_combined_audio": "返回包含所有块无缝合并的单个音频文件。",
+    "required": "必需",
+    "optional": "可选",
+    "python_package_title": "Python包",
+    "long_text_support": "长文本支持",
+    "long_text_desc": "TTSFM Python包包含内置的长文本分割功能，为需要精细控制的开发者提供支持：",
+    "developer_features": "开发者功能：",
+    "manual_splitting": "手动分割：对高级用例的文本分块进行完全控制",
+    "word_preservation": "单词保护：维护单词边界以获得自然语音",
+    "separate_files": "单独文件：每个块保存为单独的音频文件",
+    "cli_support": "CLI支持：使用`--split-long-text`标志进行命令行使用",
+    "note": "注意：",
+    "auto_combine_note": "对于Web用户，建议使用`/v1/audio/speech`中的自动合并功能，因为它会自动处理长文本并返回单个无缝音频文件。",
+    "combined_audio_desc": "从长文本生成单个合并的音频文件。自动将文本分割成块，为每个块生成语音，并将它们合并成一个无缝的音频文件。",
+    "response_headers": "响应头：",
+    "chunks_combined_header": "合并的块数",
+    "original_text_length_header": "原始文本长度（字符数）",
+    "audio_size_header": "最终音频文件大小（字节）",
+    "openai_compatible_desc": "增强的OpenAI兼容端点，具有自动合并功能。在需要时自动处理长文本，通过分割和合并音频块。",
+    "enhanced_parameters": "增强参数：",
+    "auto_combine_param": "自动分割长文本并将音频块合并为单个文件",
+    "auto_combine_false": "如果文本超过max_length则返回错误（标准OpenAI行为）",
+    "max_length_chunk_param": "分割时每个块的最大字符数",
+    "auto_combine_header": "是否启用了自动合并（true/false）",
+    "chunks_combined_response": "合并的音频块数（短文本为1）",
+    "original_text_response": "原始文本长度（用于长文本处理）",
+    "audio_format_header": "响应的音频格式",
+    "audio_size_response": "音频文件大小（字节）",
+    "short_text_comment": "短文本（正常工作）",
+    "long_text_auto_comment": "带自动合并的长文本（默认）",
+    "long_text_no_auto_comment": "不带自动合并的长文本（将出错）",
+    "audio_combination": "音频合并：",
+    "audio_combination_desc": "在可用时使用高级音频处理（PyDub），在不同环境中具有智能回退。支持所有音频格式。",
+    "use_cases": "使用场景：",
+    "use_case_articles": "长文章：将博客文章或文章转换为单个音频文件",
+    "use_case_audiobooks": "有声书：将章节生成为单个音频文件",
+    "use_case_podcasts": "播客：从脚本创建播客剧集",
+    "use_case_education": "教育内容：将学习材料转换为音频",
+    "example_usage": "使用示例：",
+    "python_example_comment": "Python示例"
+  }
+}

ttsfm-web/websocket_handler.py ADDED Viewed

	@@ -0,0 +1,231 @@

+"""
+WebSocket handler for real-time TTS streaming.
+Because apparently waiting 2 seconds for audio generation is too much for modern users.
+At least this will make it FEEL faster.
+"""
+import asyncio
+import json
+import logging
+import uuid
+import time
+from typing import Optional, Dict, Any
+from datetime import datetime
+from flask_socketio import SocketIO, emit, disconnect
+from flask import request
+from ttsfm import TTSClient, Voice, AudioFormat, TTSException
+from ttsfm.utils import split_text_by_length, estimate_audio_duration
+logger = logging.getLogger(__name__)
+class WebSocketTTSHandler:
+    """
+    Handles WebSocket connections for streaming TTS generation.
+    Because your users can't wait 2 seconds for a complete response.
+    """
+    def __init__(self, socketio: SocketIO, tts_client: TTSClient):
+        self.socketio = socketio
+        self.tts_client = tts_client
+        self.active_sessions: Dict[str, Dict[str, Any]] = {}
+        # Register WebSocket events
+        self._register_events()
+    def _register_events(self):
+        """Register all WebSocket event handlers."""
+        @self.socketio.on('connect')
+        def handle_connect():
+            """Handle new WebSocket connection."""
+            session_id = request.sid
+            self.active_sessions[session_id] = {
+                'connected_at': datetime.now(),
+                'request_count': 0,
+                'last_request': None
+            }
+            logger.info(f"WebSocket client connected: {session_id}")
+            emit('connected', {'session_id': session_id, 'status': 'ready'})
+        @self.socketio.on('disconnect')
+        def handle_disconnect():
+            """Handle WebSocket disconnection."""
+            session_id = request.sid
+            if session_id in self.active_sessions:
+                del self.active_sessions[session_id]
+            logger.info(f"WebSocket client disconnected: {session_id}")
+        @self.socketio.on('generate_stream')
+        def handle_generate_stream(data):
+            """
+            Handle streaming TTS generation request.
+            Expected data format:
+            {
+                'text': str,
+                'voice': str,
+                'format': str,
+                'chunk_size': int (optional, default 1024 chars),
+                'instructions': str (optional, voice modulation instructions)
+            }
+            """
+            session_id = request.sid
+            request_id = data.get('request_id', str(uuid.uuid4()))
+            # Update session info
+            if session_id in self.active_sessions:
+                self.active_sessions[session_id]['request_count'] += 1
+                self.active_sessions[session_id]['last_request'] = datetime.now()
+            # Emit acknowledgment
+            emit('stream_started', {
+                'request_id': request_id,
+                'timestamp': time.time()
+            })
+            # Start async generation
+            self.socketio.start_background_task(
+                self._generate_stream,
+                session_id,
+                request_id,
+                data
+            )
+        @self.socketio.on('cancel_stream')
+        def handle_cancel_stream(data):
+            """Handle stream cancellation request."""
+            request_id = data.get('request_id')
+            session_id = request.sid
+            # In a real implementation, you'd track and cancel the actual generation
+            logger.info(f"Stream cancellation requested: {request_id}")
+            emit('stream_cancelled', {'request_id': request_id})
+    def _generate_stream(self, session_id: str, request_id: str, data: Dict[str, Any]):
+        """
+        Generate TTS audio in chunks and stream to client.
+        This is where the magic happens. And by magic, I mean
+        chunking text and pretending it's real-time.
+        """
+        try:
+            # Extract parameters
+            text = data.get('text', '')
+            voice = data.get('voice', 'alloy')
+            format_str = data.get('format', 'mp3')
+            chunk_size = data.get('chunk_size', 1024)
+            instructions = data.get('instructions', None)  # Voice instructions support!
+            if not text:
+                self._emit_error(session_id, request_id, "No text provided")
+                return
+            # Convert string parameters to enums
+            try:
+                voice_enum = Voice(voice.lower())
+                format_enum = AudioFormat(format_str.lower())
+            except ValueError as e:
+                self._emit_error(session_id, request_id, f"Invalid parameter: {str(e)}")
+                return
+            # Split text into chunks for "streaming" effect
+            chunks = split_text_by_length(text, chunk_size, preserve_words=True)
+            total_chunks = len(chunks)
+            logger.info(f"Starting stream generation: {request_id} with {total_chunks} chunks")
+            # Emit initial progress
+            self.socketio.emit('stream_progress', {
+                'request_id': request_id,
+                'progress': 0,
+                'total_chunks': total_chunks,
+                'status': 'processing'
+            }, room=session_id)
+            # Process each chunk
+            for i, chunk in enumerate(chunks):
+                # Check if client is still connected
+                if session_id not in self.active_sessions:
+                    logger.warning(f"Client disconnected during generation: {session_id}")
+                    break
+                try:
+                    # Generate audio for chunk
+                    start_time = time.time()
+                    response = self.tts_client.generate_speech(
+                        text=chunk,
+                        voice=voice_enum,
+                        response_format=format_enum,
+                        instructions=instructions,  # Pass voice instructions!
+                        validate_length=False  # We already chunked it
+                    )
+                    generation_time = time.time() - start_time
+                    # Emit chunk data
+                    chunk_data = {
+                        'request_id': request_id,
+                        'chunk_index': i,
+                        'total_chunks': total_chunks,
+                        'audio_data': response.audio_data.hex(),  # Convert bytes to hex string
+                        'format': format_enum.value,
+                        'duration': response.duration,
+                        'generation_time': generation_time,
+                        'chunk_text': chunk[:50] + '...' if len(chunk) > 50 else chunk
+                    }
+                    self.socketio.emit('audio_chunk', chunk_data, room=session_id)
+                    # Emit progress update
+                    progress = int(((i + 1) / total_chunks) * 100)
+                    self.socketio.emit('stream_progress', {
+                        'request_id': request_id,
+                        'progress': progress,
+                        'total_chunks': total_chunks,
+                        'chunks_completed': i + 1,
+                        'status': 'processing'
+                    }, room=session_id)
+                    # Small delay to prevent overwhelming the client
+                    # (and to make it feel more "real-time")
+                    self.socketio.sleep(0.1)
+                except Exception as e:
+                    logger.error(f"Error generating chunk {i}: {str(e)}")
+                    self._emit_error(session_id, request_id, f"Chunk {i} generation failed: {str(e)}")
+                    # Continue with next chunk instead of failing completely
+                    continue
+            # Emit completion
+            self.socketio.emit('stream_complete', {
+                'request_id': request_id,
+                'total_chunks': total_chunks,
+                'status': 'completed',
+                'timestamp': time.time()
+            }, room=session_id)
+            logger.info(f"Stream generation completed: {request_id}")
+        except Exception as e:
+            logger.error(f"Stream generation failed: {str(e)}")
+            self._emit_error(session_id, request_id, str(e))
+    def _emit_error(self, session_id: str, request_id: str, error_message: str):
+        """Emit error to specific session."""
+        self.socketio.emit('stream_error', {
+            'request_id': request_id,
+            'error': error_message,
+            'timestamp': time.time()
+        }, room=session_id)
+    def get_active_sessions_count(self) -> int:
+        """Get count of active WebSocket sessions."""
+        return len(self.active_sessions)
+    def get_session_info(self, session_id: str) -> Optional[Dict[str, Any]]:
+        """Get information about a specific session."""
+        return self.active_sessions.get(session_id)

ttsfm/__init__.py CHANGED Viewed

@@ -1,183 +1,193 @@
-"""
-TTSFM - Text-to-Speech for Free using OpenAI.fm
-A Python library for generating high-quality text-to-speech audio using the free OpenAI.fm service.
-Supports multiple voices and audio formats with a simple, intuitive API.
-Features:
-- 🎤 6 premium AI voices (alloy, echo, fable, nova, onyx, shimmer)
-- 🎵 6 audio formats (MP3, WAV, OPUS, AAC, FLAC, PCM)
-- 🚀 Fast and reliable speech generation
-- 📝 Comprehensive text processing and validation
-- 🔄 Automatic retry with exponential backoff
-- 📊 Detailed response metadata and statistics
-- 🌐 Both synchronous and asynchronous APIs
-- 🎯 OpenAI-compatible API format
-- 🔧 Smart format optimization for best quality
-Audio Format Support:
-- MP3: Good quality, small file size - ideal for web and general use
-- WAV: Lossless quality, large file size - ideal for professional use
-- OPUS: High-quality compressed audio - ideal for streaming
-- AAC: Advanced audio codec - ideal for mobile devices
-- FLAC: Lossless compression - ideal for archival
-- PCM: Raw audio data - ideal for processing
-Example:
-    >>> from ttsfm import TTSClient, Voice, AudioFormat
-    >>>
-    >>> client = TTSClient()
-    >>>
-    >>> # Generate MP3 audio
-    >>> mp3_response = client.generate_speech(
-    ...     text="Hello, world!",
-    ...     voice=Voice.ALLOY,
-    ...     response_format=AudioFormat.MP3
-    ... )
-    >>> mp3_response.save_to_file("hello")  # Saves as hello.mp3
-    >>>
-    >>> # Generate WAV audio
-    >>> wav_response = client.generate_speech(
-    ...     text="High quality audio",
-    ...     voice=Voice.NOVA,
-    ...     response_format=AudioFormat.WAV
-    ... )
-    >>> wav_response.save_to_file("audio")  # Saves as audio.wav
-    >>>
-    >>> # Generate OPUS audio
-    >>> opus_response = client.generate_speech(
-    ...     text="Compressed audio",
-    ...     voice=Voice.ECHO,
-    ...     response_format=AudioFormat.OPUS
-    ... )
-    >>> opus_response.save_to_file("compressed")  # Saves as compressed.wav
-"""
-from .client import TTSClient
-from .async_client import AsyncTTSClient
-from .models import (
-    TTSRequest,
-    TTSResponse,
-    Voice,
-    AudioFormat,
-    TTSError,
-    APIError,
-    NetworkError,
-    ValidationError
-)
-from .exceptions import (
-    TTSException,
-    APIException,
-    NetworkException,
-    ValidationException,
-    RateLimitException,
-    AuthenticationException
-)
-from .utils import (
-    validate_text_length,
-    split_text_by_length
-)
-__version__ = "3.0.0"
-__author__ = "dbcccc"
-__email__ = "120614547+dbccccccc@users.noreply.github.com"
-__description__ = "Text-to-Speech API Client with OpenAI compatibility"
-__url__ = "https://github.com/dbccccccc/ttsfm"
-# Default client instance for convenience
-default_client = None
-def create_client(base_url: str = None, api_key: str = None, **kwargs) -> TTSClient:
-    """
-    Create a new TTS client instance.
-    Args:
-        base_url: Base URL for the TTS service
-        api_key: API key for authentication (if required)
-        **kwargs: Additional client configuration
-    Returns:
-        TTSClient: Configured client instance
-    """
-    return TTSClient(base_url=base_url, api_key=api_key, **kwargs)
-def create_async_client(base_url: str = None, api_key: str = None, **kwargs) -> AsyncTTSClient:
-    """
-    Create a new async TTS client instance.
-    Args:
-        base_url: Base URL for the TTS service
-        api_key: API key for authentication (if required)
-        **kwargs: Additional client configuration
-    Returns:
-        AsyncTTSClient: Configured async client instance
-    """
-    return AsyncTTSClient(base_url=base_url, api_key=api_key, **kwargs)
-def set_default_client(client: TTSClient) -> None:
-    """Set the default client instance for convenience functions."""
-    global default_client
-    default_client = client
-def generate_speech(text: str, voice: str = "alloy", **kwargs) -> bytes:
-    """
-    Convenience function to generate speech using the default client.
-    Args:
-        text: Text to convert to speech
-        voice: Voice to use for generation
-        **kwargs: Additional generation parameters
-    Returns:
-        bytes: Generated audio data
-    Raises:
-        TTSException: If no default client is set or generation fails
-    """
-    if default_client is None:
-        raise TTSException("No default client set. Use create_client() first.")
-    return default_client.generate_speech(text=text, voice=voice, **kwargs)
-# Export all public components
-__all__ = [
-    # Main classes
-    "TTSClient",
-    "AsyncTTSClient",
-    # Models
-    "TTSRequest",
-    "TTSResponse",
-    "Voice",
-    "AudioFormat",
-    "TTSError",
-    "APIError",
-    "NetworkError",
-    "ValidationError",
-    # Exceptions
-    "TTSException",
-    "APIException",
-    "NetworkException",
-    "ValidationException",
-    "RateLimitException",
-    "AuthenticationException",
-    # Factory functions
-    "create_client",
-    "create_async_client",
-    "set_default_client",
-    "generate_speech",
-    # Utility functions
-    "validate_text_length",
-    "split_text_by_length",
-    # Package metadata
-    "__version__",
-    "__author__",
-    "__email__",
-    "__description__",
-    "__url__"
-]

+"""
+TTSFM - Text-to-Speech for Free using OpenAI.fm
+A Python library for generating high-quality text-to-speech audio using the free OpenAI.fm service.
+Supports multiple voices and audio formats with a simple, intuitive API.
+Example:
+    >>> from ttsfm import TTSClient, Voice, AudioFormat
+    >>>
+    >>> client = TTSClient()
+    >>>
+    >>> # Generate MP3 audio
+    >>> mp3_response = client.generate_speech(
+    ...     text="Hello, world!",
+    ...     voice=Voice.ALLOY,
+    ...     response_format=AudioFormat.MP3
+    ... )
+    >>> mp3_response.save_to_file("hello")  # Saves as hello.mp3
+    >>>
+    >>> # Generate WAV audio
+    >>> wav_response = client.generate_speech(
+    ...     text="High quality audio",
+    ...     voice=Voice.NOVA,
+    ...     response_format=AudioFormat.WAV
+    ... )
+    >>> wav_response.save_to_file("audio")  # Saves as audio.wav
+    >>>
+    >>> # Generate OPUS audio
+    >>> opus_response = client.generate_speech(
+    ...     text="Compressed audio",
+    ...     voice=Voice.ECHO,
+    ...     response_format=AudioFormat.OPUS
+    ... )
+    >>> opus_response.save_to_file("compressed")  # Saves as compressed.wav
+"""
+from .client import TTSClient
+from .async_client import AsyncTTSClient
+from .models import (
+    TTSRequest,
+    TTSResponse,
+    Voice,
+    AudioFormat,
+    TTSError,
+    APIError,
+    NetworkError,
+    ValidationError
+)
+from .exceptions import (
+    TTSException,
+    APIException,
+    NetworkException,
+    ValidationException,
+    RateLimitException,
+    AuthenticationException,
+    ServiceUnavailableException,
+    QuotaExceededException,
+    AudioProcessingException
+)
+from .utils import (
+    validate_text_length,
+    split_text_by_length
+)
+__version__ = "3.2.3"
+__author__ = "dbcccc"
+__email__ = "120614547+dbccccccc@users.noreply.github.com"
+__description__ = "Text-to-Speech API Client with OpenAI compatibility"
+__url__ = "https://github.com/dbccccccc/ttsfm"
+# Default client instance for convenience
+default_client = None
+def create_client(base_url: str = None, api_key: str = None, **kwargs) -> TTSClient:
+    """
+    Create a new TTS client instance.
+    Args:
+        base_url: Base URL for the TTS service
+        api_key: API key for authentication (if required)
+        **kwargs: Additional client configuration
+    Returns:
+        TTSClient: Configured client instance
+    """
+    return TTSClient(base_url=base_url, api_key=api_key, **kwargs)
+def create_async_client(base_url: str = None, api_key: str = None, **kwargs) -> AsyncTTSClient:
+    """
+    Create a new async TTS client instance.
+    Args:
+        base_url: Base URL for the TTS service
+        api_key: API key for authentication (if required)
+        **kwargs: Additional client configuration
+    Returns:
+        AsyncTTSClient: Configured async client instance
+    """
+    return AsyncTTSClient(base_url=base_url, api_key=api_key, **kwargs)
+def set_default_client(client: TTSClient) -> None:
+    """Set the default client instance for convenience functions."""
+    global default_client
+    default_client = client
+def generate_speech(text: str, voice: str = "alloy", **kwargs) -> bytes:
+    """
+    Convenience function to generate speech using the default client.
+    Args:
+        text: Text to convert to speech
+        voice: Voice to use for generation
+        **kwargs: Additional generation parameters
+    Returns:
+        bytes: Generated audio data
+    Raises:
+        TTSException: If no default client is set or generation fails
+    """
+    if default_client is None:
+        raise TTSException("No default client set. Use create_client() first.")
+    return default_client.generate_speech(text=text, voice=voice, **kwargs)
+def generate_speech_long_text(text: str, voice: str = "alloy", **kwargs) -> list:
+    """
+    Convenience function to generate speech from long text using the default client.
+    Automatically splits long text into chunks and generates speech for each chunk.
+    Args:
+        text: Text to convert to speech (can be longer than 4096 characters)
+        voice: Voice to use for generation
+        **kwargs: Additional generation parameters (max_length, preserve_words, etc.)
+    Returns:
+        list: List of TTSResponse objects for each chunk
+    Raises:
+        TTSException: If no default client is set or generation fails
+    """
+    if default_client is None:
+        raise TTSException("No default client set. Use create_client() first.")
+    return default_client.generate_speech_long_text(text=text, voice=voice, **kwargs)
+# Export all public components
+__all__ = [
+    # Main classes
+    "TTSClient",
+    "AsyncTTSClient",
+    # Models
+    "TTSRequest",
+    "TTSResponse",
+    "Voice",
+    "AudioFormat",
+    "TTSError",
+    "APIError",
+    "NetworkError",
+    "ValidationError",
+    # Exceptions
+    "TTSException",
+    "APIException",
+    "NetworkException",
+    "ValidationException",
+    "RateLimitException",
+    "AuthenticationException",
+    "ServiceUnavailableException",
+    "QuotaExceededException",
+    "AudioProcessingException",
+    # Factory functions
+    "create_client",
+    "create_async_client",
+    "set_default_client",
+    "generate_speech",
+    "generate_speech_long_text",
+    # Utility functions
+    "validate_text_length",
+    "split_text_by_length",
+    # Package metadata
+    "__version__",
+    "__author__",
+    "__email__",
+    "__description__",
+    "__url__"
+]

ttsfm/async_client.py CHANGED Viewed

@@ -1,464 +1,504 @@
-"""
-Asynchronous TTS client implementation.
-This module provides the AsyncTTSClient class for asynchronous
-text-to-speech generation with OpenAI-compatible API.
-"""
-import json
-import uuid
-import asyncio
-import logging
-from typing import Optional, Dict, Any, Union, List
-import aiohttp
-from aiohttp import ClientTimeout, ClientSession
-from .models import (
-    TTSRequest, TTSResponse, Voice, AudioFormat,
-    get_content_type, get_format_from_content_type
-)
-from .exceptions import (
-    TTSException, APIException, NetworkException, ValidationException,
-    create_exception_from_response
-)
-from .utils import (
-    get_realistic_headers, sanitize_text, validate_url, build_url,
-    exponential_backoff, estimate_audio_duration, format_file_size,
-    validate_text_length, split_text_by_length
-)
-logger = logging.getLogger(__name__)
-class AsyncTTSClient:
-    """
-    Asynchronous TTS client for text-to-speech generation.
-    This client provides an async interface for generating speech from text
-    using OpenAI-compatible TTS services with support for concurrent requests.
-    Attributes:
-        base_url: Base URL for the TTS service
-        api_key: API key for authentication (if required)
-        timeout: Request timeout in seconds
-        max_retries: Maximum number of retry attempts
-        verify_ssl: Whether to verify SSL certificates
-        max_concurrent: Maximum concurrent requests
-    """
-    def __init__(
-        self,
-        base_url: str = "https://www.openai.fm",
-        api_key: Optional[str] = None,
-        timeout: float = 30.0,
-        max_retries: int = 3,
-        verify_ssl: bool = True,
-        max_concurrent: int = 10,
-        **kwargs
-    ):
-        """
-        Initialize the async TTS client.
-        Args:
-            base_url: Base URL for the TTS service
-            api_key: API key for authentication
-            timeout: Request timeout in seconds
-            max_retries: Maximum retry attempts
-            verify_ssl: Whether to verify SSL certificates
-            max_concurrent: Maximum concurrent requests
-            **kwargs: Additional configuration options
-        """
-        self.base_url = base_url.rstrip('/')
-        self.api_key = api_key
-        self.timeout = timeout
-        self.max_retries = max_retries
-        self.verify_ssl = verify_ssl
-        self.max_concurrent = max_concurrent
-        # Validate base URL
-        if not validate_url(self.base_url):
-            raise ValidationException(f"Invalid base URL: {self.base_url}")
-        # Session will be created when needed
-        self._session: Optional[ClientSession] = None
-        self._semaphore = asyncio.Semaphore(max_concurrent)
-        logger.info(f"Initialized async TTS client with base URL: {self.base_url}")
-    async def __aenter__(self):
-        """Async context manager entry."""
-        await self._ensure_session()
-        return self
-    async def __aexit__(self, exc_type, exc_val, exc_tb):
-        """Async context manager exit."""
-        await self.close()
-    async def _ensure_session(self):
-        """Ensure HTTP session is created."""
-        if self._session is None or self._session.closed:
-            # Setup headers
-            headers = get_realistic_headers()
-            if self.api_key:
-                headers["Authorization"] = f"Bearer {self.api_key}"
-            # Create timeout configuration
-            timeout = ClientTimeout(total=self.timeout)
-            # Create session
-            connector = aiohttp.TCPConnector(
-                verify_ssl=self.verify_ssl,
-                limit=self.max_concurrent * 2
-            )
-            self._session = ClientSession(
-                headers=headers,
-                timeout=timeout,
-                connector=connector
-            )
-    async def generate_speech(
-        self,
-        text: str,
-        voice: Union[Voice, str] = Voice.ALLOY,
-        response_format: Union[AudioFormat, str] = AudioFormat.MP3,
-        instructions: Optional[str] = None,
-        max_length: int = 4096,
-        validate_length: bool = True,
-        **kwargs
-    ) -> TTSResponse:
-        """
-        Generate speech from text asynchronously.
-        Args:
-            text: Text to convert to speech
-            voice: Voice to use for generation
-            response_format: Audio format for output
-            instructions: Optional instructions for voice modulation
-            max_length: Maximum allowed text length in characters (default: 4096)
-            validate_length: Whether to validate text length (default: True)
-            **kwargs: Additional parameters
-        Returns:
-            TTSResponse: Generated audio response
-        Raises:
-            TTSException: If generation fails
-            ValueError: If text exceeds max_length and validate_length is True
-        """
-        # Create and validate request
-        request = TTSRequest(
-            input=sanitize_text(text),
-            voice=voice,
-            response_format=response_format,
-            instructions=instructions,
-            max_length=max_length,
-            validate_length=validate_length,
-            **kwargs
-        )
-        return await self._make_request(request)
-    async def generate_speech_long_text(
-        self,
-        text: str,
-        voice: Union[Voice, str] = Voice.ALLOY,
-        response_format: Union[AudioFormat, str] = AudioFormat.MP3,
-        instructions: Optional[str] = None,
-        max_length: int = 4096,
-        preserve_words: bool = True,
-        **kwargs
-    ) -> List[TTSResponse]:
-        """
-        Generate speech from long text by splitting it into chunks asynchronously.
-        This method automatically splits text that exceeds max_length into
-        smaller chunks and generates speech for each chunk concurrently.
-        Args:
-            text: Text to convert to speech
-            voice: Voice to use for generation
-            response_format: Audio format for output
-            instructions: Optional instructions for voice modulation
-            max_length: Maximum length per chunk (default: 4096)
-            preserve_words: Whether to avoid splitting words (default: True)
-            **kwargs: Additional parameters
-        Returns:
-            List[TTSResponse]: List of generated audio responses
-        Raises:
-            TTSException: If generation fails for any chunk
-        """
-        # Sanitize text first
-        clean_text = sanitize_text(text)
-        # Split text into chunks
-        chunks = split_text_by_length(clean_text, max_length, preserve_words)
-        if not chunks:
-            raise ValueError("No valid text chunks found after processing")
-        # Create requests for all chunks
-        requests = []
-        for chunk in chunks:
-            request = TTSRequest(
-                input=chunk,
-                voice=voice,
-                response_format=response_format,
-                instructions=instructions,
-                max_length=max_length,
-                validate_length=False,  # We already split the text
-                **kwargs
-            )
-            requests.append(request)
-        # Process all chunks concurrently
-        return await self.generate_speech_batch(requests)
-    async def generate_speech_batch(
-        self,
-        requests: List[TTSRequest]
-    ) -> List[TTSResponse]:
-        """
-        Generate speech for multiple requests concurrently.
-        Args:
-            requests: List of TTS requests
-        Returns:
-            List[TTSResponse]: List of generated audio responses
-        Raises:
-            TTSException: If any generation fails
-        """
-        if not requests:
-            return []
-        # Process requests concurrently with semaphore limiting
-        tasks = [self._make_request(request) for request in requests]
-        responses = await asyncio.gather(*tasks, return_exceptions=True)
-        # Check for exceptions and convert them
-        results = []
-        for i, response in enumerate(responses):
-            if isinstance(response, Exception):
-                raise TTSException(f"Request {i} failed: {str(response)}")
-            results.append(response)
-        return results
-    async def generate_speech_from_request(self, request: TTSRequest) -> TTSResponse:
-        """
-        Generate speech from a TTSRequest object asynchronously.
-        Args:
-            request: TTS request object
-        Returns:
-            TTSResponse: Generated audio response
-        """
-        return await self._make_request(request)
-    async def _make_request(self, request: TTSRequest) -> TTSResponse:
-        """
-        Make the actual HTTP request to the TTS service.
-        Args:
-            request: TTS request object
-        Returns:
-            TTSResponse: Generated audio response
-        Raises:
-            TTSException: If request fails
-        """
-        await self._ensure_session()
-        async with self._semaphore:  # Limit concurrent requests
-            url = build_url(self.base_url, "api/generate")
-            # Prepare form data for openai.fm API
-            form_data = {
-                'input': request.input,
-                'voice': request.voice.value,
-                'generation': str(uuid.uuid4()),
-                'response_format': request.response_format.value if hasattr(request.response_format, 'value') else str(request.response_format)
-            }
-            # Add prompt/instructions if provided
-            if request.instructions:
-                form_data['prompt'] = request.instructions
-            else:
-                # Default prompt for better quality
-                form_data['prompt'] = (
-                    "Affect/personality: Natural and clear\n\n"
-                    "Tone: Friendly and professional, creating a pleasant listening experience.\n\n"
-                    "Pronunciation: Clear, articulate, and steady, ensuring each word is easily understood "
-                    "while maintaining a natural, conversational flow.\n\n"
-                    "Pause: Brief, purposeful pauses between sentences to allow time for the listener "
-                    "to process the information.\n\n"
-                    "Emotion: Warm and engaging, conveying the intended message effectively."
-                )
-            logger.info(f"Generating speech for text: '{request.input[:50]}...' with voice: {request.voice}")
-            # Make request with retries
-            for attempt in range(self.max_retries + 1):
-                try:
-                    # Add random delay for rate limiting (except first attempt)
-                    if attempt > 0:
-                        delay = exponential_backoff(attempt - 1)
-                        logger.info(f"Retrying request after {delay:.2f}s (attempt {attempt + 1})")
-                        await asyncio.sleep(delay)
-                    # Use form data as required by openai.fm
-                    async with self._session.post(url, data=form_data) as response:
-                        # Handle different response types
-                        if response.status == 200:
-                            return await self._process_openai_fm_response(response, request)
-                        else:
-                            # Try to parse error response
-                            try:
-                                error_data = await response.json()
-                            except (json.JSONDecodeError, ValueError):
-                                text = await response.text()
-                                error_data = {"error": {"message": text or "Unknown error"}}
-                            # Create appropriate exception
-                            exception = create_exception_from_response(
-                                response.status,
-                                error_data,
-                                f"TTS request failed with status {response.status}"
-                            )
-                            # Don't retry for certain errors
-                            if response.status in [400, 401, 403, 404]:
-                                raise exception
-                            # For retryable errors, continue to next attempt
-                            if attempt == self.max_retries:
-                                raise exception
-                            logger.warning(f"Request failed with status {response.status}, retrying...")
-                            continue
-                except asyncio.TimeoutError:
-                    if attempt == self.max_retries:
-                        raise NetworkException(
-                            f"Request timed out after {self.timeout}s",
-                            timeout=self.timeout,
-                            retry_count=attempt
-                        )
-                    logger.warning(f"Request timed out, retrying...")
-                    continue
-                except aiohttp.ClientError as e:
-                    if attempt == self.max_retries:
-                        raise NetworkException(
-                            f"Client error: {str(e)}",
-                            retry_count=attempt
-                        )
-                    logger.warning(f"Client error, retrying...")
-                    continue
-            # This should never be reached, but just in case
-            raise TTSException("Maximum retries exceeded")
-    async def _process_openai_fm_response(
-        self,
-        response: aiohttp.ClientResponse,
-        request: TTSRequest
-    ) -> TTSResponse:
-        """
-        Process a successful response from the openai.fm TTS service.
-        Args:
-            response: HTTP response object
-            request: Original TTS request
-        Returns:
-            TTSResponse: Processed response object
-        """
-        # Get content type from response headers
-        content_type = response.headers.get("content-type", "audio/mpeg")
-        # Get audio data
-        audio_data = await response.read()
-        if not audio_data:
-            raise APIException("Received empty audio data from openai.fm")
-        # Determine format from content type
-        if "audio/mpeg" in content_type or "audio/mp3" in content_type:
-            actual_format = AudioFormat.MP3
-        elif "audio/wav" in content_type:
-            actual_format = AudioFormat.WAV
-        elif "audio/opus" in content_type:
-            actual_format = AudioFormat.OPUS
-        elif "audio/aac" in content_type:
-            actual_format = AudioFormat.AAC
-        elif "audio/flac" in content_type:
-            actual_format = AudioFormat.FLAC
-        else:
-            # Default to MP3 for openai.fm
-            actual_format = AudioFormat.MP3
-        # Estimate duration based on text length
-        estimated_duration = estimate_audio_duration(request.input)
-        # Check if returned format differs from requested format
-        requested_format = request.response_format
-        if isinstance(requested_format, str):
-            try:
-                requested_format = AudioFormat(requested_format.lower())
-            except ValueError:
-                requested_format = AudioFormat.MP3  # Default fallback
-        # Import here to avoid circular imports
-        from .models import maps_to_wav
-        # Check if format differs from request
-        if actual_format != requested_format:
-            if maps_to_wav(requested_format.value) and actual_format.value == "wav":
-                logger.debug(
-                    f"Format '{requested_format.value}' requested, returning WAV format."
-                )
-            else:
-                logger.warning(
-                    f"Requested format '{requested_format.value}' but received '{actual_format.value}' "
-                    f"from service."
-                )
-        # Create response object
-        tts_response = TTSResponse(
-            audio_data=audio_data,
-            content_type=content_type,
-            format=actual_format,
-            size=len(audio_data),
-            duration=estimated_duration,
-            metadata={
-                "response_headers": dict(response.headers),
-                "status_code": response.status,
-                "url": str(response.url),
-                "service": "openai.fm",
-                "voice": request.voice.value,
-                "original_text": request.input[:100] + "..." if len(request.input) > 100 else request.input,
-                "requested_format": requested_format.value,
-                "actual_format": actual_format.value
-            }
-        )
-        logger.info(
-            f"Successfully generated {format_file_size(len(audio_data))} "
-            f"of {actual_format.value.upper()} audio from openai.fm using voice '{request.voice.value}'"
-        )
-        return tts_response
-    async def close(self):
-        """Close the HTTP session."""
-        if self._session and not self._session.closed:
-            await self._session.close()

+"""
+Asynchronous TTS client implementation.
+This module provides the AsyncTTSClient class for asynchronous
+text-to-speech generation with OpenAI-compatible API.
+"""
+import json
+import uuid
+import asyncio
+import logging
+from typing import Optional, Dict, Any, Union, List
+import aiohttp
+from aiohttp import ClientTimeout, ClientSession
+from .models import (
+    TTSRequest, TTSResponse, Voice, AudioFormat,
+    get_content_type, get_format_from_content_type
+)
+from .exceptions import (
+    TTSException, APIException, NetworkException, ValidationException,
+    create_exception_from_response
+)
+from .utils import (
+    get_realistic_headers, sanitize_text, validate_url, build_url,
+    exponential_backoff, estimate_audio_duration, format_file_size,
+    validate_text_length, split_text_by_length
+)
+logger = logging.getLogger(__name__)
+class AsyncTTSClient:
+    """
+    Asynchronous TTS client for text-to-speech generation.
+    This client provides an async interface for generating speech from text
+    using OpenAI-compatible TTS services with support for concurrent requests.
+    Attributes:
+        base_url: Base URL for the TTS service
+        api_key: API key for authentication (if required)
+        timeout: Request timeout in seconds
+        max_retries: Maximum number of retry attempts
+        verify_ssl: Whether to verify SSL certificates
+        max_concurrent: Maximum concurrent requests
+    """
+    def __init__(
+        self,
+        base_url: str = "https://www.openai.fm",
+        api_key: Optional[str] = None,
+        timeout: float = 30.0,
+        max_retries: int = 3,
+        verify_ssl: bool = True,
+        max_concurrent: int = 10,
+        **kwargs
+    ):
+        """
+        Initialize the async TTS client.
+        Args:
+            base_url: Base URL for the TTS service
+            api_key: API key for authentication
+            timeout: Request timeout in seconds
+            max_retries: Maximum retry attempts
+            verify_ssl: Whether to verify SSL certificates
+            max_concurrent: Maximum concurrent requests
+            **kwargs: Additional configuration options
+        """
+        self.base_url = base_url.rstrip('/')
+        self.api_key = api_key
+        self.timeout = timeout
+        self.max_retries = max_retries
+        self.verify_ssl = verify_ssl
+        self.max_concurrent = max_concurrent
+        # Validate base URL
+        if not validate_url(self.base_url):
+            raise ValidationException(f"Invalid base URL: {self.base_url}")
+        # Session will be created when needed
+        self._session: Optional[ClientSession] = None
+        self._semaphore = asyncio.Semaphore(max_concurrent)
+        logger.info(f"Initialized async TTS client with base URL: {self.base_url}")
+    async def __aenter__(self):
+        """Async context manager entry."""
+        await self._ensure_session()
+        return self
+    async def __aexit__(self, exc_type, exc_val, exc_tb):
+        """Async context manager exit."""
+        await self.close()
+    async def _ensure_session(self):
+        """Ensure HTTP session is created."""
+        if self._session is None or self._session.closed:
+            # Setup headers
+            headers = get_realistic_headers()
+            if self.api_key:
+                headers["Authorization"] = f"Bearer {self.api_key}"
+            # Create timeout configuration
+            timeout = ClientTimeout(total=self.timeout)
+            # Create session
+            connector = aiohttp.TCPConnector(
+                verify_ssl=self.verify_ssl,
+                limit=self.max_concurrent * 2
+            )
+            self._session = ClientSession(
+                headers=headers,
+                timeout=timeout,
+                connector=connector
+            )
+    async def generate_speech(
+        self,
+        text: str,
+        voice: Union[Voice, str] = Voice.ALLOY,
+        response_format: Union[AudioFormat, str] = AudioFormat.MP3,
+        instructions: Optional[str] = None,
+        max_length: int = 4096,
+        validate_length: bool = True,
+        **kwargs
+    ) -> TTSResponse:
+        """
+        Generate speech from text asynchronously.
+        Args:
+            text: Text to convert to speech
+            voice: Voice to use for generation
+            response_format: Audio format for output
+            instructions: Optional instructions for voice modulation
+            max_length: Maximum allowed text length in characters (default: 4096)
+            validate_length: Whether to validate text length (default: True)
+            **kwargs: Additional parameters
+        Returns:
+            TTSResponse: Generated audio response
+        Raises:
+            TTSException: If generation fails
+            ValueError: If text exceeds max_length and validate_length is True
+        """
+        # Create and validate request
+        request = TTSRequest(
+            input=sanitize_text(text),
+            voice=voice,
+            response_format=response_format,
+            instructions=instructions,
+            max_length=max_length,
+            validate_length=validate_length,
+            **kwargs
+        )
+        return await self._make_request(request)
+    async def generate_speech_long_text(
+        self,
+        text: str,
+        voice: Union[Voice, str] = Voice.ALLOY,
+        response_format: Union[AudioFormat, str] = AudioFormat.MP3,
+        instructions: Optional[str] = None,
+        max_length: int = 4096,
+        preserve_words: bool = True,
+        **kwargs
+    ) -> List[TTSResponse]:
+        """
+        Generate speech from long text by splitting it into chunks asynchronously.
+        This method automatically splits text that exceeds max_length into
+        smaller chunks and generates speech for each chunk concurrently.
+        Args:
+            text: Text to convert to speech
+            voice: Voice to use for generation
+            response_format: Audio format for output
+            instructions: Optional instructions for voice modulation
+            max_length: Maximum length per chunk (default: 4096)
+            preserve_words: Whether to avoid splitting words (default: True)
+            **kwargs: Additional parameters
+        Returns:
+            List[TTSResponse]: List of generated audio responses
+        Raises:
+            TTSException: If generation fails for any chunk
+        """
+        # Sanitize text first
+        clean_text = sanitize_text(text)
+        # Split text into chunks
+        chunks = split_text_by_length(clean_text, max_length, preserve_words)
+        if not chunks:
+            raise ValueError("No valid text chunks found after processing")
+        # Create requests for all chunks
+        requests = []
+        for chunk in chunks:
+            request = TTSRequest(
+                input=chunk,
+                voice=voice,
+                response_format=response_format,
+                instructions=instructions,
+                max_length=max_length,
+                validate_length=False,  # We already split the text
+                **kwargs
+            )
+            requests.append(request)
+        # Process all chunks concurrently
+        return await self.generate_speech_batch(requests)
+    async def generate_speech_from_long_text(
+        self,
+        text: str,
+        voice: Union[Voice, str] = Voice.ALLOY,
+        response_format: Union[AudioFormat, str] = AudioFormat.MP3,
+        instructions: Optional[str] = None,
+        max_length: int = 4096,
+        preserve_words: bool = True,
+        **kwargs
+    ) -> List[TTSResponse]:
+        """
+        Generate speech from long text by splitting it into chunks asynchronously.
+        This is an alias for generate_speech_long_text for consistency.
+        Args:
+            text: Text to convert to speech
+            voice: Voice to use for generation
+            response_format: Audio format for output
+            instructions: Optional instructions for voice modulation
+            max_length: Maximum length per chunk (default: 4096)
+            preserve_words: Whether to avoid splitting words (default: True)
+            **kwargs: Additional parameters
+        Returns:
+            List[TTSResponse]: List of generated audio responses
+        Raises:
+            TTSException: If generation fails for any chunk
+        """
+        return await self.generate_speech_long_text(
+            text=text,
+            voice=voice,
+            response_format=response_format,
+            instructions=instructions,
+            max_length=max_length,
+            preserve_words=preserve_words,
+            **kwargs
+        )
+    async def generate_speech_batch(
+        self,
+        requests: List[TTSRequest]
+    ) -> List[TTSResponse]:
+        """
+        Generate speech for multiple requests concurrently.
+        Args:
+            requests: List of TTS requests
+        Returns:
+            List[TTSResponse]: List of generated audio responses
+        Raises:
+            TTSException: If any generation fails
+        """
+        if not requests:
+            return []
+        # Process requests concurrently with semaphore limiting
+        tasks = [self._make_request(request) for request in requests]
+        responses = await asyncio.gather(*tasks, return_exceptions=True)
+        # Check for exceptions and convert them
+        results = []
+        for i, response in enumerate(responses):
+            if isinstance(response, Exception):
+                raise TTSException(f"Request {i} failed: {str(response)}")
+            results.append(response)
+        return results
+    async def generate_speech_from_request(self, request: TTSRequest) -> TTSResponse:
+        """
+        Generate speech from a TTSRequest object asynchronously.
+        Args:
+            request: TTS request object
+        Returns:
+            TTSResponse: Generated audio response
+        """
+        return await self._make_request(request)
+    async def _make_request(self, request: TTSRequest) -> TTSResponse:
+        """
+        Make the actual HTTP request to the TTS service.
+        Args:
+            request: TTS request object
+        Returns:
+            TTSResponse: Generated audio response
+        Raises:
+            TTSException: If request fails
+        """
+        await self._ensure_session()
+        async with self._semaphore:  # Limit concurrent requests
+            url = build_url(self.base_url, "api/generate")
+            # Prepare form data for openai.fm API
+            form_data = {
+                'input': request.input,
+                'voice': request.voice.value,
+                'generation': str(uuid.uuid4()),
+                'response_format': request.response_format.value if hasattr(request.response_format, 'value') else str(request.response_format)
+            }
+            # Add prompt/instructions if provided
+            if request.instructions:
+                form_data['prompt'] = request.instructions
+            else:
+                # Default prompt for better quality
+                form_data['prompt'] = (
+                    "Affect/personality: Natural and clear\n\n"
+                    "Tone: Friendly and professional, creating a pleasant listening experience.\n\n"
+                    "Pronunciation: Clear, articulate, and steady, ensuring each word is easily understood "
+                    "while maintaining a natural, conversational flow.\n\n"
+                    "Pause: Brief, purposeful pauses between sentences to allow time for the listener "
+                    "to process the information.\n\n"
+                    "Emotion: Warm and engaging, conveying the intended message effectively."
+                )
+            logger.info(f"Generating speech for text: '{request.input[:50]}...' with voice: {request.voice}")
+            # Make request with retries
+            for attempt in range(self.max_retries + 1):
+                try:
+                    # Add random delay for rate limiting (except first attempt)
+                    if attempt > 0:
+                        delay = exponential_backoff(attempt - 1)
+                        logger.info(f"Retrying request after {delay:.2f}s (attempt {attempt + 1})")
+                        await asyncio.sleep(delay)
+                    # Use form data as required by openai.fm
+                    async with self._session.post(url, data=form_data) as response:
+                        # Handle different response types
+                        if response.status == 200:
+                            return await self._process_openai_fm_response(response, request)
+                        else:
+                            # Try to parse error response
+                            try:
+                                error_data = await response.json()
+                            except (json.JSONDecodeError, ValueError):
+                                text = await response.text()
+                                error_data = {"error": {"message": text or "Unknown error"}}
+                            # Create appropriate exception
+                            exception = create_exception_from_response(
+                                response.status,
+                                error_data,
+                                f"TTS request failed with status {response.status}"
+                            )
+                            # Don't retry for certain errors
+                            if response.status in [400, 401, 403, 404]:
+                                raise exception
+                            # For retryable errors, continue to next attempt
+                            if attempt == self.max_retries:
+                                raise exception
+                            logger.warning(f"Request failed with status {response.status}, retrying...")
+                            continue
+                except asyncio.TimeoutError:
+                    if attempt == self.max_retries:
+                        raise NetworkException(
+                            f"Request timed out after {self.timeout}s",
+                            timeout=self.timeout,
+                            retry_count=attempt
+                        )
+                    logger.warning(f"Request timed out, retrying...")
+                    continue
+                except aiohttp.ClientError as e:
+                    if attempt == self.max_retries:
+                        raise NetworkException(
+                            f"Client error: {str(e)}",
+                            retry_count=attempt
+                        )
+                    logger.warning(f"Client error, retrying...")
+                    continue
+            # This should never be reached, but just in case
+            raise TTSException("Maximum retries exceeded")
+    async def _process_openai_fm_response(
+        self,
+        response: aiohttp.ClientResponse,
+        request: TTSRequest
+    ) -> TTSResponse:
+        """
+        Process a successful response from the openai.fm TTS service.
+        Args:
+            response: HTTP response object
+            request: Original TTS request
+        Returns:
+            TTSResponse: Processed response object
+        """
+        # Get content type from response headers
+        content_type = response.headers.get("content-type", "audio/mpeg")
+        # Get audio data
+        audio_data = await response.read()
+        if not audio_data:
+            raise APIException("Received empty audio data from openai.fm")
+        # Determine format from content type
+        if "audio/mpeg" in content_type or "audio/mp3" in content_type:
+            actual_format = AudioFormat.MP3
+        elif "audio/wav" in content_type:
+            actual_format = AudioFormat.WAV
+        elif "audio/opus" in content_type:
+            actual_format = AudioFormat.OPUS
+        elif "audio/aac" in content_type:
+            actual_format = AudioFormat.AAC
+        elif "audio/flac" in content_type:
+            actual_format = AudioFormat.FLAC
+        else:
+            # Default to MP3 for openai.fm
+            actual_format = AudioFormat.MP3
+        # Estimate duration based on text length
+        estimated_duration = estimate_audio_duration(request.input)
+        # Check if returned format differs from requested format
+        requested_format = request.response_format
+        if isinstance(requested_format, str):
+            try:
+                requested_format = AudioFormat(requested_format.lower())
+            except ValueError:
+                requested_format = AudioFormat.MP3  # Default fallback
+        # Import here to avoid circular imports
+        from .models import maps_to_wav
+        # Check if format differs from request
+        if actual_format != requested_format:
+            if maps_to_wav(requested_format.value) and actual_format.value == "wav":
+                logger.debug(
+                    f"Format '{requested_format.value}' requested, returning WAV format."
+                )
+            else:
+                logger.warning(
+                    f"Requested format '{requested_format.value}' but received '{actual_format.value}' "
+                    f"from service."
+                )
+        # Create response object
+        tts_response = TTSResponse(
+            audio_data=audio_data,
+            content_type=content_type,
+            format=actual_format,
+            size=len(audio_data),
+            duration=estimated_duration,
+            metadata={
+                "response_headers": dict(response.headers),
+                "status_code": response.status,
+                "url": str(response.url),
+                "service": "openai.fm",
+                "voice": request.voice.value,
+                "original_text": request.input[:100] + "..." if len(request.input) > 100 else request.input,
+                "requested_format": requested_format.value,
+                "actual_format": actual_format.value
+            }
+        )
+        logger.info(
+            f"Successfully generated {format_file_size(len(audio_data))} "
+            f"of {actual_format.value.upper()} audio from openai.fm using voice '{request.voice.value}'"
+        )
+        return tts_response
+    async def close(self):
+        """Close the HTTP session."""
+        if self._session and not self._session.closed:
+            await self._session.close()

ttsfm/cli.py CHANGED Viewed

@@ -1,362 +1,363 @@
-#!/usr/bin/env python3
-"""
-Command-line interface for TTSFM.
-This module provides a command-line interface for the TTSFM package,
-allowing users to generate speech from text using various options.
-"""
-import argparse
-import sys
-import os
-from typing import Optional
-from pathlib import Path
-from .client import TTSClient
-from .models import Voice, AudioFormat
-from .exceptions import TTSException, APIException, NetworkException
-def create_parser() -> argparse.ArgumentParser:
-    """Create and configure the argument parser."""
-    parser = argparse.ArgumentParser(
-        prog="ttsfm",
-        description="TTSFM - Text-to-Speech API Client",
-        formatter_class=argparse.RawDescriptionHelpFormatter,
-        epilog="""
-Examples:
-  ttsfm "Hello, world!" --output hello.mp3
-  ttsfm "Hello, world!" --voice nova --format wav --output hello.wav
-  ttsfm "Hello, world!" --url http://localhost:7000 --output hello.mp3
-  ttsfm --text-file input.txt --output speech.mp3
-        """
-    )
-    # Text input options (mutually exclusive)
-    text_group = parser.add_mutually_exclusive_group(required=True)
-    text_group.add_argument(
-        "text",
-        nargs="?",
-        help="Text to convert to speech"
-    )
-    text_group.add_argument(
-        "--text-file", "-f",
-        type=str,
-        help="Read text from file"
-    )
-    # Output options
-    parser.add_argument(
-        "--output", "-o",
-        type=str,
-        required=True,
-        help="Output file path"
-    )
-    # TTS options
-    parser.add_argument(
-        "--voice", "-v",
-        type=str,
-        default="alloy",
-        choices=["alloy", "echo", "fable", "onyx", "nova", "shimmer"],
-        help="Voice to use for speech generation (default: alloy)"
-    )
-    parser.add_argument(
-        "--format",
-        type=str,
-        default="mp3",
-        choices=["mp3", "opus", "aac", "flac", "wav", "pcm"],
-        help="Audio format (default: mp3)"
-    )
-    parser.add_argument(
-        "--speed",
-        type=float,
-        default=1.0,
-        help="Speech speed (0.25 to 4.0, default: 1.0)"
-    )
-    # Client options
-    parser.add_argument(
-        "--url", "-u",
-        type=str,
-        default="http://localhost:7000",
-        help="TTS service URL (default: http://localhost:7000)"
-    )
-    parser.add_argument(
-        "--api-key", "-k",
-        type=str,
-        help="API key for authentication"
-    )
-    parser.add_argument(
-        "--timeout",
-        type=float,
-        default=30.0,
-        help="Request timeout in seconds (default: 30.0)"
-    )
-    parser.add_argument(
-        "--retries",
-        type=int,
-        default=3,
-        help="Maximum number of retries (default: 3)"
-    )
-    # Text length validation options
-    parser.add_argument(
-        "--max-length",
-        type=int,
-        default=4096,
-        help="Maximum text length in characters (default: 4096)"
-    )
-    parser.add_argument(
-        "--no-length-validation",
-        action="store_true",
-        help="Disable text length validation"
-    )
-    parser.add_argument(
-        "--split-long-text",
-        action="store_true",
-        help="Automatically split long text into chunks"
-    )
-    # Other options
-    parser.add_argument(
-        "--verbose", "-V",
-        action="store_true",
-        help="Enable verbose output"
-    )
-    parser.add_argument(
-        "--version",
-        action="version",
-        version=f"%(prog)s {get_version()}"
-    )
-    return parser
-def get_version() -> str:
-    """Get the package version."""
-    try:
-        from . import __version__
-        return __version__
-    except ImportError:
-        return "unknown"
-def read_text_file(file_path: str) -> str:
-    """Read text from a file."""
-    try:
-        with open(file_path, 'r', encoding='utf-8') as f:
-            return f.read().strip()
-    except FileNotFoundError:
-        print(f"Error: File '{file_path}' not found.", file=sys.stderr)
-        sys.exit(1)
-    except Exception as e:
-        print(f"Error reading file '{file_path}': {e}", file=sys.stderr)
-        sys.exit(1)
-def validate_speed(speed: float) -> float:
-    """Validate and return the speed parameter."""
-    if not 0.25 <= speed <= 4.0:
-        print("Error: Speed must be between 0.25 and 4.0", file=sys.stderr)
-        sys.exit(1)
-    return speed
-def get_voice_enum(voice_str: str) -> Voice:
-    """Convert voice string to Voice enum."""
-    voice_map = {
-        "alloy": Voice.ALLOY,
-        "echo": Voice.ECHO,
-        "fable": Voice.FABLE,
-        "onyx": Voice.ONYX,
-        "nova": Voice.NOVA,
-        "shimmer": Voice.SHIMMER,
-    }
-    return voice_map[voice_str.lower()]
-def get_format_enum(format_str: str) -> AudioFormat:
-    """Convert format string to AudioFormat enum."""
-    format_map = {
-        "mp3": AudioFormat.MP3,
-        "opus": AudioFormat.OPUS,
-        "aac": AudioFormat.AAC,
-        "flac": AudioFormat.FLAC,
-        "wav": AudioFormat.WAV,
-        "pcm": AudioFormat.PCM,
-    }
-    return format_map[format_str.lower()]
-def handle_long_text(args, text: str, voice: Voice, audio_format: AudioFormat, speed: float) -> None:
-    """Handle long text by splitting it into chunks and generating multiple files."""
-    from .utils import split_text_by_length
-    import os
-    # Split text into chunks
-    chunks = split_text_by_length(text, args.max_length, preserve_words=True)
-    if not chunks:
-        print("Error: No valid text chunks found after processing.", file=sys.stderr)
-        sys.exit(1)
-    print(f"Split text into {len(chunks)} chunks")
-    # Create client
-    try:
-        client = TTSClient(
-            base_url=args.url,
-            api_key=args.api_key,
-            timeout=args.timeout,
-            max_retries=args.retries
-        )
-        # Generate speech for each chunk
-        base_name, ext = os.path.splitext(args.output)
-        for i, chunk in enumerate(chunks, 1):
-            if args.verbose:
-                print(f"Processing chunk {i}/{len(chunks)} ({len(chunk)} characters)...")
-            # Generate filename for this chunk
-            if len(chunks) == 1:
-                output_file = args.output
-            else:
-                output_file = f"{base_name}_part{i:03d}{ext}"
-            # Generate speech for this chunk
-            audio_data = client.generate_speech(
-                text=chunk,
-                voice=voice,
-                response_format=audio_format,
-                speed=speed,
-                max_length=args.max_length,
-                validate_length=False  # We already split the text
-            )
-            # Save to file
-            with open(output_file, 'wb') as f:
-                f.write(audio_data)
-            print(f"Generated: {output_file}")
-        if len(chunks) > 1:
-            print(f"\nGenerated {len(chunks)} audio files from long text.")
-            print(f"Files: {base_name}_part001{ext} to {base_name}_part{len(chunks):03d}{ext}")
-    except Exception as e:
-        print(f"Error processing long text: {e}", file=sys.stderr)
-        if args.verbose:
-            import traceback
-            traceback.print_exc()
-        sys.exit(1)
-def main() -> None:
-    """Main CLI entry point."""
-    parser = create_parser()
-    args = parser.parse_args()
-    # Get text input
-    if args.text:
-        text = args.text
-    else:
-        text = read_text_file(args.text_file)
-    if not text:
-        print("Error: No text provided.", file=sys.stderr)
-        sys.exit(1)
-    # Validate parameters
-    speed = validate_speed(args.speed)
-    voice = get_voice_enum(args.voice)
-    audio_format = get_format_enum(args.format)
-    # Create output directory if needed
-    output_path = Path(args.output)
-    output_path.parent.mkdir(parents=True, exist_ok=True)
-    # Check text length and handle accordingly
-    text_length = len(text)
-    validate_length = not args.no_length_validation
-    if args.verbose:
-        print(f"Text: {text[:50]}{'...' if len(text) > 50 else ''}")
-        print(f"Text length: {text_length} characters")
-        print(f"Max length: {args.max_length}")
-        print(f"Length validation: {'enabled' if validate_length else 'disabled'}")
-        print(f"Voice: {args.voice}")
-        print(f"Format: {args.format}")
-        print(f"Speed: {speed}")
-        print(f"URL: {args.url}")
-        print(f"Output: {args.output}")
-        print()
-    # Handle long text
-    if text_length > args.max_length:
-        if args.split_long_text:
-            print(f"Text is {text_length} characters, splitting into chunks...")
-            return handle_long_text(args, text, voice, audio_format, speed)
-        elif validate_length:
-            print(f"Error: Text is too long ({text_length} characters). "
-                  f"Maximum allowed is {args.max_length} characters.", file=sys.stderr)
-            print("Use --split-long-text to automatically split the text, "
-                  "or --no-length-validation to disable this check.", file=sys.stderr)
-            sys.exit(1)
-    # Create client
-    try:
-        client = TTSClient(
-            base_url=args.url,
-            api_key=args.api_key,
-            timeout=args.timeout,
-            max_retries=args.retries
-        )
-        if args.verbose:
-            print("Generating speech...")
-        # Generate speech
-        audio_data = client.generate_speech(
-            text=text,
-            voice=voice,
-            response_format=audio_format,
-            speed=speed,
-            max_length=args.max_length,
-            validate_length=validate_length
-        )
-        # Save to file
-        with open(args.output, 'wb') as f:
-            f.write(audio_data)
-        print(f"Speech generated successfully: {args.output}")
-    except NetworkException as e:
-        print(f"Network error: {e}", file=sys.stderr)
-        sys.exit(1)
-    except APIException as e:
-        print(f"API error: {e}", file=sys.stderr)
-        sys.exit(1)
-    except TTSException as e:
-        print(f"TTS error: {e}", file=sys.stderr)
-        sys.exit(1)
-    except Exception as e:
-        print(f"Unexpected error: {e}", file=sys.stderr)
-        if args.verbose:
-            import traceback
-            traceback.print_exc()
-        sys.exit(1)
-if __name__ == "__main__":
-    main()

+#!/usr/bin/env python3
+"""
+Command-line interface for TTSFM.
+This module provides a command-line interface for the TTSFM package,
+allowing users to generate speech from text using various options.
+"""
+import argparse
+import sys
+import os
+from typing import Optional
+from pathlib import Path
+from .client import TTSClient
+from .models import Voice, AudioFormat
+from .exceptions import TTSException, APIException, NetworkException
+def create_parser() -> argparse.ArgumentParser:
+    """Create and configure the argument parser."""
+    parser = argparse.ArgumentParser(
+        prog="ttsfm",
+        description="TTSFM - Text-to-Speech API Client",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog="""
+Examples:
+  ttsfm "Hello, world!" --output hello.mp3
+  ttsfm "Hello, world!" --voice nova --format wav --output hello.wav
+  ttsfm "Hello, world!" --url http://localhost:7000 --output hello.mp3
+  ttsfm --text-file input.txt --output speech.mp3
+        """
+    )
+    # Text input options (mutually exclusive)
+    text_group = parser.add_mutually_exclusive_group(required=True)
+    text_group.add_argument(
+        "text",
+        nargs="?",
+        help="Text to convert to speech"
+    )
+    text_group.add_argument(
+        "--text-file", "-f",
+        type=str,
+        help="Read text from file"
+    )
+    # Output options
+    parser.add_argument(
+        "--output", "-o",
+        type=str,
+        required=True,
+        help="Output file path"
+    )
+    # TTS options
+    parser.add_argument(
+        "--voice", "-v",
+        type=str,
+        default="alloy",
+        choices=["alloy", "ash", "ballad", "coral", "echo", "fable", "nova", "onyx", "sage", "shimmer", "verse"],
+        help="Voice to use for speech generation (default: alloy)"
+    )
+    parser.add_argument(
+        "--format",
+        type=str,
+        default="mp3",
+        choices=["mp3", "opus", "aac", "flac", "wav", "pcm"],
+        help="Audio format (default: mp3)"
+    )
+    parser.add_argument(
+        "--speed",
+        type=float,
+        default=1.0,
+        help="Speech speed (0.25 to 4.0, default: 1.0)"
+    )
+    # Client options
+    parser.add_argument(
+        "--url", "-u",
+        type=str,
+        default="http://localhost:7000",
+        help="TTS service URL (default: http://localhost:7000)"
+    )
+    parser.add_argument(
+        "--api-key", "-k",
+        type=str,
+        help="API key for authentication"
+    )
+    parser.add_argument(
+        "--timeout",
+        type=float,
+        default=30.0,
+        help="Request timeout in seconds (default: 30.0)"
+    )
+    parser.add_argument(
+        "--retries",
+        type=int,
+        default=3,
+        help="Maximum number of retries (default: 3)"
+    )
+    # Text length validation options
+    parser.add_argument(
+        "--max-length",
+        type=int,
+        default=4096,
+        help="Maximum text length in characters (default: 4096)"
+    )
+    parser.add_argument(
+        "--no-length-validation",
+        action="store_true",
+        help="Disable text length validation"
+    )
+    parser.add_argument(
+        "--split-long-text",
+        action="store_true",
+        help="Automatically split long text into chunks"
+    )
+    # Other options
+    parser.add_argument(
+        "--verbose", "-V",
+        action="store_true",
+        help="Enable verbose output"
+    )
+    parser.add_argument(
+        "--version",
+        action="version",
+        version=f"%(prog)s {get_version()}"
+    )
+    return parser
+def get_version() -> str:
+    """Get the package version."""
+    try:
+        from . import __version__
+        return __version__
+    except ImportError:
+        return "unknown"
+def read_text_file(file_path: str) -> str:
+    """Read text from a file."""
+    try:
+        with open(file_path, 'r', encoding='utf-8') as f:
+            return f.read().strip()
+    except FileNotFoundError:
+        print(f"Error: File '{file_path}' not found.", file=sys.stderr)
+        sys.exit(1)
+    except Exception as e:
+        print(f"Error reading file '{file_path}': {e}", file=sys.stderr)
+        sys.exit(1)
+def validate_speed(speed: float) -> float:
+    """Validate and return the speed parameter."""
+    if not 0.25 <= speed <= 4.0:
+        print("Error: Speed must be between 0.25 and 4.0", file=sys.stderr)
+        sys.exit(1)
+    return speed
+def get_voice_enum(voice_str: str) -> Voice:
+    """Convert voice string to Voice enum."""
+    voice_map = {
+        "alloy": Voice.ALLOY,
+        "ash": Voice.ASH,
+        "ballad": Voice.BALLAD,
+        "coral": Voice.CORAL,
+        "echo": Voice.ECHO,
+        "fable": Voice.FABLE,
+        "nova": Voice.NOVA,
+        "onyx": Voice.ONYX,
+        "sage": Voice.SAGE,
+        "shimmer": Voice.SHIMMER,
+        "verse": Voice.VERSE,
+    }
+    return voice_map[voice_str.lower()]
+def get_format_enum(format_str: str) -> AudioFormat:
+    """Convert format string to AudioFormat enum."""
+    format_map = {
+        "mp3": AudioFormat.MP3,
+        "opus": AudioFormat.OPUS,
+        "aac": AudioFormat.AAC,
+        "flac": AudioFormat.FLAC,
+        "wav": AudioFormat.WAV,
+        "pcm": AudioFormat.PCM,
+    }
+    return format_map[format_str.lower()]
+def handle_long_text(args, text: str, voice: Voice, audio_format: AudioFormat, speed: float) -> None:
+    """Handle long text by splitting it into chunks and generating multiple files."""
+    import os
+    # Create client
+    try:
+        client = TTSClient(
+            base_url=args.url,
+            api_key=args.api_key,
+            timeout=args.timeout,
+            max_retries=args.retries
+        )
+        # Use the new long text method
+        responses = client.generate_speech_long_text(
+            text=text,
+            voice=voice,
+            response_format=audio_format,
+            speed=speed,
+            max_length=args.max_length,
+            preserve_words=True
+        )
+        if not responses:
+            print("Error: No valid text chunks found after processing.", file=sys.stderr)
+            sys.exit(1)
+        print(f"Generated {len(responses)} audio chunks")
+        # Save each response to a file
+        base_name, ext = os.path.splitext(args.output)
+        for i, response in enumerate(responses, 1):
+            if args.verbose:
+                print(f"Saving chunk {i}/{len(responses)}...")
+            # Generate filename for this chunk
+            if len(responses) == 1:
+                output_file = args.output
+            else:
+                output_file = f"{base_name}_part{i:03d}{ext}"
+            # Save to file
+            with open(output_file, 'wb') as f:
+                f.write(response.audio_data)
+            print(f"Generated: {output_file}")
+        if len(responses) > 1:
+            print(f"\nGenerated {len(responses)} audio files from long text.")
+            print(f"Files: {base_name}_part001{ext} to {base_name}_part{len(responses):03d}{ext}")
+    except Exception as e:
+        print(f"Error processing long text: {e}", file=sys.stderr)
+        if args.verbose:
+            import traceback
+            traceback.print_exc()
+        sys.exit(1)
+def main() -> None:
+    """Main CLI entry point."""
+    parser = create_parser()
+    args = parser.parse_args()
+    # Get text input
+    if args.text:
+        text = args.text
+    else:
+        text = read_text_file(args.text_file)
+    if not text:
+        print("Error: No text provided.", file=sys.stderr)
+        sys.exit(1)
+    # Validate parameters
+    speed = validate_speed(args.speed)
+    voice = get_voice_enum(args.voice)
+    audio_format = get_format_enum(args.format)
+    # Create output directory if needed
+    output_path = Path(args.output)
+    output_path.parent.mkdir(parents=True, exist_ok=True)
+    # Check text length and handle accordingly
+    text_length = len(text)
+    validate_length = not args.no_length_validation
+    if args.verbose:
+        print(f"Text: {text[:50]}{'...' if len(text) > 50 else ''}")
+        print(f"Text length: {text_length} characters")
+        print(f"Max length: {args.max_length}")
+        print(f"Length validation: {'enabled' if validate_length else 'disabled'}")
+        print(f"Voice: {args.voice}")
+        print(f"Format: {args.format}")
+        print(f"Speed: {speed}")
+        print(f"URL: {args.url}")
+        print(f"Output: {args.output}")
+        print()
+    # Handle long text
+    if text_length > args.max_length:
+        if args.split_long_text:
+            print(f"Text is {text_length} characters, splitting into chunks...")
+            return handle_long_text(args, text, voice, audio_format, speed)
+        elif validate_length:
+            print(f"Error: Text is too long ({text_length} characters). "
+                  f"Maximum allowed is {args.max_length} characters.", file=sys.stderr)
+            print("Use --split-long-text to automatically split the text, "
+                  "or --no-length-validation to disable this check.", file=sys.stderr)
+            sys.exit(1)
+    # Create client
+    try:
+        client = TTSClient(
+            base_url=args.url,
+            api_key=args.api_key,
+            timeout=args.timeout,
+            max_retries=args.retries
+        )
+        if args.verbose:
+            print("Generating speech...")
+        # Generate speech
+        response = client.generate_speech(
+            text=text,
+            voice=voice,
+            response_format=audio_format,
+            speed=speed,
+            max_length=args.max_length,
+            validate_length=validate_length
+        )
+        # Save to file
+        with open(args.output, 'wb') as f:
+            f.write(response.audio_data)
+        print(f"Speech generated successfully: {args.output}")
+    except NetworkException as e:
+        print(f"Network error: {e}", file=sys.stderr)
+        sys.exit(1)
+    except APIException as e:
+        print(f"API error: {e}", file=sys.stderr)
+        sys.exit(1)
+    except TTSException as e:
+        print(f"TTS error: {e}", file=sys.stderr)
+        sys.exit(1)
+    except Exception as e:
+        print(f"Unexpected error: {e}", file=sys.stderr)
+        if args.verbose:
+            import traceback
+            traceback.print_exc()
+        sys.exit(1)
+if __name__ == "__main__":
+    main()

ttsfm/client.py CHANGED Viewed

@@ -1,481 +1,530 @@
-"""
-Main TTS client implementation.
-This module provides the primary TTSClient class for synchronous
-text-to-speech generation with OpenAI-compatible API.
-"""
-import json
-import time
-import uuid
-import logging
-from typing import Optional, Dict, Any, Union, List
-from urllib.parse import urljoin
-import requests
-from requests.adapters import HTTPAdapter
-from urllib3.util.retry import Retry
-from .models import (
-    TTSRequest, TTSResponse, Voice, AudioFormat,
-    get_content_type, get_format_from_content_type
-)
-from .exceptions import (
-    TTSException, APIException, NetworkException, ValidationException,
-    create_exception_from_response
-)
-from .utils import (
-    get_realistic_headers, sanitize_text, validate_url, build_url,
-    exponential_backoff, estimate_audio_duration, format_file_size,
-    validate_text_length, split_text_by_length
-)
-logger = logging.getLogger(__name__)
-class TTSClient:
-    """
-    Synchronous TTS client for text-to-speech generation.
-    This client provides a simple interface for generating speech from text
-    using OpenAI-compatible TTS services.
-    Attributes:
-        base_url: Base URL for the TTS service
-        api_key: API key for authentication (if required)
-        timeout: Request timeout in seconds
-        max_retries: Maximum number of retry attempts
-        verify_ssl: Whether to verify SSL certificates
-    """
-    def __init__(
-        self,
-        base_url: str = "https://www.openai.fm",
-        api_key: Optional[str] = None,
-        timeout: float = 30.0,
-        max_retries: int = 3,
-        verify_ssl: bool = True,
-        preferred_format: Optional[AudioFormat] = None,
-        **kwargs
-    ):
-        """
-        Initialize the TTS client.
-        Args:
-            base_url: Base URL for the TTS service
-            api_key: API key for authentication
-            timeout: Request timeout in seconds
-            max_retries: Maximum retry attempts
-            verify_ssl: Whether to verify SSL certificates
-            preferred_format: Preferred audio format (affects header selection)
-            **kwargs: Additional configuration options
-        """
-        self.base_url = base_url.rstrip('/')
-        self.api_key = api_key
-        self.timeout = timeout
-        self.max_retries = max_retries
-        self.verify_ssl = verify_ssl
-        self.preferred_format = preferred_format or AudioFormat.WAV
-        # Validate base URL
-        if not validate_url(self.base_url):
-            raise ValidationException(f"Invalid base URL: {self.base_url}")
-        # Setup HTTP session with retry strategy
-        self.session = requests.Session()
-        # Configure retry strategy
-        retry_strategy = Retry(
-            total=max_retries,
-            status_forcelist=[429, 500, 502, 503, 504],
-            allowed_methods=["HEAD", "GET", "POST"],  # Updated parameter name
-            backoff_factor=1
-        )
-        adapter = HTTPAdapter(max_retries=retry_strategy)
-        self.session.mount("http://", adapter)
-        self.session.mount("https://", adapter)
-        # Set default headers
-        self.session.headers.update(get_realistic_headers())
-        if self.api_key:
-            self.session.headers["Authorization"] = f"Bearer {self.api_key}"
-        logger.info(f"Initialized TTS client with base URL: {self.base_url}")
-    def _get_headers_for_format(self, requested_format: AudioFormat) -> Dict[str, str]:
-        """
-        Get appropriate headers to get the desired format from openai.fm.
-        Based on testing, openai.fm returns:
-        - MP3: When using simple/minimal headers
-        - WAV: When using full Chrome security headers
-        Args:
-            requested_format: The desired audio format
-        Returns:
-            Dict[str, str]: HTTP headers optimized for the requested format
-        """
-        from .models import get_supported_format
-        # Map requested format to supported format
-        target_format = get_supported_format(requested_format)
-        if target_format == AudioFormat.MP3:
-            # Use minimal headers to get MP3 response
-            return {
-                'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36',
-                'Accept': 'audio/*,*/*;q=0.9'
-            }
-        else:
-            # Use full realistic headers to get WAV response
-            # This works for WAV, OPUS, AAC, FLAC, PCM formats
-            return get_realistic_headers()
-    def generate_speech(
-        self,
-        text: str,
-        voice: Union[Voice, str] = Voice.ALLOY,
-        response_format: Union[AudioFormat, str] = AudioFormat.MP3,
-        instructions: Optional[str] = None,
-        max_length: int = 4096,
-        validate_length: bool = True,
-        **kwargs
-    ) -> TTSResponse:
-        """
-        Generate speech from text.
-        Args:
-            text: Text to convert to speech
-            voice: Voice to use for generation
-            response_format: Audio format for output
-            instructions: Optional instructions for voice modulation
-            max_length: Maximum allowed text length in characters (default: 4096)
-            validate_length: Whether to validate text length (default: True)
-            **kwargs: Additional parameters
-        Returns:
-            TTSResponse: Generated audio response
-        Raises:
-            TTSException: If generation fails
-            ValueError: If text exceeds max_length and validate_length is True
-        """
-        # Create and validate request
-        request = TTSRequest(
-            input=sanitize_text(text),
-            voice=voice,
-            response_format=response_format,
-            instructions=instructions,
-            max_length=max_length,
-            validate_length=validate_length,
-            **kwargs
-        )
-        return self._make_request(request)
-    def generate_speech_from_request(self, request: TTSRequest) -> TTSResponse:
-        """
-        Generate speech from a TTSRequest object.
-        Args:
-            request: TTS request object
-        Returns:
-            TTSResponse: Generated audio response
-        """
-        return self._make_request(request)
-    def generate_speech_batch(
-        self,
-        text: str,
-        voice: Union[Voice, str] = Voice.ALLOY,
-        response_format: Union[AudioFormat, str] = AudioFormat.MP3,
-        instructions: Optional[str] = None,
-        max_length: int = 4096,
-        preserve_words: bool = True,
-        **kwargs
-    ) -> List[TTSResponse]:
-        """
-        Generate speech from long text by splitting it into chunks.
-        This method automatically splits text that exceeds max_length into
-        smaller chunks and generates speech for each chunk separately.
-        Args:
-            text: Text to convert to speech
-            voice: Voice to use for generation
-            response_format: Audio format for output
-            instructions: Optional instructions for voice modulation
-            max_length: Maximum length per chunk (default: 4096)
-            preserve_words: Whether to avoid splitting words (default: True)
-            **kwargs: Additional parameters
-        Returns:
-            List[TTSResponse]: List of generated audio responses
-        Raises:
-            TTSException: If generation fails for any chunk
-        """
-        # Sanitize text first
-        clean_text = sanitize_text(text)
-        # Split text into chunks
-        chunks = split_text_by_length(clean_text, max_length, preserve_words)
-        if not chunks:
-            raise ValueError("No valid text chunks found after processing")
-        responses = []
-        for i, chunk in enumerate(chunks):
-            logger.info(f"Processing chunk {i+1}/{len(chunks)} ({len(chunk)} characters)")
-            # Create request for this chunk (disable length validation since we already split)
-            request = TTSRequest(
-                input=chunk,
-                voice=voice,
-                response_format=response_format,
-                instructions=instructions,
-                max_length=max_length,
-                validate_length=False,  # We already split the text
-                **kwargs
-            )
-            response = self._make_request(request)
-            responses.append(response)
-        return responses
-    def _make_request(self, request: TTSRequest) -> TTSResponse:
-        """
-        Make the actual HTTP request to the openai.fm TTS service.
-        Args:
-            request: TTS request object
-        Returns:
-            TTSResponse: Generated audio response
-        Raises:
-            TTSException: If request fails
-        """
-        url = build_url(self.base_url, "api/generate")
-        # Prepare form data for openai.fm API
-        form_data = {
-            'input': request.input,
-            'voice': request.voice.value,
-            'generation': str(uuid.uuid4()),
-            'response_format': request.response_format.value if hasattr(request.response_format, 'value') else str(request.response_format)
-        }
-        # Add prompt/instructions if provided
-        if request.instructions:
-            form_data['prompt'] = request.instructions
-        else:
-            # Default prompt for better quality
-            form_data['prompt'] = (
-                "Affect/personality: Natural and clear\n\n"
-                "Tone: Friendly and professional, creating a pleasant listening experience.\n\n"
-                "Pronunciation: Clear, articulate, and steady, ensuring each word is easily understood "
-                "while maintaining a natural, conversational flow.\n\n"
-                "Pause: Brief, purposeful pauses between sentences to allow time for the listener "
-                "to process the information.\n\n"
-                "Emotion: Warm and engaging, conveying the intended message effectively."
-            )
-        # Get optimized headers for the requested format
-        # Convert string format to AudioFormat enum if needed
-        requested_format = request.response_format
-        if isinstance(requested_format, str):
-            try:
-                requested_format = AudioFormat(requested_format.lower())
-            except ValueError:
-                requested_format = AudioFormat.WAV  # Default to WAV for unknown formats
-        format_headers = self._get_headers_for_format(requested_format)
-        logger.info(f"Generating speech for text: '{request.input[:50]}...' with voice: {request.voice}")
-        logger.debug(f"Using headers optimized for {requested_format.value} format")
-        # Make request with retries
-        for attempt in range(self.max_retries + 1):
-            try:
-                # Add random delay for rate limiting (except first attempt)
-                if attempt > 0:
-                    delay = exponential_backoff(attempt - 1)
-                    logger.info(f"Retrying request after {delay:.2f}s (attempt {attempt + 1})")
-                    time.sleep(delay)
-                # Use multipart form data as required by openai.fm
-                response = self.session.post(
-                    url,
-                    data=form_data,
-                    headers=format_headers,
-                    timeout=self.timeout,
-                    verify=self.verify_ssl
-                )
-                # Handle different response types
-                if response.status_code == 200:
-                    return self._process_openai_fm_response(response, request)
-                else:
-                    # Try to parse error response
-                    try:
-                        error_data = response.json()
-                    except (json.JSONDecodeError, ValueError):
-                        error_data = {"error": {"message": response.text or "Unknown error"}}
-                    # Create appropriate exception
-                    exception = create_exception_from_response(
-                        response.status_code,
-                        error_data,
-                        f"TTS request failed with status {response.status_code}"
-                    )
-                    # Don't retry for certain errors
-                    if response.status_code in [400, 401, 403, 404]:
-                        raise exception
-                    # For retryable errors, continue to next attempt
-                    if attempt == self.max_retries:
-                        raise exception
-                    logger.warning(f"Request failed with status {response.status_code}, retrying...")
-                    continue
-            except requests.exceptions.Timeout:
-                if attempt == self.max_retries:
-                    raise NetworkException(
-                        f"Request timed out after {self.timeout}s",
-                        timeout=self.timeout,
-                        retry_count=attempt
-                    )
-                logger.warning(f"Request timed out, retrying...")
-                continue
-            except requests.exceptions.ConnectionError as e:
-                if attempt == self.max_retries:
-                    raise NetworkException(
-                        f"Connection error: {str(e)}",
-                        retry_count=attempt
-                    )
-                logger.warning(f"Connection error, retrying...")
-                continue
-            except requests.exceptions.RequestException as e:
-                if attempt == self.max_retries:
-                    raise NetworkException(
-                        f"Request error: {str(e)}",
-                        retry_count=attempt
-                    )
-                logger.warning(f"Request error, retrying...")
-                continue
-        # This should never be reached, but just in case
-        raise TTSException("Maximum retries exceeded")
-    def _process_openai_fm_response(self, response: requests.Response, request: TTSRequest) -> TTSResponse:
-        """
-        Process a successful response from the openai.fm TTS service.
-        Args:
-            response: HTTP response object
-            request: Original TTS request
-        Returns:
-            TTSResponse: Processed response object
-        """
-        # Get content type from response headers
-        content_type = response.headers.get("content-type", "audio/mpeg")
-        # Get audio data
-        audio_data = response.content
-        if not audio_data:
-            raise APIException("Received empty audio data from openai.fm")
-        # Determine format from content type
-        if "audio/mpeg" in content_type or "audio/mp3" in content_type:
-            actual_format = AudioFormat.MP3
-        elif "audio/wav" in content_type:
-            actual_format = AudioFormat.WAV
-        elif "audio/opus" in content_type:
-            actual_format = AudioFormat.OPUS
-        elif "audio/aac" in content_type:
-            actual_format = AudioFormat.AAC
-        elif "audio/flac" in content_type:
-            actual_format = AudioFormat.FLAC
-        else:
-            # Default to MP3 for openai.fm
-            actual_format = AudioFormat.MP3
-        # Estimate duration based on text length (rough approximation)
-        estimated_duration = estimate_audio_duration(request.input)
-        # Check if returned format differs from requested format
-        requested_format = request.response_format
-        if isinstance(requested_format, str):
-            try:
-                requested_format = AudioFormat(requested_format.lower())
-            except ValueError:
-                requested_format = AudioFormat.WAV  # Default fallback
-        # Import here to avoid circular imports
-        from .models import get_supported_format, maps_to_wav
-        # Check if format differs from request
-        if actual_format != requested_format:
-            if maps_to_wav(requested_format.value) and actual_format.value == "wav":
-                logger.debug(
-                    f"Format '{requested_format.value}' requested, returning WAV format."
-                )
-            else:
-                logger.warning(
-                    f"Requested format '{requested_format.value}' but received '{actual_format.value}' "
-                    f"from service."
-                )
-        # Create response object
-        tts_response = TTSResponse(
-            audio_data=audio_data,
-            content_type=content_type,
-            format=actual_format,
-            size=len(audio_data),
-            duration=estimated_duration,
-            metadata={
-                "response_headers": dict(response.headers),
-                "status_code": response.status_code,
-                "url": str(response.url),
-                "service": "openai.fm",
-                "voice": request.voice.value,
-                "original_text": request.input[:100] + "..." if len(request.input) > 100 else request.input,
-                "requested_format": requested_format.value,
-                "actual_format": actual_format.value
-            }
-        )
-        logger.info(
-            f"Successfully generated {format_file_size(len(audio_data))} "
-            f"of {actual_format.value.upper()} audio from openai.fm using voice '{request.voice.value}'"
-        )
-        return tts_response
-    def close(self):
-        """Close the HTTP session."""
-        if hasattr(self, 'session'):
-            self.session.close()
-    def __enter__(self):
-        """Context manager entry."""
-        return self
-    def __exit__(self, exc_type, exc_val, exc_tb):
-        """Context manager exit."""
-        self.close()

+"""
+Main TTS client implementation.
+This module provides the primary TTSClient class for synchronous
+text-to-speech generation with OpenAI-compatible API.
+"""
+import json
+import time
+import uuid
+import logging
+from typing import Optional, Dict, Any, Union, List
+from urllib.parse import urljoin
+import requests
+from requests.adapters import HTTPAdapter
+from urllib3.util.retry import Retry
+from .models import (
+    TTSRequest, TTSResponse, Voice, AudioFormat,
+    get_content_type, get_format_from_content_type
+)
+from .exceptions import (
+    TTSException, APIException, NetworkException, ValidationException,
+    create_exception_from_response
+)
+from .utils import (
+    get_realistic_headers, sanitize_text, validate_url, build_url,
+    exponential_backoff, estimate_audio_duration, format_file_size,
+    validate_text_length, split_text_by_length
+)
+logger = logging.getLogger(__name__)
+class TTSClient:
+    """
+    Synchronous TTS client for text-to-speech generation.
+    This client provides a simple interface for generating speech from text
+    using OpenAI-compatible TTS services.
+    Attributes:
+        base_url: Base URL for the TTS service
+        api_key: API key for authentication (if required)
+        timeout: Request timeout in seconds
+        max_retries: Maximum number of retry attempts
+        verify_ssl: Whether to verify SSL certificates
+    """
+    def __init__(
+        self,
+        base_url: str = "https://www.openai.fm",
+        api_key: Optional[str] = None,
+        timeout: float = 30.0,
+        max_retries: int = 3,
+        verify_ssl: bool = True,
+        preferred_format: Optional[AudioFormat] = None,
+        **kwargs
+    ):
+        """
+        Initialize the TTS client.
+        Args:
+            base_url: Base URL for the TTS service
+            api_key: API key for authentication
+            timeout: Request timeout in seconds
+            max_retries: Maximum retry attempts
+            verify_ssl: Whether to verify SSL certificates
+            preferred_format: Preferred audio format (affects header selection)
+            **kwargs: Additional configuration options
+        """
+        self.base_url = base_url.rstrip('/')
+        self.api_key = api_key
+        self.timeout = timeout
+        self.max_retries = max_retries
+        self.verify_ssl = verify_ssl
+        self.preferred_format = preferred_format or AudioFormat.WAV
+        # Validate base URL
+        if not validate_url(self.base_url):
+            raise ValidationException(f"Invalid base URL: {self.base_url}")
+        # Setup HTTP session with retry strategy
+        self.session = requests.Session()
+        # Configure retry strategy
+        retry_strategy = Retry(
+            total=max_retries,
+            status_forcelist=[429, 500, 502, 503, 504],
+            allowed_methods=["HEAD", "GET", "POST"],  # Updated parameter name
+            backoff_factor=1
+        )
+        adapter = HTTPAdapter(
+            max_retries=retry_strategy,
+            pool_connections=10,
+            pool_maxsize=10
+        )
+        self.session.mount("http://", adapter)
+        self.session.mount("https://", adapter)
+        # Set default headers
+        self.session.headers.update(get_realistic_headers())
+        if self.api_key:
+            self.session.headers["Authorization"] = f"Bearer {self.api_key}"
+        logger.info(f"Initialized TTS client with base URL: {self.base_url}")
+    def _get_headers_for_format(self, requested_format: AudioFormat) -> Dict[str, str]:
+        """
+        Get appropriate headers to get the desired format from openai.fm.
+        Based on testing, openai.fm returns:
+        - MP3: When using no headers or very minimal headers
+        - WAV: When using more complex headers with specific Accept values
+        Args:
+            requested_format: The desired audio format
+        Returns:
+            Dict[str, str]: HTTP headers optimized for the requested format
+        """
+        from .models import get_supported_format
+        # Map requested format to supported format
+        target_format = get_supported_format(requested_format)
+        if target_format == AudioFormat.MP3:
+            # Use minimal headers to reliably get MP3 response
+            # Testing shows that no headers or very basic headers work best for MP3
+            return {
+                'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
+            }
+        else:
+            # Use more complex headers to get WAV response
+            # This works for WAV, OPUS, AAC, FLAC, PCM formats
+            return {
+                'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36',
+                'Accept': 'audio/*,*/*;q=0.9'
+            }
+    def generate_speech(
+        self,
+        text: str,
+        voice: Union[Voice, str] = Voice.ALLOY,
+        response_format: Union[AudioFormat, str] = AudioFormat.MP3,
+        instructions: Optional[str] = None,
+        max_length: int = 4096,
+        validate_length: bool = True,
+        **kwargs
+    ) -> TTSResponse:
+        """
+        Generate speech from text.
+        Args:
+            text: Text to convert to speech
+            voice: Voice to use for generation
+            response_format: Audio format for output
+            instructions: Optional instructions for voice modulation
+            max_length: Maximum allowed text length in characters (default: 4096)
+            validate_length: Whether to validate text length (default: True)
+            **kwargs: Additional parameters
+        Returns:
+            TTSResponse: Generated audio response
+        Raises:
+            TTSException: If generation fails
+            ValueError: If text exceeds max_length and validate_length is True
+        """
+        # Create and validate request
+        request = TTSRequest(
+            input=sanitize_text(text),
+            voice=voice,
+            response_format=response_format,
+            instructions=instructions,
+            max_length=max_length,
+            validate_length=validate_length,
+            **kwargs
+        )
+        return self._make_request(request)
+    def generate_speech_from_request(self, request: TTSRequest) -> TTSResponse:
+        """
+        Generate speech from a TTSRequest object.
+        Args:
+            request: TTS request object
+        Returns:
+            TTSResponse: Generated audio response
+        """
+        return self._make_request(request)
+    def generate_speech_batch(
+        self,
+        text: str,
+        voice: Union[Voice, str] = Voice.ALLOY,
+        response_format: Union[AudioFormat, str] = AudioFormat.MP3,
+        instructions: Optional[str] = None,
+        max_length: int = 4096,
+        preserve_words: bool = True,
+        **kwargs
+    ) -> List[TTSResponse]:
+        """
+        Generate speech from long text by splitting it into chunks.
+        This method automatically splits text that exceeds max_length into
+        smaller chunks and generates speech for each chunk separately.
+        Args:
+            text: Text to convert to speech
+            voice: Voice to use for generation
+            response_format: Audio format for output
+            instructions: Optional instructions for voice modulation
+            max_length: Maximum length per chunk (default: 4096)
+            preserve_words: Whether to avoid splitting words (default: True)
+            **kwargs: Additional parameters
+        Returns:
+            List[TTSResponse]: List of generated audio responses
+        Raises:
+            TTSException: If generation fails for any chunk
+        """
+        # Sanitize text first
+        clean_text = sanitize_text(text)
+        # Split text into chunks
+        chunks = split_text_by_length(clean_text, max_length, preserve_words)
+        if not chunks:
+            raise ValueError("No valid text chunks found after processing")
+        responses = []
+        for i, chunk in enumerate(chunks):
+            logger.info(f"Processing chunk {i+1}/{len(chunks)} ({len(chunk)} characters)")
+            # Create request for this chunk (disable length validation since we already split)
+            request = TTSRequest(
+                input=chunk,
+                voice=voice,
+                response_format=response_format,
+                instructions=instructions,
+                max_length=max_length,
+                validate_length=False,  # We already split the text
+                **kwargs
+            )
+            response = self._make_request(request)
+            responses.append(response)
+        return responses
+    def generate_speech_long_text(
+        self,
+        text: str,
+        voice: Union[Voice, str] = Voice.ALLOY,
+        response_format: Union[AudioFormat, str] = AudioFormat.MP3,
+        instructions: Optional[str] = None,
+        max_length: int = 4096,
+        preserve_words: bool = True,
+        **kwargs
+    ) -> List[TTSResponse]:
+        """
+        Generate speech from long text by splitting it into chunks.
+        This is an alias for generate_speech_batch for consistency with AsyncTTSClient.
+        Automatically splits text that exceeds max_length into smaller chunks
+        and generates speech for each chunk separately.
+        Args:
+            text: Text to convert to speech
+            voice: Voice to use for generation
+            response_format: Audio format for output
+            instructions: Optional instructions for voice modulation
+            max_length: Maximum length per chunk (default: 4096)
+            preserve_words: Whether to avoid splitting words (default: True)
+            **kwargs: Additional parameters
+        Returns:
+            List[TTSResponse]: List of generated audio responses
+        Raises:
+            TTSException: If generation fails for any chunk
+        """
+        return self.generate_speech_batch(
+            text=text,
+            voice=voice,
+            response_format=response_format,
+            instructions=instructions,
+            max_length=max_length,
+            preserve_words=preserve_words,
+            **kwargs
+        )
+    def _make_request(self, request: TTSRequest) -> TTSResponse:
+        """
+        Make the actual HTTP request to the openai.fm TTS service.
+        Args:
+            request: TTS request object
+        Returns:
+            TTSResponse: Generated audio response
+        Raises:
+            TTSException: If request fails
+        """
+        url = build_url(self.base_url, "api/generate")
+        # Prepare form data for openai.fm API
+        form_data = {
+            'input': request.input,
+            'voice': request.voice.value,
+            'generation': str(uuid.uuid4()),
+            'response_format': request.response_format.value if hasattr(request.response_format, 'value') else str(request.response_format)
+        }
+        # Add prompt/instructions if provided
+        if request.instructions:
+            form_data['prompt'] = request.instructions
+        else:
+            # Default prompt for better quality
+            form_data['prompt'] = (
+                "Affect/personality: Natural and clear\n\n"
+                "Tone: Friendly and professional, creating a pleasant listening experience.\n\n"
+                "Pronunciation: Clear, articulate, and steady, ensuring each word is easily understood "
+                "while maintaining a natural, conversational flow.\n\n"
+                "Pause: Brief, purposeful pauses between sentences to allow time for the listener "
+                "to process the information.\n\n"
+                "Emotion: Warm and engaging, conveying the intended message effectively."
+            )
+        # Get optimized headers for the requested format
+        # Convert string format to AudioFormat enum if needed
+        requested_format = request.response_format
+        if isinstance(requested_format, str):
+            try:
+                requested_format = AudioFormat(requested_format.lower())
+            except ValueError:
+                requested_format = AudioFormat.WAV  # Default to WAV for unknown formats
+        format_headers = self._get_headers_for_format(requested_format)
+        logger.info(f"Generating speech for text: '{request.input[:50]}...' with voice: {request.voice}")
+        logger.debug(f"Using headers optimized for {requested_format.value} format")
+        # Make request with retries
+        for attempt in range(self.max_retries + 1):
+            try:
+                # Add random delay for rate limiting (except first attempt)
+                if attempt > 0:
+                    delay = exponential_backoff(attempt - 1)
+                    logger.info(f"Retrying request after {delay:.2f}s (attempt {attempt + 1})")
+                    time.sleep(delay)
+                # Use multipart form data as required by openai.fm
+                response = self.session.post(
+                    url,
+                    data=form_data,
+                    headers=format_headers,
+                    timeout=self.timeout,
+                    verify=self.verify_ssl
+                )
+                # Handle different response types
+                if response.status_code == 200:
+                    return self._process_openai_fm_response(response, request)
+                else:
+                    # Try to parse error response
+                    try:
+                        error_data = response.json()
+                    except (json.JSONDecodeError, ValueError):
+                        error_data = {"error": {"message": response.text or "Unknown error"}}
+                    # Create appropriate exception
+                    exception = create_exception_from_response(
+                        response.status_code,
+                        error_data,
+                        f"TTS request failed with status {response.status_code}"
+                    )
+                    # Don't retry for certain errors
+                    if response.status_code in [400, 401, 403, 404]:
+                        raise exception
+                    # For retryable errors, continue to next attempt
+                    if attempt == self.max_retries:
+                        raise exception
+                    logger.warning(f"Request failed with status {response.status_code}, retrying...")
+                    continue
+            except requests.exceptions.Timeout:
+                if attempt == self.max_retries:
+                    raise NetworkException(
+                        f"Request timed out after {self.timeout}s",
+                        timeout=self.timeout,
+                        retry_count=attempt
+                    )
+                logger.warning(f"Request timed out, retrying...")
+                continue
+            except requests.exceptions.ConnectionError as e:
+                if attempt == self.max_retries:
+                    raise NetworkException(
+                        f"Connection error: {str(e)}",
+                        retry_count=attempt
+                    )
+                logger.warning(f"Connection error, retrying...")
+                continue
+            except requests.exceptions.RequestException as e:
+                if attempt == self.max_retries:
+                    raise NetworkException(
+                        f"Request error: {str(e)}",
+                        retry_count=attempt
+                    )
+                logger.warning(f"Request error, retrying...")
+                continue
+        # This should never be reached, but just in case
+        raise TTSException("Maximum retries exceeded")
+    def _process_openai_fm_response(self, response: requests.Response, request: TTSRequest) -> TTSResponse:
+        """
+        Process a successful response from the openai.fm TTS service.
+        Args:
+            response: HTTP response object
+            request: Original TTS request
+        Returns:
+            TTSResponse: Processed response object
+        """
+        # Get content type from response headers
+        content_type = response.headers.get("content-type", "audio/mpeg")
+        # Get audio data
+        audio_data = response.content
+        if not audio_data:
+            raise APIException("Received empty audio data from openai.fm")
+        # Determine format from content type
+        if "audio/mpeg" in content_type or "audio/mp3" in content_type:
+            actual_format = AudioFormat.MP3
+        elif "audio/wav" in content_type:
+            actual_format = AudioFormat.WAV
+        elif "audio/opus" in content_type:
+            actual_format = AudioFormat.OPUS
+        elif "audio/aac" in content_type:
+            actual_format = AudioFormat.AAC
+        elif "audio/flac" in content_type:
+            actual_format = AudioFormat.FLAC
+        else:
+            # Default to MP3 for openai.fm
+            actual_format = AudioFormat.MP3
+        # Estimate duration based on text length (rough approximation)
+        estimated_duration = estimate_audio_duration(request.input)
+        # Check if returned format differs from requested format
+        requested_format = request.response_format
+        if isinstance(requested_format, str):
+            try:
+                requested_format = AudioFormat(requested_format.lower())
+            except ValueError:
+                requested_format = AudioFormat.WAV  # Default fallback
+        # Import here to avoid circular imports
+        from .models import get_supported_format, maps_to_wav
+        # Check if format differs from request
+        if actual_format != requested_format:
+            if maps_to_wav(requested_format.value) and actual_format.value == "wav":
+                logger.debug(
+                    f"Format '{requested_format.value}' requested, returning WAV format."
+                )
+            else:
+                logger.warning(
+                    f"Requested format '{requested_format.value}' but received '{actual_format.value}' "
+                    f"from service."
+                )
+        # Create response object
+        tts_response = TTSResponse(
+            audio_data=audio_data,
+            content_type=content_type,
+            format=actual_format,
+            size=len(audio_data),
+            duration=estimated_duration,
+            metadata={
+                "response_headers": dict(response.headers),
+                "status_code": response.status_code,
+                "url": str(response.url),
+                "service": "openai.fm",
+                "voice": request.voice.value,
+                "original_text": request.input[:100] + "..." if len(request.input) > 100 else request.input,
+                "requested_format": requested_format.value,
+                "actual_format": actual_format.value
+            }
+        )
+        logger.info(
+            f"Successfully generated {format_file_size(len(audio_data))} "
+            f"of {actual_format.value.upper()} audio from openai.fm using voice '{request.voice.value}'"
+        )
+        return tts_response
+    def close(self):
+        """Close the HTTP session."""
+        if hasattr(self, 'session'):
+            self.session.close()
+    def __enter__(self):
+        """Context manager entry."""
+        return self
+    def __exit__(self, exc_type, exc_val, exc_tb):
+        """Context manager exit."""
+        self.close()

ttsfm/exceptions.py CHANGED Viewed

@@ -1,243 +1,243 @@
-"""
-Exception classes for the TTSFM package.
-This module defines the exception hierarchy used throughout the package
-for consistent error handling and reporting.
-"""
-from typing import Optional, Dict, Any
-class TTSException(Exception):
-    """
-    Base exception class for all TTSFM-related errors.
-    Attributes:
-        message: Human-readable error message
-        code: Error code for programmatic handling
-        details: Additional error details
-    """
-    def __init__(
-        self,
-        message: str,
-        code: Optional[str] = None,
-        details: Optional[Dict[str, Any]] = None
-    ):
-        super().__init__(message)
-        self.message = message
-        self.code = code or self.__class__.__name__
-        self.details = details or {}
-    def __str__(self) -> str:
-        if self.code:
-            return f"[{self.code}] {self.message}"
-        return self.message
-    def __repr__(self) -> str:
-        return f"{self.__class__.__name__}(message='{self.message}', code='{self.code}')"
-class APIException(TTSException):
-    """
-    Exception raised for API-related errors.
-    This includes HTTP errors, invalid responses, and server-side issues.
-    """
-    def __init__(
-        self,
-        message: str,
-        status_code: Optional[int] = None,
-        response_data: Optional[Dict[str, Any]] = None,
-        **kwargs
-    ):
-        super().__init__(message, **kwargs)
-        self.status_code = status_code
-        self.response_data = response_data or {}
-    def __str__(self) -> str:
-        if self.status_code:
-            return f"[HTTP {self.status_code}] {self.message}"
-        return super().__str__()
-class NetworkException(TTSException):
-    """
-    Exception raised for network-related errors.
-    This includes connection timeouts, DNS resolution failures, and other
-    network connectivity issues.
-    """
-    def __init__(
-        self,
-        message: str,
-        timeout: Optional[float] = None,
-        retry_count: int = 0,
-        **kwargs
-    ):
-        super().__init__(message, **kwargs)
-        self.timeout = timeout
-        self.retry_count = retry_count
-class ValidationException(TTSException):
-    """
-    Exception raised for input validation errors.
-    This includes invalid parameters, missing required fields, and
-    data format issues.
-    """
-    def __init__(
-        self,
-        message: str,
-        field: Optional[str] = None,
-        value: Optional[Any] = None,
-        **kwargs
-    ):
-        super().__init__(message, **kwargs)
-        self.field = field
-        self.value = value
-    def __str__(self) -> str:
-        if self.field:
-            return f"Validation error for '{self.field}': {self.message}"
-        return f"Validation error: {self.message}"
-class RateLimitException(APIException):
-    """
-    Exception raised when API rate limits are exceeded.
-    Attributes:
-        retry_after: Seconds to wait before retrying (if provided by server)
-        limit: Rate limit that was exceeded
-        remaining: Remaining requests in current window
-    """
-    def __init__(
-        self,
-        message: str = "Rate limit exceeded",
-        retry_after: Optional[int] = None,
-        limit: Optional[int] = None,
-        remaining: Optional[int] = None,
-        **kwargs
-    ):
-        super().__init__(message, status_code=429, **kwargs)
-        self.retry_after = retry_after
-        self.limit = limit
-        self.remaining = remaining
-    def __str__(self) -> str:
-        msg = super().__str__()
-        if self.retry_after:
-            msg += f" (retry after {self.retry_after}s)"
-        return msg
-class AuthenticationException(APIException):
-    """
-    Exception raised for authentication and authorization errors.
-    This includes invalid API keys, expired tokens, and insufficient
-    permissions.
-    """
-    def __init__(
-        self,
-        message: str = "Authentication failed",
-        **kwargs
-    ):
-        super().__init__(message, status_code=401, **kwargs)
-class ServiceUnavailableException(APIException):
-    """
-    Exception raised when the TTS service is temporarily unavailable.
-    This includes server maintenance, overload conditions, and
-    temporary service outages.
-    """
-    def __init__(
-        self,
-        message: str = "Service temporarily unavailable",
-        retry_after: Optional[int] = None,
-        **kwargs
-    ):
-        super().__init__(message, status_code=503, **kwargs)
-        self.retry_after = retry_after
-class QuotaExceededException(APIException):
-    """
-    Exception raised when usage quotas are exceeded.
-    This includes monthly limits, character limits, and other
-    usage-based restrictions.
-    """
-    def __init__(
-        self,
-        message: str = "Usage quota exceeded",
-        quota_type: Optional[str] = None,
-        limit: Optional[int] = None,
-        used: Optional[int] = None,
-        **kwargs
-    ):
-        super().__init__(message, status_code=402, **kwargs)
-        self.quota_type = quota_type
-        self.limit = limit
-        self.used = used
-class AudioProcessingException(TTSException):
-    """
-    Exception raised for audio processing errors.
-    This includes format conversion issues, audio generation failures,
-    and output processing problems.
-    """
-    def __init__(
-        self,
-        message: str,
-        audio_format: Optional[str] = None,
-        **kwargs
-    ):
-        super().__init__(message, **kwargs)
-        self.audio_format = audio_format
-def create_exception_from_response(
-    status_code: int,
-    response_data: Dict[str, Any],
-    default_message: str = "API request failed"
-) -> APIException:
-    """
-    Create appropriate exception from API response.
-    Args:
-        status_code: HTTP status code
-        response_data: Response data from API
-        default_message: Default message if none in response
-    Returns:
-        APIException: Appropriate exception instance
-    """
-    message = response_data.get("error", {}).get("message", default_message)
-    if status_code == 401:
-        return AuthenticationException(message, response_data=response_data)
-    elif status_code == 402:
-        return QuotaExceededException(message, response_data=response_data)
-    elif status_code == 429:
-        retry_after = response_data.get("retry_after")
-        return RateLimitException(message, retry_after=retry_after, response_data=response_data)
-    elif status_code == 503:
-        retry_after = response_data.get("retry_after")
-        return ServiceUnavailableException(message, retry_after=retry_after, response_data=response_data)
-    else:
-        return APIException(message, status_code=status_code, response_data=response_data)

+"""
+Exception classes for the TTSFM package.
+This module defines the exception hierarchy used throughout the package
+for consistent error handling and reporting.
+"""
+from typing import Optional, Dict, Any
+class TTSException(Exception):
+    """
+    Base exception class for all TTSFM-related errors.
+    Attributes:
+        message: Human-readable error message
+        code: Error code for programmatic handling
+        details: Additional error details
+    """
+    def __init__(
+        self,
+        message: str,
+        code: Optional[str] = None,
+        details: Optional[Dict[str, Any]] = None
+    ):
+        super().__init__(message)
+        self.message = message
+        self.code = code or self.__class__.__name__
+        self.details = details or {}
+    def __str__(self) -> str:
+        if self.code:
+            return f"[{self.code}] {self.message}"
+        return self.message
+    def __repr__(self) -> str:
+        return f"{self.__class__.__name__}(message='{self.message}', code='{self.code}')"
+class APIException(TTSException):
+    """
+    Exception raised for API-related errors.
+    This includes HTTP errors, invalid responses, and server-side issues.
+    """
+    def __init__(
+        self,
+        message: str,
+        status_code: Optional[int] = None,
+        response_data: Optional[Dict[str, Any]] = None,
+        **kwargs
+    ):
+        super().__init__(message, **kwargs)
+        self.status_code = status_code
+        self.response_data = response_data or {}
+    def __str__(self) -> str:
+        if self.status_code:
+            return f"[HTTP {self.status_code}] {self.message}"
+        return super().__str__()
+class NetworkException(TTSException):
+    """
+    Exception raised for network-related errors.
+    This includes connection timeouts, DNS resolution failures, and other
+    network connectivity issues.
+    """
+    def __init__(
+        self,
+        message: str,
+        timeout: Optional[float] = None,
+        retry_count: int = 0,
+        **kwargs
+    ):
+        super().__init__(message, **kwargs)
+        self.timeout = timeout
+        self.retry_count = retry_count
+class ValidationException(TTSException):
+    """
+    Exception raised for input validation errors.
+    This includes invalid parameters, missing required fields, and
+    data format issues.
+    """
+    def __init__(
+        self,
+        message: str,
+        field: Optional[str] = None,
+        value: Optional[Any] = None,
+        **kwargs
+    ):
+        super().__init__(message, **kwargs)
+        self.field = field
+        self.value = value
+    def __str__(self) -> str:
+        if self.field:
+            return f"Validation error for '{self.field}': {self.message}"
+        return f"Validation error: {self.message}"
+class RateLimitException(APIException):
+    """
+    Exception raised when API rate limits are exceeded.
+    Attributes:
+        retry_after: Seconds to wait before retrying (if provided by server)
+        limit: Rate limit that was exceeded
+        remaining: Remaining requests in current window
+    """
+    def __init__(
+        self,
+        message: str = "Rate limit exceeded",
+        retry_after: Optional[int] = None,
+        limit: Optional[int] = None,
+        remaining: Optional[int] = None,
+        **kwargs
+    ):
+        super().__init__(message, status_code=429, **kwargs)
+        self.retry_after = retry_after
+        self.limit = limit
+        self.remaining = remaining
+    def __str__(self) -> str:
+        msg = super().__str__()
+        if self.retry_after:
+            msg += f" (retry after {self.retry_after}s)"
+        return msg
+class AuthenticationException(APIException):
+    """
+    Exception raised for authentication and authorization errors.
+    This includes invalid API keys, expired tokens, and insufficient
+    permissions.
+    """
+    def __init__(
+        self,
+        message: str = "Authentication failed",
+        **kwargs
+    ):
+        super().__init__(message, status_code=401, **kwargs)
+class ServiceUnavailableException(APIException):
+    """
+    Exception raised when the TTS service is temporarily unavailable.
+    This includes server maintenance, overload conditions, and
+    temporary service outages.
+    """
+    def __init__(
+        self,
+        message: str = "Service temporarily unavailable",
+        retry_after: Optional[int] = None,
+        **kwargs
+    ):
+        super().__init__(message, status_code=503, **kwargs)
+        self.retry_after = retry_after
+class QuotaExceededException(APIException):
+    """
+    Exception raised when usage quotas are exceeded.
+    This includes monthly limits, character limits, and other
+    usage-based restrictions.
+    """
+    def __init__(
+        self,
+        message: str = "Usage quota exceeded",
+        quota_type: Optional[str] = None,
+        limit: Optional[int] = None,
+        used: Optional[int] = None,
+        **kwargs
+    ):
+        super().__init__(message, status_code=402, **kwargs)
+        self.quota_type = quota_type
+        self.limit = limit
+        self.used = used
+class AudioProcessingException(TTSException):
+    """
+    Exception raised for audio processing errors.
+    This includes format conversion issues, audio generation failures,
+    and output processing problems.
+    """
+    def __init__(
+        self,
+        message: str,
+        audio_format: Optional[str] = None,
+        **kwargs
+    ):
+        super().__init__(message, **kwargs)
+        self.audio_format = audio_format
+def create_exception_from_response(
+    status_code: int,
+    response_data: Dict[str, Any],
+    default_message: str = "API request failed"
+) -> APIException:
+    """
+    Create appropriate exception from API response.
+    Args:
+        status_code: HTTP status code
+        response_data: Response data from API
+        default_message: Default message if none in response
+    Returns:
+        APIException: Appropriate exception instance
+    """
+    message = response_data.get("error", {}).get("message", default_message)
+    if status_code == 401:
+        return AuthenticationException(message, response_data=response_data)
+    elif status_code == 402:
+        return QuotaExceededException(message, response_data=response_data)
+    elif status_code == 429:
+        retry_after = response_data.get("retry_after")
+        return RateLimitException(message, retry_after=retry_after, response_data=response_data)
+    elif status_code == 503:
+        retry_after = response_data.get("retry_after")
+        return ServiceUnavailableException(message, retry_after=retry_after, response_data=response_data)
+    else:
+        return APIException(message, status_code=status_code, response_data=response_data)

ttsfm/models.py CHANGED Viewed

@@ -1,283 +1,283 @@
-"""
-Data models and types for the TTSFM package.
-This module defines the core data structures used throughout the package,
-including request/response models, enums, and error types.
-"""
-from enum import Enum
-from typing import Optional, Dict, Any, Union
-from dataclasses import dataclass
-from datetime import datetime
-class Voice(str, Enum):
-    """Available voice options for TTS generation."""
-    ALLOY = "alloy"
-    ASH = "ash"
-    BALLAD = "ballad"
-    CORAL = "coral"
-    ECHO = "echo"
-    FABLE = "fable"
-    NOVA = "nova"
-    ONYX = "onyx"
-    SAGE = "sage"
-    SHIMMER = "shimmer"
-    VERSE = "verse"
-class AudioFormat(str, Enum):
-    """Supported audio output formats."""
-    MP3 = "mp3"
-    WAV = "wav"
-    OPUS = "opus"
-    AAC = "aac"
-    FLAC = "flac"
-    PCM = "pcm"
-@dataclass
-class TTSRequest:
-    """
-    Request model for TTS generation.
-    Attributes:
-        input: Text to convert to speech
-        voice: Voice to use for generation
-        response_format: Audio format for output
-        instructions: Optional instructions for voice modulation
-        model: Model to use (for OpenAI compatibility, usually ignored)
-        speed: Speech speed (for OpenAI compatibility, usually ignored)
-        max_length: Maximum allowed text length (default: 4096 characters)
-        validate_length: Whether to validate text length (default: True)
-    """
-    input: str
-    voice: Union[Voice, str] = Voice.ALLOY
-    response_format: Union[AudioFormat, str] = AudioFormat.MP3
-    instructions: Optional[str] = None
-    model: Optional[str] = None
-    speed: Optional[float] = None
-    max_length: int = 4096
-    validate_length: bool = True
-    def __post_init__(self):
-        """Validate and normalize fields after initialization."""
-        # Ensure voice is a valid Voice enum
-        if isinstance(self.voice, str):
-            try:
-                self.voice = Voice(self.voice.lower())
-            except ValueError:
-                raise ValueError(f"Invalid voice: {self.voice}. Must be one of {list(Voice)}")
-        # Ensure response_format is a valid AudioFormat enum
-        if isinstance(self.response_format, str):
-            try:
-                self.response_format = AudioFormat(self.response_format.lower())
-            except ValueError:
-                raise ValueError(f"Invalid format: {self.response_format}. Must be one of {list(AudioFormat)}")
-        # Validate input text
-        if not self.input or not self.input.strip():
-            raise ValueError("Input text cannot be empty")
-        # Validate text length if enabled
-        if self.validate_length:
-            text_length = len(self.input)
-            if text_length > self.max_length:
-                raise ValueError(
-                    f"Input text is too long ({text_length} characters). "
-                    f"Maximum allowed length is {self.max_length} characters. "
-                    f"Consider splitting your text into smaller chunks or disable "
-                    f"length validation with validate_length=False."
-                )
-        # Validate max_length parameter
-        if self.max_length <= 0:
-            raise ValueError("max_length must be a positive integer")
-        # Validate speed if provided
-        if self.speed is not None and (self.speed < 0.25 or self.speed > 4.0):
-            raise ValueError("Speed must be between 0.25 and 4.0")
-    def to_dict(self) -> Dict[str, Any]:
-        """Convert request to dictionary for API calls."""
-        data = {
-            "input": self.input,
-            "voice": self.voice.value if isinstance(self.voice, Voice) else self.voice,
-            "response_format": self.response_format.value if isinstance(self.response_format, AudioFormat) else self.response_format
-        }
-        if self.instructions:
-            data["instructions"] = self.instructions
-        if self.model:
-            data["model"] = self.model
-        if self.speed is not None:
-            data["speed"] = self.speed
-        return data
-@dataclass
-class TTSResponse:
-    """
-    Response model for TTS generation.
-    Attributes:
-        audio_data: Generated audio as bytes
-        content_type: MIME type of the audio data
-        format: Audio format used
-        size: Size of audio data in bytes
-        duration: Estimated duration in seconds (if available)
-        metadata: Additional response metadata
-    """
-    audio_data: bytes
-    content_type: str
-    format: AudioFormat
-    size: int
-    duration: Optional[float] = None
-    metadata: Optional[Dict[str, Any]] = None
-    def __post_init__(self):
-        """Calculate derived fields after initialization."""
-        if self.size is None:
-            self.size = len(self.audio_data)
-    def save_to_file(self, filename: str) -> str:
-        """
-        Save audio data to a file.
-        Args:
-            filename: Target filename (extension will be added if missing)
-        Returns:
-            str: Final filename used
-        """
-        import os
-        # Use the actual returned format for the extension, not any requested format
-        expected_extension = f".{self.format.value}"
-        # Check if filename already has the correct extension
-        if filename.endswith(expected_extension):
-            final_filename = filename
-        else:
-            # Remove any existing extension and add the correct one
-            base_name = filename
-            # Remove common audio extensions if present
-            for ext in ['.mp3', '.wav', '.opus', '.aac', '.flac', '.pcm']:
-                if base_name.endswith(ext):
-                    base_name = base_name[:-len(ext)]
-                    break
-            final_filename = f"{base_name}{expected_extension}"
-        # Create directory if it doesn't exist
-        os.makedirs(os.path.dirname(final_filename) if os.path.dirname(final_filename) else ".", exist_ok=True)
-        # Write audio data
-        with open(final_filename, "wb") as f:
-            f.write(self.audio_data)
-        return final_filename
-@dataclass
-class TTSError:
-    """
-    Error information from TTS API.
-    Attributes:
-        code: Error code
-        message: Human-readable error message
-        type: Error type/category
-        details: Additional error details
-        timestamp: When the error occurred
-    """
-    code: str
-    message: str
-    type: Optional[str] = None
-    details: Optional[Dict[str, Any]] = None
-    timestamp: Optional[datetime] = None
-    def __post_init__(self):
-        """Set timestamp if not provided."""
-        if self.timestamp is None:
-            self.timestamp = datetime.now()
-@dataclass
-class APIError(TTSError):
-    """API-specific error information."""
-    status_code: int = 500
-    headers: Optional[Dict[str, str]] = None
-@dataclass
-class NetworkError(TTSError):
-    """Network-related error information."""
-    timeout: Optional[float] = None
-    retry_count: int = 0
-@dataclass
-class ValidationError(TTSError):
-    """Validation error information."""
-    field: Optional[str] = None
-    value: Optional[Any] = None
-# Content type mappings for audio formats
-CONTENT_TYPE_MAP = {
-    AudioFormat.MP3: "audio/mpeg",
-    AudioFormat.OPUS: "audio/opus",
-    AudioFormat.AAC: "audio/aac",
-    AudioFormat.FLAC: "audio/flac",
-    AudioFormat.WAV: "audio/wav",
-    AudioFormat.PCM: "audio/pcm"
-}
-# Reverse mapping for content type to format
-FORMAT_FROM_CONTENT_TYPE = {v: k for k, v in CONTENT_TYPE_MAP.items()}
-def get_content_type(format: Union[AudioFormat, str]) -> str:
-    """Get MIME content type for audio format."""
-    if isinstance(format, str):
-        format = AudioFormat(format.lower())
-    return CONTENT_TYPE_MAP.get(format, "audio/mpeg")
-def get_format_from_content_type(content_type: str) -> AudioFormat:
-    """Get audio format from MIME content type."""
-    return FORMAT_FROM_CONTENT_TYPE.get(content_type, AudioFormat.MP3)
-def get_supported_format(requested_format: AudioFormat) -> AudioFormat:
-    """
-    Map requested format to supported format.
-    Args:
-        requested_format: The requested audio format
-    Returns:
-        AudioFormat: MP3 or WAV (the supported formats)
-    """
-    if requested_format == AudioFormat.MP3:
-        return AudioFormat.MP3
-    else:
-        # All other formats (WAV, OPUS, AAC, FLAC, PCM) return WAV
-        return AudioFormat.WAV
-def maps_to_wav(format_value: str) -> bool:
-    """
-    Check if a format maps to WAV.
-    Args:
-        format_value: Format string to check
-    Returns:
-        bool: True if the format maps to WAV
-    """
-    return format_value.lower() in ['wav', 'opus', 'aac', 'flac', 'pcm']

+"""
+Data models and types for the TTSFM package.
+This module defines the core data structures used throughout the package,
+including request/response models, enums, and error types.
+"""
+from enum import Enum
+from typing import Optional, Dict, Any, Union
+from dataclasses import dataclass
+from datetime import datetime
+class Voice(str, Enum):
+    """Available voice options for TTS generation."""
+    ALLOY = "alloy"
+    ASH = "ash"
+    BALLAD = "ballad"
+    CORAL = "coral"
+    ECHO = "echo"
+    FABLE = "fable"
+    NOVA = "nova"
+    ONYX = "onyx"
+    SAGE = "sage"
+    SHIMMER = "shimmer"
+    VERSE = "verse"
+class AudioFormat(str, Enum):
+    """Supported audio output formats."""
+    MP3 = "mp3"
+    WAV = "wav"
+    OPUS = "opus"
+    AAC = "aac"
+    FLAC = "flac"
+    PCM = "pcm"
+@dataclass
+class TTSRequest:
+    """
+    Request model for TTS generation.
+    Attributes:
+        input: Text to convert to speech
+        voice: Voice to use for generation
+        response_format: Audio format for output
+        instructions: Optional instructions for voice modulation
+        model: Model to use (for OpenAI compatibility, usually ignored)
+        speed: Speech speed (for OpenAI compatibility, usually ignored)
+        max_length: Maximum allowed text length (default: 4096 characters)
+        validate_length: Whether to validate text length (default: True)
+    """
+    input: str
+    voice: Union[Voice, str] = Voice.ALLOY
+    response_format: Union[AudioFormat, str] = AudioFormat.MP3
+    instructions: Optional[str] = None
+    model: Optional[str] = None
+    speed: Optional[float] = None
+    max_length: int = 4096
+    validate_length: bool = True
+    def __post_init__(self):
+        """Validate and normalize fields after initialization."""
+        # Ensure voice is a valid Voice enum
+        if isinstance(self.voice, str):
+            try:
+                self.voice = Voice(self.voice.lower())
+            except ValueError:
+                raise ValueError(f"Invalid voice: {self.voice}. Must be one of {list(Voice)}")
+        # Ensure response_format is a valid AudioFormat enum
+        if isinstance(self.response_format, str):
+            try:
+                self.response_format = AudioFormat(self.response_format.lower())
+            except ValueError:
+                raise ValueError(f"Invalid format: {self.response_format}. Must be one of {list(AudioFormat)}")
+        # Validate input text
+        if not self.input or not self.input.strip():
+            raise ValueError("Input text cannot be empty")
+        # Validate text length if enabled
+        if self.validate_length:
+            text_length = len(self.input)
+            if text_length > self.max_length:
+                raise ValueError(
+                    f"Input text is too long ({text_length} characters). "
+                    f"Maximum allowed length is {self.max_length} characters. "
+                    f"Consider splitting your text into smaller chunks or disable "
+                    f"length validation with validate_length=False."
+                )
+        # Validate max_length parameter
+        if self.max_length <= 0:
+            raise ValueError("max_length must be a positive integer")
+        # Validate speed if provided
+        if self.speed is not None and (self.speed < 0.25 or self.speed > 4.0):
+            raise ValueError("Speed must be between 0.25 and 4.0")
+    def to_dict(self) -> Dict[str, Any]:
+        """Convert request to dictionary for API calls."""
+        data = {
+            "input": self.input,
+            "voice": self.voice.value if isinstance(self.voice, Voice) else self.voice,
+            "response_format": self.response_format.value if isinstance(self.response_format, AudioFormat) else self.response_format
+        }
+        if self.instructions:
+            data["instructions"] = self.instructions
+        if self.model:
+            data["model"] = self.model
+        if self.speed is not None:
+            data["speed"] = self.speed
+        return data
+@dataclass
+class TTSResponse:
+    """
+    Response model for TTS generation.
+    Attributes:
+        audio_data: Generated audio as bytes
+        content_type: MIME type of the audio data
+        format: Audio format used
+        size: Size of audio data in bytes
+        duration: Estimated duration in seconds (if available)
+        metadata: Additional response metadata
+    """
+    audio_data: bytes
+    content_type: str
+    format: AudioFormat
+    size: int
+    duration: Optional[float] = None
+    metadata: Optional[Dict[str, Any]] = None
+    def __post_init__(self):
+        """Calculate derived fields after initialization."""
+        if self.size is None:
+            self.size = len(self.audio_data)
+    def save_to_file(self, filename: str) -> str:
+        """
+        Save audio data to a file.
+        Args:
+            filename: Target filename (extension will be added if missing)
+        Returns:
+            str: Final filename used
+        """
+        import os
+        # Use the actual returned format for the extension, not any requested format
+        expected_extension = f".{self.format.value}"
+        # Check if filename already has the correct extension
+        if filename.endswith(expected_extension):
+            final_filename = filename
+        else:
+            # Remove any existing extension and add the correct one
+            base_name = filename
+            # Remove common audio extensions if present
+            for ext in ['.mp3', '.wav', '.opus', '.aac', '.flac', '.pcm']:
+                if base_name.endswith(ext):
+                    base_name = base_name[:-len(ext)]
+                    break
+            final_filename = f"{base_name}{expected_extension}"
+        # Create directory if it doesn't exist
+        os.makedirs(os.path.dirname(final_filename) if os.path.dirname(final_filename) else ".", exist_ok=True)
+        # Write audio data
+        with open(final_filename, "wb") as f:
+            f.write(self.audio_data)
+        return final_filename
+@dataclass
+class TTSError:
+    """
+    Error information from TTS API.
+    Attributes:
+        code: Error code
+        message: Human-readable error message
+        type: Error type/category
+        details: Additional error details
+        timestamp: When the error occurred
+    """
+    code: str
+    message: str
+    type: Optional[str] = None
+    details: Optional[Dict[str, Any]] = None
+    timestamp: Optional[datetime] = None
+    def __post_init__(self):
+        """Set timestamp if not provided."""
+        if self.timestamp is None:
+            self.timestamp = datetime.now()
+@dataclass
+class APIError(TTSError):
+    """API-specific error information."""
+    status_code: int = 500
+    headers: Optional[Dict[str, str]] = None
+@dataclass
+class NetworkError(TTSError):
+    """Network-related error information."""
+    timeout: Optional[float] = None
+    retry_count: int = 0
+@dataclass
+class ValidationError(TTSError):
+    """Validation error information."""
+    field: Optional[str] = None
+    value: Optional[Any] = None
+# Content type mappings for audio formats
+CONTENT_TYPE_MAP = {
+    AudioFormat.MP3: "audio/mpeg",
+    AudioFormat.OPUS: "audio/opus",
+    AudioFormat.AAC: "audio/aac",
+    AudioFormat.FLAC: "audio/flac",
+    AudioFormat.WAV: "audio/wav",
+    AudioFormat.PCM: "audio/pcm"
+}
+# Reverse mapping for content type to format
+FORMAT_FROM_CONTENT_TYPE = {v: k for k, v in CONTENT_TYPE_MAP.items()}
+def get_content_type(format: Union[AudioFormat, str]) -> str:
+    """Get MIME content type for audio format."""
+    if isinstance(format, str):
+        format = AudioFormat(format.lower())
+    return CONTENT_TYPE_MAP.get(format, "audio/mpeg")
+def get_format_from_content_type(content_type: str) -> AudioFormat:
+    """Get audio format from MIME content type."""
+    return FORMAT_FROM_CONTENT_TYPE.get(content_type, AudioFormat.MP3)
+def get_supported_format(requested_format: AudioFormat) -> AudioFormat:
+    """
+    Map requested format to supported format.
+    Args:
+        requested_format: The requested audio format
+    Returns:
+        AudioFormat: MP3 or WAV (the supported formats)
+    """
+    if requested_format == AudioFormat.MP3:
+        return AudioFormat.MP3
+    else:
+        # All other formats (WAV, OPUS, AAC, FLAC, PCM) return WAV
+        return AudioFormat.WAV
+def maps_to_wav(format_value: str) -> bool:
+    """
+    Check if a format maps to WAV.
+    Args:
+        format_value: Format string to check
+    Returns:
+        bool: True if the format maps to WAV
+    """
+    return format_value.lower() in ['wav', 'opus', 'aac', 'flac', 'pcm']

ttsfm/utils.py CHANGED Viewed

@@ -1,421 +1,466 @@
-"""
-Utility functions for the TTSFM package.
-This module provides common utility functions used throughout the package,
-including HTTP helpers, validation utilities, and configuration management.
-"""
-import os
-import re
-import time
-import random
-import logging
-from typing import Dict, Any, Optional, Union, List
-from urllib.parse import urljoin, urlparse
-# Configure logging
-logger = logging.getLogger(__name__)
-def get_user_agent() -> str:
-    """
-    Generate a realistic User-Agent string.
-    Returns:
-        str: User-Agent string for HTTP requests
-    """
-    try:
-        from fake_useragent import UserAgent
-        ua = UserAgent()
-        return ua.random
-    except ImportError:
-        # Fallback if fake_useragent is not available
-        return "TTSFM-Client/3.0.0 (Python)"
-def get_realistic_headers() -> Dict[str, str]:
-    """
-    Generate realistic HTTP headers for requests.
-    Returns:
-        Dict[str, str]: HTTP headers dictionary
-    """
-    user_agent = get_user_agent()
-    headers = {
-        "Accept": "application/json, audio/*",
-        "Accept-Encoding": "gzip, deflate, br",
-        "Accept-Language": random.choice(["en-US,en;q=0.9", "en-GB,en;q=0.8", "en-CA,en;q=0.7"]),
-        "Cache-Control": "no-cache",
-        "DNT": "1",
-        "Pragma": "no-cache",
-        "User-Agent": user_agent,
-        "X-Requested-With": "XMLHttpRequest",
-    }
-    # Add browser-specific headers for Chromium-based browsers
-    if any(browser in user_agent.lower() for browser in ['chrome', 'edge', 'chromium']):
-        version_match = re.search(r'(?:Chrome|Edge|Chromium)/(\d+)', user_agent)
-        major_version = version_match.group(1) if version_match else "121"
-        brands = []
-        if 'google chrome' in user_agent.lower():
-            brands.extend([
-                f'"Google Chrome";v="{major_version}"',
-                f'"Chromium";v="{major_version}"',
-                '"Not A(Brand";v="99"'
-            ])
-        elif 'microsoft edge' in user_agent.lower():
-            brands.extend([
-                f'"Microsoft Edge";v="{major_version}"',
-                f'"Chromium";v="{major_version}"',
-                '"Not A(Brand";v="99"'
-            ])
-        else:
-            brands.extend([
-                f'"Chromium";v="{major_version}"',
-                '"Not A(Brand";v="8"'
-            ])
-        headers.update({
-            "Sec-Ch-Ua": ", ".join(brands),
-            "Sec-Ch-Ua-Mobile": "?0",
-            "Sec-Ch-Ua-Platform": random.choice(['"Windows"', '"macOS"', '"Linux"']),
-            "Sec-Fetch-Dest": "empty",
-            "Sec-Fetch-Mode": "cors",
-            "Sec-Fetch-Site": "same-origin"
-        })
-    # Randomly add some optional headers
-    if random.random() < 0.5:
-        headers["Upgrade-Insecure-Requests"] = "1"
-    return headers
-def validate_text_length(text: str, max_length: int = 4096, raise_error: bool = True) -> bool:
-    """
-    Validate text length against maximum allowed characters.
-    Args:
-        text: Text to validate
-        max_length: Maximum allowed length in characters
-        raise_error: Whether to raise an exception if validation fails
-    Returns:
-        bool: True if text is within limits, False otherwise
-    Raises:
-        ValueError: If text exceeds max_length and raise_error is True
-    """
-    if not text:
-        return True
-    text_length = len(text)
-    if text_length > max_length:
-        if raise_error:
-            raise ValueError(
-                f"Text is too long ({text_length} characters). "
-                f"Maximum allowed length is {max_length} characters. "
-                f"TTS models typically support up to 4096 characters per request."
-            )
-        return False
-    return True
-def split_text_by_length(text: str, max_length: int = 4096, preserve_words: bool = True) -> List[str]:
-    """
-    Split text into chunks that don't exceed the maximum length.
-    Args:
-        text: Text to split
-        max_length: Maximum length per chunk
-        preserve_words: Whether to avoid splitting words
-    Returns:
-        List[str]: List of text chunks
-    """
-    if not text:
-        return []
-    if len(text) <= max_length:
-        return [text]
-    chunks = []
-    if preserve_words:
-        # Split by sentences first, then by words if needed
-        sentences = re.split(r'[.!?]+', text)
-        current_chunk = ""
-        for sentence in sentences:
-            sentence = sentence.strip()
-            if not sentence:
-                continue
-            # Add sentence ending punctuation back
-            if not sentence.endswith(('.', '!', '?')):
-                sentence += '.'
-            # Check if adding this sentence would exceed the limit
-            test_chunk = current_chunk + (" " if current_chunk else "") + sentence
-            if len(test_chunk) <= max_length:
-                current_chunk = test_chunk
-            else:
-                # Save current chunk if it has content
-                if current_chunk:
-                    chunks.append(current_chunk.strip())
-                # If single sentence is too long, split by words
-                if len(sentence) > max_length:
-                    word_chunks = _split_by_words(sentence, max_length)
-                    chunks.extend(word_chunks)
-                    current_chunk = ""
-                else:
-                    current_chunk = sentence
-        # Add remaining chunk
-        if current_chunk:
-            chunks.append(current_chunk.strip())
-    else:
-        # Simple character-based splitting
-        for i in range(0, len(text), max_length):
-            chunks.append(text[i:i + max_length])
-    return [chunk for chunk in chunks if chunk.strip()]
-def _split_by_words(text: str, max_length: int) -> List[str]:
-    """
-    Split text by words when sentences are too long.
-    Args:
-        text: Text to split
-        max_length: Maximum length per chunk
-    Returns:
-        List[str]: List of word-based chunks
-    """
-    words = text.split()
-    chunks = []
-    current_chunk = ""
-    for word in words:
-        test_chunk = current_chunk + (" " if current_chunk else "") + word
-        if len(test_chunk) <= max_length:
-            current_chunk = test_chunk
-        else:
-            if current_chunk:
-                chunks.append(current_chunk)
-            # If single word is too long, split it
-            if len(word) > max_length:
-                for i in range(0, len(word), max_length):
-                    chunks.append(word[i:i + max_length])
-                current_chunk = ""
-            else:
-                current_chunk = word
-    if current_chunk:
-        chunks.append(current_chunk)
-    return chunks
-def sanitize_text(text: str) -> str:
-    """
-    Sanitize input text for TTS processing.
-    Args:
-        text: Input text to sanitize
-    Returns:
-        str: Sanitized text
-    """
-    if not text:
-        return ""
-    # Remove HTML tags
-    text = re.sub(r'<[^>]+>', '', text)
-    # Remove script tags and content
-    text = re.sub(r'<script.*?</script>', '', text, flags=re.DOTALL | re.IGNORECASE)
-    # Remove potentially dangerous characters
-    text = re.sub(r'[<>"\']', '', text)
-    # Normalize whitespace
-    text = re.sub(r'\s+', ' ', text)
-    return text.strip()
-def validate_url(url: str) -> bool:
-    """
-    Validate if a URL is properly formatted.
-    Args:
-        url: URL to validate
-    Returns:
-        bool: True if URL is valid, False otherwise
-    """
-    try:
-        result = urlparse(url)
-        return all([result.scheme, result.netloc])
-    except Exception:
-        return False
-def build_url(base_url: str, path: str) -> str:
-    """
-    Build a complete URL from base URL and path.
-    Args:
-        base_url: Base URL
-        path: Path to append
-    Returns:
-        str: Complete URL
-    """
-    # Ensure base_url ends with /
-    if not base_url.endswith('/'):
-        base_url += '/'
-    # Ensure path doesn't start with /
-    if path.startswith('/'):
-        path = path[1:]
-    return urljoin(base_url, path)
-def get_random_delay(min_delay: float = 1.0, max_delay: float = 5.0) -> float:
-    """
-    Get a random delay with jitter for rate limiting.
-    Args:
-        min_delay: Minimum delay in seconds
-        max_delay: Maximum delay in seconds
-    Returns:
-        float: Random delay in seconds
-    """
-    base_delay = random.uniform(min_delay, max_delay)
-    jitter = random.uniform(0.1, 0.5)
-    return base_delay + jitter
-def exponential_backoff(attempt: int, base_delay: float = 1.0, max_delay: float = 60.0) -> float:
-    """
-    Calculate exponential backoff delay.
-    Args:
-        attempt: Attempt number (0-based)
-        base_delay: Base delay in seconds
-        max_delay: Maximum delay in seconds
-    Returns:
-        float: Delay in seconds
-    """
-    delay = base_delay * (2 ** attempt)
-    jitter = random.uniform(0.1, 0.3) * delay
-    return min(delay + jitter, max_delay)
-def load_config_from_env(prefix: str = "TTSFM_") -> Dict[str, Any]:
-    """
-    Load configuration from environment variables.
-    Args:
-        prefix: Prefix for environment variables
-    Returns:
-        Dict[str, Any]: Configuration dictionary
-    """
-    config = {}
-    for key, value in os.environ.items():
-        if key.startswith(prefix):
-            config_key = key[len(prefix):].lower()
-            # Try to convert to appropriate type
-            if value.lower() in ('true', 'false'):
-                config[config_key] = value.lower() == 'true'
-            elif value.isdigit():
-                config[config_key] = int(value)
-            elif '.' in value and value.replace('.', '').isdigit():
-                config[config_key] = float(value)
-            else:
-                config[config_key] = value
-    return config
-def setup_logging(level: Union[str, int] = logging.INFO, format_string: Optional[str] = None) -> None:
-    """
-    Setup logging configuration for the package.
-    Args:
-        level: Logging level
-        format_string: Custom format string
-    """
-    if format_string is None:
-        format_string = '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
-    logging.basicConfig(
-        level=level,
-        format=format_string,
-        handlers=[logging.StreamHandler()]
-    )
-def estimate_audio_duration(text: str, words_per_minute: float = 150.0) -> float:
-    """
-    Estimate audio duration based on text length.
-    Args:
-        text: Input text
-        words_per_minute: Average speaking rate
-    Returns:
-        float: Estimated duration in seconds
-    """
-    if not text:
-        return 0.0
-    # Count words (simple whitespace split)
-    word_count = len(text.split())
-    # Calculate duration in seconds
-    duration = (word_count / words_per_minute) * 60.0
-    # Add some buffer for pauses and processing
-    return duration * 1.1
-def format_file_size(size_bytes: int) -> str:
-    """
-    Format file size in human-readable format.
-    Args:
-        size_bytes: Size in bytes
-    Returns:
-        str: Formatted size string
-    """
-    if size_bytes == 0:
-        return "0 B"
-    size_names = ["B", "KB", "MB", "GB"]
-    i = 0
-    while size_bytes >= 1024 and i < len(size_names) - 1:
-        size_bytes /= 1024.0
-        i += 1
-    return f"{size_bytes:.1f} {size_names[i]}"

+"""
+Utility functions for the TTSFM package.
+This module provides common utility functions used throughout the package,
+including HTTP helpers, validation utilities, and configuration management.
+"""
+import os
+import re
+import time
+import random
+import logging
+from typing import Dict, Any, Optional, Union, List
+from urllib.parse import urljoin, urlparse
+# Configure logging
+logger = logging.getLogger(__name__)
+def get_user_agent() -> str:
+    """
+    Generate a realistic User-Agent string.
+    Returns:
+        str: User-Agent string for HTTP requests
+    """
+    try:
+        from fake_useragent import UserAgent
+        ua = UserAgent()
+        return ua.random
+    except ImportError:
+        # Fallback if fake_useragent is not available
+        return "TTSFM-Client/3.0.0 (Python)"
+def get_realistic_headers() -> Dict[str, str]:
+    """
+    Generate realistic HTTP headers for requests.
+    Returns:
+        Dict[str, str]: HTTP headers dictionary
+    """
+    user_agent = get_user_agent()
+    headers = {
+        "Accept": "application/json, audio/*",
+        "Accept-Encoding": "gzip, deflate, br",
+        "Accept-Language": random.choice(["en-US,en;q=0.9", "en-GB,en;q=0.8", "en-CA,en;q=0.7"]),
+        "Cache-Control": "no-cache",
+        "DNT": "1",
+        "Pragma": "no-cache",
+        "User-Agent": user_agent,
+        "X-Requested-With": "XMLHttpRequest",
+    }
+    # Add browser-specific headers for Chromium-based browsers
+    if any(browser in user_agent.lower() for browser in ['chrome', 'edge', 'chromium']):
+        version_match = re.search(r'(?:Chrome|Edge|Chromium)/(\d+)', user_agent)
+        major_version = version_match.group(1) if version_match else "121"
+        brands = []
+        if 'google chrome' in user_agent.lower():
+            brands.extend([
+                f'"Google Chrome";v="{major_version}"',
+                f'"Chromium";v="{major_version}"',
+                '"Not A(Brand";v="99"'
+            ])
+        elif 'microsoft edge' in user_agent.lower():
+            brands.extend([
+                f'"Microsoft Edge";v="{major_version}"',
+                f'"Chromium";v="{major_version}"',
+                '"Not A(Brand";v="99"'
+            ])
+        else:
+            brands.extend([
+                f'"Chromium";v="{major_version}"',
+                '"Not A(Brand";v="8"'
+            ])
+        headers.update({
+            "Sec-Ch-Ua": ", ".join(brands),
+            "Sec-Ch-Ua-Mobile": "?0",
+            "Sec-Ch-Ua-Platform": random.choice(['"Windows"', '"macOS"', '"Linux"']),
+            "Sec-Fetch-Dest": "empty",
+            "Sec-Fetch-Mode": "cors",
+            "Sec-Fetch-Site": "same-origin"
+        })
+    # Randomly add some optional headers
+    if random.random() < 0.5:
+        headers["Upgrade-Insecure-Requests"] = "1"
+    return headers
+def validate_text_length(text: str, max_length: int = 4096, raise_error: bool = True) -> bool:
+    """
+    Validate text length against maximum allowed characters.
+    Args:
+        text: Text to validate
+        max_length: Maximum allowed length in characters
+        raise_error: Whether to raise an exception if validation fails
+    Returns:
+        bool: True if text is within limits, False otherwise
+    Raises:
+        ValueError: If text exceeds max_length and raise_error is True
+    """
+    if not text:
+        return True
+    text_length = len(text)
+    if text_length > max_length:
+        if raise_error:
+            raise ValueError(
+                f"Text is too long ({text_length} characters). "
+                f"Maximum allowed length is {max_length} characters. "
+                f"TTS models typically support up to 4096 characters per request."
+            )
+        return False
+    return True
+def split_text_by_length(text: str, max_length: int = 4096, preserve_words: bool = True) -> List[str]:
+    """
+    Split text into chunks that don't exceed the maximum length.
+    Args:
+        text: Text to split
+        max_length: Maximum length per chunk
+        preserve_words: Whether to avoid splitting words
+    Returns:
+        List[str]: List of text chunks
+    """
+    if not text:
+        return []
+    if len(text) <= max_length:
+        return [text]
+    chunks = []
+    if preserve_words:
+        # Split by sentences first, then by words if needed
+        sentences = re.split(r'[.!?]+', text)
+        current_chunk = ""
+        for sentence in sentences:
+            sentence = sentence.strip()
+            if not sentence:
+                continue
+            # Add sentence ending punctuation back
+            if not sentence.endswith(('.', '!', '?')):
+                sentence += '.'
+            # Check if adding this sentence would exceed the limit
+            test_chunk = current_chunk + (" " if current_chunk else "") + sentence
+            if len(test_chunk) <= max_length:
+                current_chunk = test_chunk
+            else:
+                # Save current chunk if it has content
+                if current_chunk:
+                    chunks.append(current_chunk.strip())
+                # If single sentence is too long, split by words
+                if len(sentence) > max_length:
+                    word_chunks = _split_by_words(sentence, max_length)
+                    chunks.extend(word_chunks)
+                    current_chunk = ""
+                else:
+                    current_chunk = sentence
+        # Add remaining chunk
+        if current_chunk:
+            chunks.append(current_chunk.strip())
+    else:
+        # Simple character-based splitting
+        for i in range(0, len(text), max_length):
+            chunks.append(text[i:i + max_length])
+    return [chunk for chunk in chunks if chunk.strip()]
+def _split_by_words(text: str, max_length: int) -> List[str]:
+    """
+    Split text by words when sentences are too long.
+    Args:
+        text: Text to split
+        max_length: Maximum length per chunk
+    Returns:
+        List[str]: List of word-based chunks
+    """
+    words = text.split()
+    chunks = []
+    current_chunk = ""
+    for word in words:
+        test_chunk = current_chunk + (" " if current_chunk else "") + word
+        if len(test_chunk) <= max_length:
+            current_chunk = test_chunk
+        else:
+            if current_chunk:
+                chunks.append(current_chunk)
+            # If single word is too long, split it
+            if len(word) > max_length:
+                for i in range(0, len(word), max_length):
+                    chunks.append(word[i:i + max_length])
+                current_chunk = ""
+            else:
+                current_chunk = word
+    if current_chunk:
+        chunks.append(current_chunk)
+    return chunks
+def sanitize_text(text: str) -> str:
+    """
+    Sanitize input text for TTS processing.
+    Removes HTML markup and potentially problematic characters to ensure
+    clean text input for text-to-speech generation. Uses safe regex patterns
+    to prevent ReDoS attacks.
+    Args:
+        text: Input text to sanitize
+    Returns:
+        str: Sanitized text safe for TTS processing
+    Raises:
+        ValueError: If input text is too long (>50000 characters)
+    """
+    if not text:
+        return ""
+    # Prevent ReDoS attacks by limiting input length
+    if len(text) > 50000:
+        raise ValueError("Input text too long for sanitization (max 50000 characters)")
+    # Use a simple character-by-character approach to remove HTML-like content
+    # This avoids complex regex patterns that can cause ReDoS
+    result = []
+    i = 0
+    while i < len(text):
+        if text[i] == '<':
+            # Find the end of the tag
+            j = i + 1
+            while j < len(text) and text[j] != '>':
+                j += 1
+            if j < len(text):
+                # Skip the entire tag
+                i = j + 1
+            else:
+                # No closing >, treat as regular character
+                result.append(text[i])
+                i += 1
+        elif text[i] == '&':
+            # Handle HTML entities
+            j = i + 1
+            while j < len(text) and j < i + 10 and text[j] not in ' \t\n\r<>&':
+                j += 1
+            if j < len(text) and text[j] == ';':
+                # Skip the entity
+                i = j + 1
+            else:
+                # Not a valid entity, keep the &
+                result.append(' ')  # Replace with space for TTS
+                i += 1
+        else:
+            # Regular character
+            char = text[i]
+            # Normalize quotes for TTS
+            if char in '""''`':
+                result.append('"')
+            elif char in '<>':
+                # Skip these characters
+                pass
+            else:
+                result.append(char)
+            i += 1
+    # Join and normalize whitespace using a safe regex
+    sanitized = ''.join(result)
+    sanitized = re.sub(r'[ \t\n\r\f\v]+', ' ', sanitized)
+    return sanitized.strip()
+def validate_url(url: str) -> bool:
+    """
+    Validate if a URL is properly formatted.
+    Args:
+        url: URL to validate
+    Returns:
+        bool: True if URL is valid, False otherwise
+    """
+    try:
+        result = urlparse(url)
+        return all([result.scheme, result.netloc])
+    except Exception:
+        return False
+def build_url(base_url: str, path: str) -> str:
+    """
+    Build a complete URL from base URL and path.
+    Args:
+        base_url: Base URL
+        path: Path to append
+    Returns:
+        str: Complete URL
+    """
+    # Ensure base_url ends with /
+    if not base_url.endswith('/'):
+        base_url += '/'
+    # Ensure path doesn't start with /
+    if path.startswith('/'):
+        path = path[1:]
+    return urljoin(base_url, path)
+def get_random_delay(min_delay: float = 1.0, max_delay: float = 5.0) -> float:
+    """
+    Get a random delay with jitter for rate limiting.
+    Args:
+        min_delay: Minimum delay in seconds
+        max_delay: Maximum delay in seconds
+    Returns:
+        float: Random delay in seconds
+    """
+    base_delay = random.uniform(min_delay, max_delay)
+    jitter = random.uniform(0.1, 0.5)
+    return base_delay + jitter
+def exponential_backoff(attempt: int, base_delay: float = 1.0, max_delay: float = 60.0) -> float:
+    """
+    Calculate exponential backoff delay.
+    Args:
+        attempt: Attempt number (0-based)
+        base_delay: Base delay in seconds
+        max_delay: Maximum delay in seconds
+    Returns:
+        float: Delay in seconds
+    """
+    delay = base_delay * (2 ** attempt)
+    jitter = random.uniform(0.1, 0.3) * delay
+    return min(delay + jitter, max_delay)
+def load_config_from_env(prefix: str = "TTSFM_") -> Dict[str, Any]:
+    """
+    Load configuration from environment variables.
+    Args:
+        prefix: Prefix for environment variables
+    Returns:
+        Dict[str, Any]: Configuration dictionary
+    """
+    config = {}
+    for key, value in os.environ.items():
+        if key.startswith(prefix):
+            config_key = key[len(prefix):].lower()
+            # Try to convert to appropriate type
+            if value.lower() in ('true', 'false'):
+                config[config_key] = value.lower() == 'true'
+            elif value.isdigit():
+                config[config_key] = int(value)
+            elif '.' in value and value.replace('.', '').isdigit():
+                config[config_key] = float(value)
+            else:
+                config[config_key] = value
+    return config
+def setup_logging(level: Union[str, int] = logging.INFO, format_string: Optional[str] = None) -> None:
+    """
+    Setup logging configuration for the package.
+    Args:
+        level: Logging level
+        format_string: Custom format string
+    """
+    if format_string is None:
+        format_string = '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
+    logging.basicConfig(
+        level=level,
+        format=format_string,
+        handlers=[logging.StreamHandler()]
+    )
+def estimate_audio_duration(text: str, words_per_minute: float = 150.0) -> float:
+    """
+    Estimate audio duration based on text length.
+    Args:
+        text: Input text
+        words_per_minute: Average speaking rate
+    Returns:
+        float: Estimated duration in seconds
+    """
+    if not text:
+        return 0.0
+    # Count words (simple whitespace split)
+    word_count = len(text.split())
+    # Calculate duration in seconds
+    duration = (word_count / words_per_minute) * 60.0
+    # Add some buffer for pauses and processing
+    return duration * 1.1
+def format_file_size(size_bytes: int) -> str:
+    """
+    Format file size in human-readable format.
+    Args:
+        size_bytes: Size in bytes
+    Returns:
+        str: Formatted size string
+    """
+    if size_bytes == 0:
+        return "0 B"
+    size_names = ["B", "KB", "MB", "GB"]
+    i = 0
+    while size_bytes >= 1024 and i < len(size_names) - 1:
+        size_bytes /= 1024.0
+        i += 1
+    return f"{size_bytes:.1f} {size_names[i]}"

uv.lock ADDED Viewed

The diff for this file is too large to render. See raw diff