# Alternative OmniAvatar Model Download Guide ## 🎯 Why You're Getting Only Audio Output Your app is working correctly but running in **TTS-only mode** because the OmniAvatar-14B models are missing. The app gracefully falls back to audio-only generation when video models aren't available. ## 🚀 Solutions to Enable Video Generation ### Option 1: Use Git to Download Models (If you have Git LFS) # Create model directories mkdir pretrained_models\Wan2.1-T2V-14B mkdir pretrained_models\OmniAvatar-14B mkdir pretrained_models\wav2vec2-base-960h # Clone models (requires Git LFS) git lfs clone https://huggingface.co/Wan-AI/Wan2.1-T2V-14B pretrained_models/Wan2.1-T2V-14B git lfs clone https://huggingface.co/OmniAvatar/OmniAvatar-14B pretrained_models/OmniAvatar-14B git lfs clone https://huggingface.co/facebook/wav2vec2-base-960h pretrained_models/wav2vec2-base-960h ### Option 2: Install Python and Run Setup Script 1. **Install Python** (if not already done): - Download from: https://python.org/downloads/ - Or enable from Microsoft Store - Make sure to check "Add to PATH" during installation 2. **Run the setup script**: python setup_omniavatar.py ### Option 3: Manual Download from HuggingFace Visit these URLs and download manually: - https://huggingface.co/Wan-AI/Wan2.1-T2V-14B - https://huggingface.co/OmniAvatar/OmniAvatar-14B - https://huggingface.co/facebook/wav2vec2-base-960h Extract to: - pretrained_models/Wan2.1-T2V-14B/ - pretrained_models/OmniAvatar-14B/ - pretrained_models/wav2vec2-base-960h/ ### Option 4: Use Windows Subsystem for Linux (WSL) If you have WSL installed: ```bash wsl cd /mnt/c/path/to/your/project python setup_omniavatar.py ``` ## 📊 Model Requirements Total download size: ~30.36GB - Wan2.1-T2V-14B: ~28GB (base text-to-video model) - OmniAvatar-14B: ~2GB (avatar animation weights) - wav2vec2-base-960h: ~360MB (audio encoder) ## 🔍 Verify Installation After downloading, restart your app and check: - The app should show "full functionality enabled" in logs - API responses should return video URLs instead of just audio - Gradio interface should show video output component ## 💡 Current Status Your setup is working perfectly for TTS! Once the OmniAvatar models are downloaded, you'll get: ✅ Audio-driven avatar videos ✅ Adaptive body animation ✅ Lip-sync accuracy ✅ 480p video output