Spaces:

bravedims
/

AI_Avatar_Chat

Running

bravedims commited on Aug 7

Commit

efb1c49

1 Parent(s): 7b4fc5d

Make video duration dynamic based on text-to-speech length

🎯 Feature: Video duration now matches text/audio length instead of fixed 5 seconds

✅ Improvements:
- Calculate duration based on word count (2.5 words/second average speaking rate)
- Try to get actual audio file duration using librosa if available
- Dynamic video generation with appropriate length
- Add text animation showing speech content progressively
- Show duration info and frame counter in video
- Set reasonable bounds: minimum 3s, maximum 30s

🎬 Enhanced Video Features:
- Color-changing animated circle
- Progressive text display (typing effect)
- Text wrapping for long content (max 3 lines)
- Frame counter and time display
- Better visual feedback

📝 Logic:
- If audio file exists: Use actual audio duration
- If no audio: Estimate from text (word_count / 2.5 words per second)
- Add 20% padding for natural speech patterns
- Fallback to 5 seconds if no text provided

🔧 Technical:
- Update input file format: prompt@@image@@audio@@text_content
- Pass text_to_speech content to inference script
- Enhanced placeholder video generation
- Better error handling for duration calculation

Files changed (2) hide show

app.py +1 -0
scripts/inference.py +2 -0

app.py CHANGED Viewed

	@@ -505,3 +505,4 @@ if __name__ == "__main__":
505
506
507


505
506
507
508	+

scripts/inference.py CHANGED Viewed

@@ -144,3 +144,5 @@ def main():
 if __name__ == "__main__":
     main()


144
145	if __name__ == "__main__":
146	main()
147	+
148	+