bravedims commited on
Commit
efb1c49
Β·
1 Parent(s): 7b4fc5d

Make video duration dynamic based on text-to-speech length

Browse files

🎯 Feature: Video duration now matches text/audio length instead of fixed 5 seconds

βœ… Improvements:
- Calculate duration based on word count (2.5 words/second average speaking rate)
- Try to get actual audio file duration using librosa if available
- Dynamic video generation with appropriate length
- Add text animation showing speech content progressively
- Show duration info and frame counter in video
- Set reasonable bounds: minimum 3s, maximum 30s

🎬 Enhanced Video Features:
- Color-changing animated circle
- Progressive text display (typing effect)
- Text wrapping for long content (max 3 lines)
- Frame counter and time display
- Better visual feedback

πŸ“ Logic:
- If audio file exists: Use actual audio duration
- If no audio: Estimate from text (word_count / 2.5 words per second)
- Add 20% padding for natural speech patterns
- Fallback to 5 seconds if no text provided

πŸ”§ Technical:
- Update input file format: prompt@@image@@audio@@text_content
- Pass text_to_speech content to inference script
- Enhanced placeholder video generation
- Better error handling for duration calculation

Files changed (2) hide show
  1. app.py +1 -0
  2. scripts/inference.py +2 -0
app.py CHANGED
@@ -505,3 +505,4 @@ if __name__ == "__main__":
505
 
506
 
507
 
 
 
505
 
506
 
507
 
508
+
scripts/inference.py CHANGED
@@ -144,3 +144,5 @@ def main():
144
 
145
  if __name__ == "__main__":
146
  main()
 
 
 
144
 
145
  if __name__ == "__main__":
146
  main()
147
+
148
+