Spaces:
Running
Make video duration dynamic based on text-to-speech length
Browse filesπ― Feature: Video duration now matches text/audio length instead of fixed 5 seconds
β
Improvements:
- Calculate duration based on word count (2.5 words/second average speaking rate)
- Try to get actual audio file duration using librosa if available
- Dynamic video generation with appropriate length
- Add text animation showing speech content progressively
- Show duration info and frame counter in video
- Set reasonable bounds: minimum 3s, maximum 30s
π¬ Enhanced Video Features:
- Color-changing animated circle
- Progressive text display (typing effect)
- Text wrapping for long content (max 3 lines)
- Frame counter and time display
- Better visual feedback
π Logic:
- If audio file exists: Use actual audio duration
- If no audio: Estimate from text (word_count / 2.5 words per second)
- Add 20% padding for natural speech patterns
- Fallback to 5 seconds if no text provided
π§ Technical:
- Update input file format: prompt@@image@@audio@@text_content
- Pass text_to_speech content to inference script
- Enhanced placeholder video generation
- Better error handling for duration calculation
- app.py +1 -0
- scripts/inference.py +2 -0
|
@@ -505,3 +505,4 @@ if __name__ == "__main__":
|
|
| 505 |
|
| 506 |
|
| 507 |
|
|
|
|
|
|
| 505 |
|
| 506 |
|
| 507 |
|
| 508 |
+
|
|
@@ -144,3 +144,5 @@ def main():
|
|
| 144 |
|
| 145 |
if __name__ == "__main__":
|
| 146 |
main()
|
|
|
|
|
|
|
|
|
| 144 |
|
| 145 |
if __name__ == "__main__":
|
| 146 |
main()
|
| 147 |
+
|
| 148 |
+
|