Update README.md
#10
by
Raychellewis00
- opened
README.md
CHANGED
|
@@ -7,6 +7,7 @@ tags:
|
|
| 7 |
- generative-models
|
| 8 |
- art
|
| 9 |
- autoregressive-models
|
|
|
|
| 10 |
---
|
| 11 |
# STARFlow: Scalable Transformer Auto-Regressive Flow
|
| 12 |
|
|
@@ -58,7 +59,7 @@ pip install -r requirements.txt
|
|
| 58 |
|
| 59 |
|
| 60 |
### Text-to-Image Generation
|
| 61 |
-
|
| 62 |
Generate high-quality images from text prompts:
|
| 63 |
|
| 64 |
```bash
|
|
@@ -88,8 +89,8 @@ bash scripts/test_sample_video.sh "a corgi dog looks at the camera"
|
|
| 88 |
bash scripts/test_sample_video.sh "a cat playing piano" "/path/to/input/image.jpg"
|
| 89 |
|
| 90 |
# Longer video generation (specify target length in frames)
|
| 91 |
-
bash scripts/test_sample_video.sh "a corgi dog looks at the camera" "none"
|
| 92 |
-
bash scripts/test_sample_video.sh "a corgi dog looks at the camera" "none"
|
| 93 |
|
| 94 |
# Advanced video generation
|
| 95 |
torchrun --standalone --nproc_per_node 8 sample.py \
|
|
@@ -138,9 +139,9 @@ torchrun --standalone --nproc_per_node 8 train.py \
|
|
| 138 |
--batch_size 192
|
| 139 |
```
|
| 140 |
|
| 141 |
-
##
|
| 142 |
|
| 143 |
-
###
|
| 144 |
|
| 145 |
Extract individual frames from multi-video grids:
|
| 146 |
|
|
@@ -236,14 +237,14 @@ python scripts/extract_images.py input_file.mp4
|
|
| 236 |
|
| 237 |
## 💡 Tips
|
| 238 |
|
| 239 |
-
###
|
| 240 |
1. Use guidance scales between 2.0-5.0 for balanced quality and diversity
|
| 241 |
2. Experiment with different aspect ratios for your use case
|
| 242 |
3. Enable Jacobi iteration (`--jacobi 1`) for faster sampling
|
| 243 |
4. Use higher resolution models for detailed outputs
|
| 244 |
5. The default script uses optimized settings: `--jacobi_th 0.001` and `--jacobi_block_size 16`
|
| 245 |
|
| 246 |
-
###
|
| 247 |
1. Start with shorter sequences (81 frames) and gradually increase length (161, 241, 481+ frames)
|
| 248 |
2. Use input images (`--input_image`) for more controlled generation
|
| 249 |
3. Adjust FPS settings based on content type (8-24 FPS)
|
|
|
|
| 7 |
- generative-models
|
| 8 |
- art
|
| 9 |
- autoregressive-models
|
| 10 |
+
pipeline_tag: text-to-audio
|
| 11 |
---
|
| 12 |
# STARFlow: Scalable Transformer Auto-Regressive Flow
|
| 13 |
|
|
|
|
| 59 |
|
| 60 |
|
| 61 |
### Text-to-Image Generation
|
| 62 |
+
hi
|
| 63 |
Generate high-quality images from text prompts:
|
| 64 |
|
| 65 |
```bash
|
|
|
|
| 89 |
bash scripts/test_sample_video.sh "a cat playing piano" "/path/to/input/image.jpg"
|
| 90 |
|
| 91 |
# Longer video generation (specify target length in frames)
|
| 92 |
+
bash scripts/test_sample_video.sh "a corgi dog looks at the camera" "none" 0 ~15 seconds at 16fps
|
| 93 |
+
bash scripts/test_sample_video.sh "a corgi dog looks at the camera" "none" 0 # ~30 seconds at 16fps
|
| 94 |
|
| 95 |
# Advanced video generation
|
| 96 |
torchrun --standalone --nproc_per_node 8 sample.py \
|
|
|
|
| 139 |
--batch_size 192
|
| 140 |
```
|
| 141 |
|
| 142 |
+
## Utilities
|
| 143 |
|
| 144 |
+
###
|
| 145 |
|
| 146 |
Extract individual frames from multi-video grids:
|
| 147 |
|
|
|
|
| 237 |
|
| 238 |
## 💡 Tips
|
| 239 |
|
| 240 |
+
###
|
| 241 |
1. Use guidance scales between 2.0-5.0 for balanced quality and diversity
|
| 242 |
2. Experiment with different aspect ratios for your use case
|
| 243 |
3. Enable Jacobi iteration (`--jacobi 1`) for faster sampling
|
| 244 |
4. Use higher resolution models for detailed outputs
|
| 245 |
5. The default script uses optimized settings: `--jacobi_th 0.001` and `--jacobi_block_size 16`
|
| 246 |
|
| 247 |
+
###
|
| 248 |
1. Start with shorter sequences (81 frames) and gradually increase length (161, 241, 481+ frames)
|
| 249 |
2. Use input images (`--input_image`) for more controlled generation
|
| 250 |
3. Adjust FPS settings based on content type (8-24 FPS)
|