Files changed (1) hide show
  1. README.md +8 -7
README.md CHANGED
@@ -7,6 +7,7 @@ tags:
7
  - generative-models
8
  - art
9
  - autoregressive-models
 
10
  ---
11
  # STARFlow: Scalable Transformer Auto-Regressive Flow
12
 
@@ -58,7 +59,7 @@ pip install -r requirements.txt
58
 
59
 
60
  ### Text-to-Image Generation
61
-
62
  Generate high-quality images from text prompts:
63
 
64
  ```bash
@@ -88,8 +89,8 @@ bash scripts/test_sample_video.sh "a corgi dog looks at the camera"
88
  bash scripts/test_sample_video.sh "a cat playing piano" "/path/to/input/image.jpg"
89
 
90
  # Longer video generation (specify target length in frames)
91
- bash scripts/test_sample_video.sh "a corgi dog looks at the camera" "none" 241 # ~15 seconds at 16fps
92
- bash scripts/test_sample_video.sh "a corgi dog looks at the camera" "none" 481 # ~30 seconds at 16fps
93
 
94
  # Advanced video generation
95
  torchrun --standalone --nproc_per_node 8 sample.py \
@@ -138,9 +139,9 @@ torchrun --standalone --nproc_per_node 8 train.py \
138
  --batch_size 192
139
  ```
140
 
141
- ## 🔧 Utilities
142
 
143
- ### Video Processing
144
 
145
  Extract individual frames from multi-video grids:
146
 
@@ -236,14 +237,14 @@ python scripts/extract_images.py input_file.mp4
236
 
237
  ## 💡 Tips
238
 
239
- ### Image Generation
240
  1. Use guidance scales between 2.0-5.0 for balanced quality and diversity
241
  2. Experiment with different aspect ratios for your use case
242
  3. Enable Jacobi iteration (`--jacobi 1`) for faster sampling
243
  4. Use higher resolution models for detailed outputs
244
  5. The default script uses optimized settings: `--jacobi_th 0.001` and `--jacobi_block_size 16`
245
 
246
- ### Video Generation
247
  1. Start with shorter sequences (81 frames) and gradually increase length (161, 241, 481+ frames)
248
  2. Use input images (`--input_image`) for more controlled generation
249
  3. Adjust FPS settings based on content type (8-24 FPS)
 
7
  - generative-models
8
  - art
9
  - autoregressive-models
10
+ pipeline_tag: text-to-audio
11
  ---
12
  # STARFlow: Scalable Transformer Auto-Regressive Flow
13
 
 
59
 
60
 
61
  ### Text-to-Image Generation
62
+ hi
63
  Generate high-quality images from text prompts:
64
 
65
  ```bash
 
89
  bash scripts/test_sample_video.sh "a cat playing piano" "/path/to/input/image.jpg"
90
 
91
  # Longer video generation (specify target length in frames)
92
+ bash scripts/test_sample_video.sh "a corgi dog looks at the camera" "none" 0 ~15 seconds at 16fps
93
+ bash scripts/test_sample_video.sh "a corgi dog looks at the camera" "none" 0 # ~30 seconds at 16fps
94
 
95
  # Advanced video generation
96
  torchrun --standalone --nproc_per_node 8 sample.py \
 
139
  --batch_size 192
140
  ```
141
 
142
+ ## Utilities
143
 
144
+ ###
145
 
146
  Extract individual frames from multi-video grids:
147
 
 
237
 
238
  ## 💡 Tips
239
 
240
+ ###
241
  1. Use guidance scales between 2.0-5.0 for balanced quality and diversity
242
  2. Experiment with different aspect ratios for your use case
243
  3. Enable Jacobi iteration (`--jacobi 1`) for faster sampling
244
  4. Use higher resolution models for detailed outputs
245
  5. The default script uses optimized settings: `--jacobi_th 0.001` and `--jacobi_block_size 16`
246
 
247
+ ###
248
  1. Start with shorter sequences (81 frames) and gradually increase length (161, 241, 481+ frames)
249
  2. Use input images (`--input_image`) for more controlled generation
250
  3. Adjust FPS settings based on content type (8-24 FPS)