apple
/

starflow

@@ -7,6 +7,7 @@ tags:
 - generative-models
 - art
 - autoregressive-models
 ---
 # STARFlow: Scalable Transformer Auto-Regressive Flow
@@ -58,7 +59,7 @@ pip install -r requirements.txt
 ### Text-to-Image Generation
 Generate high-quality images from text prompts:
 ```bash
@@ -88,8 +89,8 @@ bash scripts/test_sample_video.sh "a corgi dog looks at the camera"
 bash scripts/test_sample_video.sh "a cat playing piano" "/path/to/input/image.jpg"
 # Longer video generation (specify target length in frames)
-bash scripts/test_sample_video.sh "a corgi dog looks at the camera" "none" 241  # ~15 seconds at 16fps
-bash scripts/test_sample_video.sh "a corgi dog looks at the camera" "none" 481  # ~30 seconds at 16fps
 # Advanced video generation
 torchrun --standalone --nproc_per_node 8 sample.py \
@@ -138,9 +139,9 @@ torchrun --standalone --nproc_per_node 8 train.py \
     --batch_size 192
 ```
-## 🔧 Utilities
-### Video Processing
 Extract individual frames from multi-video grids:
@@ -236,14 +237,14 @@ python scripts/extract_images.py input_file.mp4
 ## 💡 Tips
-### Image Generation
 1. Use guidance scales between 2.0-5.0 for balanced quality and diversity
 2. Experiment with different aspect ratios for your use case
 3. Enable Jacobi iteration (`--jacobi 1`) for faster sampling
 4. Use higher resolution models for detailed outputs
 5. The default script uses optimized settings: `--jacobi_th 0.001` and `--jacobi_block_size 16`
-### Video Generation
 1. Start with shorter sequences (81 frames) and gradually increase length (161, 241, 481+ frames)
 2. Use input images (`--input_image`) for more controlled generation
 3. Adjust FPS settings based on content type (8-24 FPS)

 - generative-models
 - art
 - autoregressive-models
+pipeline_tag: text-to-audio
 ---
 # STARFlow: Scalable Transformer Auto-Regressive Flow
 ### Text-to-Image Generation
+hi
 Generate high-quality images from text prompts:
 ```bash
 bash scripts/test_sample_video.sh "a cat playing piano" "/path/to/input/image.jpg"
 # Longer video generation (specify target length in frames)
+bash scripts/test_sample_video.sh "a corgi dog looks at the camera" "none" 0 ~15 seconds at 16fps
+bash scripts/test_sample_video.sh "a corgi dog looks at the camera" "none" 0 # ~30 seconds at 16fps
 # Advanced video generation
 torchrun --standalone --nproc_per_node 8 sample.py \
     --batch_size 192
 ```
+## Utilities
+###
 Extract individual frames from multi-video grids:
 ## 💡 Tips
+###
 1. Use guidance scales between 2.0-5.0 for balanced quality and diversity
 2. Experiment with different aspect ratios for your use case
 3. Enable Jacobi iteration (`--jacobi 1`) for faster sampling
 4. Use higher resolution models for detailed outputs
 5. The default script uses optimized settings: `--jacobi_th 0.001` and `--jacobi_block_size 16`
+###
 1. Start with shorter sequences (81 frames) and gradually increase length (161, 241, 481+ frames)
 2. Use input images (`--input_image`) for more controlled generation
 3. Adjust FPS settings based on content type (8-24 FPS)