File size: 5,133 Bytes
58b196f dcf0106 58b196f dcf0106 58b196f 54b6ea7 58b196f 54b6ea7 58b196f 54b6ea7 58b196f 54b6ea7 58b196f 54b6ea7 dcf0106 58b196f dcf0106 58b196f dcf0106 58b196f 54b6ea7 58b196f 54b6ea7 58b196f 54b6ea7 58b196f 54b6ea7 58b196f 54b6ea7 58b196f 54b6ea7 58b196f 54b6ea7 58b196f 54b6ea7 dcf0106 58b196f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 |
---
tags:
- text-to-image
- lora
- diffusers
- template:diffusion-lora
widget:
- text: >-
flat color, no lineart, blending, negative space, artist:[john kafka|ponsuke
kaikai|hara id 21|yoneyama mai|fuzichoco], 1girl, hoshimachi suisei,
virtual youtuber, blue hair, side ponytail, cowboy shot, black shirt, star
print, off shoulder, outdoors, starry sky, wariza, looking up, half-closed
eyes, black sky, live2d animation, upper body, high quality cinematic video
of a woman sitting under the starry night sky. The Camera is steady, This is
a cowboy shot. The animation is smooth and fluid.
parameters:
negative_prompt: >-
bad quality
video,色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走
output:
url: images/ComfyUI_00455_.webp
- text: >-
flat color, no lineart, blending, negative space, artist:[john kafka|ponsuke
kaikai|hara id 21|yoneyama mai|fuzichoco], 1girl, sakura miko, pink hair,
cowboy shot, white shirt, floral print, off shoulder, outdoors, cherry
blossom, tree shade, wariza, looking up, falling petals, half-closed eyes,
white sky, clouds, live2d animation, upper body, high quality cinematic
video of a woman sitting under a sakura tree. Dreamy and lonely, the camera
close-ups on the face of the woman as she turns towards the viewer. The
Camera is steady, This is a cowboy shot. The animation is smooth and fluid.
parameters:
negative_prompt: >-
bad quality
video,色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走
output:
url: images/ComfyUI_00469_.webp
base_model: Wan-AI/Wan2.1-T2V-1.3B-Diffusers
instance_prompt: flat color, no lineart
license: apache-2.0
---
# Flat Color - Style
<Gallery />
## Model description
Flat Color - Style Trained on images without visible lineart, flat colors, and little to no indication of depth.
Reprinted from CivitAI: https://civitai.com/models/1132089?modelVersionId=1525407
Text to Video previews generated with [ComfyUI_examples/wan/#text-to-video](https://comfyanonymous.github.io/ComfyUI_examples/wan/#text-to-video)
Loading the LoRA with LoraLoaderModelOnly node and using the fp16 1.3B wan2.1_t2v_1.3B_fp16.safetensors
## Trigger words
You should use `flat color` to trigger the image generation.
You should use `no lineart` to trigger the image generation.
## Download model
Weights for this model are available in Safetensors format.
[Download](/motimalu/wan-flat-color-1.3b-v2/tree/main) them in the Files & versions tab.
## Training Config
Trained with [diffusion-pipe](https://github.com/tdrussell/diffusion-pipe)
### dataset.toml
```
# Resolution settings.
resolutions = [512]
# Aspect ratio bucketing settings
enable_ar_bucket = true
min_ar = 0.5
max_ar = 2.0
num_ar_buckets = 7
# Frame buckets (1 is for images)
frame_buckets = [1]
[[directory]] # IMAGES
# Path to the directory containing images and their corresponding caption files.
path = '/mnt/d/huanvideo/training_data/images'
num_repeats = 5
resolutions = [720]
frame_buckets = [1] # Use 1 frame for images.
[[directory]] # VIDEOS
# Path to the directory containing videos and their corresponding caption files.
path = '/mnt/d/huanvideo/training_data/videos'
num_repeats = 5
resolutions = [512] # Set video resolution to 256 (e.g., 244p).
frame_buckets = [6, 28, 31, 32, 36, 42, 43, 48, 50, 53]
```
### config.toml
```
# Dataset config file.
output_dir = '/mnt/d/wan/training_output'
dataset = 'dataset.toml'
# Training settings
epochs = 50
micro_batch_size_per_gpu = 1
pipeline_stages = 1
gradient_accumulation_steps = 4
gradient_clipping = 1.0
warmup_steps = 100
# eval settings
eval_every_n_epochs = 5
eval_before_first_step = true
eval_micro_batch_size_per_gpu = 1
eval_gradient_accumulation_steps = 1
# misc settings
save_every_n_epochs = 5
checkpoint_every_n_minutes = 30
activation_checkpointing = true
partition_method = 'parameters'
save_dtype = 'bfloat16'
caching_batch_size = 1
steps_per_print = 1
video_clip_mode = 'single_middle'
[model]
type = 'wan'
ckpt_path = '../Wan2.1-T2V-1.3B'
dtype = 'bfloat16'
# You can use fp8 for the transformer when training LoRA.
transformer_dtype = 'float8'
timestep_sample_method = 'logit_normal'
[adapter]
type = 'lora'
rank = 32
dtype = 'bfloat16'
[optimizer]
type = 'adamw_optimi'
lr = 5e-5
betas = [0.9, 0.99]
weight_decay = 0.02
eps = 1e-8
```
|