File size: 5,133 Bytes

---
tags:
- text-to-image
- lora
- diffusers
- template:diffusion-lora
widget:
- text: >-
    flat color, no lineart, blending, negative space, artist:[john kafka|ponsuke
    kaikai|hara id 21|yoneyama mai|fuzichoco],  1girl, hoshimachi suisei,
    virtual youtuber, blue hair, side ponytail, cowboy shot, black shirt, star
    print, off shoulder, outdoors, starry sky, wariza, looking up, half-closed
    eyes, black sky,  live2d animation, upper body, high quality cinematic video
    of a woman sitting under the starry night sky. The Camera is steady, This is
    a cowboy shot. The animation is smooth and fluid.
  parameters:
    negative_prompt: >-
      bad quality
      video,色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走
  output:
    url: images/ComfyUI_00455_.webp
- text: >-
    flat color, no lineart, blending, negative space, artist:[john kafka|ponsuke
    kaikai|hara id 21|yoneyama mai|fuzichoco],  1girl, sakura miko, pink hair,
    cowboy shot, white shirt, floral print, off shoulder, outdoors, cherry
    blossom, tree shade, wariza, looking up, falling petals, half-closed eyes,
    white sky, clouds,  live2d animation, upper body, high quality cinematic
    video of a woman sitting under a sakura tree. Dreamy and lonely, the camera
    close-ups on the face of the woman as she turns towards the viewer. The
    Camera is steady, This is a cowboy shot. The animation is smooth and fluid.
  parameters:
    negative_prompt: >-
      bad quality
      video,色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走
  output:
    url: images/ComfyUI_00469_.webp
base_model: Wan-AI/Wan2.1-T2V-1.3B-Diffusers
instance_prompt: flat color, no lineart
license: apache-2.0
---
# Flat Color - Style

<Gallery />

## Model description 

Flat Color - Style Trained on images without visible lineart, flat colors, and little to no indication of depth.

Reprinted from CivitAI: https:&#x2F;&#x2F;civitai.com&#x2F;models&#x2F;1132089?modelVersionId&#x3D;1525407

Text to Video previews generated with [ComfyUI_examples&#x2F;wan&#x2F;#text-to-video](https:&#x2F;&#x2F;comfyanonymous.github.io&#x2F;ComfyUI_examples&#x2F;wan&#x2F;#text-to-video)

Loading the LoRA with LoraLoaderModelOnly node and using the fp16 1.3B wan2.1_t2v_1.3B_fp16.safetensors

## Trigger words

You should use `flat color` to trigger the image generation.

You should use `no lineart` to trigger the image generation.


## Download model

Weights for this model are available in Safetensors format.

[Download](/motimalu/wan-flat-color-1.3b-v2/tree/main) them in the Files & versions tab.


## Training Config

Trained with [diffusion-pipe](https://github.com/tdrussell/diffusion-pipe)

### dataset.toml
```
# Resolution settings.
resolutions = [512]

# Aspect ratio bucketing settings
enable_ar_bucket = true
min_ar = 0.5
max_ar = 2.0
num_ar_buckets = 7

# Frame buckets (1 is for images)
frame_buckets = [1]

[[directory]] # IMAGES
# Path to the directory containing images and their corresponding caption files.
path = '/mnt/d/huanvideo/training_data/images'
num_repeats = 5
resolutions = [720]
frame_buckets = [1] # Use 1 frame for images.

[[directory]] # VIDEOS
# Path to the directory containing videos and their corresponding caption files.
path = '/mnt/d/huanvideo/training_data/videos'
num_repeats = 5
resolutions = [512] # Set video resolution to 256 (e.g., 244p).
frame_buckets = [6, 28, 31, 32, 36, 42, 43, 48, 50, 53]
```

### config.toml

```
# Dataset config file.
output_dir = '/mnt/d/wan/training_output'
dataset = 'dataset.toml'

# Training settings
epochs = 50
micro_batch_size_per_gpu = 1
pipeline_stages = 1
gradient_accumulation_steps = 4
gradient_clipping = 1.0
warmup_steps = 100

# eval settings
eval_every_n_epochs = 5
eval_before_first_step = true
eval_micro_batch_size_per_gpu = 1
eval_gradient_accumulation_steps = 1

# misc settings
save_every_n_epochs = 5
checkpoint_every_n_minutes = 30
activation_checkpointing = true
partition_method = 'parameters'
save_dtype = 'bfloat16'
caching_batch_size = 1
steps_per_print = 1
video_clip_mode = 'single_middle'

[model]
type = 'wan'
ckpt_path = '../Wan2.1-T2V-1.3B'
dtype = 'bfloat16'
# You can use fp8 for the transformer when training LoRA.
transformer_dtype = 'float8'
timestep_sample_method = 'logit_normal'

[adapter]
type = 'lora'
rank = 32
dtype = 'bfloat16'

[optimizer]
type = 'adamw_optimi'
lr = 5e-5
betas = [0.9, 0.99]
weight_decay = 0.02
eps = 1e-8
```