File size: 2,992 Bytes
92b0f7e 428d55f 92b0f7e 428d55f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 |
---
library_name: diffusers
tags:
- modular_diffusers
---
# Modular ChronoEdit
Modular implementation of [`nvidia/ChronoEdit-14B-Diffusers`](https://hf.co/nvidia/ChronoEdit-14B-Diffusers).
## Code
<details>
<summary>Unfold</summary>
```py
"""
Mimicked from https://huggingface.co/spaces/nvidia/ChronoEdit/blob/main/app.py
"""
from diffusers.modular_pipelines import WanModularPipeline, ModularPipelineBlocks
from diffusers.utils import load_image
from diffusers import UniPCMultistepScheduler
import torch
from PIL import Image
repo_id = "diffusers-internal-dev/chronoedit-modular"
blocks = ModularPipelineBlocks.from_pretrained(repo_id, trust_remote_code=True)
pipe = WanModularPipeline(blocks, repo_id)
pipe.load_components(
trust_remote_code=True,
device_map="cuda",
torch_dtype={"default": torch.bfloat16, "image_encoder": torch.float32},
)
pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config, flow_shift=2.0)
pipe.load_lora_weights("nvidia/ChronoEdit-14B-Diffusers", weight_name="lora/chronoedit_distill_lora.safetensors")
pipe.fuse_lora(lora_scale=1.0)
image = load_image("https://huggingface.co/spaces/nvidia/ChronoEdit/resolve/main/examples/3.png")
prompt = "Transform the image so that inside the floral teacup of steaming tea, a small, cute mouse is sitting and taking a bath; the mouse should look relaxed and cheerful, with a tiny white bath towel draped over its head as if enjoying a spa moment, while the steam rises gently around it, blending seamlessly with the warm and cozy atmosphere."
# image is resized within the pipeline unlike https://huggingface.co/spaces/nvidia/ChronoEdit/blob/main/app.py#L151
# refer to `ChronoEditImageInputStep`.
out = pipe(
image=image,
prompt=prompt, # todo: enhance prompt
num_inference_steps=8, # todo: implement temporal reasoning
num_frames=5, # https://huggingface.co/spaces/nvidia/ChronoEdit/blob/main/app.py#L152
output_type="np",
generator=torch.manual_seed(0),
)
frames = out.values["videos"][0]
Image.fromarray((frames[-1] * 255).clip(0, 255).astype("uint8")).save("demo.png")
```
</details>
You can find it [here](./example.py) too.
> [!TIP]
> Make sure `diffusers` is installed from source: `pip install git+https://github.com/huggingface/diffusers`.
## Results
<table>
<tr>
<td><img src="https://huggingface.co/spaces/nvidia/ChronoEdit/resolve/main/examples/3.png" alt="First Image"></td>
<td><img src="./demo.png" alt="Edited Image"></td>
</tr>
<caption><i>Transform the image so that inside the floral teacup of steaming tea, a small, cute mouse is sitting and taking a bath; the mouse should look relaxed and cheerful, with a tiny white bath towel draped over its head as if enjoying a spa moment, while the steam rises gently around it, blending seamlessly with the warm and cozy atmosphere</i>.</caption>
</table>
## Notes
1. This implementation doesn't have temporal reasoning.
2. This doesn't use a separate prompt enhancer model. |