bubbliiiing
		
	commited on
		
		
					Commit 
							
							·
						
						f3d0c96
	
1
								Parent(s):
							
							c3461c0
								
Update Readme
Browse files- README.md +88 -17
- README_en.md +93 -18
    	
        README.md
    CHANGED
    
    | @@ -143,6 +143,39 @@ Linux 的详细信息: | |
| 143 |  | 
| 144 | 
             
            我们需要大约 60GB 的可用磁盘空间,请检查!
         | 
| 145 |  | 
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
| 146 | 
             
            #### b. 权重放置
         | 
| 147 | 
             
            我们最好将[权重](#model-zoo)按照指定路径进行放置:
         | 
| 148 |  | 
| @@ -161,8 +194,7 @@ EasyAnimateV5: | |
| 161 |  | 
| 162 | 
             
            ### EasyAnimateV5-12b-zh-InP
         | 
| 163 |  | 
| 164 | 
            -
             | 
| 165 | 
            -
             | 
| 166 | 
             
            <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
         | 
| 167 | 
             
              <tr>
         | 
| 168 | 
             
                  <td>
         | 
| @@ -181,8 +213,6 @@ Resolution-1024 | |
| 181 | 
             
            </table>
         | 
| 182 |  | 
| 183 |  | 
| 184 | 
            -
            Resolution-768
         | 
| 185 | 
            -
             | 
| 186 | 
             
            <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
         | 
| 187 | 
             
              <tr>
         | 
| 188 | 
             
                  <td>
         | 
| @@ -200,8 +230,6 @@ Resolution-768 | |
| 200 | 
             
              </tr>
         | 
| 201 | 
             
            </table>
         | 
| 202 |  | 
| 203 | 
            -
            Resolution-512
         | 
| 204 | 
            -
             | 
| 205 | 
             
            <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
         | 
| 206 | 
             
              <tr>
         | 
| 207 | 
             
                  <td>
         | 
| @@ -219,6 +247,41 @@ Resolution-512 | |
| 219 | 
             
              </tr>
         | 
| 220 | 
             
            </table>
         | 
| 221 |  | 
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
| 222 | 
             
            ### EasyAnimateV5-12b-zh-Control
         | 
| 223 |  | 
| 224 | 
             
            <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
         | 
| @@ -364,6 +427,13 @@ sh scripts/train.sh | |
| 364 | 
             
            # 模型地址
         | 
| 365 | 
             
            EasyAnimateV5:
         | 
| 366 |  | 
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
| 367 | 
             
            | 名称 | 种类 | 存储空间 | Hugging Face | Model Scope | 描述 |
         | 
| 368 | 
             
            |--|--|--|--|--|--|
         | 
| 369 | 
             
            | EasyAnimateV5-12b-zh-InP | EasyAnimateV5 | 34 GB | [🤗Link](https://huggingface.co/alibaba-pai/EasyAnimateV5-12b-zh-InP) | [😄Link](https://modelscope.cn/models/PAI/EasyAnimateV5-12b-zh-InP)| 官方的图生视频权重。支持多分辨率(512,768,1024)的视频预测,支持多分辨率(512,768,1024)的视频预测,以49帧、每秒8帧进行训练,支持中文与英文双语预测 |
         | 
| @@ -373,29 +443,29 @@ EasyAnimateV5: | |
| 373 | 
             
            <details>
         | 
| 374 | 
             
              <summary>(Obsolete) EasyAnimateV4:</summary>
         | 
| 375 |  | 
| 376 | 
            -
            | 名称 | 种类 | 存储空间 |  | 
| 377 | 
             
            |--|--|--|--|--|--|
         | 
| 378 | 
            -
            | EasyAnimateV4-XL-2-InP.tar.gz | EasyAnimateV4 | 解压前 8.9 GB / 解压后 14.0 GB | [ | 
| 379 | 
             
            </details>
         | 
| 380 |  | 
| 381 | 
             
            <details>
         | 
| 382 | 
             
              <summary>(Obsolete) EasyAnimateV3:</summary>
         | 
| 383 |  | 
| 384 | 
            -
            | 名称 | 种类 | 存储空间 |  | 
| 385 | 
             
            |--|--|--|--|--|--|
         | 
| 386 | 
            -
            | EasyAnimateV3-XL-2-InP-512x512.tar | EasyAnimateV3 | 18.2GB | 
| 387 | 
            -
            | EasyAnimateV3-XL-2-InP-768x768.tar | EasyAnimateV3 | 18.2GB | [ | 
| 388 | 
            -
            | EasyAnimateV3-XL-2-InP-960x960.tar | EasyAnimateV3 | 18.2GB | [ | 
| 389 | 
             
            </details>
         | 
| 390 |  | 
| 391 | 
             
            <details>
         | 
| 392 | 
             
              <summary>(Obsolete) EasyAnimateV2:</summary>
         | 
| 393 |  | 
| 394 | 
            -
            | 名称 | 种类 | 存储空间 | 下载地址 | Hugging Face | 描述 |
         | 
| 395 | 
            -
             | 
| 396 | 
            -
            | EasyAnimateV2-XL-2-512x512.tar | EasyAnimateV2 | 16.2GB | [ | 
| 397 | 
            -
            | EasyAnimateV2-XL-2-768x768.tar | EasyAnimateV2 | 16.2GB | [ | 
| 398 | 
            -
            | easyanimatev2_minimalism_lora.safetensors | Lora of Pixart | 485.1MB | [Download](https://pai-aigc-photog.oss-cn-hangzhou.aliyuncs.com/easyanimate/Personalized_Model/easyanimatev2_minimalism_lora.safetensors)| - | 使用特定类型的图像进行lora训练的结果。图片可从这里[下载](https://pai-aigc-photog.oss-cn-hangzhou.aliyuncs.com/webui/Minimalism.zip). |
         | 
| 399 | 
             
            </details>
         | 
| 400 |  | 
| 401 | 
             
            <details>
         | 
| @@ -426,6 +496,7 @@ EasyAnimateV5: | |
| 426 |  | 
| 427 | 
             
            # 参考文献
         | 
| 428 | 
             
            - CogVideo: https://github.com/THUDM/CogVideo/
         | 
|  | |
| 429 | 
             
            - magvit: https://github.com/google-research/magvit
         | 
| 430 | 
             
            - PixArt: https://github.com/PixArt-alpha/PixArt-alpha
         | 
| 431 | 
             
            - Open-Sora-Plan: https://github.com/PKU-YuanGroup/Open-Sora-Plan
         | 
|  | |
| 143 |  | 
| 144 | 
             
            我们需要大约 60GB 的可用磁盘空间,请检查!
         | 
| 145 |  | 
| 146 | 
            +
            EasyAnimateV5-12B的视频大小可以由不同的GPU Memory生成,包括:
         | 
| 147 | 
            +
            | GPU memory |384x672x72|384x672x49|576x1008x25|576x1008x49|768x1344x25|768x1344x49|
         | 
| 148 | 
            +
            |----------|----------|----------|----------|----------|----------|----------|
         | 
| 149 | 
            +
            | 16GB | 🧡 | 🧡 | ❌ | ❌ | ❌ | ❌ | 
         | 
| 150 | 
            +
            | 24GB | 🧡 | 🧡 | 🧡 | 🧡 | ❌ | ❌ | 
         | 
| 151 | 
            +
            | 40GB | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | 
         | 
| 152 | 
            +
            | 80GB | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 
         | 
| 153 | 
            +
             | 
| 154 | 
            +
            ✅ 表示它可以在"model_cpu_offload"的情况下运行,🧡代表它可以在"model_cpu_offload_and_qfloat8"的情况下运行,⭕️ 表示它可以在"sequential_cpu_offload"的情况下运行,❌ 表示它无法运行。请注意,使用sequential_cpu_offload运行会更慢。
         | 
| 155 | 
            +
             | 
| 156 | 
            +
            有一些不支持torch.bfloat16的卡型,如2080ti、V100,需要将app.py、predict文件中的weight_dtype修改为torch.float16才可以运行。
         | 
| 157 | 
            +
             | 
| 158 | 
            +
            EasyAnimateV5-12B使用不同GPU在25个steps中的生成时间如下:
         | 
| 159 | 
            +
            | GPU |384x672x72|384x672x49|576x1008x25|576x1008x49|768x1344x25|768x1344x49|
         | 
| 160 | 
            +
            |----------|----------|----------|----------|----------|----------|----------|
         | 
| 161 | 
            +
            | A10 24GB |约120秒 (4.8s/it)|约240秒 (9.6s/it)|约320秒 (12.7s/it)| 约750秒 (29.8s/it)| ❌ | ❌ |
         | 
| 162 | 
            +
            | A100 80GB |约45秒 (1.75s/it)|约90秒 (3.7s/it)|约120秒 (4.7s/it)|约300秒 (11.4s/it)|约265秒 (10.6s/it)| 约710秒 (28.3s/it)|
         | 
| 163 | 
            +
             | 
| 164 | 
            +
            (⭕️) 表示它可以在low_gpu_memory_mode=True的情况下运行,但速度较慢,同时❌ 表示它无法运行。
         | 
| 165 | 
            +
             | 
| 166 | 
            +
            <details>
         | 
| 167 | 
            +
              <summary>(Obsolete) EasyAnimateV3:</summary>
         | 
| 168 | 
            +
             | 
| 169 | 
            +
            EasyAnimateV3的视频大小可以由不同的GPU Memory生成,包括:
         | 
| 170 | 
            +
            | GPU memory | 384x672x72 | 384x672x144 | 576x1008x72 | 576x1008x144 | 720x1280x72 | 720x1280x144 |
         | 
| 171 | 
            +
            |----------|----------|----------|----------|----------|----------|----------|
         | 
| 172 | 
            +
            | 12GB | ⭕️ | ⭕️ | ⭕️ | ⭕️ | ❌ | ❌ |
         | 
| 173 | 
            +
            | 16GB | ✅ | ✅ | ⭕️ | ⭕️ | ⭕️ | ❌ |
         | 
| 174 | 
            +
            | 24GB | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
         | 
| 175 | 
            +
            | 40GB | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
         | 
| 176 | 
            +
            | 80GB | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
         | 
| 177 | 
            +
            </details>
         | 
| 178 | 
            +
             | 
| 179 | 
             
            #### b. 权重放置
         | 
| 180 | 
             
            我们最好将[权重](#model-zoo)按照指定路径进行放置:
         | 
| 181 |  | 
|  | |
| 194 |  | 
| 195 | 
             
            ### EasyAnimateV5-12b-zh-InP
         | 
| 196 |  | 
| 197 | 
            +
            #### I2V
         | 
|  | |
| 198 | 
             
            <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
         | 
| 199 | 
             
              <tr>
         | 
| 200 | 
             
                  <td>
         | 
|  | |
| 213 | 
             
            </table>
         | 
| 214 |  | 
| 215 |  | 
|  | |
|  | |
| 216 | 
             
            <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
         | 
| 217 | 
             
              <tr>
         | 
| 218 | 
             
                  <td>
         | 
|  | |
| 230 | 
             
              </tr>
         | 
| 231 | 
             
            </table>
         | 
| 232 |  | 
|  | |
|  | |
| 233 | 
             
            <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
         | 
| 234 | 
             
              <tr>
         | 
| 235 | 
             
                  <td>
         | 
|  | |
| 247 | 
             
              </tr>
         | 
| 248 | 
             
            </table>
         | 
| 249 |  | 
| 250 | 
            +
            #### T2V
         | 
| 251 | 
            +
            <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
         | 
| 252 | 
            +
              <tr>
         | 
| 253 | 
            +
                  <td>
         | 
| 254 | 
            +
                      <video src="https://github.com/user-attachments/assets/eccb0797-4feb-48e9-91d3-5769ce30142b" width="100%" controls autoplay loop></video>
         | 
| 255 | 
            +
                  </td>
         | 
| 256 | 
            +
                  <td>
         | 
| 257 | 
            +
                      <video src="https://github.com/user-attachments/assets/76b3db64-9c7a-4d38-8854-dba940240ceb" width="100%" controls autoplay loop></video>
         | 
| 258 | 
            +
                  </td>
         | 
| 259 | 
            +
                   <td>
         | 
| 260 | 
            +
                      <video src="https://github.com/user-attachments/assets/0b8fab66-8de7-44ff-bd43-8f701bad6bb7" width="100%" controls autoplay loop></video>
         | 
| 261 | 
            +
                 </td>
         | 
| 262 | 
            +
                  <td>
         | 
| 263 | 
            +
                      <video src="https://github.com/user-attachments/assets/9fbddf5f-7fcd-4cc6-9d7c-3bdf1d4ce59e" width="100%" controls autoplay loop></video>
         | 
| 264 | 
            +
                 </td>
         | 
| 265 | 
            +
              </tr>
         | 
| 266 | 
            +
            </table>
         | 
| 267 | 
            +
             | 
| 268 | 
            +
            <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
         | 
| 269 | 
            +
              <tr>
         | 
| 270 | 
            +
                  <td>
         | 
| 271 | 
            +
                      <video src="https://github.com/user-attachments/assets/19c1742b-e417-45ac-97d6-8bf3a80d8e13" width="100%" controls autoplay loop></video>
         | 
| 272 | 
            +
                  </td>
         | 
| 273 | 
            +
                  <td>
         | 
| 274 | 
            +
                      <video src="https://github.com/user-attachments/assets/641e56c8-a3d9-489d-a3a6-42c50a9aeca1" width="100%" controls autoplay loop></video>
         | 
| 275 | 
            +
                  </td>
         | 
| 276 | 
            +
                   <td>
         | 
| 277 | 
            +
                      <video src="https://github.com/user-attachments/assets/2b16be76-518b-44c6-a69b-5c49d76df365" width="100%" controls autoplay loop></video>
         | 
| 278 | 
            +
                 </td>
         | 
| 279 | 
            +
                  <td>
         | 
| 280 | 
            +
                      <video src="https://github.com/user-attachments/assets/e7d9c0fc-136f-405c-9fab-629389e196be" width="100%" controls autoplay loop></video>
         | 
| 281 | 
            +
                 </td>
         | 
| 282 | 
            +
              </tr>
         | 
| 283 | 
            +
            </table>
         | 
| 284 | 
            +
             | 
| 285 | 
             
            ### EasyAnimateV5-12b-zh-Control
         | 
| 286 |  | 
| 287 | 
             
            <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
         | 
|  | |
| 427 | 
             
            # 模型地址
         | 
| 428 | 
             
            EasyAnimateV5:
         | 
| 429 |  | 
| 430 | 
            +
            7B:
         | 
| 431 | 
            +
            | 名称 | 种类 | 存储空间 | Hugging Face | Model Scope | 描述 |
         | 
| 432 | 
            +
            |--|--|--|--|--|--|
         | 
| 433 | 
            +
            | EasyAnimateV5-7b-zh-InP | EasyAnimateV5 | 22 GB | [🤗Link](https://huggingface.co/alibaba-pai/EasyAnimateV5-7b-zh-InP) | [😄Link](https://modelscope.cn/models/PAI/EasyAnimateV5-7b-zh-InP)| 官方的7B图生视频权重。支持多分辨率(512,768,1024)的视频预测,支持多分辨率(512,768,1024)的视频预���,以49帧、每秒8帧进行训练,支持中文与英文双语预测 |
         | 
| 434 | 
            +
            | EasyAnimateV5-7b-zh | EasyAnimateV5 | 22 GB | [🤗Link](https://huggingface.co/alibaba-pai/EasyAnimateV5-7b-zh) | [😄Link](https://modelscope.cn/models/PAI/EasyAnimateV5-12b-zh)| 官方的7B文生视频权重。可用于进行下游任务的fientune。支持多分辨率(512,768,1024)的视频预测,支持多分辨率(512,768,1024)的视频预测,以49帧、每秒8帧进行训练,支持中文与英文双语预测 |
         | 
| 435 | 
            +
             | 
| 436 | 
            +
            12B:
         | 
| 437 | 
             
            | 名称 | 种类 | 存储空间 | Hugging Face | Model Scope | 描述 |
         | 
| 438 | 
             
            |--|--|--|--|--|--|
         | 
| 439 | 
             
            | EasyAnimateV5-12b-zh-InP | EasyAnimateV5 | 34 GB | [🤗Link](https://huggingface.co/alibaba-pai/EasyAnimateV5-12b-zh-InP) | [😄Link](https://modelscope.cn/models/PAI/EasyAnimateV5-12b-zh-InP)| 官方的图生视频权重。支持多分辨率(512,768,1024)的视频预测,支持多分辨率(512,768,1024)的视频预测,以49帧、每秒8帧进行训练,支持中文与英文双语预测 |
         | 
|  | |
| 443 | 
             
            <details>
         | 
| 444 | 
             
              <summary>(Obsolete) EasyAnimateV4:</summary>
         | 
| 445 |  | 
| 446 | 
            +
            | 名称 | 种类 | 存储空间 | Hugging Face | Model Scope | 描述 |
         | 
| 447 | 
             
            |--|--|--|--|--|--|
         | 
| 448 | 
            +
            | EasyAnimateV4-XL-2-InP.tar.gz | EasyAnimateV4 | 解压前 8.9 GB / 解压后 14.0 GB | [🤗Link](https://huggingface.co/alibaba-pai/EasyAnimateV4-XL-2-InP)| [😄Link](https://modelscope.cn/models/PAI/EasyAnimateV4-XL-2-InP)| 官方的图生视频权重。支持多分辨率(512,768,1024,1280)的视频预测,以144帧、每秒24帧进行训练 |
         | 
| 449 | 
             
            </details>
         | 
| 450 |  | 
| 451 | 
             
            <details>
         | 
| 452 | 
             
              <summary>(Obsolete) EasyAnimateV3:</summary>
         | 
| 453 |  | 
| 454 | 
            +
            | 名称 | 种类 | 存储空间 | Hugging Face | Model Scope | 描述 |
         | 
| 455 | 
             
            |--|--|--|--|--|--|
         | 
| 456 | 
            +
            | EasyAnimateV3-XL-2-InP-512x512.tar | EasyAnimateV3 | 18.2GB| [🤗Link](https://huggingface.co/alibaba-pai/EasyAnimateV3-XL-2-InP-512x512)| [😄Link](https://modelscope.cn/models/PAI/EasyAnimateV3-XL-2-InP-512x512)| 官方的512x512分辨率的图生视频权重。以144帧、每秒24帧进行训练 |
         | 
| 457 | 
            +
            | EasyAnimateV3-XL-2-InP-768x768.tar | EasyAnimateV3 | 18.2GB | [🤗Link](https://huggingface.co/alibaba-pai/EasyAnimateV3-XL-2-InP-768x768) | [😄Link](https://modelscope.cn/models/PAI/EasyAnimateV3-XL-2-InP-768x768)| 官方的768x768分辨率的图生视频权重。以144帧、每秒24帧进行训练 |
         | 
| 458 | 
            +
            | EasyAnimateV3-XL-2-InP-960x960.tar | EasyAnimateV3 | 18.2GB | [🤗Link](https://huggingface.co/alibaba-pai/EasyAnimateV3-XL-2-InP-960x960) | [😄Link](https://modelscope.cn/models/PAI/EasyAnimateV3-XL-2-InP-960x960)| 官方的960x960(720P)分辨率的图生视频权重。以144帧、每秒24帧进行训练 |
         | 
| 459 | 
             
            </details>
         | 
| 460 |  | 
| 461 | 
             
            <details>
         | 
| 462 | 
             
              <summary>(Obsolete) EasyAnimateV2:</summary>
         | 
| 463 |  | 
| 464 | 
            +
            | 名称 | 种类 | 存储空间 | 下载地址 | Hugging Face | Model Scope | 描述 |
         | 
| 465 | 
            +
            |--|--|--|--|--|--|--|
         | 
| 466 | 
            +
            | EasyAnimateV2-XL-2-512x512.tar | EasyAnimateV2 | 16.2GB | - | [🤗Link](https://huggingface.co/alibaba-pai/EasyAnimateV2-XL-2-512x512)| [😄Link](https://modelscope.cn/models/PAI/EasyAnimateV2-XL-2-512x512)| 官方的512x512分辨率的重量。以144帧、每秒24帧进行训练 |
         | 
| 467 | 
            +
            | EasyAnimateV2-XL-2-768x768.tar | EasyAnimateV2 | 16.2GB | - | [🤗Link](https://huggingface.co/alibaba-pai/EasyAnimateV2-XL-2-768x768) | [😄Link](https://modelscope.cn/models/PAI/EasyAnimateV2-XL-2-768x768)| 官方的768x768分辨率的重量。以144帧、每秒24帧进行训练 |
         | 
| 468 | 
            +
            | easyanimatev2_minimalism_lora.safetensors | Lora of Pixart | 485.1MB | [Download](https://pai-aigc-photog.oss-cn-hangzhou.aliyuncs.com/easyanimate/Personalized_Model/easyanimatev2_minimalism_lora.safetensors)| - | - | 使用特定类型的图像进行lora训练的结果。图片可从这里[下载](https://pai-aigc-photog.oss-cn-hangzhou.aliyuncs.com/webui/Minimalism.zip). |
         | 
| 469 | 
             
            </details>
         | 
| 470 |  | 
| 471 | 
             
            <details>
         | 
|  | |
| 496 |  | 
| 497 | 
             
            # 参考文献
         | 
| 498 | 
             
            - CogVideo: https://github.com/THUDM/CogVideo/
         | 
| 499 | 
            +
            - Flux: https://github.com/black-forest-labs/flux
         | 
| 500 | 
             
            - magvit: https://github.com/google-research/magvit
         | 
| 501 | 
             
            - PixArt: https://github.com/PixArt-alpha/PixArt-alpha
         | 
| 502 | 
             
            - Open-Sora-Plan: https://github.com/PKU-YuanGroup/Open-Sora-Plan
         | 
    	
        README_en.md
    CHANGED
    
    | @@ -112,6 +112,41 @@ The detailed of Linux: | |
| 112 | 
             
            - GPU:Nvidia-V100 16G & Nvidia-A10 24G & Nvidia-A100 40G & Nvidia-A100 80G
         | 
| 113 |  | 
| 114 | 
             
            We need about 60GB available on disk (for saving weights), please check!
         | 
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
| 115 |  | 
| 116 | 
             
            #### b. Weights
         | 
| 117 | 
             
            We'd better place the [weights](#model-zoo) along the specified path:
         | 
| @@ -131,8 +166,7 @@ The results displayed are all based on image. | |
| 131 |  | 
| 132 | 
             
            ### EasyAnimateV5-12b-zh-InP
         | 
| 133 |  | 
| 134 | 
            -
             | 
| 135 | 
            -
             | 
| 136 | 
             
            <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
         | 
| 137 | 
             
              <tr>
         | 
| 138 | 
             
                  <td>
         | 
| @@ -151,8 +185,6 @@ Resolution-1024 | |
| 151 | 
             
            </table>
         | 
| 152 |  | 
| 153 |  | 
| 154 | 
            -
            Resolution-768
         | 
| 155 | 
            -
             | 
| 156 | 
             
            <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
         | 
| 157 | 
             
              <tr>
         | 
| 158 | 
             
                  <td>
         | 
| @@ -170,8 +202,6 @@ Resolution-768 | |
| 170 | 
             
              </tr>
         | 
| 171 | 
             
            </table>
         | 
| 172 |  | 
| 173 | 
            -
            Resolution-512
         | 
| 174 | 
            -
             | 
| 175 | 
             
            <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
         | 
| 176 | 
             
              <tr>
         | 
| 177 | 
             
                  <td>
         | 
| @@ -189,6 +219,41 @@ Resolution-512 | |
| 189 | 
             
              </tr>
         | 
| 190 | 
             
            </table>
         | 
| 191 |  | 
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
| 192 | 
             
            ### EasyAnimateV5-12b-zh-Control
         | 
| 193 |  | 
| 194 | 
             
            <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
         | 
| @@ -335,6 +400,13 @@ For details on setting some parameters, please refer to [Readme Train](scripts/R | |
| 335 |  | 
| 336 | 
             
            EasyAnimateV5:
         | 
| 337 |  | 
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
| 338 | 
             
            | Name | Type | Storage Space | Hugging Face | Model Scope | Description |
         | 
| 339 | 
             
            |--|--|--|--|--|--|
         | 
| 340 | 
             
            | EasyAnimateV5-12b-zh-InP | EasyAnimateV5 | 34 GB | [🤗Link](https://huggingface.co/alibaba-pai/EasyAnimateV5-12b-zh-InP) | [😄Link](https://modelscope.cn/models/PAI/EasyAnimateV5-12b-zh-InP) | Official image-to-video weights. Supports video prediction at multiple resolutions (512, 768, 1024), trained with 49 frames at 8 frames per second, and supports bilingual prediction in Chinese and English. |
         | 
| @@ -344,28 +416,29 @@ EasyAnimateV5: | |
| 344 | 
             
            <details>
         | 
| 345 | 
             
              <summary>(Obsolete) EasyAnimateV4:</summary>
         | 
| 346 |  | 
| 347 | 
            -
            | Name | Type | Storage Space |  | 
| 348 | 
             
            |--|--|--|--|--|--|
         | 
| 349 | 
            -
            | EasyAnimateV4-XL-2-InP.tar.gz | EasyAnimateV4 | Before extraction: 8.9 GB \/ After extraction: 14.0 GB | | 
| 350 | 
             
            </details>
         | 
| 351 |  | 
| 352 | 
             
            <details>
         | 
| 353 | 
             
              <summary>(Obsolete) EasyAnimateV3:</summary>
         | 
| 354 |  | 
| 355 | 
            -
            | Name | Type | Storage Space |  | 
| 356 | 
             
            |--|--|--|--|--|--|
         | 
| 357 | 
            -
            | EasyAnimateV3-XL-2-InP-512x512.tar | EasyAnimateV3 | 18.2GB | [ | 
| 358 | 
            -
            | EasyAnimateV3-XL-2-InP-768x768.tar | EasyAnimateV3 | 18.2GB | [ | 
| 359 | 
            -
            | EasyAnimateV3-XL-2-InP-960x960.tar | EasyAnimateV3 | 18.2GB | [ | 
| 360 | 
             
            </details>
         | 
| 361 |  | 
| 362 | 
             
            <details>
         | 
| 363 | 
             
              <summary>(Obsolete) EasyAnimateV2:</summary>
         | 
| 364 | 
            -
             | 
| 365 | 
            -
             | 
| 366 | 
            -
             | 
| 367 | 
            -
            | EasyAnimateV2-XL-2- | 
| 368 | 
            -
            |  | 
|  | |
| 369 | 
             
            </details>
         | 
| 370 |  | 
| 371 | 
             
            <details>
         | 
| @@ -397,6 +470,8 @@ EasyAnimateV5: | |
| 397 |  | 
| 398 |  | 
| 399 | 
             
            # Reference
         | 
|  | |
|  | |
| 400 | 
             
            - magvit: https://github.com/google-research/magvit
         | 
| 401 | 
             
            - PixArt: https://github.com/PixArt-alpha/PixArt-alpha
         | 
| 402 | 
             
            - Open-Sora-Plan: https://github.com/PKU-YuanGroup/Open-Sora-Plan
         | 
| @@ -406,4 +481,4 @@ EasyAnimateV5: | |
| 406 | 
             
            - HunYuan DiT: https://github.com/tencent/HunyuanDiT
         | 
| 407 |  | 
| 408 | 
             
            # License
         | 
| 409 | 
            -
            This project is licensed under the [Apache License (Version 2.0)](https://github.com/modelscope/modelscope/blob/master/LICENSE).
         | 
|  | |
| 112 | 
             
            - GPU:Nvidia-V100 16G & Nvidia-A10 24G & Nvidia-A100 40G & Nvidia-A100 80G
         | 
| 113 |  | 
| 114 | 
             
            We need about 60GB available on disk (for saving weights), please check!
         | 
| 115 | 
            +
            The video size for EasyAnimateV5-12B can be generated by different GPU Memory, including:
         | 
| 116 | 
            +
             | 
| 117 | 
            +
            | GPU memory | 384x672x72 | 384x672x49 | 576x1008x25 | 576x1008x49 | 768x1344x25 | 768x1344x49 |
         | 
| 118 | 
            +
            |------------|------------|------------|------------|------------|------------|------------|
         | 
| 119 | 
            +
            | 16GB       | 🧡         | 🧡         | ❌         | ❌         | ❌         | ❌         |
         | 
| 120 | 
            +
            | 24GB       | 🧡         | 🧡         | 🧡         | 🧡         | ❌         | ❌         |
         | 
| 121 | 
            +
            | 40GB       | ✅         | ✅         | ✅         | ✅         | ❌         | ❌         |
         | 
| 122 | 
            +
            | 80GB       | ✅         | ✅         | ✅         | ✅         | ✅         | ✅         |
         | 
| 123 | 
            +
             | 
| 124 | 
            +
            ✅ indicates it can run under "model_cpu_offload", 🧡 represents it can run under "model_cpu_offload_and_qfloat8", ⭕️ indicates it can run under "sequential_cpu_offload", ❌ means it can't run. Please note that running with sequential_cpu_offload will be slower.
         | 
| 125 | 
            +
             | 
| 126 | 
            +
            Some GPUs that do not support torch.bfloat16, such as 2080ti and V100, require changing the weight_dtype in app.py and predict files to torch.float16 in order to run.
         | 
| 127 | 
            +
             | 
| 128 | 
            +
            The generation time for EasyAnimateV5-12B using different GPUs over 25 steps is as follows:
         | 
| 129 | 
            +
             | 
| 130 | 
            +
            | GPU       | 384x672x72       | 384x672x49       | 576x1008x25      | 576x1008x49      | 768x1344x25      | 768x1344x49     |
         | 
| 131 | 
            +
            |-----------|------------------|------------------|------------------|------------------|------------------|-----------------|
         | 
| 132 | 
            +
            | A10 24GB  | ~120s (4.8s/it)  | ~240s (9.6s/it)  | ~320s (12.7s/it) | ~750s (29.8s/it) | ❌               | ❌              |
         | 
| 133 | 
            +
            | A100 80GB | ~45s (1.75s/it)  | ~90s (3.7s/it)   | ~120s (4.7s/it)  | ~300s (11.4s/it) | ~265s (10.6s/it) | ~710s (28.3s/it) |
         | 
| 134 | 
            +
             | 
| 135 | 
            +
            (⭕️) indicates it can run with low_gpu_memory_mode=True, but at a slower speed, and ❌ means it can't run.
         | 
| 136 | 
            +
             | 
| 137 | 
            +
            <details>
         | 
| 138 | 
            +
              <summary>(Obsolete) EasyAnimateV3:</summary>
         | 
| 139 | 
            +
              
         | 
| 140 | 
            +
            The video size for EasyAnimateV3 can be generated by different GPU Memory, including:
         | 
| 141 | 
            +
             | 
| 142 | 
            +
            | GPU memory | 384x672x72 | 384x672x144 | 576x1008x72 | 576x1008x144 | 720x1280x72 | 720x1280x144 |
         | 
| 143 | 
            +
            |------------|------------|-------------|-------------|--------------|-------------|--------------|
         | 
| 144 | 
            +
            | 12GB       | ⭕️         | ⭕️          | ⭕️          | ⭕️           | ❌          | ❌           |
         | 
| 145 | 
            +
            | 16GB       | ✅         | ✅          | ⭕️          | ⭕️           | ⭕️          | ❌           |
         | 
| 146 | 
            +
            | 24GB       | ✅         | ✅          | ✅          | ✅           | ✅          | ❌           |
         | 
| 147 | 
            +
            | 40GB       | ✅         | ✅          | ✅          | ✅           | ✅          | ✅           |
         | 
| 148 | 
            +
            | 80GB       | ✅         | ✅          | ✅          | ✅           | ✅          | ✅           |
         | 
| 149 | 
            +
            </details>
         | 
| 150 |  | 
| 151 | 
             
            #### b. Weights
         | 
| 152 | 
             
            We'd better place the [weights](#model-zoo) along the specified path:
         | 
|  | |
| 166 |  | 
| 167 | 
             
            ### EasyAnimateV5-12b-zh-InP
         | 
| 168 |  | 
| 169 | 
            +
            #### I2V
         | 
|  | |
| 170 | 
             
            <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
         | 
| 171 | 
             
              <tr>
         | 
| 172 | 
             
                  <td>
         | 
|  | |
| 185 | 
             
            </table>
         | 
| 186 |  | 
| 187 |  | 
|  | |
|  | |
| 188 | 
             
            <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
         | 
| 189 | 
             
              <tr>
         | 
| 190 | 
             
                  <td>
         | 
|  | |
| 202 | 
             
              </tr>
         | 
| 203 | 
             
            </table>
         | 
| 204 |  | 
|  | |
|  | |
| 205 | 
             
            <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
         | 
| 206 | 
             
              <tr>
         | 
| 207 | 
             
                  <td>
         | 
|  | |
| 219 | 
             
              </tr>
         | 
| 220 | 
             
            </table>
         | 
| 221 |  | 
| 222 | 
            +
            #### T2V
         | 
| 223 | 
            +
            <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
         | 
| 224 | 
            +
              <tr>
         | 
| 225 | 
            +
                  <td>
         | 
| 226 | 
            +
                      <video src="https://github.com/user-attachments/assets/eccb0797-4feb-48e9-91d3-5769ce30142b" width="100%" controls autoplay loop></video>
         | 
| 227 | 
            +
                  </td>
         | 
| 228 | 
            +
                  <td>
         | 
| 229 | 
            +
                      <video src="https://github.com/user-attachments/assets/76b3db64-9c7a-4d38-8854-dba940240ceb" width="100%" controls autoplay loop></video>
         | 
| 230 | 
            +
                  </td>
         | 
| 231 | 
            +
                   <td>
         | 
| 232 | 
            +
                      <video src="https://github.com/user-attachments/assets/0b8fab66-8de7-44ff-bd43-8f701bad6bb7" width="100%" controls autoplay loop></video>
         | 
| 233 | 
            +
                 </td>
         | 
| 234 | 
            +
                  <td>
         | 
| 235 | 
            +
                      <video src="https://github.com/user-attachments/assets/9fbddf5f-7fcd-4cc6-9d7c-3bdf1d4ce59e" width="100%" controls autoplay loop></video>
         | 
| 236 | 
            +
                 </td>
         | 
| 237 | 
            +
              </tr>
         | 
| 238 | 
            +
            </table>
         | 
| 239 | 
            +
             | 
| 240 | 
            +
            <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
         | 
| 241 | 
            +
              <tr>
         | 
| 242 | 
            +
                  <td>
         | 
| 243 | 
            +
                      <video src="https://github.com/user-attachments/assets/19c1742b-e417-45ac-97d6-8bf3a80d8e13" width="100%" controls autoplay loop></video>
         | 
| 244 | 
            +
                  </td>
         | 
| 245 | 
            +
                  <td>
         | 
| 246 | 
            +
                      <video src="https://github.com/user-attachments/assets/641e56c8-a3d9-489d-a3a6-42c50a9aeca1" width="100%" controls autoplay loop></video>
         | 
| 247 | 
            +
                  </td>
         | 
| 248 | 
            +
                   <td>
         | 
| 249 | 
            +
                      <video src="https://github.com/user-attachments/assets/2b16be76-518b-44c6-a69b-5c49d76df365" width="100%" controls autoplay loop></video>
         | 
| 250 | 
            +
                 </td>
         | 
| 251 | 
            +
                  <td>
         | 
| 252 | 
            +
                      <video src="https://github.com/user-attachments/assets/e7d9c0fc-136f-405c-9fab-629389e196be" width="100%" controls autoplay loop></video>
         | 
| 253 | 
            +
                 </td>
         | 
| 254 | 
            +
              </tr>
         | 
| 255 | 
            +
            </table>
         | 
| 256 | 
            +
             | 
| 257 | 
             
            ### EasyAnimateV5-12b-zh-Control
         | 
| 258 |  | 
| 259 | 
             
            <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
         | 
|  | |
| 400 |  | 
| 401 | 
             
            EasyAnimateV5:
         | 
| 402 |  | 
| 403 | 
            +
            7B:
         | 
| 404 | 
            +
            | Name | Type | Storage Space | Hugging Face | Model Scope | Description |
         | 
| 405 | 
            +
            |--|--|--|--|--|--|
         | 
| 406 | 
            +
            | EasyAnimateV5-7b-zh-InP | EasyAnimateV5 | 22 GB | [🤗Link](https://huggingface.co/alibaba-pai/EasyAnimateV5-7b-zh-InP) | [😄Link](https://modelscope.cn/models/PAI/EasyAnimateV5-7b-zh-InP) | Official 7B image-to-video weights. Supports video prediction at multiple resolutions (512, 768, 1024), trained with 49 frames at 8 frames per second, and supports bilingual prediction in Chinese and English. |
         | 
| 407 | 
            +
            | EasyAnimateV5-7b-zh | EasyAnimateV5 | 22 GB | [🤗Link](https://huggingface.co/alibaba-pai/EasyAnimateV5-7b-zh) | [😄Link](https://modelscope.cn/models/PAI/EasyAnimateV5-7b-zh) | Official 7B text-to-video weights. Supports video prediction at multiple resolutions (512, 768, 1024), trained with 49 frames at 8 frames per second, and supports bilingual prediction in Chinese and English. |
         | 
| 408 | 
            +
             | 
| 409 | 
            +
            12B:
         | 
| 410 | 
             
            | Name | Type | Storage Space | Hugging Face | Model Scope | Description |
         | 
| 411 | 
             
            |--|--|--|--|--|--|
         | 
| 412 | 
             
            | EasyAnimateV5-12b-zh-InP | EasyAnimateV5 | 34 GB | [🤗Link](https://huggingface.co/alibaba-pai/EasyAnimateV5-12b-zh-InP) | [😄Link](https://modelscope.cn/models/PAI/EasyAnimateV5-12b-zh-InP) | Official image-to-video weights. Supports video prediction at multiple resolutions (512, 768, 1024), trained with 49 frames at 8 frames per second, and supports bilingual prediction in Chinese and English. |
         | 
|  | |
| 416 | 
             
            <details>
         | 
| 417 | 
             
              <summary>(Obsolete) EasyAnimateV4:</summary>
         | 
| 418 |  | 
| 419 | 
            +
            | Name | Type | Storage Space | Hugging Face | Model Scope | Description |
         | 
| 420 | 
             
            |--|--|--|--|--|--|
         | 
| 421 | 
            +
            | EasyAnimateV4-XL-2-InP.tar.gz | EasyAnimateV4 | Before extraction: 8.9 GB \/ After extraction: 14.0 GB |[🤗Link](https://huggingface.co/alibaba-pai/EasyAnimateV4-XL-2-InP)| [😄Link](https://modelscope.cn/models/PAI/EasyAnimateV4-XL-2-InP)| | Our official graph-generated video model is capable of predicting videos at multiple resolutions (512, 768, 1024, 1280) and has been trained on 144 frames at a rate of 24 frames per second. |
         | 
| 422 | 
             
            </details>
         | 
| 423 |  | 
| 424 | 
             
            <details>
         | 
| 425 | 
             
              <summary>(Obsolete) EasyAnimateV3:</summary>
         | 
| 426 |  | 
| 427 | 
            +
            | Name | Type | Storage Space | Hugging Face | Model Scope | Description |
         | 
| 428 | 
             
            |--|--|--|--|--|--|
         | 
| 429 | 
            +
            | EasyAnimateV3-XL-2-InP-512x512.tar | EasyAnimateV3 | 18.2GB | [🤗Link](https://huggingface.co/alibaba-pai/EasyAnimateV3-XL-2-InP-512x512)| [😄Link](https://modelscope.cn/models/PAI/EasyAnimateV3-XL-2-InP-512x512) | EasyAnimateV3 official weights for 512x512 text and image to video resolution. Training with 144 frames and fps 24 |
         | 
| 430 | 
            +
            | EasyAnimateV3-XL-2-InP-768x768.tar | EasyAnimateV3 | 18.2GB | [🤗Link](https://huggingface.co/alibaba-pai/EasyAnimateV3-XL-2-InP-768x768) | [😄Link](https://modelscope.cn/models/PAI/EasyAnimateV3-XL-2-InP-768x768) | EasyAnimateV3 official weights for 768x768 text and image to video resolution. Training with 144 frames and fps 24 |
         | 
| 431 | 
            +
            | EasyAnimateV3-XL-2-InP-960x960.tar | EasyAnimateV3 | 18.2GB | [🤗Link](https://huggingface.co/alibaba-pai/EasyAnimateV3-XL-2-InP-960x960) | [😄Link](https://modelscope.cn/models/PAI/EasyAnimateV3-XL-2-InP-960x960) | EasyAnimateV3 official weights for 960x960 text and  image to video resolution. Training with 144 frames and fps 24 |
         | 
| 432 | 
             
            </details>
         | 
| 433 |  | 
| 434 | 
             
            <details>
         | 
| 435 | 
             
              <summary>(Obsolete) EasyAnimateV2:</summary>
         | 
| 436 | 
            +
             | 
| 437 | 
            +
            | Name | Type | Storage Space | Url | Hugging Face | Model Scope | Description |
         | 
| 438 | 
            +
            |--|--|--|--|--|--|--|
         | 
| 439 | 
            +
            | EasyAnimateV2-XL-2-512x512.tar | EasyAnimateV2 | 16.2GB | - | [🤗Link](https://huggingface.co/alibaba-pai/EasyAnimateV2-XL-2-512x512)| [😄Link](https://modelscope.cn/models/PAI/EasyAnimateV2-XL-2-512x512)| EasyAnimateV2 official weights for 512x512 resolution. Training with 144 frames and fps 24 |
         | 
| 440 | 
            +
            | EasyAnimateV2-XL-2-768x768.tar | EasyAnimateV2 | 16.2GB | - | [🤗Link](https://huggingface.co/alibaba-pai/EasyAnimateV2-XL-2-768x768) | [😄Link](https://modelscope.cn/models/PAI/EasyAnimateV2-XL-2-768x768)| EasyAnimateV2 official weights for 768x768 resolution. Training with 144 frames and fps 24 |
         | 
| 441 | 
            +
            | easyanimatev2_minimalism_lora.safetensors | Lora of Pixart | 485.1MB | [Download](https://pai-aigc-photog.oss-cn-hangzhou.aliyuncs.com/easyanimate/Personalized_Model/easyanimatev2_minimalism_lora.safetensors)| - | - | A lora training with a specifial type images. Images can be downloaded from [Url](https://pai-aigc-photog.oss-cn-hangzhou.aliyuncs.com/easyanimate/asset/v2/Minimalism.zip). |
         | 
| 442 | 
             
            </details>
         | 
| 443 |  | 
| 444 | 
             
            <details>
         | 
|  | |
| 470 |  | 
| 471 |  | 
| 472 | 
             
            # Reference
         | 
| 473 | 
            +
            - CogVideo: https://github.com/THUDM/CogVideo/
         | 
| 474 | 
            +
            - Flux: https://github.com/black-forest-labs/flux
         | 
| 475 | 
             
            - magvit: https://github.com/google-research/magvit
         | 
| 476 | 
             
            - PixArt: https://github.com/PixArt-alpha/PixArt-alpha
         | 
| 477 | 
             
            - Open-Sora-Plan: https://github.com/PKU-YuanGroup/Open-Sora-Plan
         | 
|  | |
| 481 | 
             
            - HunYuan DiT: https://github.com/tencent/HunyuanDiT
         | 
| 482 |  | 
| 483 | 
             
            # License
         | 
| 484 | 
            +
            This project is licensed under the [Apache License (Version 2.0)](https://github.com/modelscope/modelscope/blob/master/LICENSE).
         | 
