| # AnimatedDiff ControlNet SDXL Example | |
| This document provides a step-by-step guide to setting up and running the `animatediff_controlnet_sdxl.py` script from the Hugging Face repository. The script leverages the `diffusers-sdxl-controlnet` library to generate animated images using ControlNet and SDXL models. | |
| ## Prerequisites | |
| Before running the script, ensure you have the necessary dependencies installed. You can install them using the following commands: | |
| ### System Dependencies | |
| ```bash | |
| sudo apt-get update && sudo apt-get install git-lfs cbm ffmpeg | |
| ``` | |
| ### Python Dependencies | |
| ```bash | |
| pip install git+https://huggingface.co/svjack/diffusers-sdxl-controlnet | |
| pip install transformers peft sentencepiece moviepy==1.0.3 controlnet_aux | |
| ``` | |
| ### Clone the Repository | |
| ```bash | |
| git clone https://huggingface.co/svjack/diffusers-sdxl-controlnet | |
| cp diffusers-sdxl-controlnet/girl-pose.gif . | |
| cp diffusers-sdxl-controlnet/girl_beach.mp4 . | |
| ``` | |
| ## Script Modifications | |
| The script requires some modifications to work correctly. Specifically, you need to comment out certain lines related to LoRA processors: | |
| ```python | |
| ''' | |
| drop #LoRAAttnProcessor2_0, | |
| #LoRAXFormersAttnProcessor, | |
| ''' | |
| ``` | |
| ## GIF to Frames Conversion | |
| The script includes a function to convert a GIF into individual frames. This is useful for preparing input data for the animation pipeline. | |
| ```python | |
| from PIL import Image, ImageSequence | |
| import os | |
| def gif_to_frames(gif_path, output_folder): | |
| # Open the GIF file | |
| gif = Image.open(gif_path) | |
| # Ensure the output folder exists | |
| if not os.path.exists(output_folder): | |
| os.makedirs(output_folder) | |
| # Iterate through each frame of the GIF | |
| for i, frame in enumerate(ImageSequence.Iterator(gif)): | |
| # Copy the frame | |
| frame_copy = frame.copy() | |
| # Save the frame to the specified folder | |
| frame_path = os.path.join(output_folder, f"frame_{i:04d}.png") | |
| frame_copy.save(frame_path) | |
| print(f"Successfully extracted {i + 1} frames to {output_folder}") | |
| # Example call | |
| gif_to_frames("girl-pose.gif", "girl_pose_frames") | |
| ``` | |
| ### Use this girl pose as pose source video (gif) | |
|  | |
| ## Running the Script | |
| To run the script, follow these steps: | |
| 1. **Add the Script Path to System Path**: | |
| ```python | |
| import sys | |
| sys.path.insert(0, "diffusers-sdxl-controlnet/examples/community/") | |
| from animatediff_controlnet_sdxl import * | |
| from controlnet_aux.processor import Processor | |
| ``` | |
| 2. **Load Necessary Libraries and Models**: | |
| ```python | |
| import torch | |
| from diffusers.models import MotionAdapter | |
| from diffusers import DDIMScheduler | |
| from diffusers.utils import export_to_gif | |
| from diffusers import AutoPipelineForText2Image, ControlNetModel | |
| from diffusers.utils import load_image | |
| from PIL import Image | |
| ``` | |
| 3. **Load the MotionAdapter Model**: | |
| ```python | |
| adapter = MotionAdapter.from_pretrained( | |
| "a-r-r-o-w/animatediff-motion-adapter-sdxl-beta", | |
| torch_dtype=torch.float16 | |
| ) | |
| ``` | |
| 4. **Configure the Scheduler and ControlNet**: | |
| ```python | |
| model_id = "svjack/GenshinImpact_XL_Base" | |
| scheduler = DDIMScheduler.from_pretrained( | |
| model_id, | |
| subfolder="scheduler", | |
| clip_sample=False, | |
| timestep_spacing="linspace", | |
| beta_schedule="linear", | |
| steps_offset=1, | |
| ) | |
| controlnet = ControlNetModel.from_pretrained( | |
| "thibaud/controlnet-openpose-sdxl-1.0", | |
| torch_dtype=torch.float16, | |
| ).to("cuda") | |
| ``` | |
| 5. **Load the AnimateDiffSDXLControlnetPipeline**: | |
| ```python | |
| pipe = AnimateDiffSDXLControlnetPipeline.from_pretrained( | |
| model_id, | |
| controlnet=controlnet, | |
| motion_adapter=adapter, | |
| scheduler=scheduler, | |
| torch_dtype=torch.float16, | |
| ).to("cuda") | |
| ``` | |
| 6. **Enable Memory Saving Features**: | |
| ```python | |
| pipe.enable_vae_slicing() | |
| pipe.enable_vae_tiling() | |
| ``` | |
| 7. **Load Conditioning Frames**: | |
| ```python | |
| import os | |
| folder_path = "girl_pose_frames/" | |
| frames = os.listdir(folder_path) | |
| frames = list(filter(lambda x: x.endswith(".png"), frames)) | |
| frames.sort() | |
| conditioning_frames = list(map(lambda x: Image.open(os.path.join(folder_path ,x)).resize((1024, 1024)), frames))[:16] | |
| ``` | |
| 8. **Process Conditioning Frames**: | |
| ```python | |
| p2 = Processor("openpose") | |
| cn2 = [p2(frame) for frame in conditioning_frames] | |
| ``` | |
| 9. **Define Prompts**: | |
| ```python | |
| prompt = ''' | |
| solo,Xiangling\(genshin impact\),1girl, | |
| full body professional photograph of a stunning detailed, sharp focus, dramatic | |
| cinematic lighting, octane render unreal engine (film grain, blurry background | |
| ''' | |
| prompt = "solo,Xiangling\(genshin impact\),1girl,full body professional photograph of a stunning detailed" | |
| negative_prompt = "bad quality, worst quality, jpeg artifacts, ugly" | |
| ``` | |
| 10. **Generate Output**: (Use Genshin Impact character Xiangling) | |
| ```python | |
| prompt = ''' | |
| solo,Xiangling\(genshin impact\),1girl, | |
| full body professional photograph of a stunning detailed, sharp focus, dramatic | |
| cinematic lighting, octane render unreal engine (film grain, blurry background | |
| ''' | |
| prompt = "solo,Xiangling\(genshin impact\),1girl,full body professional photograph of a stunning detailed" | |
| #prompt = "solo,Xiangling\(genshin impact\),1girl" | |
| negative_prompt = "bad quality, worst quality, jpeg artifacts, ugly" | |
| generator = torch.Generator(device="cpu").manual_seed(0) | |
| output = pipe( | |
| prompt=prompt, | |
| negative_prompt=negative_prompt, | |
| num_inference_steps=50, | |
| guidance_scale=20, | |
| controlnet_conditioning_scale = 1.0, | |
| width=512, | |
| height=768, | |
| num_frames=16, | |
| conditioning_frames=cn2, | |
| generator = generator | |
| ) | |
| ``` | |
| 11. **Export Frames to GIF**: | |
| ```python | |
| frames = output.frames[0] | |
| export_to_gif(frames, "xiangling_animation.gif") | |
| ``` | |
| 12. **Display the Result**: | |
| ```python | |
| from IPython import display | |
| display.Image("xiangling_animation.gif") | |
| ``` | |
| ### Target gif | |
| <div style="display: flex; justify-content: center; flex-wrap: nowrap;"> | |
| <div style="margin-right: 10px;"> | |
| <img src="xiangling_animation.gif" alt="Image 1" style="width: 512px; height: 768px;"> | |
| </div> | |
| </div> | |
| ### Use Anime Upscale in https://github.com/svjack/APISR | |
| <div style="display: flex; justify-content: center; flex-wrap: nowrap;"> | |
| <div style="margin-left: 10px;"> | |
| <img src="xiangling_animation_frames_4x.gif" alt="Image 2" style="width: 512px; height: 768px;"> | |
| </div> | |
| </div> | |
| ### Run in Command line | |
| - animatediff_controlnet_sdxl_run_script.py | |
| ```python | |
| import sys | |
| sys.path.insert(0, "diffusers-sdxl-controlnet/examples/community/") | |
| from animatediff_controlnet_sdxl import * | |
| import argparse | |
| from moviepy.editor import VideoFileClip, ImageSequenceClip | |
| import os | |
| import torch | |
| from diffusers.models import MotionAdapter | |
| from diffusers import DDIMScheduler, AutoPipelineForText2Image, ControlNetModel | |
| from diffusers.utils import export_to_gif | |
| from PIL import Image | |
| from controlnet_aux.processor import Processor | |
| # 初始化 MotionAdapter 和 ControlNetModel | |
| adapter = MotionAdapter.from_pretrained("a-r-r-o-w/animatediff-motion-adapter-sdxl-beta", torch_dtype=torch.float16) | |
| def initialize_pipeline(model_id): | |
| scheduler = DDIMScheduler.from_pretrained(model_id, subfolder="scheduler", clip_sample=False, timestep_spacing="linspace", beta_schedule="linear", steps_offset=1) | |
| controlnet = ControlNetModel.from_pretrained("thibaud/controlnet-openpose-sdxl-1.0", torch_dtype=torch.float16).to("cuda") | |
| # 初始化 AnimateDiffSDXLControlnetPipeline | |
| pipe = AnimateDiffSDXLControlnetPipeline.from_pretrained( | |
| model_id, | |
| controlnet=controlnet, | |
| motion_adapter=adapter, | |
| scheduler=scheduler, | |
| torch_dtype=torch.float16, | |
| ).to("cuda") | |
| pipe.enable_vae_slicing() | |
| pipe.enable_vae_tiling() | |
| return pipe | |
| def split_video_into_frames(input_video_path, num_frames, temp_folder='temp_frames'): | |
| """ | |
| 将视频处理成指定帧数的视频,并保持原始的帧率。 | |
| :param input_video_path: 输入视频文件路径 | |
| :param num_frames: 目标帧数 | |
| :param temp_folder: 临时文件夹路径 | |
| """ | |
| clip = VideoFileClip(input_video_path) | |
| original_duration = clip.duration | |
| segment_duration = original_duration / num_frames | |
| if not os.path.exists(temp_folder): | |
| os.makedirs(temp_folder) | |
| for i in range(num_frames): | |
| frame_time = i * segment_duration | |
| frame_path = os.path.join(temp_folder, f'frame_{i:04d}.png') | |
| clip.save_frame(frame_path, t=frame_time) | |
| frame_paths = [os.path.join(temp_folder, f'frame_{i:04d}.png') for i in range(num_frames)] | |
| final_clip = ImageSequenceClip(frame_paths, fps=clip.fps) | |
| final_clip.write_videofile("resampled_video.mp4", codec='libx264') | |
| print(f"新的视频已保存到 resampled_video.mp4,包含 {num_frames} 个帧,并保持原始的帧率。") | |
| def generate_video_with_prompt(input_video_path, prompt, model_id, gif_output_path, seed=0, num_frames=16, keep_imgs=False, temp_folder='temp_frames', num_inference_steps=50, guidance_scale=20, controlnet_conditioning_scale=1.0, width=512, height=768): | |
| """ | |
| 生成带有文本提示的视频。 | |
| :param input_video_path: 输入视频文件路径 | |
| :param prompt: 文本提示 | |
| :param model_id: 模型ID | |
| :param gif_output_path: GIF 输出文件路径 | |
| :param seed: 随机种子 | |
| :param num_frames: 目标帧数 | |
| :param keep_imgs: 是否保留临时图片 | |
| :param temp_folder: 临时文件夹路径 | |
| :param num_inference_steps: 推理步数 | |
| :param guidance_scale: 引导比例 | |
| :param controlnet_conditioning_scale: ControlNet 条件比例 | |
| :param width: 输出宽度 | |
| :param height: 输出高度 | |
| """ | |
| split_video_into_frames(input_video_path, num_frames, temp_folder) | |
| folder_path = temp_folder | |
| frames = os.listdir(folder_path) | |
| frames = list(filter(lambda x: x.endswith(".png"), frames)) | |
| frames.sort() | |
| conditioning_frames = list(map(lambda x: Image.open(os.path.join(folder_path, x)).resize((1024, 1024)), frames))[:num_frames] | |
| p2 = Processor("openpose") | |
| cn2 = [p2(frame) for frame in conditioning_frames] | |
| negative_prompt = "bad quality, worst quality, jpeg artifacts, ugly" | |
| generator = torch.Generator(device="cuda").manual_seed(seed) | |
| pipe = initialize_pipeline(model_id) | |
| output = pipe( | |
| prompt=prompt, | |
| negative_prompt=negative_prompt, | |
| num_inference_steps=num_inference_steps, | |
| guidance_scale=guidance_scale, | |
| controlnet_conditioning_scale=controlnet_conditioning_scale, | |
| width=width, | |
| height=height, | |
| num_frames=num_frames, | |
| conditioning_frames=cn2, | |
| generator=generator | |
| ) | |
| frames = output.frames[0] | |
| export_to_gif(frames, gif_output_path) | |
| print(f"生成的 GIF 已保存到 {gif_output_path}") | |
| if not keep_imgs: | |
| # 删除临时文件夹 | |
| import shutil | |
| shutil.rmtree(temp_folder) | |
| if __name__ == "__main__": | |
| parser = argparse.ArgumentParser(description="生成带有文本提示的视频") | |
| parser.add_argument("input_video", help="输入视频文件路径") | |
| parser.add_argument("prompt", help="文本提示") | |
| parser.add_argument("model_id", help="模型ID") | |
| parser.add_argument("gif_output_path", help="GIF 输出文件路径") | |
| parser.add_argument("--seed", type=int, default=0, help="随机种子") | |
| parser.add_argument("--num_frames", type=int, default=16, help="目标帧数") | |
| parser.add_argument("--keep_imgs", action="store_true", help="是否保留临时图片") | |
| parser.add_argument("--temp_folder", default='temp_frames', help="临时文件夹路径") | |
| parser.add_argument("--num_inference_steps", type=int, default=50, help="推理步数") | |
| parser.add_argument("--guidance_scale", type=float, default=20.0, help="引导比例") | |
| parser.add_argument("--controlnet_conditioning_scale", type=float, default=1.0, help="ControlNet 条件比例") | |
| parser.add_argument("--width", type=int, default=512, help="输出宽度") | |
| parser.add_argument("--height", type=int, default=768, help="输出高度") | |
| args = parser.parse_args() | |
| generate_video_with_prompt(args.input_video, args.prompt, args.model_id, args.gif_output_path, args.seed, args.num_frames, | |
| args.keep_imgs, args.temp_folder, args.num_inference_steps, args.guidance_scale, args.controlnet_conditioning_scale, args.width, args.height) | |
| ``` | |
| ```bash | |
| python animatediff_controlnet_sdxl_run_script.py girl_beach.mp4 \ | |
| "solo,Xiangling\(genshin impact\),1girl,full body professional photograph of a stunning detailed, drink tea use chinese cup" \ | |
| "svjack/GenshinImpact_XL_Base" \ | |
| xiangling_tea_animation.gif --num_frames 16 --temp_folder temp_frames | |
| ``` | |
| - Pose: girl_beach.mp4 | |
| <video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/634dffc49b777beec3bc6448/pYx23VyLNkLk3YxAAqu5i.mp4"></video> | |
| - Output: xiangling_tea_animation.gif | |
|  | |
| - Upscaled: | |
| <video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/634dffc49b777beec3bc6448/uwUDYOPiZbHuq5v6jWADr.mp4"></video> | |
| ### Some Other Samples | |
| #### Makise Kurisu in Steins Gate | |
| ```bash | |
| python animatediff_controlnet_sdxl_run_script.py girl_beach.mp4 \ | |
| "1girl, Makise Kurisu, masterpiece, white lab coat, red tie, laboratory" \ | |
| "cagliostrolab/animagine-xl-3.1" \ | |
| Makise_Kurisu_animation_short.gif --num_frames 16 --temp_folder temp_frames --guidance_scale 20 --controlnet_conditioning_scale 0.3 | |
| ``` | |
| - Output: Makise_Kurisu_animation_short.gif | |
|  | |
| - Upscaled: | |
| <video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/634dffc49b777beec3bc6448/v69NuN5UsAokrfBNW_c9P.mp4"></video> | |
| #### Souryuu Asuka Langley in EVA | |
| ```bash | |
| python animatediff_controlnet_sdxl_run_script.py girl_beach.mp4 \ | |
| "1girl, souryuu asuka langley, masterpiece" \ | |
| "cagliostrolab/animagine-xl-3.1" \ | |
| asuka_langley_animation_short.gif --num_frames 16 --temp_folder temp_frames --guidance_scale 20 --controlnet_conditioning_scale 0.3 --num_inference_steps 50 | |
| ``` | |
| - Output: asuka_langley_animation_short.gif | |
|  | |
| - Upscaled: | |
| <video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/634dffc49b777beec3bc6448/uusv36dl0NT80fpUeo5pA.mp4"></video> | |
| ```bash | |
| python animatediff_controlnet_sdxl_run_script.py girl_beach.mp4 \ | |
| "1girl, souryuu asuka langley, masterpiece, neon genesis evangelion, solo, upper body, v, smile, looking at viewer, outdoors, night" \ | |
| "cagliostrolab/animagine-xl-3.1" \ | |
| asuka_langley_animation_long.gif --num_frames 16 --temp_folder temp_frames --guidance_scale 20 --controlnet_conditioning_scale 0.3 | |
| ``` | |
| - Output: asuka_langley_animation_long.gif | |
|  | |
| - Upscaled: | |
| <video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/634dffc49b777beec3bc6448/T2iREkPkWXWCjzOHmq82-.mp4"></video> | |
| #### XiangLing in Genshin Impact | |
| - produce_gif_script.py | |
| ```bash | |
| python produce_gif_script.py xiangling_video_seed.csv "svjack/GenshinImpact_XL_Base" xiangling_gif_dir \ | |
| --num_frames 16 --temp_folder temp_frames --seed 0 --controlnet_conditioning_scale 0.3 | |
| ``` | |
|  | |
|  | |
| ## Conclusion | |
| This script demonstrates how to use the `diffusers-sdxl-controlnet` library to generate animated images with ControlNet and SDXL models. By following the steps outlined above, you can create and visualize your own animated sequences. | |