--- base_model: - Wan-AI/Wan2.1-I2V-14B-480P language: - en library_name: diffusers license: mit pipeline_tag: image-to-video ---

ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing

     

Traditional cartoon/anime production is time-consuming, requiring skilled artists for keyframing, inbetweening, and colorization. ToonComposer streamlines this with generative AI, turning hours of manual work of inbetweening and colorization into a single, seamless process. Visit our [project page](https://lg-li.github.io/project/tooncomposer) and read our [paper](https://arxiv.org/abs/2508.10881) for more details. This HF model repo provides model weights of ToonComposer. Codes are available at [our GitHub repo](https://github.com/TencentARC/ToonComposer). ## Sample Usage You can use the `ToonComposer` model with the `diffusers` library. Ensure your environment has the required dependencies, including `flash-attn` as specified in the [official GitHub repository](https://github.com/TencentARC/ToonComposer) for optimal performance. ```python import torch from diffusers import DiffusionPipeline from PIL import Image import numpy as np # For creating dummy images if paths are not found # Load the ToonComposer pipeline from the Hugging Face Hub pipeline = DiffusionPipeline.from_pretrained( "TencentARC/ToonComposer", torch_dtype=torch.float16, # Use torch.bfloat16 for newer GPUs or if preferred trust_remote_code=True # Required to load custom pipeline code ) pipeline.to("cuda") # Move model to GPU for faster inference (can use "cpu" for CPU inference) # --- Prepare your input data --- # 1. Initial Colored Keyframe (Reference Image) # This image sets the base visual style and initial frame. # Replace 'path/to/your/initial_colored_keyframe.png' with your actual image file. try: initial_colored_keyframe = Image.open("path/to/your/initial_colored_keyframe.png").convert("RGB") except FileNotFoundError: print("Warning: Initial colored keyframe image not found. Using a dummy white image for demonstration.") # Create a dummy white image matching target resolution (e.g., 1088x608 from model config) initial_colored_keyframe = Image.fromarray(np.full((608, 1088, 3), 255, dtype=np.uint8)) # 2. Sketch Keyframe (for motion control) # This is typically a black and white line drawing that guides motion at a specific frame. # Replace 'path/to/your/sketch_at_frame_X.png' with your actual sketch image path. # ToonComposer supports multiple sketches at different time steps. For this example, we use one. try: sketch_keyframe = Image.open("path/to/your/sketch_at_frame_X.png").convert("RGB") except FileNotFoundError: print("Warning: Sketch keyframe image not found. Using a dummy black image for demonstration.") # Create a dummy black image matching target resolution sketch_keyframe = Image.fromarray(np.full((608, 1088, 3), 0, dtype=np.uint8)) # Text Prompt: Describe the desired motion or scene prompt = "a joyful character bouncing a ball" # Video Generation Parameters (adjust as needed) # Refer to the model's config.json or official GitHub for recommended values. num_frames = 33 # Example number of frames from model's config.json height = 608 # Example resolution from model's config.json width = 1088 # Example resolution from model's config.json guidance_scale = 7.5 # Common value for text-to-image/video diffusion # --- Generate the video frames --- # The exact arguments for this custom pipeline might vary. # We infer a plausible API based on common diffusion model practices and the paper's description. # The `image` argument is for the initial colored keyframe, and `sketches` for the list of sketch images. video_frames = pipeline( prompt=prompt, image=initial_colored_keyframe, sketches=[(5, sketch_keyframe)], # Example: Apply `sketch_keyframe` at frame index 5 num_frames=num_frames, height=height, width=width, guidance_scale=guidance_scale, ).frames # The output is expected to be a list of PIL Images. # --- Save or display the generated video frames --- output_dir = "./tooncomposer_output" import os os.makedirs(output_dir, exist_ok=True) for i, frame in enumerate(video_frames): frame.save(f"{output_dir}/frame_{i:04d}.png") print(f"Generated {len(video_frames)} frames to '{output_dir}'.") # Optional: To compile frames into a GIF (requires `imageio` and `imageio-ffmpeg` to be installed) # import imageio # try: # imageio.mimsave(f"{output_dir}/output_video.gif", video_frames, fps=10) # print(f"Saved GIF to '{output_dir}/output_video.gif'.") # except Exception as e: # print(f"Could not save GIF (ensure imageio and imageio-ffmpeg are installed): {e}") ``` ## Citation If you find ToonComposer useful, please consider citing: ``` @article{li2025tooncomposer, title={ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing}, author={Li, Lingen and Wang, Guangzhi and Zhang, Zhaoyang and Li, Yaowei and Li, Xiaoyu and Dou, Qi and Gu, Jinwei and Xue, Tianfan and Shan, Ying}, journal={arXiv preprint arXiv:2508.10881}, year={2025} } ```