YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

FramePack Mask to Sign Image Generation

This repository contains the necessary steps and scripts to generate Empty Sign Image using a image-to-video model. The model leverages LoRA (Low-Rank Adaptation) weights and pre-trained components to create Hold Sign Image based on sign mask and textual prompts.

Prerequisites

Before proceeding, ensure that you have the following installed on your system:

Ubuntu (or a compatible Linux distribution) • Python 3.xpip (Python package manager) • GitGit LFS (Git Large File Storage) • FFmpeg

Installation

  1. Update and Install Dependencies

    sudo apt-get update && sudo apt-get install cbm git-lfs ffmpeg
    
  2. Clone the Repository

    git clone https://huggingface.co/svjack/FramePack_mask_to_sign_Lora
    cd FramePack_mask_to_sign_Lora
    
  3. Install Python Dependencies

    pip install torch torchvision
    pip install -r requirements.txt
    pip install ascii-magic matplotlib tensorboard huggingface_hub datasets
    pip install moviepy==1.0.3
    pip install sageattention==1.0.6
    
  4. Download Model Weights

     git clone https://huggingface.co/lllyasviel/FramePackI2V_HY
     git clone https://huggingface.co/hunyuanvideo-community/HunyuanVideo
     git clone https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged
     git clone https://huggingface.co/Comfy-Org/sigclip_vision_384
    

Usage

To generate a hold Sign Image, use the fpack_generate_video.py script with the appropriate parameters. Below are examples of how to do it.

  • 1
  • Mask

image/jpeg

#A blacksmith manga artisan proudly holds the empty sign in their forge, the glowing metal textures contrasting with the dark, soot-stained walls.

python fpack_generate_video.py \
    --dit FramePackI2V_HY/diffusion_pytorch_model-00001-of-00003.safetensors \
    --vae HunyuanVideo/vae/diffusion_pytorch_model.safetensors \
    --text_encoder1 HunyuanVideo_repackaged/split_files/text_encoders/llava_llama3_fp16.safetensors \
    --text_encoder2 HunyuanVideo_repackaged/split_files/text_encoders/clip_l.safetensors \
    --image_encoder sigclip_vision_384/sigclip_vision_patch14_384.safetensors \
    --image_path mask.jpg \
    --prompt "A historical manga samurai holds the empty sign with quiet dignity amidst a shower of cherry blossoms, his tattered kimono sleeves fluttering in the wind as ink wash-style mountains loom in the background." \
    --video_size 1088  1920 --fps 30 --infer_steps 25 \
    --attn_mode sdpa --fp8_scaled \
    --vae_chunk_size 32 --vae_spatial_tile_sample_min_size 128 \
    --save_path save --video_sections 1 --output_type latent_images --one_frame_inference zero_post \
    --seed 1234 --lora_multiplier 1.0 --lora_weight framepack_sign_output/framepack-sign-lora-000004.safetensors
  • Output

image/jpeg

  • 2
  • Mask

image/jpeg

#A historical manga samurai holds the empty sign with quiet dignity amidst a shower of cherry blossoms, his tattered kimono sleeves fluttering in the wind as ink wash-style mountains loom in the background.
python fpack_generate_video.py \
    --dit FramePackI2V_HY/diffusion_pytorch_model-00001-of-00003.safetensors \
    --vae HunyuanVideo/vae/diffusion_pytorch_model.safetensors \
    --text_encoder1 HunyuanVideo_repackaged/split_files/text_encoders/llava_llama3_fp16.safetensors \
    --text_encoder2 HunyuanVideo_repackaged/split_files/text_encoders/clip_l.safetensors \
    --image_encoder sigclip_vision_384/sigclip_vision_patch14_384.safetensors \
    --image_path mask.jpg \
    --prompt "A historical manga samurai holds the empty sign with quiet dignity amidst a shower of cherry blossoms, his tattered kimono sleeves fluttering in the wind as ink wash-style mountains loom in the background." \
    --video_size 1088  1920 --fps 30 --infer_steps 25 \
    --attn_mode sdpa --fp8_scaled \
    --vae_chunk_size 32 --vae_spatial_tile_sample_min_size 128 \
    --save_path save --video_sections 1 --output_type latent_images --one_frame_inference zero_post \
    --seed 1234 --lora_multiplier 1.0 --lora_weight framepack_sign_output/framepack-sign-lora-000004.safetensors

  • Output

image/jpeg

  • 3
  • Mask

image/jpeg

** demo 1

#In a gritty seinen manga panel, a grizzled detective holds the empty sign under the flickering neon lights of a rainy alleyway, the heavy crosshatching shadows obscuring half his scarred face while cigarette smoke curls around him.
python fpack_generate_video.py \
    --dit FramePackI2V_HY/diffusion_pytorch_model-00001-of-00003.safetensors \
    --vae HunyuanVideo/vae/diffusion_pytorch_model.safetensors \
    --text_encoder1 HunyuanVideo_repackaged/split_files/text_encoders/llava_llama3_fp16.safetensors \
    --text_encoder2 HunyuanVideo_repackaged/split_files/text_encoders/clip_l.safetensors \
    --image_encoder sigclip_vision_384/sigclip_vision_patch14_384.safetensors \
    --image_path mask.jpg \
    --prompt "In a gritty seinen manga panel, a grizzled detective holds the empty sign under the flickering neon lights of a rainy alleyway, the heavy crosshatching shadows obscuring half his scarred face while cigarette smoke curls around him." \
    --video_size 1088  1920 --fps 30 --infer_steps 25 \
    --attn_mode sdpa --fp8_scaled \
    --vae_chunk_size 32 --vae_spatial_tile_sample_min_size 128 \
    --save_path save --video_sections 1 --output_type latent_images --one_frame_inference zero_post \
    --seed 1234 --lora_multiplier 1.0 --lora_weight framepack_sign_output/framepack-sign-lora-000004.safetensors
  • Output

image/jpeg

** demo 2 KAEDEHARA KAZUHA

  • Image
python fpack_generate_video.py \
    --dit FramePackI2V_HY/diffusion_pytorch_model-00001-of-00003.safetensors \
    --vae HunyuanVideo/vae/diffusion_pytorch_model.safetensors \
    --text_encoder1 HunyuanVideo_repackaged/split_files/text_encoders/llava_llama3_fp16.safetensors \
    --text_encoder2 HunyuanVideo_repackaged/split_files/text_encoders/clip_l.safetensors \
    --image_encoder sigclip_vision_384/sigclip_vision_patch14_384.safetensors \
    --image_path mask.jpg \
    --prompt "a young male anime character with white hair tied by a crimson ribbon, holds an empty sign, his red pupils etched with black patterns like secrets whispered in the dark. He walks alone through a maple forest, each step rustling the carpet of fallen leaves. The wind toys with his pale strands, the red ribbon flickering like flame against the gold-and-scarlet canopy. Sunlight filters through the branches, dappling his figure in amber as he catches a drifting leaf, tracing its veins with quiet amusement. Distant mountains blaze with autumn, yet he moves through the fiery woods—neither stranger nor native, just a silhouette woven into the season’s tapestry." \
    --video_size 768  1024 --fps 30 --infer_steps 25 \
    --attn_mode sdpa --fp8_scaled \
    --vae_chunk_size 32 --vae_spatial_tile_sample_min_size 128 \
    --save_path save --video_sections 1 --output_type latent_images --one_frame_inference zero_post \
    --seed 1234 --lora_multiplier 1.0 --lora_weight framepack_sign_output/framepack-sign-lora-000006.safetensors

image/png

  • Video
python fpack_generate_video.py \
    --dit FramePackI2V_HY/diffusion_pytorch_model-00001-of-00003.safetensors \
    --vae HunyuanVideo/vae/diffusion_pytorch_model.safetensors \
    --text_encoder1 HunyuanVideo_repackaged/split_files/text_encoders/llava_llama3_fp16.safetensors \
    --text_encoder2 HunyuanVideo_repackaged/split_files/text_encoders/clip_l.safetensors \
    --image_encoder sigclip_vision_384/sigclip_vision_patch14_384.safetensors \
    --image_path mask.jpg \
    --prompt "a young male anime character with white hair tied by a crimson ribbon, holds an empty sign, his red pupils etched with black patterns like secrets whispered in the dark. He walks alone through a maple forest, each step rustling the carpet of fallen leaves. The wind toys with his pale strands, the red ribbon flickering like flame against the gold-and-scarlet canopy. Sunlight filters through the branches, dappling his figure in amber as he catches a drifting leaf, tracing its veins with quiet amusement. Distant mountains blaze with autumn, yet he moves through the fiery woods—neither stranger nor native, just a silhouette woven into the season’s tapestry." \
    --video_size 480 832  --video_seconds 3 --fps 30 --infer_steps 25 \
    --attn_mode sdpa --fp8_scaled \
    --vae_chunk_size 32 --vae_spatial_tile_sample_min_size 128 \
    --save_path save --output_type video \
    --seed 1234 --lora_multiplier 1.0 --lora_weight framepack_sign_output/framepack-sign-lora-000006.safetensors

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support