File size: 8,851 Bytes
8816b1e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
---
license: apache-2.0
language:
- en
base_model:
- Wan-AI/Wan2.1-I2V-14B-480P
- Wan-AI/Wan2.1-I2V-14B-480P-Diffusers
pipeline_tag: image-to-video
tags:
- text-to-image
- lora
- diffusers
- template:diffusion-lora

widget:
- text: >-
    "The video begins with a anime young character with long hair. 5en3m venom transformation. Transform into a venom character transformation. Venom is depicted with his iconic black symbiote body, large white eyes with black pupils, sharp teeth, and a menacing expression. The transformation is smooth and seamless, blending the human figure with the monstrous ."
  output:
    url: example_videos/9_epoch40.mp4
- text: >-
    The video begins with a woman wearing black clothes. 5en3m venom transformation. Transform into a venom character transformation. Venom is depicted with his iconic black symbiote body, large white eyes with black pupils, sharp teeth, and a menacing expression. The transformation is smooth and seamless, blending the human figure with the monstrous .
  output:
    url: example_videos/8_epoch40.mp4
- text: >-
    The video begins with a man wearing a suit. 5en3m venom transformation. Transform into a venom character transformation. Venom is depicted with his iconic black symbiote body, large white eyes with black pupils, sharp teeth, and a menacing expression.The transformation is smooth and seamless, blending the human figure with the monstrous .
  output:
    url: example_videos/10_epoch40.mp4
---

<div style="background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin-bottom: 20px;">
  <h1 style="color: #24292e; margin-top: 0;">Transform to Venom Effect LoRA for Wan2.1 14B I2V 480p</h1>
  
  <div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
    <h2 style="color: #24292e; margin-top: 0;">Overview</h2>
    <p>This LoRA is trained on the Wan2.1 14B I2V 480p model and allows you to transform any object to venom in an image. The effect works on a wide variety of objects, from animals to vehicles to people!</p>
  </div>

  <div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
    <h2 style="color: #24292e; margin-top: 0;">Features</h2>
    <ul style="margin-bottom: 0;">
      <li>Transform any image into a video of it being squished</li>
      <li>Trained on the Wan2.1 14B 480p I2V base model</li>
      <li>Consistent results across different object types</li>
      <li>Simple prompt structure that's easy to adapt</li>
    </ul>
  </div>

</div>

<Gallery />

# Model File and Inference Workflow

## 📥 Download Links:

- [transform2venom.safetensors](./transform2venom.safetensors) - LoRA Model File
- [wan_img2video_lora_workflow.json](./workflow/wan_img2video_lora_workflow.json) - Wan I2V with LoRA Workflow for ComfyUI

## Using with Diffusers
```py
pip install git+https://github.com/huggingface/diffusers.git
```

```py
import torch
from diffusers.utils import export_to_video, load_image
from diffusers import AutoencoderKLWan, WanImageToVideoPipeline
from transformers import CLIPVisionModel
import numpy as np

model_id = "Wan-AI/Wan2.1-I2V-14B-480P-Diffusers"
image_encoder = CLIPVisionModel.from_pretrained(model_id, subfolder="image_encoder", torch_dtype=torch.float32)
vae = AutoencoderKLWan.from_pretrained(model_id, subfolder="vae", torch_dtype=torch.float32)

# Note: Choose Unipcm scheduler to generate higher quality videos for Wan
flow_shift = 3.0  # 5.0 for 720P, 3.0 for 480P
scheduler = UniPCMultistepScheduler(
    prediction_type="flow_prediction",
    use_flow_sigmas=True,
    num_train_timesteps=1000,
    flow_shift=flow_shift,
    scheduler=scheduler,
)
pipe = WanImageToVideoPipeline.from_pretrained(model_id, vae=vae, image_encoder=image_encoder, torch_dtype=torch.bfloat16)
pipe.to("cuda")

pipe.load_lora_weights("passenger12138/Transform2Venom")

pipe.enable_model_cpu_offload() #for low-vram environments

prompt = "The video begins with a man wearing a suit. 5en3m venom transformation. Transform into a venom character transformation. Venom is depicted with his iconic black symbiote body, large white eyes with black pupils, sharp teeth, and a menacing expression. The transformation is smooth and seamless, blending the human figure with the monstrous ."

image = load_image('./test_i2vlora_imgs/1.png')

max_area = 480 * 832
aspect_ratio = image.height / image.width
mod_value = pipe.vae_scale_factor_spatial * pipe.transformer.config.patch_size[1]
height = round(np.sqrt(max_area * aspect_ratio)) // mod_value * mod_value
width = round(np.sqrt(max_area / aspect_ratio)) // mod_value * mod_value
image = image.resize((width, height))

output = pipe(
    image=image,
    prompt=prompt,
    height=height,
    width=width,
    num_frames=81,
    guidance_scale=5.0,
    num_inference_steps=28
).frames[0]
export_to_video(output, "output.mp4", fps=16)
```

---
<div style="background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin-bottom: 20px;">
  <div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
    <h2 style="color: #24292e; margin-top: 0;">Recommended Settings</h2>
    <ul style="margin-bottom: 0;">
      <li><b>LoRA Strength:</b> 1.0</li>
      <li><b>Embedded Guidance Scale:</b> 6.0</li>
      <li><b>Flow Shift:</b> 3.0</li>
    </ul>
  </div>

  <div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
    <h2 style="color: #24292e; margin-top: 0;">Trigger Words</h2>
    <p>The key trigger phrase is: <code style="background-color: #f0f0f0; padding: 3px 6px; border-radius: 4px;">5en3m venom transformation.</code></p>
  </div>

  <div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
    <h2 style="color: #24292e; margin-top: 0;">Prompt Template</h2>
    <p>For best results, use this prompt structure:</p>
    <div style="background-color: #f0f0f0; padding: 12px; border-radius: 6px; margin: 10px 0;">
      <i>The video begins with a [object]. 5en3m venom transformation. Transform into a venom character transformation. Venom is depicted with his iconic black symbiote body, large white eyes with black pupils, sharp teeth, and a menacing expression. The transformation is smooth and seamless, blending the human figure with the monstrous .</i>
    </div>
    <p>Simply replace <code style="background-color: #f0f0f0; padding: 3px 6px; border-radius: 4px;">[object]</code> with whatever you want to see transform to venom!</p>
  </div>

  <div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
    <h2 style="color: #24292e; margin-top: 0;">ComfyUI Workflow</h2>
    <p>This LoRA works with a modified version of <a href="https://github.com/kijai/ComfyUI-WanVideoWrapper/blob/main/example_workflows/wanvideo_480p_I2V_example_02.json" style="color: #0366d6; text-decoration: none;">Kijai's Wan Video Wrapper workflow</a>. The main modification is adding a Wan LoRA node connected to the base model.</p>

  </div>
</div>

<div style="background-color: #f8f9fa; padding: 20px; border-radius: 10px; margin-bottom: 20px;">
  <div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
    <h2 style="color: #24292e; margin-top: 0;">Model Information</h2>
    <p>The model weights are available in Safetensors format. See the Downloads section above.</p>
  </div>

  <div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
    <h2 style="color: #24292e; margin-top: 0;">Training Details</h2>
    <ul style="margin-bottom: 0;">
      <li><b>Base Model:</b> Wan2.1 14B I2V 480p</li>
      <li><b>Training Data:</b> 1.5 minutes of video (40 short clips of things being squished)</li>
      <li><b>Epochs:</b> 40</li>
    </ul>
  </div>

  <div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
    <h2 style="color: #24292e; margin-top: 0;">Additional Information</h2>
    <p>Training was done using <a href="https://github.com/tdrussell/diffusion-pipe" style="color: #0366d6; text-decoration: none;">Diffusion Pipe for Training</a></p>
  </div>

  <div style="background-color: white; padding: 15px; border-radius: 8px; margin: 15px 0; box-shadow: 0 2px 4px rgba(0,0,0,0.1);">
    <h2 style="color: #24292e; margin-top: 0;">Acknowledgments</h2>
    <p style="margin-bottom: 0;">Special thanks to Kijai for the ComfyUI Wan Video Wrapper and tdrussell for the training scripts and RemadeAI some case!</p>
  </div>
</div>