|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
- zh |
|
|
tags: |
|
|
- image-to-video |
|
|
- lora |
|
|
- replicate |
|
|
- text-to-video |
|
|
- video |
|
|
- video-generation |
|
|
base_model: "Wan-AI/Wan2.1-T2V-14B-Diffusers" |
|
|
pipeline_tag: text-to-video |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
instance_prompt: woodward |
|
|
--- |
|
|
|
|
|
# Thinking Out Loud |
|
|
|
|
|
<Gallery /> |
|
|
|
|
|
## About this LoRA |
|
|
|
|
|
This is a [LoRA](https://replicate.com/docs/guides/working-with-loras) for the Wan 14B Text-to-Video model. |
|
|
|
|
|
It can be used with diffusers or ComfyUI, and can be loaded against the Wan 14B models. |
|
|
|
|
|
It was trained on [Replicate](https://replicate.com/) with 500 steps at a learning rate of 5e-05 and LoRA rank of 32. |
|
|
|
|
|
|
|
|
## Trigger word |
|
|
|
|
|
You should use `woodward` to trigger the video generation. |
|
|
|
|
|
|
|
|
## Use this LoRA |
|
|
|
|
|
Replicate has a collection of Wan models that are optimised for speed and cost. They can also be used with this LoRA: |
|
|
|
|
|
- https://replicate.com/collections/wan-video |
|
|
- https://replicate.com/fofr/wan-with-lora |
|
|
|
|
|
### Run this LoRA with an API using Replicate |
|
|
|
|
|
```py |
|
|
import replicate |
|
|
|
|
|
input = { |
|
|
"prompt": "woodward", |
|
|
"lora_url": "https://huggingface.co/hanani/Thinking-Out-Loud/resolve/main/wan-14b-t2v-woodward-lora.safetensors" |
|
|
} |
|
|
|
|
|
output = replicate.run( |
|
|
"fofr/wan-with-lora:latest", |
|
|
model="14B", |
|
|
input=input |
|
|
) |
|
|
for index, item in enumerate(output): |
|
|
with open(f"output_{index}.mp4", "wb") as file: |
|
|
file.write(item.read()) |
|
|
``` |
|
|
|
|
|
### Using with Diffusers |
|
|
|
|
|
```py |
|
|
import torch |
|
|
from diffusers.utils import export_to_video |
|
|
from diffusers import WanVidAdapter, WanVid |
|
|
|
|
|
# Load base model |
|
|
base_model = WanVid.from_pretrained("Wan-AI/Wan2.1-T2V-14B-Diffusers", torch_dtype=torch.float16) |
|
|
|
|
|
# Load and apply LoRA adapter |
|
|
adapter = WanVidAdapter.from_pretrained("hanani/Thinking-Out-Loud") |
|
|
base_model.load_adapter(adapter) |
|
|
|
|
|
# Generate video |
|
|
prompt = "woodward" |
|
|
negative_prompt = "blurry, low quality, low resolution" |
|
|
|
|
|
# Generate video frames |
|
|
frames = base_model( |
|
|
prompt=prompt, |
|
|
negative_prompt=negative_prompt, |
|
|
num_inference_steps=30, |
|
|
guidance_scale=5.0, |
|
|
width=832, |
|
|
height=480, |
|
|
fps=16, |
|
|
num_frames=32, |
|
|
).frames[0] |
|
|
|
|
|
# Save as video |
|
|
video_path = "output.mp4" |
|
|
export_to_video(frames, video_path, fps=16) |
|
|
print(f"Video saved to: {video_path}") |
|
|
``` |
|
|
|
|
|
|
|
|
## Training details |
|
|
|
|
|
- Steps: 500 |
|
|
- Learning rate: 5e-05 |
|
|
- LoRA rank: 32 |
|
|
|
|
|
|
|
|
## Contribute your own examples |
|
|
|
|
|
You can use the [community tab](https://huggingface.co/hanani/Thinking-Out-Loud/discussions) to add videos that show off what you've made with this LoRA. |
|
|
|