Instructions to use Wan-AI/Wan2.1-T2V-1.3B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use Wan-AI/Wan2.1-T2V-1.3B with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("Wan-AI/Wan2.1-T2V-1.3B", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Inference
- Notebooks
- Google Colab
- Kaggle
AMD support
It's asking for nvcc when installing from this repo.
ComfyUI produces only black screen for me.
Has anyone managed to run this on an AMD card?
Manually add --no-deps to the install command: pip install -r requirements.txt --no-deps and make sure you have a ROCm-specific torch install. I have it running on my RX 7900 XTX but the 25 steps is running for ~24 mins at 832*480 with a non-trivial VAE decode too. I need to get sage-attn installed again, as I did when I played with Hunyuan.
When you talk about "non-trivial VAE decode" you mean ComfyUI?
--no-deps option didn't help unfortunately:
Wan2.1# pip3.12 install -r requirements.txt --no-deps --break-system-packages
Requirement already satisfied: torch>=2.4.0 in /usr/local/lib/python3.12/dist-packages (from -r requirements.txt (line 1)) (2.7.0.dev20250119+rocm6.3)
Requirement already satisfied: torchvision>=0.19.0 in /usr/local/lib/python3.12/dist-packages (from -r requirements.txt (line 2)) (0.22.0.dev20250119+rocm6.3)
Requirement already satisfied: opencv-python>=4.9.0.80 in /usr/local/lib/python3.12/dist-packages (from -r requirements.txt (line 3)) (4.11.0.86)
Requirement already satisfied: diffusers>=0.31.0 in /usr/local/lib/python3.12/dist-packages (from -r requirements.txt (line 4)) (0.32.2)
Collecting transformers>=4.49.0 (from -r requirements.txt (line 5))
Downloading transformers-4.49.0-py3-none-any.whl.metadata (44 kB)
ββββββββββββββββββββββββββββββββββββββββ 44.0/44.0 kB 321.4 kB/s eta 0:00:00
Requirement already satisfied: tokenizers>=0.20.3 in /usr/local/lib/python3.12/dist-packages (from -r requirements.txt (line 6)) (0.21.0)
Requirement already satisfied: accelerate>=1.1.1 in /usr/local/lib/python3.12/dist-packages (from -r requirements.txt (line 7)) (1.3.0)
Requirement already satisfied: tqdm in /usr/local/lib/python3.12/dist-packages (from -r requirements.txt (line 8)) (4.67.1)
Collecting imageio (from -r requirements.txt (line 9))
Downloading imageio-2.37.0-py3-none-any.whl.metadata (5.2 kB)
Collecting easydict (from -r requirements.txt (line 10))
Downloading easydict-1.13-py3-none-any.whl.metadata (4.2 kB)
Collecting ftfy (from -r requirements.txt (line 11))
Downloading ftfy-6.3.1-py3-none-any.whl.metadata (7.3 kB)
Collecting dashscope (from -r requirements.txt (line 12))
Downloading dashscope-1.22.2-py3-none-any.whl.metadata (6.8 kB)
Requirement already satisfied: imageio-ffmpeg in /usr/local/lib/python3.12/dist-packages (from -r requirements.txt (line 13)) (0.6.0)
Collecting flash_attn (from -r requirements.txt (line 14))
Downloading flash_attn-2.7.4.post1.tar.gz (6.0 MB)
ββββββββββββββββββββββββββββββββββββββββ 6.0/6.0 MB 1.1 MB/s eta 0:00:00
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error
Γ python setup.py egg_info did not run successfully.
β exit code: 1
β°β> [21 lines of output]
/tmp/pip-install-ly5kyotc/flash-attn_a10f07a52fe1407e8b9f7bc672ef64d3/setup.py:106: UserWarning: flash_attn was requested, but nvcc was not found. Are you sure your environment has nvcc available? If you're installing within a container from https://hub.docker.com/r/pytorch/pytorch, only images whose names contain 'devel' will provide nvcc.
warnings.warn(
Traceback (most recent call last):
File "", line 2, in
File "", line 34, in
File "/tmp/pip-install-ly5kyotc/flash-attn_a10f07a52fe1407e8b9f7bc672ef64d3/setup.py", line 198, in
CUDAExtension(
File "/usr/local/lib/python3.12/dist-packages/torch/utils/cpp_extension.py", line 1139, in CUDAExtension
library_dirs += library_paths(device_type="cuda")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/torch/utils/cpp_extension.py", line 1274, in library_paths
if (not os.path.exists(_join_cuda_home(lib_dir)) and
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/torch/utils/cpp_extension.py", line 2535, in _join_cuda_home
raise OSError('CUDA_HOME environment variable is not set. '
OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.
torch.__version__ = 2.7.0.dev20250119+rocm6.3
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
Γ Encountered error while generating package metadata.
β°β> See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
When you talk about "non-trivial VAE decode" you mean ComfyUI?
Yep.
As for your error - my apologies, I actually installed and ran the Kijai workflows which provide a different requirements.txt which does work on my AMD GPU: https://github.com/kijai/ComfyUI-WanVideoWrapper
Also, if it helps, I am using Python 3.10 and Ubuntu 24.04, with the latest version of ComfyUI.
Incase it is useful to yourself or others: Fixed my VAE issues by not pre-empting OOMs and changing the tile size to 256x256 - a 720*480 video will decode in 16s now.
With TeaCache, SLG, and torch.compile(), I'm able to generate a 81 frame video from an input image in just under 20 minutes, with 30 steps, with the 480P 14B FP8_e5m2 model. This is using PyTorch's native Flash attention (via SDP) on PyTorch 2.6+rocm6.2.4.
I attempted to use flash attention 2 also, but it increased the generation time during the sampling by ~50%.