Text-to-Video
Diffusers
Safetensors
English
MotifVideoPipeline
image-to-video
video-generation
diffusion-transformer
Instructions to use Motif-Technologies/Motif-Video-2B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use Motif-Technologies/Motif-Video-2B with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("Motif-Technologies/Motif-Video-2B", dtype=torch.bfloat16, device_map="cuda") prompt = "A vibrant blue jay perches gracefully on a slender branch, its feathers shimmering in the soft morning light. The bird's keen eyes scan the surroundings, capturing the essence of the tranquil forest. It flutters its wings briefly, showcasing the intricate patterns of blue, white, and black on its plumage. The background reveals a lush canopy of green leaves, with rays of sunlight filtering through, creating a dappled effect on the forest floor. The blue jay then tilts its head, emitting a melodious call that echoes through the serene woodland, adding a touch of magic to the peaceful scene." image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
Enable Flash Attention by trimming prompt embedding padding
#8
by gkalstn0 - opened
Summary
- Trim prompt_embeds to actual token length (removing padding) for batch_size=1 inference
- Pass attention_mask=None to transformer, allowing PyTorch SDPA to use Flash Attention backend
- Positive and negative prompts trimmed independently (guider runs them in separate iterations)
- batch_size>1 preserves original attention_mask path for variable-length prompt compatibility
Changes
- pipeline_motif_video.py: encode_prompt() computes actual_seq_len, call() trims embeddings and drops mask
- No transformer code changes needed (existing None-guard handles it)
Test plan
- batch=1 with CFG: trim confirmed (pos 512->117, neg 512->113)
- batch>1: mask path preserved (no trim)
- Video output quality verified (720p 121f 50 steps)
- I2V compatibility: transformer handles encoder_attention_mask=None safely
gkalstn0 changed pull request status to open
gkalstn0 changed pull request status to merged