Instructions to use Motif-Technologies/Motif-Video-2B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use Motif-Technologies/Motif-Video-2B with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("Motif-Technologies/Motif-Video-2B", dtype=torch.bfloat16, device_map="cuda") prompt = "A vibrant blue jay perches gracefully on a slender branch, its feathers shimmering in the soft morning light. The bird's keen eyes scan the surroundings, capturing the essence of the tranquil forest. It flutters its wings briefly, showcasing the intricate patterns of blue, white, and black on its plumage. The background reveals a lush canopy of green leaves, with rays of sunlight filtering through, creating a dappled effect on the forest floor. The blue jay then tilts its head, emitting a melodious call that echoes through the serene woodland, adding a touch of magic to the peaceful scene." image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
Where can I find the inference.py mentioned in the CLI Inference section?
Hi,
I noticed that inference.py is mentioned in the CLI Inference section of the documentation, but I couldn’t find this file in the repository.
Could you please point me to its location?
inference.py is outdated and we have since updated our documentation to reflect the latest changes.
Please refer to the latest model card or Diffusers documentation for the usages.
Thank you for your reply. I noticed that your work already supports Diffusers, and I really appreciate your effort.
I actually saw the usage of inference.py on this page:
https://huggingface.co/Motif-Technologies/Motif-Video-2B/blob/main/docs/gguf-sageattention.md#benchmark
I would like to know more about the logic behind use-sage-attention. It seems that this part has not yet been added to the Diffusers pipeline. If possible, could you share more details about how it is implemented?
Thanks again for your help!
That page is also outdated and we will update it later.
Since Sage Attention does not support attention_mask in its interface, before Diffusers integration we have to hot swap the attention logic to remove the padding tokens before attention.
With Diffusers integration attention backend is handled by the dispatch_attention_fn and you may simply swap the backend with pipeline.transformer.set_attention_backend(...).
In case you observe inconsistent results between SDPA and Sage Attention, consider removing the padding tokens after tokenization.
Documentation on Attention backends
https://huggingface.co/docs/diffusers/optimization/attention_backends
Related Code:
https://github.com/huggingface/diffusers/blob/68a4847768c9a4e5e39307635ff2762ef2ef5d13/src/diffusers/models/transformers/transformer_motif_video.py#L90
https://github.com/huggingface/diffusers/blob/68a4847768c9a4e5e39307635ff2762ef2ef5d13/src/diffusers/models/transformers/transformer_motif_video.py#L187