Text-to-Video
Diffusers
Safetensors
Wan2.2
bernini_renderer
comfyui
bernini-r
video-editing
reference-to-video
fp8
Instructions to use neuregex/Bernini-R-fp8 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use neuregex/Bernini-R-fp8 with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("neuregex/Bernini-R-fp8", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Wan2.2
How to use neuregex/Bernini-R-fp8 with Wan2.2:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
File size: 2,743 Bytes
28e3b22 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 | ---
license: apache-2.0
base_model: Wan-AI/Wan2.2-T2V-A14B
pipeline_tag: text-to-video
library_name: diffusers
tags:
- comfyui
- bernini-r
- wan2.2
- text-to-video
- video-editing
- reference-to-video
- fp8
---
# Bernini-R fp8 (e4m3) — for ComfyUI-BerniniR
fp8 (`float8_e4m3fn`, **weight-only**) build of **[ByteDance/Bernini-R](https://huggingface.co/ByteDance/Bernini-R)**
(which is Wan2.2-T2V-A14B inside), **self-contained** (2 transformers + VAE + UMT5 + tokenizer +
scheduler), packaged for the **[ComfyUI-BerniniR](https://github.com/neuregex/ComfyUI-BerniniR)**
custom node. Runs the **full pipeline in 24 GB**.
The Linear weights are stored in `float8_e4m3fn` and upcast to bf16 on every forward;
norms / embeddings / time-/text-embedders / patch-embed stay in bf16/fp32. These weights are
**bit-identical** to the node's on-the-fly fp8 quantization — validated end-to-end on an i2i edit:
same seed, same GPU → **0 pixel difference**. Loading this pre-quantized bundle is also faster
(no bf16 → fp8 cast at load).
## VRAM (measured: `torch.cuda.max_memory_allocated`, NVIDIA A10 24 GB, fp8 + sequential offload)
| Task | Frames / resolution | Peak VRAM | Fits 24 GB |
|---|---|---|---|
| i2i / t2i (image edit / image) | 1 frame, 848×848 | **~16.7 GB** | ✅ |
| t2v / v2v / rv2v (video / video edit) | **81 frames** (full length), 480p | **~18.8 GB** | ✅ |
The UMT5 text encoder is freed before the experts load, and offload keeps a single ~14 GB expert
resident — that is what makes full-length 480p video fit in 24 GB.
## Tasks
`t2v` · `t2i` · `i2i` (image edit) · `v2v` (video edit) · `rv2v` (video edit + reference) ·
`r2v` (reference-to-video). Edits preserve the source content/motion via Bernini's source-id RoPE
(validated qualitatively).
## Usage
Install the **[ComfyUI-BerniniR](https://github.com/neuregex/ComfyUI-BerniniR)** node, then in
`BerniniR · Load Model`:
- set **source = `neuregex/Bernini-R-fp8 (auto)`** with `auto_download = True` — downloads ~40 GB to
`download_dir` on first run (with a free-space check and progress bar), or
- `hf download neuregex/Bernini-R-fp8 --local-dir models/bernini/Bernini-R-fp8` and use `source = local`.
For the full **bf16** weights instead, point the node at `ByteDance/Bernini-R-Diffusers`
(`source = ... (full bf16)`), which needs more VRAM (A100-class) or on-the-fly fp8.
## Credits & license
- Algorithm & model: **Bernini: Latent Semantic Planning for Video Diffusion**, ByteDance
([arXiv:2605.22344](https://arxiv.org/abs/2605.22344) · [code](https://github.com/bytedance/Bernini)) — Apache-2.0.
- Base: [Wan2.2-T2V-A14B](https://huggingface.co/Wan-AI/Wan2.2-T2V-A14B).
- fp8 build by **neuregex**. **Apache-2.0**.
|