Text-to-Video
Diffusers
Safetensors
Wan2.2
bernini_renderer
comfyui
bernini-r
video-editing
reference-to-video
fp8
Instructions to use neuregex/Bernini-R-fp8 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use neuregex/Bernini-R-fp8 with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("neuregex/Bernini-R-fp8", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Wan2.2
How to use neuregex/Bernini-R-fp8 with Wan2.2:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
| license: apache-2.0 | |
| base_model: Wan-AI/Wan2.2-T2V-A14B | |
| pipeline_tag: text-to-video | |
| library_name: diffusers | |
| tags: | |
| - comfyui | |
| - bernini-r | |
| - wan2.2 | |
| - text-to-video | |
| - video-editing | |
| - reference-to-video | |
| - fp8 | |
| # Bernini-R fp8 (e4m3) — for ComfyUI-BerniniR | |
| fp8 (`float8_e4m3fn`, **weight-only**) build of **[ByteDance/Bernini-R](https://huggingface.co/ByteDance/Bernini-R)** | |
| (which is Wan2.2-T2V-A14B inside), **self-contained** (2 transformers + VAE + UMT5 + tokenizer + | |
| scheduler), packaged for the **[ComfyUI-BerniniR](https://github.com/neuregex/ComfyUI-BerniniR)** | |
| custom node. Runs the **full pipeline in 24 GB**. | |
| The Linear weights are stored in `float8_e4m3fn` and upcast to bf16 on every forward; | |
| norms / embeddings / time-/text-embedders / patch-embed stay in bf16/fp32. These weights are | |
| **bit-identical** to the node's on-the-fly fp8 quantization — validated end-to-end on an i2i edit: | |
| same seed, same GPU → **0 pixel difference**. Loading this pre-quantized bundle is also faster | |
| (no bf16 → fp8 cast at load). | |
| ## VRAM (measured: `torch.cuda.max_memory_allocated`, NVIDIA A10 24 GB, fp8 + sequential offload) | |
| | Task | Frames / resolution | Peak VRAM | Fits 24 GB | | |
| |---|---|---|---| | |
| | i2i / t2i (image edit / image) | 1 frame, 848×848 | **~16.7 GB** | ✅ | | |
| | t2v / v2v / rv2v (video / video edit) | **81 frames** (full length), 480p | **~18.8 GB** | ✅ | | |
| The UMT5 text encoder is freed before the experts load, and offload keeps a single ~14 GB expert | |
| resident — that is what makes full-length 480p video fit in 24 GB. | |
| ## Tasks | |
| `t2v` · `t2i` · `i2i` (image edit) · `v2v` (video edit) · `rv2v` (video edit + reference) · | |
| `r2v` (reference-to-video). Edits preserve the source content/motion via Bernini's source-id RoPE | |
| (validated qualitatively). | |
| ## Usage | |
| Install the **[ComfyUI-BerniniR](https://github.com/neuregex/ComfyUI-BerniniR)** node, then in | |
| `BerniniR · Load Model`: | |
| - set **source = `neuregex/Bernini-R-fp8 (auto)`** with `auto_download = True` — downloads ~40 GB to | |
| `download_dir` on first run (with a free-space check and progress bar), or | |
| - `hf download neuregex/Bernini-R-fp8 --local-dir models/bernini/Bernini-R-fp8` and use `source = local`. | |
| For the full **bf16** weights instead, point the node at `ByteDance/Bernini-R-Diffusers` | |
| (`source = ... (full bf16)`), which needs more VRAM (A100-class) or on-the-fly fp8. | |
| ## Credits & license | |
| - Algorithm & model: **Bernini: Latent Semantic Planning for Video Diffusion**, ByteDance | |
| ([arXiv:2605.22344](https://arxiv.org/abs/2605.22344) · [code](https://github.com/bytedance/Bernini)) — Apache-2.0. | |
| - Base: [Wan2.2-T2V-A14B](https://huggingface.co/Wan-AI/Wan2.2-T2V-A14B). | |
| - fp8 build by **neuregex**. **Apache-2.0**. | |