| | --- |
| | license: other |
| | license_name: circlestone-labs-non-commercial-license |
| | license_link: https://huggingface.co/circlestone-labs/Anima/blob/main/LICENSE.md |
| | base_model: |
| | - circlestone-labs/Anima |
| | base_model_relation: quantized |
| | tags: |
| | - comfyui |
| | - diffusion-single-file |
| | pipeline_tag: text-to-image |
| | --- |
| | |
| | # Int8-Tensorwise Quantized model of ANIMA |
| |
|
| | ## !! You need custom node for running this model on ComfyUI. See below !! |
| |
|
| | ## Generation Speed |
| |
|
| | - Tested on |
| | - RTX5090 (400W), ComfyUI with torch2.10.0+cu130 |
| | - RTX3090 (280W), ComfyUI with torch2.9.1+cu130 |
| | - RTX3060 (PCIe4.0 x4), ComfyUI with torch2.9.1+cu130 |
| | - Generates 832x1216, cfg 4.0, steps 30, er sde, simple |
| | - Torch Compile + Sage attention (Both from KJNodes) |
| | - *RTX3090 runs fp8 without Torch compile, because official triton uses fp8e4nv which is not supported on RTX3000 series. (triton-windows supports)* |
| | - *INT8Rowwise runs without Torch compile. (doesn't support it)* |
| | - Second run measured |
| |
|
| | |GPU |BF16 |FP8 |INT8 |Int8Rowwise |INT8 vs BF16 (%)| |
| | |----|-------------------|-------------------|-----------------------|-------------------|-----------------| |
| | |5090| 6.30it/s (5.04s) | 7.20it/s (4.86s) | **8.46it/s (4.24s)** | 6.20it/s (5.36s) | **+18.8%** | |
| | |3090| 1.70it/s (19.36s) | *1.55it/s (20.26s)* | **2.58it/s (12.62s)** | 1.79it/s (18.04s) | **+53.3%** | |
| | |3060| 1.06it/s (29.47s) | 1.07it/s (28.91s) | **1.33it/s (23.43s)** | 1.06it/s (28.91s) | **+25.7%** | |
| |
|
| | ## Sample |
| |
|
| | |quant|sample| |
| | |-----|------| |
| | |BF16 || |
| | |FP8 || |
| | |INT8 || |
| | |INT8Rowwise || |
| |
|
| | ## How to use |
| |
|
| | 1. Cloning [ComfyUI-Flux2-INT8](https://github.com/BobJohnson24/ComfyUI-Flux2-INT8) to `custom_nodes` directory. |
| | 3. Use `Load Diffusion Model INT8 (W8A8)` node to model loading and set `on_the_fly_qunatization` to **False** (default). |
| |  |
| |
|
| |
|
| | ## Quantized layers |
| |
|
| | ### INT8Tensorwise |
| | ```json |
| | { |
| | "format": "comfy_quant", |
| | "block_names": ["net.blocks."], |
| | "rules": [ |
| | { "policy": "keep", "match": ["blocks.0", "adaln_modulation", ".mlp.layer2"] }, |
| | { "policy": "int8_tensorwise", "match": ["q_proj", "k_proj", "v_proj", "output_proj", ".mlp"] } |
| | ] |
| | } |
| | ``` |
| |
|
| | ### INT8Rowwise |
| | ```json |
| | { |
| | "format": "comfy_quant", |
| | "block_names": ["net.blocks."], |
| | "rules": [ |
| | { "policy": "keep", "match": ["blocks.0.", "adaln_modulation", ".0.mlp", ".1.mlp", ".2.mlp", ".3.mlp"] }, |
| | { "policy": "int8_rowwise", "match": ["q_proj", "k_proj", "v_proj", "output_proj", ".mlp"] } |
| | ] |
| | } |
| | ``` |