FP8 Quantized model of ANIMA

!! I hava changed models recently. Please redownload if hash is different. !!

There are two models - FP8 and NVFP4Mixed.

  • FP8 (2.4GB) : (recommend) maximize generation speed while preserving quality as much as possible.
  • NVFP4Mixed (2.0GB): (marginal quality) Mixture of FP8 and NVFP4.

To use torch.compile, use the TorchCompileModelAdvanced node from KJNodes, set the mode to max-autotune-no-cudagraphs, and make sure dynamic is set to false.

Generation speed

Tested on

  • RTX5090 (400W), ComfyUI with --fast option, torch2.10.0+cu130
  • Generates 832x1216, 30steps, cfg 4.0, er sde, simple
quant none sage+torch.compile
bf16 7.13s/4.21it/s 5.16s/5.81it/s (+38%)
fp8 6.66s/4.50it/s (+11%) 4.52s/6.64it/s (+58%)
nvfp4mix 6.37s/4.71it/s (+12%) 4.99s/6.01it/s (+43%)

Sample

quant sample
bf16 anima-preview-bf16
fp8 anima-preview-fp8
nvfp4mixed anima-preview-nvfp4

Quantized layers

fp8

{
  "format": "comfy_quant",
  "block_names": ["net.blocks."],
  "rules": [
    { "policy": "keep", "match": ["blocks.0", "blocks.1."] },
    { "policy": "float8_e4m3fn", "match": ["q_proj", "k_proj", "v_proj", "o_proj", "output_proj", ".mlp"] },
    { "policy": "nvfp4", "match": [] }
  ]
}

nvfp4mixed

{
  "format": "comfy_quant",
  "block_names": ["net.blocks."],
  "rules": [
    { "policy": "keep", "match": ["blocks.0."] },
    { "policy": "float8_e4m3fn", "match": [
            "blocks.1.k_proj", "blocks.1.q_proj", "blocks.1.output_proj",
            "blocks.27.k_proj", "blocks.27.q_proj", "blocks.27.output_proj",
            "v_proj", "adaln_modulation", ".mlp"
    ] },
    { "policy": "nvfp4", "match": ["k_proj", "q_proj", "output_proj"] }
  ]
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Bedovyy/Anima-FP8

Quantized
(5)
this model