Text-to-Image

Anima-Base-FP8

Anima_Upscale_00014_

This repository provides the FP8 quantized version of the Anima-Base model.

It is optimized to significantly reduce VRAM usage while maintaining high-quality generation, making it much easier to run on consumer-grade GPUs with limited VRAM.

Quantization Tool

This model was quantized using the following open-source tool:

Quantized Models

There are two models - FP8 and FP8-balanced

  • FP8 (2.4GB) : (recommend) maximize generation speed while preserving quality as much as possible.
  • FP8-balanced (2.7GB) : (Personal Preference) retain the prefix and suffix blocks intact, while exclusively modifying the Self-Attention and MLP layers. As a result, its performance is remarkably close to the original BF16 model.
quant sample
bf16 Anima_Base_v1_00001_
fp8 Anima_Base_v1_fp8_00001_
fp8-balanced Anima_Base_v1_fp8_balanced_00001_

Quantized layers

fp8

{
  "format": "comfy_quant",
  "block_names": ["net.blocks."],
  "rules": [
    { "policy": "keep", "match": ["blocks.0", "blocks.1."] },
    { "policy": "float8_e4m3fn", "match": ["q_proj", "k_proj", "v_proj", "o_proj", "output_proj", ".mlp"] },
    { "policy": "nvfp4", "match": [] }
  ]
}

fp8-balanced

{
  "format": "comfy_quant",
  "block_names": ["net.blocks."],
  "rules": [
    { "policy": "keep", "match": ["blocks.0.", "blocks.1.", "blocks.26.", "blocks.27."] },
    { "policy": "float8_e4m3fn", "match": ["self_attn.", ".mlp"] },
    { "policy": "nvfp4", "match": [] }
  ]
}

Acknowledgments

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Ronysoc/Anima-Base-FP8

Quantized
(19)
this model