File size: 2,626 Bytes

a56ee5e
 
 
 
d6c1285
 
adc9307
a2dd0de
 
 
 
 
1ab42f6
 
a2dd0de
 
 
 
 
 
 
ac7a7c4
 
e5b58cc
 
ac7a7c4
 
 
8594a59
 
 
 
 
 
 
 
ac7a7c4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a37345a
 
 
 
f15e42c

---
license: other
license_name: circlestone-labs-non-commercial-license
license_link: https://huggingface.co/circlestone-labs/Anima/blob/main/LICENSE.md
base_model:
- circlestone-labs/Anima
pipeline_tag: text-to-image
base_model_relation: quantized
---

# Anima-Base-FP8

![Anima_Upscale_00014_](https://cdn-uploads.huggingface.co/production/uploads/6a1af8de8e92da1d819b0197/WmrRmb6DEIJra2PVFo6JO.png)

This repository provides the **FP8 quantized version** of the [Anima-Base](https://huggingface.co/circlestone-labs/Anima) model. 

It is optimized to significantly reduce VRAM usage while maintaining high-quality generation, making it much easier to run on consumer-grade GPUs with limited VRAM.

## Quantization Tool

This model was quantized using the following open-source tool:
* **Quantizer**: [comfy-dit-quantizer](https://github.com/bedovyy/comfy-dit-quantizer)

## Quantized Models

There are two models - FP8 and FP8-balanced

- FP8 (2.4GB) : (***recommend***) maximize generation speed while preserving quality as much as possible.
- FP8-balanced (2.7GB) : (***Personal Preference***) retain the prefix and suffix blocks intact, while exclusively modifying the Self-Attention and MLP layers. As a result, its performance is remarkably close to the original BF16 model.

| quant      | sample               |
|------------|----------------------|
| **bf16**       |![Anima_Base_v1_00001_](https://cdn-uploads.huggingface.co/production/uploads/6a1af8de8e92da1d819b0197/RcoUaA1m8gM5a8h4-zAZg.png)|
| **fp8**        |![Anima_Base_v1_fp8_00001_](https://cdn-uploads.huggingface.co/production/uploads/6a1af8de8e92da1d819b0197/4qRM9E4dcog7S2BwVAo7x.png)|
| **fp8-balanced** |![Anima_Base_v1_fp8_balanced_00001_](https://cdn-uploads.huggingface.co/production/uploads/6a1af8de8e92da1d819b0197/hBvFppfU_IyR0MLLAERFY.png)|


## Quantized layers

### fp8
```json
{
  "format": "comfy_quant",
  "block_names": ["net.blocks."],
  "rules": [
    { "policy": "keep", "match": ["blocks.0", "blocks.1."] },
    { "policy": "float8_e4m3fn", "match": ["q_proj", "k_proj", "v_proj", "o_proj", "output_proj", ".mlp"] },
    { "policy": "nvfp4", "match": [] }
  ]
}
```

### fp8-balanced
```json
{
  "format": "comfy_quant",
  "block_names": ["net.blocks."],
  "rules": [
    { "policy": "keep", "match": ["blocks.0.", "blocks.1.", "blocks.26.", "blocks.27."] },
    { "policy": "float8_e4m3fn", "match": ["self_attn.", ".mlp"] },
    { "policy": "nvfp4", "match": [] }
  ]
}
```

## Acknowledgments

* **Inspired by**: [Bedovyy](https://huggingface.co/Bedovyy) (This is his repo: [Anima-FP8](https://huggingface.co/Bedovyy/Anima-FP8))