applispee
/

GGUF
Anima-GGUF / README.md
applispee's picture
Duplicate from Bedovyy/Anima-GGUF
a4b901d
---
license: other
license_name: circlestone-labs-non-commercial-license
license_link: https://huggingface.co/circlestone-labs/Anima/blob/main/LICENSE.md
base_model:
- circlestone-labs/Anima
base_model_relation: quantized
---
# GGUF models of ANIMA
# How to use
- **Update ComfyUI to v0.14.1 or above.** *(You don't need custom script.)*
- Download GGUF you want. (Recommend at least Q5_K_M)
- Use [ComfyUI-GGUF](https://github.com/city96/ComfyUI-GGUF) custom node to load the model.
## Generation speed
Tested on
- RTX5090(400W), ComfyUI with `--fast` option and `Patch Sage Attention KJ` node(AUTO).
- 832x1216, cfg 5.0, 50steps
| Quant | it/s | Time (s) | Speed vs BF16 (%) |
| ------ | ---- | --------- | ---------------- |
| BF16 | **4.65** | **11.70** | 0.00% |
| Q8_0 | *4.46* | *12.07* | -4.09% |
| Q6_K | 3.60 | 14.91 | -22.58% |
| Q5_K_S | 3.35 | 15.94 | -28.03% |
| Q5_K_M | 3.41 | 15.67 | -26.67% |
| Q5_1 | 3.42 | 15.24 | -26.45% |
| Q5_0 | 3.40 | 15.73 | -26.88% |
| Q4_K_S | 3.55 | 15.12 | -23.66% |
| Q4_K_M | 3.59 | 14.98 | -22.80% |
| Q4_1 | 4.01 | 13.46 | -13.76% |
| Q4_0 | 3.97 | 13.50 | -14.62% |
## Sample
### anima-preview3-base
![26-04-09-Anima_00007_](https://cdn-uploads.huggingface.co/production/uploads/63fbf6951b4b1bd4e706fed1/BSEQSYZniROqpfqiivPmB.webp)
### anima-preview2
![26-03-12-Anima_00031_](https://cdn-uploads.huggingface.co/production/uploads/63fbf6951b4b1bd4e706fed1/XXtcZh7Au0RO8JC1A0R6A.webp)
### anima-preview
![Anima_GGUF_Comparison](https://cdn-uploads.huggingface.co/production/uploads/63fbf6951b4b1bd4e706fed1/mTWE0WdRsuvoJEwRMeEWr.webp)
## How to reproduce
1. Convert BF16 model to FP32
```python
import torch
import safetensors.torch
import os
import sys
def convert_to_fp32(input_path, output_path):
state_dict = safetensors.torch.load_file(input_path)
new_state_dict = {}
for key, tensor in state_dict.items():
print(f"{key} ({tensor.dtype}) -> torch.float32")
new_tensor = tensor.to(torch.float32)
new_state_dict[key] = new_tensor
safetensors.torch.save_file(new_state_dict, output_path)
print(f"output_path: {output_path}")
if __name__ == "__main__":
assert len(sys.argv) == 3, f"usage: {sys.argv[0]} SOURCE TARGET"
input_path, output_path = sys.argv[1:3]
convert_to_fp32(input_path, output_path)
```
2. Read [this manual](https://github.com/city96/ComfyUI-GGUF/blob/main/tools/README.md).
3. make F32 GGUF using https://github.com/city96/ComfyUI-GGUF/blob/main/tools/convert.py#L258
4. Run `llama-quantize`.