| --- |
| license: other |
| license_name: circlestone-labs-non-commercial-license |
| license_link: https://huggingface.co/circlestone-labs/Anima/blob/main/LICENSE.md |
| base_model: |
| - circlestone-labs/Anima |
| base_model_relation: quantized |
| --- |
| |
| # GGUF models of ANIMA |
|
|
| # How to use |
|
|
| - **Update ComfyUI to v0.14.1 or above.** *(You don't need custom script.)* |
| - Download GGUF you want. (Recommend at least Q5_K_M) |
| - Use [ComfyUI-GGUF](https://github.com/city96/ComfyUI-GGUF) custom node to load the model. |
|
|
|
|
|
|
|
|
| ## Generation speed |
|
|
| Tested on |
| - RTX5090(400W), ComfyUI with `--fast` option and `Patch Sage Attention KJ` node(AUTO). |
| - 832x1216, cfg 5.0, 50steps |
|
|
| | Quant | it/s | Time (s) | Speed vs BF16 (%) | |
| | ------ | ---- | --------- | ---------------- | |
| | BF16 | **4.65** | **11.70** | 0.00% | |
| | Q8_0 | *4.46* | *12.07* | -4.09% | |
| | Q6_K | 3.60 | 14.91 | -22.58% | |
| | Q5_K_S | 3.35 | 15.94 | -28.03% | |
| | Q5_K_M | 3.41 | 15.67 | -26.67% | |
| | Q5_1 | 3.42 | 15.24 | -26.45% | |
| | Q5_0 | 3.40 | 15.73 | -26.88% | |
| | Q4_K_S | 3.55 | 15.12 | -23.66% | |
| | Q4_K_M | 3.59 | 14.98 | -22.80% | |
| | Q4_1 | 4.01 | 13.46 | -13.76% | |
| | Q4_0 | 3.97 | 13.50 | -14.62% | |
|
|
| ## Sample |
|
|
| ### anima-preview3-base |
|
|
|  |
|
|
| ### anima-preview2 |
|
|
|  |
|
|
| ### anima-preview |
|
|
|  |
|
|
|
|
| ## How to reproduce |
|
|
| 1. Convert BF16 model to FP32 |
|
|
| ```python |
| import torch |
| import safetensors.torch |
| import os |
| import sys |
| |
| def convert_to_fp32(input_path, output_path): |
| state_dict = safetensors.torch.load_file(input_path) |
| |
| new_state_dict = {} |
| for key, tensor in state_dict.items(): |
| print(f"{key} ({tensor.dtype}) -> torch.float32") |
| new_tensor = tensor.to(torch.float32) |
| new_state_dict[key] = new_tensor |
| |
| safetensors.torch.save_file(new_state_dict, output_path) |
| print(f"output_path: {output_path}") |
| |
| if __name__ == "__main__": |
| assert len(sys.argv) == 3, f"usage: {sys.argv[0]} SOURCE TARGET" |
| input_path, output_path = sys.argv[1:3] |
| |
| convert_to_fp32(input_path, output_path) |
| ``` |
|
|
| 2. Read [this manual](https://github.com/city96/ComfyUI-GGUF/blob/main/tools/README.md). |
| 3. make F32 GGUF using https://github.com/city96/ComfyUI-GGUF/blob/main/tools/convert.py#L258 |
| 4. Run `llama-quantize`. |
|
|