File size: 3,948 Bytes
76fa28e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1909e3b
76fa28e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
---
license: apache-2.0
base_model:
- lodestones/Chroma1-Base
base_model_relation: quantized
language:
- en
pipeline_tag: text-to-image
library_name: diffusers
---
For more information (including how to compress models yourself), check out https://huggingface.co/DFloat11 and https://github.com/LeanModels/DFloat11

This is my first time using DF11 to compress a model outside the Flux architecture. The process for compressing Flux-based models is much more straightforward as compared to other architectures because the compression code requires a `pattern_dict` as input, but the original [example code](https://github.com/LeanModels/DFloat11/tree/master/examples/compress_flux1) only provides it for Flux, which meant I had to learn the notation myself and modify it to fit other models. At least Chroma is just a pruned version of Flux, so it was relatively simple to derive the correct `pattern_dict` this time. Do let me know if you run into any problems.

This is the `pattern_dict` I used for compression:
```python
pattern_dict = {
    "transformer_blocks\.\d+": (
        "attn.to_q",
        "attn.to_k",
        "attn.to_v",
        "attn.add_k_proj",
        "attn.add_v_proj",
        "attn.add_q_proj",
        "attn.to_out.0",
        "attn.to_add_out",
        "ff.net.0.proj",
        "ff.net.2",
        "ff_context.net.0.proj",
        "ff_context.net.2",
    ),
    "single_transformer_blocks\.\d+": (
        "proj_mlp",
        "proj_out",
        "attn.to_q",
        "attn.to_k",
        "attn.to_v",
    ),
}
```

### How to Use

#### `diffusers`

1. Install the DFloat11 pip package *(installs the CUDA kernel automatically; requires a CUDA-compatible GPU and PyTorch installed)*:

    ```bash
    pip install dfloat11[cuda12]
    # or if you have CUDA version 11:
    # pip install dfloat11[cuda11]
    ```
2. To use the DFloat11 model, run the following example code in Python:
    ```python
    import torch
    from diffusers import ChromaPipeline, ChromaTransformer2DModel
    from dfloat11 import DFloat11Model
    from transformers.modeling_utils import no_init_weights
    with no_init_weights():
        transformer = ChromaTransformer2DModel.from_config(
            ChromaTransformer2DModel.load_config(
                "lodestones/Chroma1-Base",
                subfolder="transformer"
            ),
            torch_dtype=torch.bfloat16
        ).to(torch.bfloat16)
       
    pipe = ChromaPipeline.from_pretrained(
        "lodestones/Chroma1-Base",
        transformer=transformer,
        torch_dtype=torch.bfloat16
    )
    DFloat11Model.from_pretrained("mingyi456/Chroma1-Base-DF11", device='cpu', bfloat16_model=pipe.transformer)
    pipe.enable_model_cpu_offload()
    prompt = "A high-fashion close-up portrait of a blonde woman in clear sunglasses. The image uses a bold teal and red color split for dramatic lighting. The background is a simple teal-green. The photo is sharp and well-composed, and is designed for viewing with anaglyph 3D glasses for optimal effect. It looks professionally done."
    negative_prompt = "low quality, ugly, unfinished, out of focus, deformed, disfigure, blurry, smudged, restricted palette, flat colors"
    image = pipe(
        prompt,
        negative_prompt=negative_prompt,
        generator=torch.Generator("cpu").manual_seed(0)
    ).images[0]
    image.save("Chroma1-Base.png")
    ```
#### ComfyUI
~~Follow the instructions (have not tested myself) here: https://github.com/LeanModels/ComfyUI-DFloat11~~
Currently, this model will not work with ComfyUI out of the box, because the custom node currently only supports Flux models. It should be possible to modify the code to successfully load this model as well, but it requires another `pattern_dict` that is of a completely different form compared to the one used to compress the model. If you are interested in running this model in ComfyUI, please try to contact the developer to request support.