File size: 4,299 Bytes
1ac4a68
 
 
 
 
 
 
 
 
 
f8df832
1ac4a68
 
 
 
 
 
 
 
 
 
 
f8df832
1ac4a68
 
 
 
 
 
 
f8df832
1ac4a68
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
---
language:
- en
license: apache-2.0
pipeline_tag: image-to-image
tags:
- pruna-ai
- safetensors
---

# Model Card for PrunaAI/flux2-klein-4b-optimized-smashed

This model was created using the [pruna](https://github.com/PrunaAI/pruna) library. Pruna is a model optimization framework built for developers, enabling you to deliver more efficient models with minimal implementation overhead.

## Usage

First things first, you need to install the pruna library:

```bash
pip install pruna
```

You can [use the library_name library to load the model](https://huggingface.co/PrunaAI/flux2-klein-4b-optimized-smashed?library=library_name) but this might not include all optimizations by default.

To ensure that all optimizations are applied, use the pruna library to load the model using the following code:

```python
from pruna import PrunaModel

loaded_model = PrunaModel.from_pretrained(
    "PrunaAI/flux2-klein-4b-optimized-smashed"
)
# we can then run inference using the methods supported by the base model
```

 Alternatively, you can visit [the Pruna documentation](https://docs.pruna.ai/en/stable/) for more information.

## Smash Configuration

The compression configuration of the model is stored in the `smash_config.json` file, which describes the optimization methods that were applied to the model.

```bash
{
    "awq": false,
    "c_generate": false,
    "c_translate": false,
    "c_whisper": false,
    "deepcache": false,
    "diffusers_int8": false,
    "fastercache": false,
    "flash_attn3": false,
    "fora": true,
    "gptq": false,
    "half": false,
    "hqq": false,
    "hqq_diffusers": false,
    "hyper": false,
    "ifw": false,
    "img2img_denoise": false,
    "ipex_llm": false,
    "llm_int8": false,
    "moe_kernel_tuner": false,
    "pab": false,
    "padding_pruning": false,
    "qkv_diffusers": false,
    "quanto": false,
    "realesrgan_upscale": false,
    "reduce_noe": false,
    "ring_attn": false,
    "sage_attn": false,
    "stable_fast": false,
    "text_to_image_distillation_inplace_perp": false,
    "text_to_image_distillation_lora": false,
    "text_to_image_distillation_perp": false,
    "text_to_image_inplace_perp": false,
    "text_to_image_lora": false,
    "text_to_image_perp": false,
    "text_to_text_inplace_perp": false,
    "text_to_text_lora": false,
    "text_to_text_perp": false,
    "torch_compile": true,
    "torch_dynamic": false,
    "torch_structured": false,
    "torch_unstructured": false,
    "torchao": true,
    "whisper_s2t": false,
    "x_fast": false,
    "zipar": false,
    "fora_backbone_calls_per_step": 2,
    "fora_interval": 3,
    "fora_start_step": 4,
    "torch_compile_backend": "inductor",
    "torch_compile_dynamic": null,
    "torch_compile_fullgraph": false,
    "torch_compile_make_portable": false,
    "torch_compile_max_kv_cache_size": 400,
    "torch_compile_mode": "default",
    "torch_compile_seqlen_manual_cuda_graph": 100,
    "torch_compile_target": "model",
    "torchao_excluded_modules": "none",
    "torchao_quant_type": "fp8wo",
    "torchao_target_modules": {
        "include": [
            "*single_transformer_blocks.*"
        ],
        "exclude": [
            "pe_embedder",
            "*norm*",
            "*embed*"
        ]
    },
    "batch_size": 1,
    "device": "cuda",
    "device_map": null,
    "save_fns": [
        "save_before_apply",
        "save_before_apply"
    ],
    "save_artifacts_fns": [],
    "load_fns": [
        "diffusers"
    ],
    "load_artifacts_fns": [],
    "reapply_after_load": {
        "torchao": true,
        "fora": true,
        "torch_compile": true
    }
}
```

## 🌍 Join the Pruna AI community!

[![Twitter](https://img.shields.io/twitter/follow/PrunaAI?style=social)](https://twitter.com/PrunaAI)
[![GitHub](https://img.shields.io/github/followers/PrunaAI?label=Follow%20%40PrunaAI&style=social)](https://github.com/PrunaAI)
[![LinkedIn](https://img.shields.io/badge/LinkedIn-Connect-blue)](https://www.linkedin.com/company/93832878/admin/feed/posts/?feedType=following)
[![Discord](https://img.shields.io/badge/Discord-Join%20Us-blue?style=social&logo=discord)](https://discord.gg/JFQmtFKCjd)
[![Reddit](https://img.shields.io/reddit/subreddit-subscribers/PrunaAI?style=social)](https://www.reddit.com/r/PrunaAI/)