Instructions to use PrunaAI/flux2-klein-4b-smashed with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use PrunaAI/flux2-klein-4b-smashed with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline from diffusers.utils import load_image # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("PrunaAI/flux2-klein-4b-smashed", dtype=torch.bfloat16, device_map="cuda") prompt = "Turn this cat into a dog" input_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png") image = pipe(image=input_image, prompt=prompt).images[0] - Pruna AI
How to use PrunaAI/flux2-klein-4b-smashed with Pruna AI:
from pruna import PrunaModel pip install -U diffusers transformers accelerate
from pruna import PrunaModel import torch from diffusers.utils import load_image # switch to "mps" for apple devices pipe = PrunaModel.from_pretrained("PrunaAI/flux2-klein-4b-smashed", dtype=torch.bfloat16, device_map="cuda") prompt = "Turn this cat into a dog" input_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png") image = pipe(image=input_image, prompt=prompt).images[0] - Notebooks
- Google Colab
- Kaggle
| language: | |
| - en | |
| license: apache-2.0 | |
| pipeline_tag: image-to-image | |
| tags: | |
| - pruna-ai | |
| - safetensors | |
| # Model Card for PrunaAI/flux2-klein-4b-optimized-smashed | |
| This model was created using the [pruna](https://github.com/PrunaAI/pruna) library. Pruna is a model optimization framework built for developers, enabling you to deliver more efficient models with minimal implementation overhead. | |
| ## Usage | |
| First things first, you need to install the pruna library: | |
| ```bash | |
| pip install pruna | |
| ``` | |
| You can [use the library_name library to load the model](https://huggingface.co/PrunaAI/flux2-klein-4b-optimized-smashed?library=library_name) but this might not include all optimizations by default. | |
| To ensure that all optimizations are applied, use the pruna library to load the model using the following code: | |
| ```python | |
| from pruna import PrunaModel | |
| loaded_model = PrunaModel.from_pretrained( | |
| "PrunaAI/flux2-klein-4b-optimized-smashed" | |
| ) | |
| # we can then run inference using the methods supported by the base model | |
| ``` | |
| Alternatively, you can visit [the Pruna documentation](https://docs.pruna.ai/en/stable/) for more information. | |
| ## Smash Configuration | |
| The compression configuration of the model is stored in the `smash_config.json` file, which describes the optimization methods that were applied to the model. | |
| ```bash | |
| { | |
| "awq": false, | |
| "c_generate": false, | |
| "c_translate": false, | |
| "c_whisper": false, | |
| "deepcache": false, | |
| "diffusers_int8": false, | |
| "fastercache": false, | |
| "flash_attn3": false, | |
| "fora": true, | |
| "gptq": false, | |
| "half": false, | |
| "hqq": false, | |
| "hqq_diffusers": false, | |
| "hyper": false, | |
| "ifw": false, | |
| "img2img_denoise": false, | |
| "ipex_llm": false, | |
| "llm_int8": false, | |
| "moe_kernel_tuner": false, | |
| "pab": false, | |
| "padding_pruning": false, | |
| "qkv_diffusers": false, | |
| "quanto": false, | |
| "realesrgan_upscale": false, | |
| "reduce_noe": false, | |
| "ring_attn": false, | |
| "sage_attn": false, | |
| "stable_fast": false, | |
| "text_to_image_distillation_inplace_perp": false, | |
| "text_to_image_distillation_lora": false, | |
| "text_to_image_distillation_perp": false, | |
| "text_to_image_inplace_perp": false, | |
| "text_to_image_lora": false, | |
| "text_to_image_perp": false, | |
| "text_to_text_inplace_perp": false, | |
| "text_to_text_lora": false, | |
| "text_to_text_perp": false, | |
| "torch_compile": true, | |
| "torch_dynamic": false, | |
| "torch_structured": false, | |
| "torch_unstructured": false, | |
| "torchao": true, | |
| "whisper_s2t": false, | |
| "x_fast": false, | |
| "zipar": false, | |
| "fora_backbone_calls_per_step": 2, | |
| "fora_interval": 3, | |
| "fora_start_step": 4, | |
| "torch_compile_backend": "inductor", | |
| "torch_compile_dynamic": null, | |
| "torch_compile_fullgraph": false, | |
| "torch_compile_make_portable": false, | |
| "torch_compile_max_kv_cache_size": 400, | |
| "torch_compile_mode": "default", | |
| "torch_compile_seqlen_manual_cuda_graph": 100, | |
| "torch_compile_target": "model", | |
| "torchao_excluded_modules": "none", | |
| "torchao_quant_type": "fp8wo", | |
| "torchao_target_modules": { | |
| "include": [ | |
| "*single_transformer_blocks.*" | |
| ], | |
| "exclude": [ | |
| "pe_embedder", | |
| "*norm*", | |
| "*embed*" | |
| ] | |
| }, | |
| "batch_size": 1, | |
| "device": "cuda", | |
| "device_map": null, | |
| "save_fns": [ | |
| "save_before_apply", | |
| "save_before_apply" | |
| ], | |
| "save_artifacts_fns": [], | |
| "load_fns": [ | |
| "diffusers" | |
| ], | |
| "load_artifacts_fns": [], | |
| "reapply_after_load": { | |
| "torchao": true, | |
| "fora": true, | |
| "torch_compile": true | |
| } | |
| } | |
| ``` | |
| ## 🌍 Join the Pruna AI community! | |
| [](https://twitter.com/PrunaAI) | |
| [](https://github.com/PrunaAI) | |
| [](https://www.linkedin.com/company/93832878/admin/feed/posts/?feedType=following) | |
| [](https://discord.gg/JFQmtFKCjd) | |
| [](https://www.reddit.com/r/PrunaAI/) |