| --- |
| library_name: transformers |
| tags: |
| - pruna-ai |
| - safetensors |
| --- |
| |
| # Model Card for pruna-test/test-save-tiny-random-llama4-smashed |
|
|
| This model was created using the [pruna](https://github.com/PrunaAI/pruna) library. Pruna is a model optimization framework built for developers, enabling you to deliver more efficient models with minimal implementation overhead. |
|
|
| ## Usage |
|
|
| First things first, you need to install the pruna library: |
|
|
| ```bash |
| pip install pruna |
| ``` |
|
|
| You can [use the transformers library to load the model](https://huggingface.co/pruna-test/test-save-tiny-random-llama4-smashed?library=transformers) but this might not include all optimizations by default. |
|
|
| To ensure that all optimizations are applied, use the pruna library to load the model using the following code: |
|
|
| ```python |
| from pruna import PrunaModel |
| |
| loaded_model = PrunaModel.from_pretrained( |
| "pruna-test/test-save-tiny-random-llama4-smashed" |
| ) |
| # we can then run inference using the methods supported by the base model |
| ``` |
|
|
|
|
| For inference, you can use the inference methods of the original model like shown in [the original model card](https://huggingface.co/hf-internal-testing/tiny-random-llama4?library=transformers). |
| Alternatively, you can visit [the Pruna documentation](https://docs.pruna.ai/en/stable/) for more information. |
|
|
| ## Smash Configuration |
|
|
| The compression configuration of the model is stored in the `smash_config.json` file, which describes the optimization methods that were applied to the model. |
|
|
| ```bash |
| { |
| "awq": false, |
| "c_generate": false, |
| "c_translate": false, |
| "c_whisper": false, |
| "deepcache": false, |
| "diffusers_int8": false, |
| "fastercache": false, |
| "flash_attn3": false, |
| "fora": false, |
| "gptq": false, |
| "half": false, |
| "hqq": false, |
| "hqq_diffusers": false, |
| "ifw": false, |
| "llm_int8": false, |
| "pab": false, |
| "qkv_diffusers": false, |
| "quanto": false, |
| "stable_fast": false, |
| "torch_compile": false, |
| "torch_dynamic": false, |
| "torch_structured": false, |
| "torch_unstructured": false, |
| "torchao": false, |
| "whisper_s2t": false, |
| "batch_size": 1, |
| "device": "cpu", |
| "device_map": null, |
| "save_fns": [], |
| "load_fns": [ |
| "transformers" |
| ], |
| "reapply_after_load": {} |
| } |
| ``` |
|
|
| ## 🌍 Join the Pruna AI community! |
|
|
| [](https://twitter.com/PrunaAI) |
| [](https://github.com/PrunaAI) |
| [](https://www.linkedin.com/company/93832878/admin/feed/posts/?feedType=following) |
| [](https://discord.gg/JFQmtFKCjd) |
| [](https://www.reddit.com/r/PrunaAI/) |