| | --- |
| | license: llama3.2 |
| | --- |
| | # Llama 3.2 GGUF (4_K_M Quantized) |
| |
|
| | This repository hosts GGUF-format quantized versions of Llama 3.2 models at multiple parameter sizes. |
| |
|
| | These files are intended for use with SciTools’ Understand and Onboard, as well as other tools and runtimes that support the GGUF format (for example, llama.cpp-based applications). |
| |
|
| | --- |
| |
|
| | ## Model Details |
| |
|
| | - Base models: Llama 3.2 (various parameter sizes) |
| | - Format: GGUF |
| | - Intended use: Local inference, code understanding, general-purpose chat |
| | - Languages: Multilingual (as supported by Llama 3.2) |
| |
|
| | ### Available Variants |
| |
|
| | This repository includes multiple Llama 3.2 parameter sizes, each quantized independently. Refer to the file names for exact parameter counts. |
| |
|
| | --- |
| |
|
| | ## Quantization Process |
| |
|
| | - Quantization was performed by **Unsloth** and **TensorBlock**. |
| | - No further modifications, rebalancing, or fine-tuning were applied. |
| | - The quantization parameters and defaults were not altered from the original sources. |
| |
|
| | The goal is to provide faithful, reproducible GGUF variants that behave as closely as possible to their upstream counterparts. |
| |
|
| | --- |
| |
|
| | ## What We Did Not Do |
| |
|
| | To be explicit: |
| |
|
| | - No additional fine-tuning |
| | - No instruction rebalancing |
| | - No safety, alignment, or prompt modifications |
| | - No merging or model surgery |
| |
|
| | If a model behaves a certain way, that behavior comes from Llama 3.2 combined with quantization, not from any downstream changes here. |
| |
|
| | --- |
| |
|
| | ## Intended Use |
| |
|
| | These models are suitable for: |
| |
|
| | - SciTools Understand and SciTools Onboard |
| | - Local AI workflows |
| | - Code comprehension and exploration |
| | - Interactive chat and analysis |
| | - Integration into developer tools that support GGUF |
| |
|
| | They are not intended for: |
| |
|
| | - Safety-critical or regulated decision-making |
| | - Use cases requiring guaranteed factual accuracy |
| | - Production deployment without independent evaluation |
| |
|
| | --- |
| |
|
| | ## Limitations |
| |
|
| | - Output quality varies by parameter size and task. |
| | - Like all large language models, Llama 3.2 may produce hallucinations or incorrect information. |
| |
|
| | Evaluate carefully for your specific workload. |
| |
|
| | --- |
| |
|
| | ## License & Attribution |
| |
|
| | - Original models: Meta (Llama 3.2) |
| | - Quantization: Unsloth and TensorBlock |
| | - Format: GGUF (llama.cpp ecosystem) |
| |
|
| | Please refer to the original Llama 3.2 license and usage terms. This repository redistributes quantized artifacts only and does not change the underlying licensing conditions. |
| |
|
| | --- |
| |
|
| | ## Acknowledgements |
| |
|
| | Thanks to Meta for releasing the Llama 3.2 models, and to Unsloth and TensorBlock for providing high-quality, reproducible quantization that enables efficient local inference across a wide range of tools. |
| |
|