File size: 6,027 Bytes
047480e 4a2fa84 047480e 4a2fa84 047480e bc2e761 047480e 4a2fa84 047480e 4a2fa84 047480e 4a2fa84 047480e 4a2fa84 047480e 4a2fa84 047480e 4a2fa84 047480e 4a2fa84 047480e 4a2fa84 047480e 4a2fa84 047480e 4a2fa84 047480e 4a2fa84 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 |
---
license: apache-2.0
base_model: google/gemma-3-1b-it
tags:
- gemma
- gemma3
- instruction-tuned
- fine-tuned
- safety
- gguf
- axion
---
# AdvRahul/Axion-Lite-1B-Q5_K_M-GGUF
**Axion-Lite-1B** is a safety-enhanced, quantized version of Google's powerful `gemma-3-1b-it` model. This model has been specifically fine-tuned to improve its safety alignment, making it more robust and reliable for a wide range of applications.
The model is provided in the GGUF format, which allows it to run efficiently on CPUs and other hardware with limited resources.
## 🚀 Model Details
* **Model Creator:** AdvRahul
* **Base Model:** [google/gemma-3-1b-it](https://huggingface.co/google/gemma-3-1b-it)
* **Fine-tuning Focus:** Enhanced Safety & Harmlessness through red-teaming.
* **Quantization:** `Q5_K_M` via GGUF. This quantization offers an excellent balance between model size, inference speed, and performance preservation.
* **Architecture:** Gemma 3
* **License:** Gemma 3 Terms of Use.
---
## 💻 How to Use
This model is in GGUF format and is designed to be used with frameworks like `llama.cpp` and its Python bindings.
### Using `llama-cpp-python`
First, install the necessary library. Ensure you have a version that supports Gemma 3 models.
```bash
pip install llama-cpp-python
````
Then, you can use the following Python script to run the model:
```python
from llama_cpp import Llama
# Download the model from the Hugging Face Hub before running this
# Or let llama-cpp-python download it for you
llm = Llama.from_pretrained(
repo_id="AdvRahul/Axion-Lite-1B-Q5_K_M-GGUF",
filename="Axion-Lite-1B-Q5_K_M.gguf",
verbose=False
)
prompt = "What are the key principles of responsible AI development?"
# The Gemma 3 instruction-tuned model uses a specific chat template.
# For simple prompts, you can start with <start_of_turn>user\n{prompt}<end_of_turn>\n<start_of_turn>model
chat_prompt = f"<start_of_turn>user\n{prompt}<end_of_turn>\n<start_of_turn>model"
output = llm(chat_prompt, max_tokens=256, stop=["<end_of_turn>"], echo=False)
print(output['choices'][0]['text'])
```
### Using `llama.cpp` (CLI)
You can also run this model directly from the command line after cloning and building the `llama.cpp` repository.
```bash
# Clone and build llama.cpp
git clone [https://github.com/ggerganov/llama.cpp](https://github.com/ggerganov/llama.cpp)
cd llama.cpp
make
# Run inference
./main -m /path/to/your/models/Axion-Lite-1B-Q5_K_M.gguf -p "<start_of_turn>user\nWhat is the capital of India?<end_of_turn>\n<start_of_turn>model" -n 128
```
-----
## 📝 Model Description
### Fine-Tuning for Safety
**Axion-Lite-1B** originates from `google/gemma-3-1b-it`. The primary goal of this project was to enhance the model's safety alignment. The base model underwent **extensive red-team testing with advanced protocols** to significantly reduce the likelihood of generating harmful, unethical, biased, or unsafe content. This makes Axion-Lite-1B a more suitable choice for applications that require a higher degree of content safety and reliability.
### Quantization
The model is quantized to `Q5_K_M`, a method that provides a high-quality balance between perplexity (model accuracy) and file size. This makes it ideal for deployment in resource-constrained environments, such as on local machines, edge devices, or cost-effective cloud instances, without a significant drop in performance.
-----
## ℹ️ Base Model Information (Gemma 3)
\<details\>
\<summary\>Click to expand details on the base model\</summary\>
Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. Gemma 3 models handle text input and generate text output, with open weights for both pre-trained variants and instruction-tuned variants. The `1B` model was trained on 2 trillion tokens of data.
### Training Data
The base model was trained on a dataset of text data that includes a wide variety of sources:
* **Web Documents:** A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary in over 140 languages.
* **Code:** Exposing the model to code helps it learn the syntax and patterns of programming languages.
* **Mathematics:** Training on mathematical text helps the model learn logical reasoning and symbolic representation.
### Data Preprocessing
The training data for the base model underwent rigorous cleaning and filtering, including:
* **CSAM Filtering:** Exclusion of Child Sexual Abuse Material.
* **Sensitive Data Filtering:** Automated techniques were used to filter out certain personal information and other sensitive data.
* **Content Quality Filtering:** Filtering based on content quality and safety in line with Google's policies.
\</details\>
-----
## ⚠️ Ethical Considerations and Limitations
While this model has been fine-tuned to enhance its safety, no language model is perfectly safe. It inherits the limitations of its base model, `gemma-3-1b-it`, and the data it was trained on.
* **Potential for Bias:** The model may still generate content that reflects societal biases present in the training data.
* **Factual Inaccuracy:** The model can "hallucinate" or generate incorrect or outdated information. It should not be used as a sole source of truth.
* **Not a Substitute for Human Judgment:** The outputs should be reviewed and validated, especially in sensitive or high-stakes applications.
Developers implementing this model should build additional safety mitigations and content moderation tools as part of a **defense-in-depth** strategy, tailored to their specific use case.
## Citing the Base Model
If you use this model, please consider citing the original Gemma 3 work:
```bibtex
@article{gemma_2025,
title={Gemma 3},
url={[https://goo.gle/Gemma3Report](https://goo.gle/Gemma3Report)},
publisher={Kaggle},
author={Gemma Team},
year={2025}
}
```
```
``` |