File size: 6,027 Bytes
047480e
4a2fa84
047480e
 
4a2fa84
 
 
 
 
 
 
047480e
 
bc2e761
047480e
4a2fa84
047480e
4a2fa84
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
047480e
 
4a2fa84
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
047480e
 
4a2fa84
 
 
 
047480e
4a2fa84
 
 
 
 
 
 
047480e
 
4a2fa84
047480e
4a2fa84
047480e
4a2fa84
047480e
4a2fa84
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
047480e
4a2fa84
047480e
4a2fa84
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
---
license: apache-2.0
base_model: google/gemma-3-1b-it
tags:
- gemma
- gemma3
- instruction-tuned
- fine-tuned
- safety
- gguf
- axion
---

# AdvRahul/Axion-Lite-1B-Q5_K_M-GGUF

**Axion-Lite-1B** is a safety-enhanced, quantized version of Google's powerful `gemma-3-1b-it` model. This model has been specifically fine-tuned to improve its safety alignment, making it more robust and reliable for a wide range of applications.

The model is provided in the GGUF format, which allows it to run efficiently on CPUs and other hardware with limited resources.

## 🚀 Model Details

* **Model Creator:** AdvRahul
* **Base Model:** [google/gemma-3-1b-it](https://huggingface.co/google/gemma-3-1b-it)
* **Fine-tuning Focus:** Enhanced Safety & Harmlessness through red-teaming.
* **Quantization:** `Q5_K_M` via GGUF. This quantization offers an excellent balance between model size, inference speed, and performance preservation.
* **Architecture:** Gemma 3
* **License:** Gemma 3 Terms of Use.

---

## 💻 How to Use

This model is in GGUF format and is designed to be used with frameworks like `llama.cpp` and its Python bindings.

### Using `llama-cpp-python`

First, install the necessary library. Ensure you have a version that supports Gemma 3 models.

```bash

pip install llama-cpp-python
````

Then, you can use the following Python script to run the model:

```python
from llama_cpp import Llama

# Download the model from the Hugging Face Hub before running this
# Or let llama-cpp-python download it for you
llm = Llama.from_pretrained(
    repo_id="AdvRahul/Axion-Lite-1B-Q5_K_M-GGUF",
    filename="Axion-Lite-1B-Q5_K_M.gguf",
    verbose=False
)

prompt = "What are the key principles of responsible AI development?"

# The Gemma 3 instruction-tuned model uses a specific chat template.
# For simple prompts, you can start with <start_of_turn>user\n{prompt}<end_of_turn>\n<start_of_turn>model
chat_prompt = f"<start_of_turn>user\n{prompt}<end_of_turn>\n<start_of_turn>model"

output = llm(chat_prompt, max_tokens=256, stop=["<end_of_turn>"], echo=False)

print(output['choices'][0]['text'])
```

### Using `llama.cpp` (CLI)

You can also run this model directly from the command line after cloning and building the `llama.cpp` repository.

```bash
# Clone and build llama.cpp
git clone [https://github.com/ggerganov/llama.cpp](https://github.com/ggerganov/llama.cpp)
cd llama.cpp
make

# Run inference
./main -m /path/to/your/models/Axion-Lite-1B-Q5_K_M.gguf -p "<start_of_turn>user\nWhat is the capital of India?<end_of_turn>\n<start_of_turn>model" -n 128
```

-----

## 📝 Model Description

### Fine-Tuning for Safety

**Axion-Lite-1B** originates from `google/gemma-3-1b-it`. The primary goal of this project was to enhance the model's safety alignment. The base model underwent **extensive red-team testing with advanced protocols** to significantly reduce the likelihood of generating harmful, unethical, biased, or unsafe content. This makes Axion-Lite-1B a more suitable choice for applications that require a higher degree of content safety and reliability.

### Quantization

The model is quantized to `Q5_K_M`, a method that provides a high-quality balance between perplexity (model accuracy) and file size. This makes it ideal for deployment in resource-constrained environments, such as on local machines, edge devices, or cost-effective cloud instances, without a significant drop in performance.

-----

## ℹ️ Base Model Information (Gemma 3)

\<details\>
\<summary\>Click to expand details on the base model\</summary\>

Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. Gemma 3 models handle text input and generate text output, with open weights for both pre-trained variants and instruction-tuned variants. The `1B` model was trained on 2 trillion tokens of data.

### Training Data

The base model was trained on a dataset of text data that includes a wide variety of sources:

  * **Web Documents:** A diverse collection of web text ensures the model is exposed to a broad range of linguistic styles, topics, and vocabulary in over 140 languages.
  * **Code:** Exposing the model to code helps it learn the syntax and patterns of programming languages.
  * **Mathematics:** Training on mathematical text helps the model learn logical reasoning and symbolic representation.

### Data Preprocessing

The training data for the base model underwent rigorous cleaning and filtering, including:

  * **CSAM Filtering:** Exclusion of Child Sexual Abuse Material.
  * **Sensitive Data Filtering:** Automated techniques were used to filter out certain personal information and other sensitive data.
  * **Content Quality Filtering:** Filtering based on content quality and safety in line with Google's policies.

\</details\>

-----

## ⚠️ Ethical Considerations and Limitations

While this model has been fine-tuned to enhance its safety, no language model is perfectly safe. It inherits the limitations of its base model, `gemma-3-1b-it`, and the data it was trained on.

  * **Potential for Bias:** The model may still generate content that reflects societal biases present in the training data.
  * **Factual Inaccuracy:** The model can "hallucinate" or generate incorrect or outdated information. It should not be used as a sole source of truth.
  * **Not a Substitute for Human Judgment:** The outputs should be reviewed and validated, especially in sensitive or high-stakes applications.

Developers implementing this model should build additional safety mitigations and content moderation tools as part of a **defense-in-depth** strategy, tailored to their specific use case.

## Citing the Base Model

If you use this model, please consider citing the original Gemma 3 work:

```bibtex
@article{gemma_2025,
    title={Gemma 3},
    url={[https://goo.gle/Gemma3Report](https://goo.gle/Gemma3Report)},
    publisher={Kaggle},
    author={Gemma Team},
    year={2025}
}
```

```
```