---
license: gemma
language: en
pipeline_tag: text-generation
datasets:
- ranggafermata/fermata_data
base_model:
- google/gemma-2-2b-it
---

# Fermata – Fine-tuned Gemma AI Assistant

**Fermata** is a fine-tuned version of Google's [`gemma-2b-it`](https://huggingface.co/google/gemma-2b-it), trained to act as a personalized AI assistant that responds with character, helpfulness, and consistency. It is designed to follow instructions, engage in conversation, and adapt to specific behavioral traits or personas.

---

## Model Details

- **Base Model**: [`google/gemma-2b-it`](https://huggingface.co/google/gemma-2b-it)
- **Fine-tuned by**: [@ranggafermata](https://huggingface.co/ranggafermata)
- **Framework**: 🤗 Transformers + PEFT + LoRA (Unsloth)
- **Precision**: 4-bit quantized (NF4) during training, merged to full F32 weights
- **Model Size**: ~2.61B parameters

---

## Training Details

- **LoRA Configuration**:
  - `r`: 16
  - `alpha`: 16
  - `dropout`: 0.05
  - Target modules: attention & MLP projection layers
- **Epochs**: 12
- **Dataset**: Custom instruction-response pairs built to teach Fermata its identity and assistant behavior
- **Tooling**: [Unsloth](https://github.com/unslothai/unsloth), 🤗 PEFT, `trl`'s `SFTTrainer`

---

## Files Included

- ✅ `model-00001-of-00003.safetensors` to `model-00003-of-00003.safetensors`
- ✅ `config.json`, `tokenizer.model`, `tokenizer.json`
- ✅ `generation_config.json`, `chat_template.jinja`
- ❌ Adapter weights are removed (merged into base model)

---

## Example Usage

```python
from transformers import AutoTokenizer, AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("ranggafermata/Fermata", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("ranggafermata/Fermata")

prompt = "### Human:\nWho are you?\n\n### Assistant:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))