--- license: gemma language: en pipeline_tag: text-generation datasets: - ranggafermata/fermata_data base_model: - google/gemma-2-2b-it --- # Fermata – Fine-tuned Gemma AI Assistant **Fermata** is a fine-tuned version of Google's [`gemma-2b-it`](https://huggingface.co/google/gemma-2b-it), trained to act as a personalized AI assistant that responds with character, helpfulness, and consistency. It is designed to follow instructions, engage in conversation, and adapt to specific behavioral traits or personas. --- ## Model Details - **Base Model**: [`google/gemma-2b-it`](https://huggingface.co/google/gemma-2b-it) - **Fine-tuned by**: [@ranggafermata](https://huggingface.co/ranggafermata) - **Framework**: 🤗 Transformers + PEFT + LoRA (Unsloth) - **Precision**: 4-bit quantized (NF4) during training, merged to full F32 weights - **Model Size**: ~2.61B parameters --- ## Training Details - **LoRA Configuration**: - `r`: 16 - `alpha`: 16 - `dropout`: 0.05 - Target modules: attention & MLP projection layers - **Epochs**: 12 - **Dataset**: Custom instruction-response pairs built to teach Fermata its identity and assistant behavior - **Tooling**: [Unsloth](https://github.com/unslothai/unsloth), 🤗 PEFT, `trl`'s `SFTTrainer` --- ## Files Included - ✅ `model-00001-of-00003.safetensors` to `model-00003-of-00003.safetensors` - ✅ `config.json`, `tokenizer.model`, `tokenizer.json` - ✅ `generation_config.json`, `chat_template.jinja` - ❌ Adapter weights are removed (merged into base model) --- ## Example Usage ```python from transformers import AutoTokenizer, AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("ranggafermata/Fermata", device_map="auto") tokenizer = AutoTokenizer.from_pretrained("ranggafermata/Fermata") prompt = "### Human:\nWho are you?\n\n### Assistant:" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=100) print(tokenizer.decode(outputs[0], skip_special_tokens=True))