ranggafermata
/

Fermata

Text Generation

Model card Files Files and versions

Fermata / README.md

ranggafermata's picture

Update README.md

cb9557b verified 6 months ago

|

history blame contribute delete

2.02 kB

	---
	license: gemma
	language: en
	pipeline_tag: text-generation
	datasets:
	- ranggafermata/fermata_data
	base_model:
	- google/gemma-2-2b-it
	---

	# Fermata – Fine-tuned Gemma AI Assistant

	Fermata is a fine-tuned version of Google's [`gemma-2b-it`](https://huggingface.co/google/gemma-2b-it), trained to act as a personalized AI assistant that responds with character, helpfulness, and consistency. It is designed to follow instructions, engage in conversation, and adapt to specific behavioral traits or personas.

	---

	## Model Details

	- Base Model: [`google/gemma-2b-it`](https://huggingface.co/google/gemma-2b-it)
	- Fine-tuned by: [@ranggafermata](https://huggingface.co/ranggafermata)
	- Framework: 🤗 Transformers + PEFT + LoRA (Unsloth)
	- Precision: 4-bit quantized (NF4) during training, merged to full F32 weights
	- Model Size: ~2.61B parameters

	---

	## Training Details

	- LoRA Configuration:
	- `r`: 16
	- `alpha`: 16
	- `dropout`: 0.05
	- Target modules: attention & MLP projection layers
	- Epochs: 12
	- Dataset: Custom instruction-response pairs built to teach Fermata its identity and assistant behavior
	- Tooling: [Unsloth](https://github.com/unslothai/unsloth), 🤗 PEFT, `trl`'s `SFTTrainer`

	---

	## Files Included

	- ✅ `model-00001-of-00003.safetensors` to `model-00003-of-00003.safetensors`
	- ✅ `config.json`, `tokenizer.model`, `tokenizer.json`
	- ✅ `generation_config.json`, `chat_template.jinja`
	- ❌ Adapter weights are removed (merged into base model)

	---

	## Example Usage

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM

	model = AutoModelForCausalLM.from_pretrained("ranggafermata/Fermata", device_map="auto")
	tokenizer = AutoTokenizer.from_pretrained("ranggafermata/Fermata")

	prompt = "### Human:\nWho are you?\n\n### Assistant:"
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=100)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))