|
|
--- |
|
|
license: mit |
|
|
datasets: |
|
|
- shvn22k/brainrot-dataset |
|
|
language: |
|
|
- en |
|
|
new_version: unsloth/gemma-3-270m-it |
|
|
tags: |
|
|
- artificial |
|
|
- nlp |
|
|
- gemma |
|
|
- agent |
|
|
- gen-z |
|
|
--- |
|
|
|
|
|
# Brainrot Gemma |
|
|
|
|
|
Brainrot Gemma is a fine tuned variant of **Gemma 3 270M**, optimized to generate chaotic internet slang, meme-speak, and hyper casual dialogue patterns. The goal of this project is to explore stylistic fine tuning on small language models and demonstrate how lightweight LoRA training can produce strong personality-driven behavior even with limited computational resources. |
|
|
|
|
|
## Overview |
|
|
|
|
|
This model is trained using **Unsloth** with LoRA adapters on top of the Gemma 3 270M base model. |
|
|
The dataset consists of paired `source` and `target` examples representing conversational brainrot style. |
|
|
All training, formatting, and merging steps follow the standard SFT (Supervised Fine Tuning) pipeline. |
|
|
|
|
|
The final model can be exported in HuggingFace format or converted into GGUF for use with local inference frameworks such as **Ollama** or **llama.cpp**. |
|
|
|
|
|
## Features |
|
|
|
|
|
* Fine tuned on a custom brainrot conversation dataset |
|
|
* Built on top of **Gemma 3 270M**, a compact and efficient model |
|
|
* LoRA-based training for fast experimentation |
|
|
* Supports HuggingFace Transformers inference |
|
|
* Can be merged and exported to **GGUF** for local deployment |
|
|
* Retains the structure and safety features of the base model while adapting tone and style |
|
|
|
|
|
## Training Details |
|
|
|
|
|
* Framework: Unsloth + Transformers |
|
|
* Base model: `unsloth/gemma-3-270m-unsloth-bnb-4bit` |
|
|
* Sequence length: 2048 |
|
|
* Optimization: LoRA (Rank 16) |
|
|
* Final training loss: ~4.0 |
|
|
* Hardware: Colab T4 GPU (training), local CPU/GPU for export |
|
|
|
|
|
### Dataset |
|
|
|
|
|
The dataset includes: |
|
|
|
|
|
* `train` |
|
|
* `validation` |
|
|
* `test` |
|
|
|
|
|
The final training set merges and subsamples these splits into a 3000-example subset formatted into ChatML-style conversations. |
|
|
|
|
|
Example data structure: |
|
|
|
|
|
```json |
|
|
{ |
|
|
"conversations": [ |
|
|
{"role": "user", "content": "..."}, |
|
|
{"role": "assistant", "content": "..."} |
|
|
] |
|
|
} |
|
|
``` |
|
|
|
|
|
## Usage (HuggingFace Format) |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained("brainrot-gemma") |
|
|
model = AutoModelForCausalLM.from_pretrained("brainrot-gemma") |
|
|
|
|
|
prompt = "explain quantum mechanics in brainrot style" |
|
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
|
|
|
|
outputs = model.generate(**inputs, max_new_tokens=100) |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
``` |
|
|
|
|
|
## Usage (Ollama / GGUF) |
|
|
|
|
|
After exporting the merged model to GGUF: |
|
|
|
|
|
``` |
|
|
FROM ./brainrot-gemma.gguf |
|
|
``` |
|
|
|
|
|
Build: |
|
|
|
|
|
``` |
|
|
ollama create brainrot-gemma -f Modelfile |
|
|
``` |
|
|
|
|
|
Run: |
|
|
|
|
|
``` |
|
|
ollama run brainrot-gemma |
|
|
``` |
|
|
|
|
|
## Repository Structure |
|
|
|
|
|
``` |
|
|
brainrot-gemma/ |
|
|
β |
|
|
βββ adapter_config.json |
|
|
βββ adapter_model.safetensors |
|
|
βββ tokenizer.json |
|
|
βββ tokenizer.model |
|
|
βββ tokenizer_config.json |
|
|
βββ special_tokens_map.json |
|
|
βββ chat_template.jinja |
|
|
``` |
|
|
|
|
|
(Merged or GGUF versions may contain different files.) |
|
|
|
|
|
## Intended Use |
|
|
|
|
|
Brainrot Gemma is designed for: |
|
|
|
|
|
* stylistic experimentation |
|
|
* meme-style text generation |
|
|
* informal dialogue agents |
|
|
* research into fine tune behavior on small LLMs |
|
|
|
|
|
It is **not** intended for tasks requiring factual accuracy, safety-critical applications, or formal communication. |
|
|
|
|
|
## License |
|
|
|
|
|
Model usage follows the licensing terms of: |
|
|
|
|
|
* Googleβs Gemma 3 |
|
|
* Unsloth |
|
|
* The dataset author |
|
|
* Any additional dependencies used during training |
|
|
|
|
|
Check the included license files for details. |