AURORA — rule-governor (LoRA on Gemma 2B)
AURORA is a text-generation policy (Gemma-2B + LoRA) trained to choose ACCEPT or REJECT in response to a text description of a simulated “civilization” state (active rule, resources, stability, belief mismatch, etc.). It is a governance policy for a research simulator, not a general chat model.
- Hugging Face model: lokiii005/Aurora
- Code & environment: inland-taipen/Aurora
Model details
| Developed by | SuperSymmetry |
| Model type | Causal LM + PEFT/LoRA adapter |
| Base model | unsloth/gemma-2-2b-it (Unsloth + 4-bit training stack) |
| Language | English (state prompts) |
| License | Comply with Gemma / Google terms; adapter subject to this repo’s terms (edit if you add a LICENSE) |
| Finetuned from | unsloth/gemma-2-2b-it via Unsloth + TRL SFT |
Links
- Repository: github.com/inland-taipen/Aurora
- Live demo (Gradio): Edit: add Space URL when published
Training
- Data: Synthetic rollouts from the in-repo grid environment; random ACCEPT/REJECT exploration, formatted as (state, action, reward) for SFT.
- Method: Supervised fine-tuning (SFT) with
trl/ Unsloth on GPU (e.g. Google Colab T4). - Episodes (example run): ~30 (increase to 50+ to retrain with more data).
- See code:
training/train_aurora.pyfor LoRA rank,max_seq_length, and training steps.
Intended use
- In scope: With the same AURORA state prompt style as in training, output ACCEPT or REJECT for a proposed rule change in the simulation.
- Out of scope: Real-world control, high-stakes decisions, unconstrained open chat. Research / demo only.
Limitations and risks
- Trained on synthetic env rollouts, not a broad human dataset; not safety-certified.
- Gemma policy limitations apply. Parse ACCEPT/REJECT in code;
generatemay add extra text.
How to load
Use the base_model_name_or_path in adapter_config.json on the Files tab if it differs. Typical pattern:
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base_id = "unsloth/gemma-2-2b-it" # match adapter_config
adapter_id = "lokiii005/Aurora"
tokenizer = AutoTokenizer.from_pretrained(adapter_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
base_id, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True,
)
model = PeftModel.from_pretrained(model, adapter_id)
# Prompt format: see demo/app.py and training/train_aurora.py
Evaluation
Reproducible scripts: GitHub scripts/, training/train_aurora.py. No fixed leaderboard; report your own numbers if you re-run.
Environmental impact (rough)
Estimate GPU hours × hardware with the ML CO2 impact calculator if needed for a report.
Citation
Cite this model and the AURORA repository.
Contact: anika.24bcs104092sst.scaler.com
Framework note: PEFT; see adapter_config.json in this model repo for exact library versions.
- Downloads last month
- 30
Model tree for lokiii005/Aurora
Base model
unsloth/gemma-2-2b-it