AURORA — rule-governor (LoRA on Gemma 2B)

AURORA is a text-generation policy (Gemma-2B + LoRA) trained to choose ACCEPT or REJECT in response to a text description of a simulated “civilization” state (active rule, resources, stability, belief mismatch, etc.). It is a governance policy for a research simulator, not a general chat model.

Model details

Developed by SuperSymmetry
Model type Causal LM + PEFT/LoRA adapter
Base model unsloth/gemma-2-2b-it (Unsloth + 4-bit training stack)
Language English (state prompts)
License Comply with Gemma / Google terms; adapter subject to this repo’s terms (edit if you add a LICENSE)
Finetuned from unsloth/gemma-2-2b-it via Unsloth + TRL SFT

Links

Training

  • Data: Synthetic rollouts from the in-repo grid environment; random ACCEPT/REJECT exploration, formatted as (state, action, reward) for SFT.
  • Method: Supervised fine-tuning (SFT) with trl / Unsloth on GPU (e.g. Google Colab T4).
  • Episodes (example run): ~30 (increase to 50+ to retrain with more data).
  • See code: training/train_aurora.py for LoRA rank, max_seq_length, and training steps.

Intended use

  • In scope: With the same AURORA state prompt style as in training, output ACCEPT or REJECT for a proposed rule change in the simulation.
  • Out of scope: Real-world control, high-stakes decisions, unconstrained open chat. Research / demo only.

Limitations and risks

  • Trained on synthetic env rollouts, not a broad human dataset; not safety-certified.
  • Gemma policy limitations apply. Parse ACCEPT/REJECT in code; generate may add extra text.

How to load

Use the base_model_name_or_path in adapter_config.json on the Files tab if it differs. Typical pattern:

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base_id = "unsloth/gemma-2-2b-it"  # match adapter_config
adapter_id = "lokiii005/Aurora"

tokenizer = AutoTokenizer.from_pretrained(adapter_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    base_id, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True,
)
model = PeftModel.from_pretrained(model, adapter_id)
# Prompt format: see demo/app.py and training/train_aurora.py

Evaluation

Reproducible scripts: GitHub scripts/, training/train_aurora.py. No fixed leaderboard; report your own numbers if you re-run.

Environmental impact (rough)

Estimate GPU hours × hardware with the ML CO2 impact calculator if needed for a report.

Citation

Cite this model and the AURORA repository.

Contact: anika.24bcs104092sst.scaler.com

Framework note: PEFT; see adapter_config.json in this model repo for exact library versions.

Downloads last month
30
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lokiii005/Aurora

Adapter
(259)
this model

Space using lokiii005/Aurora 1