Aelora-Qwen3-4B — Fine-Tuned on a Fully Synthetic World

A LoRA fine-tune of Qwen3-4B taught a constructed knowledge domain that does not exist in any pre-training corpus: the planet Aelora — its language (Velari), its base-8 mathematics (Thex-Kron), and its custom logic system (Vel-Rith).

Why a synthetic world? Benchmarks leak into pre-training. By fine-tuning on a domain that provably cannot exist in any pre-training corpus, every correct answer is unambiguous evidence of learning — not retrieval from prior knowledge.

🧪 Full code, datasets, RAG comparison & writeup: github.com/Aslam-13/Fine_tuning_RAG
📓 Training notebook (Kaggle): kaggle.com/code/syed13/finetunerag
🔁 Earlier checkpoint (Level 2 only): Aslam-13/velari-level2-qwen3-4b
🔁 Datasets: huggingface.co/datasets/Aslam-13/Fine-tune-RAG

What this model knows

Domain	Coverage
Velari — vocabulary	30-word lexicon (nouns, verbs, pronouns, adjectives)
Velari — grammar	Plurals (`-an`), past (`ta-`), negation (`ne`), possession (`-os`), comparatives (`vor-`), superlatives (`krath-`), imperatives, SVO sentences
Thex-Kron math	Base-8 numerals (`nul, ek, doi, tri, kal, fen, sai, hep, ek-nul …`), addition with carry, multiplication, word problems
Vel-Rith logic	Element generation & destruction rules (e.g. `krel + pael → fia`, `krel destroys zorak`), single-rule and chained reasoning
World/lore	Aeloran society, regions, currency (Zolts), governance (Fen-Renan voting)

It will not know anything about Aelora outside the training set, and should be treated as a research/demo artifact — not a general-purpose assistant.

Training details


Base model	`unsloth/qwen3-4b-unsloth-bnb-4bit`
Method	LoRA (4-bit quant) via Unsloth + TRL `SFTTrainer`
Trainable params	66,060,288 of 4,088,528,384 (1.62%)
Dataset size	376 examples (JSONL, instruction/input/output)
Epochs	12
Effective batch size	4 (batch 2 × grad_accum 2)
Total steps	1,128
Final training loss	0.3397
Training time	226 min on a single Kaggle T4
License	Apache 2.0

Evaluation

148-question held-out test suite covering all domains. Compared head-to-head against multiple RAG configurations on a 50-question capstone subset:

System	Avg score	Perfect (1.0)	Zero (0.0)
L2 Semantic-chunk RAG	0.280	12	34
L2 Proposition-chunk RAG	0.278	11	33
Gemini 2.5 Flash + RAG	0.506	20	20
This model (FT-only)	0.649	30	15
This model + RAG	0.620	27	15

Finding: the fine-tuned model alone outperforms RAG-augmented variants on this domain. RAG helped on lore/logic gaps but hurt on math/grammar where the model had already memorized the rules. Full breakdown in the GitHub repo.

Quick start

Option A — with Unsloth (fastest, requires GPU)

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "Aslam-13/aelora-qwen3-4b",
    max_seq_length = 2048,
    load_in_4bit = True,
)
FastLanguageModel.for_inference(model)

prompt = "Translate to English: 'vel ne ta-vex zorak'"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
out = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(out[0], skip_special_tokens=True))

Option B — with plain `transformers` (no Unsloth required)

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "Aslam-13/aelora-qwen3-4b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

prompt = "Translate to English: 'vel ne ta-vex zorak'"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(out[0], skip_special_tokens=True))

Expected output (approximate): "I did not see fire." — vel (I) + ne (not) + ta- (past tense prefix) + vex (see) + zorak (fire). Exact phrasing may vary.

Limitations

Trained on a single small synthetic domain — does not generalize to real-world tasks.
Multi-step base-8 word problems and chained logic (>2 rules) are the weakest areas.
Keyword-overlap eval is crude; no human/LLM-judge evaluation was run.
No hyperparameter sweep — LoRA rank/alpha picked by convention.

Citation

@misc{aelora_qwen3_4b_2026,
  author = {Aslam-13},
  title  = {Aelora-Qwen3-4B: Fine-tuning Qwen3-4B on a fully synthetic constructed-knowledge domain},
  year   = {2026},
  url    = {https://huggingface.co/Aslam-13/aelora-qwen3-4b}
}

Trained with Unsloth for 2× faster LoRA training.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support