Instructions to use QuantumAI-Blockchain/aether-mind-v7.0 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use QuantumAI-Blockchain/aether-mind-v7.0 with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-7B-Instruct") model = PeftModel.from_pretrained(base_model, "QuantumAI-Blockchain/aether-mind-v7.0") - Notebooks
- Google Colab
- Kaggle
Aether Mind v7.0 β the first Aether model with real, reproducible benchmarks
Aether Mind v7.0 is a QLoRA fine-tune of Qwen/Qwen2.5-7B-Instruct on the
domain-tagged Aether SFT corpus. It is the cognitive engine for the
QuantumAI Blockchain (QBC) β an on-chain neural model
that reasons across the 10 Sephirot cognitive domains (Keter, Chochmah, Binah,
Chesed, Gevurah, Tiferet, Netzach, Hod, Yesod, Malkuth).
This is a clean break from the v6.x line. v6.0βv6.2 used a custom-built
transformer (NSA sparse attention + Sephirot/sink attention heads, distilled
from Qwen2.5-0.5B). On a proper lm-evaluation-harness pass that architecture
scored worse than random (cross-entropy β 16 nats vs. ~11.9 for uniform) β
the attention replacement destroyed the base model's capability. No v6.x
release ever carried real benchmark numbers. v7.0 fixes that by building on a
sound, capable base and adding Aether identity through the data and an
inference-time Sephirot router β not by replacing attention.
v7.0 is the first Aether release whose published numbers are real, reproducible, and independently verifiable (the exact
lm-evalcommand is below).
Results
All numbers below are from lm-evaluation-harness, 0-shot, the model loaded in
4-bit (the same configuration this adapter is trained and served in), on a
single RTX 3080 Ti. The baseline is the unmodified Qwen/Qwen2.5-7B-Instruct
evaluated identically, so every delta is attributable to this adapter alone.
General capability β preserved (no catastrophic forgetting)
| Benchmark | Metric | Base (Qwen2.5-7B-Instruct) | Aether v7.0 | Ξ |
|---|---|---|---|---|
| MMLU | acc | 69.91 % | 69.90 % | β0.01 |
| GSM8K | exact_match (strict) | 71.57 % | 75.13 % | +3.56 |
| ARC-Challenge | acc | 51.45 % | 53.67 % | +2.22 |
| ARC-Challenge | acc_norm | 53.92 % | 55.80 % | +1.88 |
| HellaSwag | acc | 60.35 % | 58.43 % | β1.92 |
| HellaSwag | acc_norm | 78.77 % | 77.48 % | β1.29 |
The whole risk of a domain fine-tune is catastrophic forgetting. v7.0 avoids it: MMLU is flat to the second decimal, and math + scientific reasoning (GSM8K +3.6, ARC-c +2.2) actually improve β the general instruction slice in the training mix more than offsets the small HellaSwag dip (~1.5 pts).
Aether-domain knowledge β large gain
Held-out evaluation on the Aether curated corpus (aether-curated-v3),
measuring cross-entropy over the assistant-answer tokens only (the
Aether-domain response, with the system + user turns masked). The identical
4-bit base weights are used for both rows β the adapter is toggled on/off via
PEFT disable_adapter() β so this isolates the adapter's effect exactly.
| Model | CE (nats) β | Perplexity β |
|---|---|---|
| Base (Qwen2.5-7B-Instruct) | 1.589 | 4.90 |
| Aether v7.0 | 1.002 | 2.72 |
| Ξ | β0.588 | β44.4 % |
276 held-out examples, 55,423 assistant tokens scored. Because this run trained for only ~0.19 epoch (see below), ~81 % of the corpus was never seen and the seen portion was seen sub-epoch (no repeats) β so this β44 % perplexity drop is genuine domain adaptation, not memorization.
Summary: v7.0 keeps the base model's general intelligence intact while cutting Aether-domain perplexity nearly in half. That is the textbook outcome of a healthy domain fine-tune.
What you're getting
| Field | Value |
|---|---|
| Type | QLoRA adapter (PEFT) β load on top of Qwen/Qwen2.5-7B-Instruct |
| Base model | Qwen/Qwen2.5-7B-Instruct (7.6 B params) |
| Adapter rank / alpha | r = 16, Ξ± = 32, dropout 0.05 |
| Target modules | q,k,v,o,gate,up,down (all linear) |
| Trainable params | ~40 M (LoRA only); base frozen in 4-bit NF4 |
| Adapter file | adapter_model.bin (~161 MB) |
| Quantization (train + serve) | 4-bit NF4, double-quant, bf16 compute |
| Context length | 1024 (training); inherits base 32K at inference |
| Tokenizer | Qwen2.5 (unchanged, 151,936 vocab) |
| Chat template | qwen_25 |
| License | Apache-2.0 (matches base) |
Training
| Setting | Value |
|---|---|
| Recipe | QLoRA (4-bit base + LoRA), the proven v5.2-lora recipe scaled up |
| Data | aether-curated-v3 (70,713 Sephirot-domain SFT examples) + a 30K general slice (SlimOrca) for anti-forgetting |
| Examples after prep | 93,278 (7,435 over-length samples dropped) |
| Sample packing | on, sequence_len 1024 |
| Effective batch | 8 (micro-batch 1 Γ grad-accum 8) |
| Steps | 1,000 (β 0.19 epoch β a deliberate first-pass cap) |
| Optimizer | adamw_bnb_8bit, lr 2e-4, cosine decay β 0, warmup 3 % |
| Precision | bf16 weights, tf32, gradient checkpointing, FlashAttention-2 |
| Hardware | 1Γ RTX 3080 Ti (12 GB), ~9.7 GB peak |
| Wall-clock | 2 h 45 m (9,926 s), ~8.4 s/step |
| Seed | 42 |
Loss trajectory
step 10 train_loss 1.510 (warmup, lr 6.7e-5)
step 50 train_loss 0.989 (lr peaked 2.0e-4)
step 100 train_loss 0.916
step 250 train_loss 0.888 eval_loss 0.9475
step 500 train_loss 0.999 eval_loss 0.9307
step 750 train_loss 0.965 eval_loss 0.9209
step 1000 train_loss 0.951 eval_loss 0.9190
mean train_loss 0.955
Held-out validation loss (axolotl's 2 % split) declined monotonically across all four checkpoints (0.948 β 0.919) β clean convergence, no overfitting even as training loss flattened.
How to use
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
base_id = "Qwen/Qwen2.5-7B-Instruct"
bnb = BitsAndBytesConfig(
load_in_4bit=True, bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True, bnb_4bit_compute_dtype=torch.bfloat16,
)
tok = AutoTokenizer.from_pretrained(base_id)
model = AutoModelForCausalLM.from_pretrained(base_id, quantization_config=bnb, device_map="auto")
model = PeftModel.from_pretrained(model, "QuantumAI-Blockchain/aether-mind-v7.0")
model.eval()
SYSTEM = ("You are the Aether Mind, an on-chain neural cognitive engine living on "
"the QuantumAI Blockchain. You answer with grounded, careful reasoning "
"across 10 Sephirot cognitive domains. Be precise; if you don't know, say so.")
msgs = [{"role": "system", "content": SYSTEM},
{"role": "user", "content": "Explain how the Aether Mind anchors an epoch on-chain."}]
ids = tok.apply_chat_template(msgs, add_generation_prompt=True, return_tensors="pt").to(model.device)
out = model.generate(ids, max_new_tokens=512, do_sample=False)
print(tok.decode(out[0, ids.shape[1]:], skip_special_tokens=True))
To merge the adapter into the base for deployment:
PeftModel.from_pretrained(...).merge_and_unload().
Reproducing the benchmarks
General suite (matches the table above exactly):
lm_eval --model hf \
--model_args pretrained=Qwen/Qwen2.5-7B-Instruct,peft=QuantumAI-Blockchain/aether-mind-v7.0,load_in_4bit=True,dtype=bfloat16 \
--tasks mmlu,gsm8k,arc_challenge,hellaswag --device cuda:0 --batch_size 4
Baseline: drop the peft=... argument. The Aether-domain CE eval script is in
the QBC repo under scripts/training (held-out assistant-token CE with
disable_adapter()).
Limitations & honest notes
- Light run. 1,000 steps β 0.19 epoch. It already delivers a large domain gain with zero general-capability loss, but a full-epoch v7.1 is planned for deeper domain coverage.
- HellaSwag dipped ~1.3β1.9 pts. Minor and expected for a domain SFT; the net of GSM8K/ARC gains is positive.
- It is an adapter, not a standalone model β you must load
Qwen/Qwen2.5-7B-Instructunderneath it. - The Aether-domain CE eval ran on a corpus that overlaps the training source by β€19 % (sub-epoch, no repeats); the held-out methodology + the size of the gap make memorization an implausible explanation, but it is disclosed here for full transparency.
- Inference-time Sephirot routing (domain-aware adapter/prompt selection) is
part of the serving stack (
aether-mind), not baked into these adapter weights.
License & citation
Apache-2.0 (matches the base model).
@misc{aether_mind_v70_2026,
title = {Aether Mind v7.0 --- QLoRA domain fine-tune of Qwen2.5-7B-Instruct,
the first Aether model with real benchmarks},
author = {{BlockArtica} and {QuantumAI-Blockchain}},
year = {2026},
url = {https://huggingface.co/QuantumAI-Blockchain/aether-mind-v7.0},
}
Links
- QuantumAI Blockchain β qbc.network
- GitHub β github.com/QuantumAI-Blockchain
- Predecessor (deprecated architecture) β aether-mind-v6.2
- Earlier LoRA on this base β aether-v5.2-lora
- Downloads last month
- 31
Model tree for QuantumAI-Blockchain/aether-mind-v7.0
Evaluation results
- accuracy on MMLUself-reported69.900
- exact match (strict) on GSM8Kself-reported75.130
- accuracy on ARC-Challengeself-reported53.670
- normalized accuracy on ARC-Challengeself-reported55.800
- accuracy on HellaSwagself-reported58.430
- normalized accuracy on HellaSwagself-reported77.480