Text Generation
PEFT
English
qwen2
qubitcoin
aether
blockchain
quantum
qlora
lora
qwen2.5
on-chain-ai
conversational
Eval Results (legacy)
4-bit precision
bitsandbytes
Instructions to use QuantumAI-Blockchain/aether-mind-v7.0 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use QuantumAI-Blockchain/aether-mind-v7.0 with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-7B-Instruct") model = PeftModel.from_pretrained(base_model, "QuantumAI-Blockchain/aether-mind-v7.0") - Notebooks
- Google Colab
- Kaggle
| library_name: peft | |
| license: apache-2.0 | |
| base_model: Qwen/Qwen2.5-7B-Instruct | |
| pipeline_tag: text-generation | |
| language: | |
| - en | |
| tags: | |
| - qubitcoin | |
| - aether | |
| - blockchain | |
| - quantum | |
| - qlora | |
| - peft | |
| - lora | |
| - qwen2.5 | |
| - on-chain-ai | |
| datasets: | |
| - QuantumAI-Blockchain/aether-curated-v3 | |
| model-index: | |
| - name: aether-mind-v7.0 | |
| results: | |
| - task: | |
| type: text-generation | |
| name: Massive Multitask Language Understanding | |
| dataset: | |
| name: MMLU | |
| type: cais/mmlu | |
| metrics: | |
| - type: acc | |
| value: 69.90 | |
| name: accuracy | |
| - task: | |
| type: text-generation | |
| name: Grade-School Math | |
| dataset: | |
| name: GSM8K | |
| type: gsm8k | |
| metrics: | |
| - type: exact_match | |
| value: 75.13 | |
| name: exact match (strict) | |
| - task: | |
| type: text-generation | |
| name: AI2 Reasoning Challenge | |
| dataset: | |
| name: ARC-Challenge | |
| type: ai2_arc | |
| metrics: | |
| - type: acc | |
| value: 53.67 | |
| name: accuracy | |
| - type: acc_norm | |
| value: 55.80 | |
| name: normalized accuracy | |
| - task: | |
| type: text-generation | |
| name: Commonsense NLI | |
| dataset: | |
| name: HellaSwag | |
| type: hellaswag | |
| metrics: | |
| - type: acc | |
| value: 58.43 | |
| name: accuracy | |
| - type: acc_norm | |
| value: 77.48 | |
| name: normalized accuracy | |
| # Aether Mind v7.0 β the first Aether model with real, reproducible benchmarks | |
| **Aether Mind v7.0 is a QLoRA fine-tune of `Qwen/Qwen2.5-7B-Instruct` on the | |
| domain-tagged Aether SFT corpus.** It is the cognitive engine for the | |
| [QuantumAI Blockchain](https://qbc.network) (QBC) β an on-chain neural model | |
| that reasons across the 10 Sephirot cognitive domains (Keter, Chochmah, Binah, | |
| Chesed, Gevurah, Tiferet, Netzach, Hod, Yesod, Malkuth). | |
| This is a **clean break** from the v6.x line. v6.0βv6.2 used a custom-built | |
| transformer (NSA sparse attention + Sephirot/sink attention heads, distilled | |
| from Qwen2.5-0.5B). On a proper `lm-evaluation-harness` pass that architecture | |
| scored **worse than random** (cross-entropy β 16 nats vs. ~11.9 for uniform) β | |
| the attention replacement destroyed the base model's capability. **No v6.x | |
| release ever carried real benchmark numbers.** v7.0 fixes that by building on a | |
| sound, capable base and adding Aether identity through the *data* and an | |
| inference-time Sephirot router β **not** by replacing attention. | |
| > **v7.0 is the first Aether release whose published numbers are real, | |
| > reproducible, and independently verifiable** (the exact `lm-eval` command is | |
| > below). | |
| --- | |
| ## Results | |
| All numbers below are from `lm-evaluation-harness`, 0-shot, the model loaded in | |
| 4-bit (the same configuration this adapter is trained and served in), on a | |
| single RTX 3080 Ti. The baseline is the unmodified `Qwen/Qwen2.5-7B-Instruct` | |
| evaluated identically, so every delta is attributable to this adapter alone. | |
| ### General capability β preserved (no catastrophic forgetting) | |
| | Benchmark | Metric | Base (Qwen2.5-7B-Instruct) | **Aether v7.0** | Ξ | | |
| |---|---|---|---|---| | |
| | MMLU | acc | 69.91 % | **69.90 %** | β0.01 | | |
| | GSM8K | exact_match (strict) | 71.57 % | **75.13 %** | **+3.56** | | |
| | ARC-Challenge | acc | 51.45 % | **53.67 %** | **+2.22** | | |
| | ARC-Challenge | acc_norm | 53.92 % | **55.80 %** | **+1.88** | | |
| | HellaSwag | acc | 60.35 % | **58.43 %** | β1.92 | | |
| | HellaSwag | acc_norm | 78.77 % | **77.48 %** | β1.29 | | |
| The whole risk of a domain fine-tune is *catastrophic forgetting*. v7.0 avoids | |
| it: MMLU is flat to the second decimal, and math + scientific reasoning | |
| (GSM8K +3.6, ARC-c +2.2) actually **improve** β the general instruction slice in | |
| the training mix more than offsets the small HellaSwag dip (~1.5 pts). | |
| ### Aether-domain knowledge β large gain | |
| Held-out evaluation on the Aether curated corpus (`aether-curated-v3`), | |
| measuring **cross-entropy over the assistant-answer tokens only** (the | |
| Aether-domain response, with the system + user turns masked). The *identical* | |
| 4-bit base weights are used for both rows β the adapter is toggled on/off via | |
| PEFT `disable_adapter()` β so this isolates the adapter's effect exactly. | |
| | Model | CE (nats) β | Perplexity β | | |
| |---|---|---| | |
| | Base (Qwen2.5-7B-Instruct) | 1.589 | 4.90 | | |
| | **Aether v7.0** | **1.002** | **2.72** | | |
| | **Ξ** | **β0.588** | **β44.4 %** | | |
| 276 held-out examples, 55,423 assistant tokens scored. Because this run trained | |
| for only **~0.19 epoch** (see below), ~81 % of the corpus was never seen and the | |
| seen portion was seen sub-epoch (no repeats) β so this β44 % perplexity drop is | |
| **genuine domain adaptation, not memorization.** | |
| **Summary: v7.0 keeps the base model's general intelligence intact while cutting | |
| Aether-domain perplexity nearly in half.** That is the textbook outcome of a | |
| healthy domain fine-tune. | |
| --- | |
| ## What you're getting | |
| | Field | Value | | |
| |---|---| | |
| | Type | **QLoRA adapter (PEFT)** β load on top of `Qwen/Qwen2.5-7B-Instruct` | | |
| | Base model | `Qwen/Qwen2.5-7B-Instruct` (7.6 B params) | | |
| | Adapter rank / alpha | r = 16, Ξ± = 32, dropout 0.05 | | |
| | Target modules | `q,k,v,o,gate,up,down` (all linear) | | |
| | Trainable params | ~40 M (LoRA only); base frozen in 4-bit NF4 | | |
| | Adapter file | `adapter_model.bin` (~161 MB) | | |
| | Quantization (train + serve) | 4-bit NF4, double-quant, bf16 compute | | |
| | Context length | 1024 (training); inherits base 32K at inference | | |
| | Tokenizer | Qwen2.5 (unchanged, 151,936 vocab) | | |
| | Chat template | `qwen_25` | | |
| | License | Apache-2.0 (matches base) | | |
| --- | |
| ## Training | |
| | Setting | Value | | |
| |---|---| | |
| | Recipe | QLoRA (4-bit base + LoRA), the proven v5.2-lora recipe scaled up | | |
| | Data | `aether-curated-v3` (70,713 Sephirot-domain SFT examples) + a 30K general slice (SlimOrca) for anti-forgetting | | |
| | Examples after prep | 93,278 (7,435 over-length samples dropped) | | |
| | Sample packing | on, sequence_len 1024 | | |
| | Effective batch | 8 (micro-batch 1 Γ grad-accum 8) | | |
| | Steps | 1,000 (**β 0.19 epoch** β a deliberate first-pass cap) | | |
| | Optimizer | `adamw_bnb_8bit`, lr 2e-4, cosine decay β 0, warmup 3 % | | |
| | Precision | bf16 weights, tf32, gradient checkpointing, FlashAttention-2 | | |
| | Hardware | 1Γ RTX 3080 Ti (12 GB), ~9.7 GB peak | | |
| | Wall-clock | 2 h 45 m (9,926 s), ~8.4 s/step | | |
| | Seed | 42 | | |
| ### Loss trajectory | |
| ``` | |
| step 10 train_loss 1.510 (warmup, lr 6.7e-5) | |
| step 50 train_loss 0.989 (lr peaked 2.0e-4) | |
| step 100 train_loss 0.916 | |
| step 250 train_loss 0.888 eval_loss 0.9475 | |
| step 500 train_loss 0.999 eval_loss 0.9307 | |
| step 750 train_loss 0.965 eval_loss 0.9209 | |
| step 1000 train_loss 0.951 eval_loss 0.9190 | |
| mean train_loss 0.955 | |
| ``` | |
| Held-out validation loss (axolotl's 2 % split) declined monotonically across all | |
| four checkpoints (0.948 β 0.919) β clean convergence, **no overfitting** even as | |
| training loss flattened. | |
| --- | |
| ## How to use | |
| ```python | |
| import torch | |
| from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig | |
| from peft import PeftModel | |
| base_id = "Qwen/Qwen2.5-7B-Instruct" | |
| bnb = BitsAndBytesConfig( | |
| load_in_4bit=True, bnb_4bit_quant_type="nf4", | |
| bnb_4bit_use_double_quant=True, bnb_4bit_compute_dtype=torch.bfloat16, | |
| ) | |
| tok = AutoTokenizer.from_pretrained(base_id) | |
| model = AutoModelForCausalLM.from_pretrained(base_id, quantization_config=bnb, device_map="auto") | |
| model = PeftModel.from_pretrained(model, "QuantumAI-Blockchain/aether-mind-v7.0") | |
| model.eval() | |
| SYSTEM = ("You are the Aether Mind, an on-chain neural cognitive engine living on " | |
| "the QuantumAI Blockchain. You answer with grounded, careful reasoning " | |
| "across 10 Sephirot cognitive domains. Be precise; if you don't know, say so.") | |
| msgs = [{"role": "system", "content": SYSTEM}, | |
| {"role": "user", "content": "Explain how the Aether Mind anchors an epoch on-chain."}] | |
| ids = tok.apply_chat_template(msgs, add_generation_prompt=True, return_tensors="pt").to(model.device) | |
| out = model.generate(ids, max_new_tokens=512, do_sample=False) | |
| print(tok.decode(out[0, ids.shape[1]:], skip_special_tokens=True)) | |
| ``` | |
| To merge the adapter into the base for deployment: | |
| `PeftModel.from_pretrained(...).merge_and_unload()`. | |
| --- | |
| ## Reproducing the benchmarks | |
| General suite (matches the table above exactly): | |
| ```bash | |
| lm_eval --model hf \ | |
| --model_args pretrained=Qwen/Qwen2.5-7B-Instruct,peft=QuantumAI-Blockchain/aether-mind-v7.0,load_in_4bit=True,dtype=bfloat16 \ | |
| --tasks mmlu,gsm8k,arc_challenge,hellaswag --device cuda:0 --batch_size 4 | |
| ``` | |
| Baseline: drop the `peft=...` argument. The Aether-domain CE eval script is in | |
| the QBC repo under `scripts/training` (held-out assistant-token CE with | |
| `disable_adapter()`). | |
| --- | |
| ## Limitations & honest notes | |
| - **Light run.** 1,000 steps β 0.19 epoch. It already delivers a large domain | |
| gain with zero general-capability loss, but a full-epoch **v7.1** is planned | |
| for deeper domain coverage. | |
| - **HellaSwag dipped** ~1.3β1.9 pts. Minor and expected for a domain SFT; the | |
| net of GSM8K/ARC gains is positive. | |
| - **It is an adapter**, not a standalone model β you must load | |
| `Qwen/Qwen2.5-7B-Instruct` underneath it. | |
| - The Aether-domain CE eval ran on a corpus that overlaps the training source by | |
| β€19 % (sub-epoch, no repeats); the held-out methodology + the size of the gap | |
| make memorization an implausible explanation, but it is disclosed here for | |
| full transparency. | |
| - Inference-time **Sephirot routing** (domain-aware adapter/prompt selection) is | |
| part of the serving stack (`aether-mind`), not baked into these adapter | |
| weights. | |
| --- | |
| ## License & citation | |
| Apache-2.0 (matches the base model). | |
| ```bibtex | |
| @misc{aether_mind_v70_2026, | |
| title = {Aether Mind v7.0 --- QLoRA domain fine-tune of Qwen2.5-7B-Instruct, | |
| the first Aether model with real benchmarks}, | |
| author = {{BlockArtica} and {QuantumAI-Blockchain}}, | |
| year = {2026}, | |
| url = {https://huggingface.co/QuantumAI-Blockchain/aether-mind-v7.0}, | |
| } | |
| ``` | |
| ## Links | |
| - **QuantumAI Blockchain** β [qbc.network](https://qbc.network) | |
| - **GitHub** β [github.com/QuantumAI-Blockchain](https://github.com/QuantumAI-Blockchain) | |
| - **Predecessor (deprecated architecture)** β [aether-mind-v6.2](https://huggingface.co/QuantumAI-Blockchain/aether-mind-v6.2) | |
| - **Earlier LoRA on this base** β [aether-v5.2-lora](https://huggingface.co/QuantumAI-Blockchain/aether-v5.2-lora) | |