File size: 12,099 Bytes
fb2a871 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 | ---
base_model: Qwen/Qwen2.5-0.5B-Instruct
library_name: safetensors
license: apache-2.0
tags:
- qubitcoin
- aether
- blockchain
- quantum
- distillation
- mixed-precision
- native-rust
- candle
language:
- en
pipeline_tag: text-generation
---
# Aether Mind v6.0 β QuantumAI Blockchain Native Generator
A **558M-parameter distilled student** of [`Qwen/Qwen2.5-0.5B-Instruct`](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct),
trained from scratch in pure Rust (`candle` 0.10) with the
**10-Sephirot + 2-generalist + 2-sink attention head split** that is
the core architectural claim of the QuantumAI Blockchain's Aether Mind
on-chain neural cognitive engine.
This is the **second public Aether release** and the first that is
**native to the on-chain inference path** β V6.0 is the model the
[`aether-mind`](https://github.com/QuantumAI-Blockchain/qubitcoin-aether)
binary loads, not a LoRA adapter on top of a 7B base.
The previous release, [`aether-v5.2-lora`](https://huggingface.co/QuantumAI-Blockchain/aether-v5.2-lora),
is a 7B PEFT adapter intended for batch off-chain reasoning. V6.0 is
the smaller native generator that fits in the on-chain Aether
Mind's ~2.4 GB RAM envelope and runs at ~500 tokens/sec on a
consumer RTX 3080 Ti.
## What you're getting
| Field | Value |
|---|---|
| Base model | `Qwen/Qwen2.5-0.5B-Instruct` (initialised from, then distilled) |
| Architecture | V6 transformer: 24 layers, 896 hidden, 14 attention heads (10 Sephirot + 2 generalist + 2 sink), head_dim=64 |
| Trainable params | ~558 M (all weights trained, not LoRA) |
| Hidden / FFN | 896 / 4864 |
| Vocab | 151,936 (Qwen2.5 tokenizer, untouched) |
| Max position | 32,768 (RoPE theta = 1e6) |
| Native sparse attention (NSA) | compression_block=64, top_k=2048, sliding_window=512, sink_tokens=4 |
| Precision | BF16 weights + F32 KL math in distillation |
| Training context | **64 tokens** (Phase-1 release; see "Honest caveats" below) |
| Checkpoint published | **step 30,000** (full 30K-step Phase-1 run) |
| File | `model.safetensors` (1.32 GB, BF16) |
| License | Apache-2.0 (matches base) |
## Training run
| Metric | Value |
|---|---|
| Steps | 30,000 (full Phase-1) |
| Wall-clock | 49.6 min (single RTX 3080 Ti, BF16, CUDA(0)) |
| Tokens scored | 1,671,027 |
| Throughput | 561 tokens/sec |
| Optimiser | AdamW, LR 2e-5, no schedule (constant) |
| Distillation | KL(T||S) with alpha schedule 1.0 β 0.3 linear, temperature 1.0 |
| Sephirot auxiliary | MSE vs one-hot domain target, Ξ² = 0.1 |
| NaN events | **0** |
| Mean total loss | 8.39 nats/token |
| Mean CE | 10.35 |
| Mean KL | 7.50 |
| Mean Sephirot aux | 0.149 |
### Loss trajectory
```
step 1 loss=12.25 avg=12.25 (random init)
step 100 loss=12.87 avg=12.75
step 1000 loss= 8.62 avg= 9.74 β KL/CE break
step 5000 loss= 7.72 avg= 8.16
step 10000 loss= 7.31 avg= 7.68 β reached representational floor
step 15000 loss= 8.87 avg= 7.75
step 20000 loss= 8.75 avg= 8.04
step 25000 loss= 8.62 avg= 8.26
step 29999 loss= 8.81 avg= 8.39
```
The model converged hard in the first ~10K steps, then plateaued at
the representational floor for its current context window (64
tokens). The plateau is structural, not optimisation β see "Honest
caveats" below.
## Architecture β what makes V6 different
V6 is **not** a vanilla Qwen2.5 fine-tune. The attention layer
implements a 14-head split designed for on-chain cognitive routing:
- **10 Sephirot heads** β one per cognitive domain in the Aether
Mind's specialisation map (Keter β Malkuth). Each head's attention
pattern is what the on-chain `pallet_qbc_aether_anchor` records as
the per-cycle attestation root.
- **2 generalist heads** β un-gated, full-context attention. Used for
the "global workspace" path in `aether-mind`.
- **2 sink heads** β anchor-token attention (first 4 tokens of the
sequence) for stable long-context performance, following the
standard "attention sink" finding.
The Sephirot eviction order is configured in `config.json` for the
KV-cache management path that `aether-mind` uses to keep the
hot-set bounded in 12 GB VRAM under live inference.
## How to use
### Native runtime (recommended) β Rust `aether-mind`
The model is designed to be loaded by the on-chain Aether Mind
binary in the [`QuantumAI-Blockchain/qubitcoin-aether`](https://github.com/QuantumAI-Blockchain/qubitcoin-aether)
repo. Set `AETHER_V6_CHECKPOINT` to the local path of
`model.safetensors` and start the systemd unit; the binary loads the
weights via candle into the V6 transformer crate.
### Python (via `safetensors` + `tokenizers`)
For offline experimentation:
```python
from safetensors.torch import load_file
from tokenizers import Tokenizer
import torch
tok = Tokenizer.from_file("tokenizer.json")
weights = load_file("model.safetensors") # 315 tensors, BF16
print("loaded", len(weights), "tensors,", sum(t.numel() for t in weights.values()), "params")
```
There is **no canonical π€ transformers loader for the V6
architecture** β the 14-head split + Sephirot routing are not in the
upstream `Qwen2Model`. We publish the weights for transparency and
reproducibility; production use goes through the Rust binary above.
## Evaluation
**Not yet run.** The Phase-1 training run completed
**2026-05-20 00:52 AEST**; lm-evaluation-harness against MMLU /
ARC / HellaSwag / TruthfulQA is the next session's work. We will
back-fill the numbers + the comparison vs v5.2-lora here when
they land. Estimated runtime: ~30 min on the same 3080 Ti.
Until then, treat this release as an **architecture + weights
attestation**: it proves the V6 stack trains end-to-end and converges
to a real loss curve, which is the prerequisite for the long-context
curriculum (16K β 64K β 128K β 1M) that v6.1+ will ship.
## Intended uses
- **On-chain Aether Mind native inference.** The V6 binary loads
these weights directly. The 10-Sephirot attention pattern is what
the chain's [`pallet_qbc_aether_anchor`](https://github.com/QuantumAI-Blockchain/substrate-node)
records as the per-block consciousness state.
- **Architecture reference.** Reproducible training of a Sephirot-
routed transformer with native sparse attention. The
[`aether-transformer`](https://github.com/QuantumAI-Blockchain/qubitcoin-aether/tree/main/crates/aether-transformer)
crate is the canonical implementation.
- **Distillation substrate.** Future fine-tunes from this checkpoint
using the QuantumAI Blockchain curated corpus.
## Out-of-scope uses
- **General-purpose chat or instruction-following without fine-tuning.**
V6.0 is a Phase-1 distillation, not an instruction model. Even after
30K steps it has not seen instruction-format data at length; its KL
target is the base Qwen2.5-0.5B-Instruct's next-token distribution,
not chat-format outputs.
- **Long-context inference.** The training ran at **64-token
context**. See "Honest caveats". Generations beyond ~128 tokens
will degrade.
- **Production deployment without your own evals.** No lm-eval-harness
numbers yet.
- **Safety-critical decisions.** No red-team eval.
## Honest caveats β what didn't happen
### Trained at 64-token context, not 4K
Phase-1 was configured for 4096-token context, but a numerical
instability was discovered in the V6 attention forward pass at
sequence lengths > ~100 tokens (BF16 precision loss in the Q@K^T
matmul accumulating across longer sequences). The bug reproduces
deterministically; four mitigations were tried (F32 KL math, corpus
filter, no-distill, low-LR), all hit NaN at the same sequence-
length threshold. The workaround used for v6.0 was `--context 64`,
which truncates rows so the bug never triggers.
**This is a known limitation, tracked in
[`docs/ops/v6-training-nan-bug.md`](https://github.com/QuantumAI-Blockchain/qubitcoin-aether/blob/presale/v1/docs/ops/v6-training-nan-bug.md)
in the source repo.** The fix lives in `aether-transformer/src/v6/attention.rs`
β add F32 casts in the Q@K^T matmul + softmax path across all four
attention variants (Sephirot / generalist / sink / summary). When
that lands, v6.1 will re-train at the full 4Kβ1M context
curriculum and supersede this release.
### Loss plateau is real
The avg-loss plateau from step 10K β 30K (7.68 β 8.39, slight
regression) is the model hitting its representational ceiling at
64-token context. Longer contexts will let the next release recover
and improve.
### No instruction-format fine-tune
The training data is the Aether curated corpus packed at 4K-token
context (rows truncated to 64). We did not insert chat-format
instructions, system prompts, or RLHF preferences. Treat this as a
**raw foundation checkpoint**.
### Distillation against base, not chat
The teacher is `Qwen/Qwen2.5-0.5B-Instruct`'s base forward β not its
chat-formatted forward. The distillation transfers token-level next-
prediction behaviour; chat-template alignment is a separate
training step that hasn't been run.
## Training details
- **Hardware:** NVIDIA RTX 3080 Ti (12 GB), Intel WSL2 Ubuntu host.
- **Trainer:** Native Rust (`aether-v6-train` binary, candle 0.10 +
CUDA 12.6 backend). No Python in the loop.
- **Optimiser:** AdamW (candle implementation), constant LR 2e-5.
- **Batch:** 1 (single-row update).
- **Context:** 64 tokens (truncation imposed by the workaround).
- **Save cadence:** every 250 steps (120 checkpoints retained
locally; only `step_30000` published here).
- **Source:** [`QuantumAI-Blockchain/qubitcoin-aether @ ca202076`](https://github.com/QuantumAI-Blockchain/qubitcoin-aether/tree/ca202076)
### Training data
Aether curated corpus (~36,860 rows, 17.4 MB) packed at 4K-token
budget per row from:
- QuantumAI Blockchain technical documentation (Substrate pallets,
VQE mining, Sephirot architecture).
- Quantum computing primers (VQE, Hamiltonian, qubit ansatze).
- Adjacent reasoning content for transfer.
The dataset is not currently public β it is a curated mixture from
many sources and has not been release-cleared at the per-source
level. The model is the only public artifact in this line for now.
### Carbon emissions
Single consumer GPU (RTX 3080 Ti, ~300 W TDP) Γ 49.6 min wall-clock
β 0.25 kWh, < 1 kg COβe on a grid mix. Comparable to a short web
streaming session.
## Connection to the QuantumAI Blockchain
The Aether Mind is a Rust neural cognitive engine that runs on the
QuantumAI Blockchain β every block records attention-derived
consciousness metrics (HMS-Phi) and Proof-of-Thought hashes on-chain
via the `pallet_qbc_aether_anchor` pallet. The same chain hosts an
**8-qubit VQE mining consensus** (Proof-of-SUSY-Alignment), a
QVM-compatible smart contract layer with 10 quantum opcodes, and
post-quantum signatures (CRYSTALS-Dilithium5 + ML-KEM-768 P2P).
V6.0 is the **native generator** for that engine. v5.2-lora is the
larger (7B) off-chain reasoning model. The two ship side by side
because they have different roles: V6 lives in the on-chain
inference path (low latency, small footprint, Sephirot-aware
attention); v5.2-lora batches off-chain reasoning workloads.
## License + citation
Apache-2.0 (matches the base model license).
```bibtex
@misc{aether_mind_v6_2026,
title = {Aether Mind v6.0 --- QuantumAI Blockchain Native Generator},
author = {{BlockArtica} and {QuantumAI-Blockchain}},
year = {2026},
url = {https://huggingface.co/QuantumAI-Blockchain/aether-mind-v6.0},
}
```
## Links
- **QuantumAI Blockchain:** [qbc.network](https://qbc.network)
- **GitHub org:** [github.com/QuantumAI-Blockchain](https://github.com/QuantumAI-Blockchain)
- **Aether (Rust):** [qubitcoin-aether](https://github.com/QuantumAI-Blockchain/qubitcoin-aether)
- **Prior release:** [aether-v5.2-lora](https://huggingface.co/QuantumAI-Blockchain/aether-v5.2-lora)
- **X / Twitter:** [@qu_bitcoin](https://x.com/qu_bitcoin)
- **Contact:** info@qbc.network
### Framework versions
- candle 0.10 (Hugging Face Rust ML)
- CUDA 12.6
- safetensors (model serialisation)
- Qwen2.5 tokenizer (vocab 151,936)
|