HDC-Brain v14.1 β Base (Pretrain Checkpoint)
A 299M-parameter hyperdimensional language model pretrained on 3B tokens of FineWeb-Edu. This is the pretrain checkpoint; for instruction-following use hdc-brain-v14.1-finetune-v3.
Paper: HDC-Brain: A 300M Hyperdimensional Language Model with Bipolar Codebook (Hasjanov, 2026) β Zenodo DOI 10.5281/zenodo.19653726. Code: https://github.com/OlegPhenomenon/hdc-brain
What is this
HDC-Brain replaces three components of a standard transformer with HDC-native primitives:
- STE bipolar codebook β token embeddings are sign-constrained Β±1 vectors (1 bit per parameter at inference). 32K Γ 4096 = 16 MB vs 512 MB float32.
- Multi-head binding attention β 3 learned binding vectors per head instead of QKV projections. 12,288 params/layer vs 67M in a transformer of equivalent width (5461Γ reduction).
- Thought loops β iterative K=3 pass reasoning through a shared block stack.
- Parallel-scan HDC memory β learned mass/decay recurrence with O(D) state, replacing KV-cache.
Key numbers
| Parameters | 299,290,629 |
| Pretrain data | 3B tokens FineWeb-Edu |
| Training time | 88 h on single RTX 3090 |
| Validation loss | 5.434 bits/token β 1.25 bits/byte |
| Gap to SmolLM-360M | +0.44 bits/byte (behind) |
| Gap to GPT-2-medium | β0.13 bits/byte (ahead) |
| Codebook storage | 16 MB (vs 512 MB float32) |
Baselines measured on the same FineWeb-Edu sample; see paper Β§5.
Usage
This model does not follow instructions β it is the raw pretrain checkpoint. For instruction-following, use the finetune-v3 variant.
import torch, sys
sys.path.insert(0, "hdc-brain-v14.1") # from github.com/OlegPhenomenon/hdc-brain
from hdc_brain_v14_1 import create_model
ckpt = torch.load("best_hdc_brain_v14_1.pt", map_location="cpu", weights_only=True)
model, _ = create_model(32000, ckpt["config"])
model.load_state_dict(ckpt["model"])
model.eval()
Full inference script: chat.py β pass --clean to load this checkpoint instead of the default finetune.
Tokenizer
32K English BPE (SentencePiece). Ship the tokenizer with the code repo:
bpe_en_32k.model.
Limitations
Feasibility study, not a frontier model:
- Single run, no hyperparameter sweep, no seed averaging
- Undertrained relative to Chinchilla (10:1 tokens:params vs 20:1)
- Compute advantage of the bipolar codebook requires custom XNOR/POPCNT kernels β not implemented here; only storage advantage is realised
- Codebook is random bipolar with STE; semantic initialisation (FastText β sign) is deferred to future work
Full discussion in paper Β§6.
Citation
@misc{hasjanov2026hdcbrain,
author = {Oleg Hasjanov},
title = {HDC-Brain: A 300M Hyperdimensional Language Model with Bipolar Codebook},
publisher = {Zenodo},
year = {2026},
doi = {10.5281/zenodo.19653726},
url = {https://doi.org/10.5281/zenodo.19653726}
}
License
Weights: CC BY-NC 4.0 β free for research, academic, and personal non-commercial use. Commercial use requires a separate license. Contact: oleg.phenomenon@gmail.com.
The code at https://github.com/OlegPhenomenon/hdc-brain is released under Apache 2.0 and is unrestricted.