HDC-Brain v14.1 — Base (Pretrain Checkpoint)

A 299M-parameter hyperdimensional language model pretrained on 3B tokens of FineWeb-Edu. This is the pretrain checkpoint; for instruction-following use hdc-brain-v14.1-finetune-v3.

Paper: HDC-Brain: A 300M Hyperdimensional Language Model with Bipolar Codebook (Hasjanov, 2026) — Zenodo DOI 10.5281/zenodo.19653726. Code: https://github.com/OlegPhenomenon/hdc-brain

What is this

HDC-Brain replaces three components of a standard transformer with HDC-native primitives:

STE bipolar codebook — token embeddings are sign-constrained ±1 vectors (1 bit per parameter at inference). 32K × 4096 = 16 MB vs 512 MB float32.
Multi-head binding attention — 3 learned binding vectors per head instead of QKV projections. 12,288 params/layer vs 67M in a transformer of equivalent width (5461× reduction).
Thought loops — iterative K=3 pass reasoning through a shared block stack.
Parallel-scan HDC memory — learned mass/decay recurrence with O(D) state, replacing KV-cache.

Key numbers


Parameters	299,290,629
Pretrain data	3B tokens FineWeb-Edu
Training time	88 h on single RTX 3090
Validation loss	5.434 bits/token ≈ 1.25 bits/byte
Gap to SmolLM-360M	+0.44 bits/byte (behind)
Gap to GPT-2-medium	−0.13 bits/byte (ahead)
Codebook storage	16 MB (vs 512 MB float32)

Baselines measured on the same FineWeb-Edu sample; see paper §5.

Usage

This model does not follow instructions — it is the raw pretrain checkpoint. For instruction-following, use the finetune-v3 variant.

import torch, sys
sys.path.insert(0, "hdc-brain-v14.1")  # from github.com/OlegPhenomenon/hdc-brain
from hdc_brain_v14_1 import create_model

ckpt = torch.load("best_hdc_brain_v14_1.pt", map_location="cpu", weights_only=True)
model, _ = create_model(32000, ckpt["config"])
model.load_state_dict(ckpt["model"])
model.eval()

Full inference script: chat.py — pass --clean to load this checkpoint instead of the default finetune.

Tokenizer

32K English BPE (SentencePiece). Ship the tokenizer with the code repo: bpe_en_32k.model.

Limitations

Feasibility study, not a frontier model:

Single run, no hyperparameter sweep, no seed averaging
Undertrained relative to Chinchilla (10:1 tokens:params vs 20:1)
Compute advantage of the bipolar codebook requires custom XNOR/POPCNT kernels — not implemented here; only storage advantage is realised
Codebook is random bipolar with STE; semantic initialisation (FastText → sign) is deferred to future work

Full discussion in paper §6.

Citation

@misc{hasjanov2026hdcbrain,
  author  = {Oleg Hasjanov},
  title   = {HDC-Brain: A 300M Hyperdimensional Language Model with Bipolar Codebook},
  publisher = {Zenodo},
  year    = {2026},
  doi       = {10.5281/zenodo.19653726},
  url       = {https://doi.org/10.5281/zenodo.19653726}
}

License

Weights: CC BY-NC 4.0 — free for research, academic, and personal non-commercial use. Commercial use requires a separate license. Contact: oleg.phenomenon@gmail.com.

The code at https://github.com/OlegPhenomenon/hdc-brain is released under Apache 2.0 and is unrestricted.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for olegphenomenon/hdc-brain-v14.1-base

Finetunes

1 model