Hallamonic 1B

TinyLlama-1.1B with all 22 LlamaAttention layers replaced by HarmonicBlock, a hierarchical state-space module. This removes TinyLlama's RoPE positional limit (max_position_embeddings=2048): the unmodified model degrades catastrophically past 2K tokens, Hallamonic does not.

This is the final checkpoint (Phase 2, full fine-tune, step 5000) from the Harmonic paper: arxiv.org/abs/2606.24650. See Omibranch/Harmonic for code and the small-scale (7M-112M param) Harmonic results this architecture is based on.

Result

Evaluated on three independent held-out benchmarks, none overlapping the fineweb-edu training data:

Dataset	Seq len	Hallamonic (bpt)	TinyLlama (bpt)	Δ
WikiText-103	1,024	0.43	2.84	+2.41
WikiText-103	8,192	0.48	10.56	+10.08
Lambada (clean)	1,024	0.44	4.29	+3.86
Lambada (clean)	8,192	0.45	9.89	+9.44
fineweb-edu held-out	1,024	0.36	3.37	+3.00
fineweb-edu held-out	8,192	0.36	10.82	+10.45

TinyLlama's loss explodes past its 2K RoPE limit on every dataset. Hallamonic's loss at 8K is within 0.02-0.04 bpt of its 1K value across all three.

Training

Base: TinyLlama/TinyLlama-1.1B-Chat-v1.0 (892M params frozen: FFN + embeddings)
New: 141M params (HarmonicBlock, d_state=128, compress ratio K=4)
Phase 1 (SSM warmup): 10K steps, seq=512, batch=4, FFN frozen, lr 3e-4
Phase 2 (full fine-tune): 5K steps, seq=1024, batch=8, grad-accum=4, lr 3e-5
Data: fineweb-edu (sample-10BT)
Cost: ~$15 on a single H100 (Modal), under 3 hours total

Usage

This is a custom architecture, not a stock transformers model class. Load it with the code in the Harmonic repo:

git clone https://github.com/Omibranch/Harmonic
cd Harmonic/hallamonic

from huggingface_hub import snapshot_download
from model import load_hallamonic

ckpt = snapshot_download("Omibranch/harmonic-checkpoints-phase2-final")
model, tokenizer = load_hallamonic(ckpt, device="cuda")
model.eval()

input_ids = tokenizer.encode("The theory of relativity states that", return_tensors="pt").to("cuda")
out = model.generate(input_ids, max_new_tokens=150, do_sample=True, temperature=0.8, top_p=0.9)
print(tokenizer.decode(out[0], skip_special_tokens=True))

Limitations

Single training run, no multi-seed replication. Evaluated on English text only. Not benchmarked for instruction-following or chat quality — this is a base language model demonstrating an architectural property (no positional limit at long context), not a general-purpose assistant.

License

Dual-licensed: free for noncommercial use (research, study, evaluation) under the PolyForm Noncommercial License 1.0.0. Commercial use requires a separate license — see LICENSE-COMMERCIAL.md.

Citation

@software{harmonic2026,
  title     = {Harmonic: Hierarchical State Space Models},
  author    = {Omibranch and {Harmonic Labs}},
  year      = {2026},
  publisher = {Zenodo},
  doi       = {10.5281/zenodo.20381713},
  url       = {https://github.com/Omibranch/Harmonic}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for Omibranch/harmonic-checkpoints-phase2-final

Base model

TinyLlama/TinyLlama-1.1B-Chat-v1.0

Finetuned

(564)

this model

Space using Omibranch/harmonic-checkpoints-phase2-final 1

Paper for Omibranch/harmonic-checkpoints-phase2-final

Harmonic: Hierarchical State Space Models for Efficient Long-Context Language Modeling

Paper • 2606.24650 • Published 27 days ago