Hallamonic 1B

TinyLlama-1.1B with all 22 LlamaAttention layers replaced by HarmonicBlock, a hierarchical state-space module. This removes TinyLlama's RoPE positional limit (max_position_embeddings=2048): the unmodified model degrades catastrophically past 2K tokens, Hallamonic does not.

This is the final checkpoint (Phase 2, full fine-tune, step 5000) from the Harmonic paper: arxiv.org/abs/2606.24650. See Omibranch/Harmonic for code and the small-scale (7M-112M param) Harmonic results this architecture is based on.

Result

Evaluated on three independent held-out benchmarks, none overlapping the fineweb-edu training data:

Dataset Seq len Hallamonic (bpt) TinyLlama (bpt) ฮ”
WikiText-103 1,024 0.43 2.84 +2.41
WikiText-103 8,192 0.48 10.56 +10.08
Lambada (clean) 1,024 0.44 4.29 +3.86
Lambada (clean) 8,192 0.45 9.89 +9.44
fineweb-edu held-out 1,024 0.36 3.37 +3.00
fineweb-edu held-out 8,192 0.36 10.82 +10.45

TinyLlama's loss explodes past its 2K RoPE limit on every dataset. Hallamonic's loss at 8K is within 0.02-0.04 bpt of its 1K value across all three.

Training

  • Base: TinyLlama/TinyLlama-1.1B-Chat-v1.0 (892M params frozen: FFN + embeddings)
  • New: 141M params (HarmonicBlock, d_state=128, compress ratio K=4)
  • Phase 1 (SSM warmup): 10K steps, seq=512, batch=4, FFN frozen, lr 3e-4
  • Phase 2 (full fine-tune): 5K steps, seq=1024, batch=8, grad-accum=4, lr 3e-5
  • Data: fineweb-edu (sample-10BT)
  • Cost: ~$15 on a single H100 (Modal), under 3 hours total

Usage

This is a custom architecture, not a stock transformers model class. Load it with the code in the Harmonic repo:

git clone https://github.com/Omibranch/Harmonic
cd Harmonic/hallamonic
from huggingface_hub import snapshot_download
from model import load_hallamonic

ckpt = snapshot_download("Omibranch/harmonic-checkpoints-phase2-final")
model, tokenizer = load_hallamonic(ckpt, device="cuda")
model.eval()

input_ids = tokenizer.encode("The theory of relativity states that", return_tensors="pt").to("cuda")
out = model.generate(input_ids, max_new_tokens=150, do_sample=True, temperature=0.8, top_p=0.9)
print(tokenizer.decode(out[0], skip_special_tokens=True))

Limitations

Single training run, no multi-seed replication. Evaluated on English text only. Not benchmarked for instruction-following or chat quality โ€” this is a base language model demonstrating an architectural property (no positional limit at long context), not a general-purpose assistant.

License

Dual-licensed: free for noncommercial use (research, study, evaluation) under the PolyForm Noncommercial License 1.0.0. Commercial use requires a separate license โ€” see LICENSE-COMMERCIAL.md.

Citation

@software{harmonic2026,
  title     = {Harmonic: Hierarchical State Space Models},
  author    = {Omibranch and {Harmonic Labs}},
  year      = {2026},
  publisher = {Zenodo},
  doi       = {10.5281/zenodo.20381713},
  url       = {https://github.com/Omibranch/Harmonic}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Omibranch/harmonic-checkpoints-phase2-final

Finetuned
(564)
this model

Space using Omibranch/harmonic-checkpoints-phase2-final 1

Paper for Omibranch/harmonic-checkpoints-phase2-final