SynapNet-Edge — Checkpoints

Hybrid SSM + sparse-attention + episodic-memory architecture with Component-Aware Joint Quantization (CAJQ) and Budget-Aware Episodic Eviction (BAEE), designed for long-context inference on consumer hardware.

📦 Code: https://github.com/vineetha00/SynapNet-Edge 🧪 Base architecture: https://github.com/vineetha00/SynapNet_Exp · 🤗 https://huggingface.co/Vineetha00/synapnet 📄 Paper: arXiv preprint — link coming soon


Checkpoints in this repo

File Params Size Stage Eval NIAH-single (ctx=1024)
synapnet_edge_8m7.pt 8.7M 33 MB Full 2-stage curriculum pretrain (ctx 512 → 1024) 0.618 ± 0.107 (FP16, 3 seeds)
synapnet_edge_130m.pt 120.9M 461 MB 1,000-step pretrain, under-converged at this compute budget not converged — released for deployment profiling only

Architecture (8.7M reference)

  • dim=192, depth=6, heads=6, episodic_slots=32
  • vocab_size=4096, num_classes=64, max_len=8192
  • k_frac=0.25 (sparse-attention top-K), episodic_write_frac=0.05
  • ScaleBridge enabled (FP16 interface between mixed-precision pathways)

Architecture (130M variant)

  • dim=640, depth=10, heads=10, episodic_slots=32
  • Same vocab, classes, max_len as 8.7M
  • Under-trained: 1,000 steps × batch 2 was insufficient for convergence at this scale. Use for latency / storage / memory profiling, not accuracy claims.

Loading

import torch
from huggingface_hub import hf_hub_download
from synapnet_edge.models.synapnet_edge_model import SynapNetEdge, SynapNetEdgeConfig

ckpt_path = hf_hub_download(
    repo_id="Vineetha00/synapnet-edge",
    filename="synapnet_edge_8m7.pt",
)
ckpt = torch.load(ckpt_path, map_location="cpu")

cfg = SynapNetEdgeConfig(**ckpt["model_cfg"])
model = SynapNetEdge(cfg)
model.load_state_dict(ckpt["model_state"])
model.eval()

To install the architecture code:

pip install git+https://github.com/vineetha00/SynapNet-Edge.git

Training data

Synthetic long-context curriculum (no external downloads):

  • NIAH-single (needle-in-a-haystack)
  • NIAH-multi-key (4 keys, retrieve value by queried key)
  • Variable tracking (3-hop chain)
  • Frequency aggregation (most-common class over 16 marked items)

Two-stage curriculum: ctx=512 (4 epochs equivalent) → ctx=1024 (2 epochs equivalent).

Final post-pretrain per-task accuracy (8.7M, ctx=1024):

  • NIAH-single: 57%
  • NIAH-multi-key: 13%
  • Variable tracking: 74%
  • Frequency aggregation: 47%

(Versus 1.5% random-chance floor for 64-class.)


Quantization (apply after loading FP16)

The architecture supports Component-Aware Joint Quantization (CAJQ) at inference time:

from synapnet_edge.quantization.cajq import apply_cajq, CAJQConfig
from synapnet_edge.training.calibration import build_calib_loader

calib_loader = build_calib_loader(n_samples=128, seq_len=1024)
model = apply_cajq(
    model,
    CAJQConfig(device="mps"),
    calib_loader=calib_loader,
    mode="ptq",   # or "qat" for QAT fine-tune
)

After 3 seeds × 200 QAT steps, CAJQ matches or exceeds FP16 on NIAH-single at every evaluated context length:

Variant Eff. bits ctx 1024 ctx 2048 ctx 4096
FP16 16.0 0.618 ± 0.107 0.507 ± 0.115 0.438 ± 0.036
CAJQ-QAT (ours) 13.8 0.674 ± 0.012 0.590 ± 0.043 0.521 ± 0.055

Compression: 4.4× on targeted SSM + attention parameters (0.60 MB vs 2.66 MB FP16-equivalent); 1.13× whole-model storage reduction at this configuration.


Streaming inference with BAEE

from synapnet_edge import BAEEMemoryManager

manager = BAEEMemoryManager(dim=192, n_layers=6, budget_mb=256.0)
logits, debug = model.forward_streaming(
    input_ids, chunk_size=512, baee_manager=manager,
)

Under 90% forced eviction with the target needle in the early portion of an 8K stream, BAEE retains the target 71% ± 8% of the time vs 0% for FIFO / LRU. Head-to-head vs H2O / Scissorhands / SnapKV / PyramidKV / Locret-style policies in the GitHub repo.


License

MIT — see LICENSE.

Citation

@article{synapnet_edge_2026,
  title={SynapNet-Edge: Component-Aware Quantization and Budget-Aware Eviction for Hybrid Long-Context Models on Consumer Hardware},
  author={Vallish Kumar, Vineetha},
  year={2026},
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support