Dual-System V2 Sidecar — Gemma 4 E2B-IT

A trained geometric sidecar checkpoint for the Dual-System Architecture.

Architecture

The sidecar (162M params) adds structured reasoning on top of the frozen Gemma 4 E2B-IT backbone:

Component Description Params
GeometricProcessor 4-layer causal transformer with KV caching producing additive geo_logits ~148M
LatentPlanner VAE with LaDiR-style diffusion ELBO for planning latent zâ‚€ ~14M
EBM Critic Energy-based model scoring geometric sequence quality ~0.5M
Alpha Gate Learned sigmoid gate (α=0.537) blending sidecar corrections 1

Forward pass: final_logits = base_logits + α · geo_logits

Training Details

  • Backbone: google/gemma-4-E2B-it (frozen, ~2.6B params)
  • Sidecar: 162,291,818 trainable parameters
  • Training data: HuggingFaceH4/ultrachat_200k (streaming)
  • Hardware: NVIDIA H100 80GB HBM3
  • Checkpoint: Epoch 2, Step 6250
  • SVD initialization: GeometricProcessor input projection initialized with top-512 PCA components of backbone embeddings

Usage

# Install
!pip install transformers accelerate huggingface_hub torch

# Clone the repo
!git clone https://github.com/Bender1011001/dual-system-architecture.git

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from dual_system_v2 import DualSystemV2, SidecarConfig
from huggingface_hub import hf_hub_download

# Download sidecar
sidecar_path = hf_hub_download(
    repo_id="Bender1011001/gemma4-dualsystem-sidecar",
    filename="sidecar_epoch2.pt"
)

# Load checkpoint config (guarantees weight compatibility)
ckpt = torch.load(sidecar_path, map_location="cuda", weights_only=False)
config = SidecarConfig(**ckpt["config"])

# Load backbone
backbone = AutoModelForCausalLM.from_pretrained(
    ckpt["backbone"], torch_dtype=torch.bfloat16, device_map="cuda"
)
for p in backbone.parameters():
    p.requires_grad = False

# Build and load sidecar
model = DualSystemV2(backbone=backbone, config=config).cuda().eval()
model.geo_processor.load_state_dict(ckpt["geo_state"])
model.ebm_critic.load_state_dict(ckpt["ebm_state"])
model.latent_planner.load_state_dict(ckpt["planner_state"])

# Generate
tokenizer = AutoTokenizer.from_pretrained(ckpt["backbone"])
result = model(input_ids=tokenizer("Hello", return_tensors="pt").input_ids.cuda())

Colab Demo

Open In Colab

Citation

@misc{dual-system-2026,
  title={Dual-System Architecture: Geometric Sidecar Modules for Language Model Enhancement},
  author={Bender1011001},
  year={2026},
  url={https://github.com/Bender1011001/dual-system-architecture}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Bender1011001/gemma4-dualsystem-sidecar

Finetuned
(206)
this model

Dataset used to train Bender1011001/gemma4-dualsystem-sidecar