Dual-System V2 Sidecar — Gemma 4 E2B-IT

A trained geometric sidecar checkpoint for the Dual-System Architecture.

Architecture

The sidecar (162M params) adds structured reasoning on top of the frozen Gemma 4 E2B-IT backbone:

Component	Description	Params
GeometricProcessor	4-layer causal transformer with KV caching producing additive `geo_logits`	~148M
LatentPlanner	VAE with LaDiR-style diffusion ELBO for planning latent z₀	~14M
EBM Critic	Energy-based model scoring geometric sequence quality	~0.5M
Alpha Gate	Learned sigmoid gate (α=0.537) blending sidecar corrections	1

Forward pass: final_logits = base_logits + α · geo_logits

Training Details

Backbone: google/gemma-4-E2B-it (frozen, ~2.6B params)
Sidecar: 162,291,818 trainable parameters
Training data: HuggingFaceH4/ultrachat_200k (streaming)
Hardware: NVIDIA H100 80GB HBM3
Checkpoint: Epoch 2, Step 6250
SVD initialization: GeometricProcessor input projection initialized with top-512 PCA components of backbone embeddings

Usage

# Install
!pip install transformers accelerate huggingface_hub torch

# Clone the repo
!git clone https://github.com/Bender1011001/dual-system-architecture.git

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from dual_system_v2 import DualSystemV2, SidecarConfig
from huggingface_hub import hf_hub_download

# Download sidecar
sidecar_path = hf_hub_download(
    repo_id="Bender1011001/gemma4-dualsystem-sidecar",
    filename="sidecar_epoch2.pt"
)

# Load checkpoint config (guarantees weight compatibility)
ckpt = torch.load(sidecar_path, map_location="cuda", weights_only=False)
config = SidecarConfig(**ckpt["config"])

# Load backbone
backbone = AutoModelForCausalLM.from_pretrained(
    ckpt["backbone"], torch_dtype=torch.bfloat16, device_map="cuda"
)
for p in backbone.parameters():
    p.requires_grad = False

# Build and load sidecar
model = DualSystemV2(backbone=backbone, config=config).cuda().eval()
model.geo_processor.load_state_dict(ckpt["geo_state"])
model.ebm_critic.load_state_dict(ckpt["ebm_state"])
model.latent_planner.load_state_dict(ckpt["planner_state"])

# Generate
tokenizer = AutoTokenizer.from_pretrained(ckpt["backbone"])
result = model(input_ids=tokenizer("Hello", return_tensors="pt").input_ids.cuda())

Colab Demo

Citation

@misc{dual-system-2026,
  title={Dual-System Architecture: Geometric Sidecar Modules for Language Model Enhancement},
  author={Bender1011001},
  year={2026},
  url={https://github.com/Bender1011001/dual-system-architecture}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for Bender1011001/gemma4-dualsystem-sidecar

Base model

google/gemma-4-E2B

Finetuned

google/gemma-4-E2B-it

Finetuned

(301)

this model

Bender1011001
/

gemma4-dualsystem-sidecar