CreditScope Circuit Tracing Models

Sparse Autoencoders (SAEs) and MoE Transcoders trained on Qwen3.5-35B-A3B-FP8 for mechanistic interpretability / circuit tracing.

Models

SAEs (Sparse Autoencoders)

  • Architecture: JumpReLU, d_model=2048 โ†’ 16384 features (8x expansion)
  • Layers: 0, 5, 10, 15, 20, 25, 30, 35, 39
  • Training: 500 diverse prompts, ~5000 tokens, 2000-15000 steps per model
  • Files: sae_l{N}.pt

Transcoders (MoE Transcoders)

  • Architecture: ReLU encoder/decoder, d_model=2048 โ†’ 16384 features
  • Layers: 0, 5, 10, 15, 20, 25, 30, 35, 39
  • Training: Maps pre-MoE residual to post-MoE output (learns MoE residual contribution)
  • Files: tc_l{N}.pt

Usage

from circuit_tracer.saes.sparse_autoencoder import SparseAutoencoder
from circuit_tracer.transcoders.moe_transcoder import MoETranscoder

# Load SAE
sae = SparseAutoencoder.load("checkpoints/sae_l0.pt")

# Load transcoder
tc = MoETranscoder.load("checkpoints/tc_l0.pt")

Training Details

  • Base model: Qwen/Qwen3.5-35B-A3B-FP8
  • Activation collection: Direct model forward hooks on 500 diverse prompts
  • SAE optimizer: Adam, lr=3e-4, cosine annealing
  • TC optimizer: Adam, lr=1e-3, cosine annealing
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support