CreditScope Circuit Tracing Models
Sparse Autoencoders (SAEs) and MoE Transcoders trained on Qwen3.5-35B-A3B-FP8 for mechanistic interpretability / circuit tracing.
Models
SAEs (Sparse Autoencoders)
- Architecture: JumpReLU, d_model=2048 โ 16384 features (8x expansion)
- Layers: 0, 5, 10, 15, 20, 25, 30, 35, 39
- Training: 500 diverse prompts, ~5000 tokens, 2000-15000 steps per model
- Files:
sae_l{N}.pt
Transcoders (MoE Transcoders)
- Architecture: ReLU encoder/decoder, d_model=2048 โ 16384 features
- Layers: 0, 5, 10, 15, 20, 25, 30, 35, 39
- Training: Maps pre-MoE residual to post-MoE output (learns MoE residual contribution)
- Files:
tc_l{N}.pt
Usage
from circuit_tracer.saes.sparse_autoencoder import SparseAutoencoder
from circuit_tracer.transcoders.moe_transcoder import MoETranscoder
# Load SAE
sae = SparseAutoencoder.load("checkpoints/sae_l0.pt")
# Load transcoder
tc = MoETranscoder.load("checkpoints/tc_l0.pt")
Training Details
- Base model: Qwen/Qwen3.5-35B-A3B-FP8
- Activation collection: Direct model forward hooks on 500 diverse prompts
- SAE optimizer: Adam, lr=3e-4, cosine annealing
- TC optimizer: Adam, lr=1e-3, cosine annealing
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support