Affine Transform: EleutherAI/deep-ignorance-pretraining-stage-unfiltered@global_step38144 β†’ EleutherAI/deep-ignorance-unfiltered@main

Learned affine transformation mapping hidden state activations from a source checkpoint to a target model.

Usage

from safetensors.torch import load_file
import torch.nn as nn
from huggingface_hub import hf_hub_download

# Download files
weights_path = hf_hub_download(repo_id="EleutherAI/affine-checkpoint-transfer", filename="affine_transforms.safetensors")
metadata_path = hf_hub_download(repo_id="EleutherAI/affine-checkpoint-transfer", filename="metadata.json")

# Load
import json
with open(metadata_path) as f:
    metadata = json.load(f)

weights = load_file(weights_path)
affine_transforms = {}
for layer_idx in metadata["layer_indices"]:
    linear = nn.Linear(metadata["hidden_dim"], metadata["hidden_dim"], bias=True)
    linear.weight.data = weights[f"layer_{layer_idx}.weight"]
    linear.bias.data = weights[f"layer_{layer_idx}.bias"]
    affine_transforms[layer_idx] = linear

MSE Metrics

Layer MSE
5 0.081037
10 0.289330
15 0.684350
20 0.978894
25 1.569308
30 4.231404

Mean MSE: 1.305720

Training Details

  • Source Model: EleutherAI/deep-ignorance-pretraining-stage-unfiltered@global_step38144
  • Target Model: EleutherAI/deep-ignorance-unfiltered@main
  • Hidden Dimension: 4096
  • Ridge Alpha: 0.01
  • Layers: [5, 10, 15, 20, 25, 30]
  • Training Examples: 100000
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support