Activation Flow (nflows) for GPT-2 residuals

Files:

flow.pt: state_dict + D + args (from train_activation_flow.py)
train_activation_flow.py: build_flow to reconstruct the model

Quick load:

import torch, importlib.util
from huggingface_hub import hf_hub_download

spec = importlib.util.spec_from_file_location('taf', 'train_activation_flow.py')
m = importlib.util.module_from_spec(spec); spec.loader.exec_module(m)

fp = hf_hub_download(repo_id="Ionel2023/gpt2-activation-flow-l11-wikitext2", filename="flow.pt")
ckpt = torch.load(fp, map_location="cpu")
flow = m.build_flow(
  D=ckpt["D"],
  arch=ckpt["args"]["flow_arch"],
  hidden_features=ckpt["args"]["hidden_features"],
  num_transforms=ckpt["args"]["num_transforms"],
  num_layers_per_transform=ckpt["args"]["num_layers_per_transform"],
  use_lu=(not ckpt["args"].get("no_lu", False)),
  use_actnorm=(not ckpt["args"].get("no_actnorm", False)) if "no_actnorm" in ckpt["args"] else True,
  affine_scale=float(ckpt["args"].get("affine_scale", 0.97)),
)
flow.load_state_dict(ckpt["model"]); flow.eval()

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support