bergson MAGIC checkpoint — GPT-2 fine-tuned on wikitext-2
GPT-2 (124M) fine-tuned on
Salesforce/wikitext wikitext-2-raw-v1 train (chunked at 512
tokens) via the bergson
MAGIC pipeline. This is the exact checkpoint used to generate the
attribution scores published at
EleutherAI/bergson-magic-scores-gpt-2.
Loading
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("EleutherAI/bergson-magic-gpt-2")
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/bergson-magic-gpt-2")
YAML used to produce this checkpoint
run_path: runs/gpt2_wikitext
model: gpt2
overwrite: true
data:
dataset: Salesforce/wikitext
subset: wikitext-2-raw-v1
split: "train"
chunk_length: 512
query:
dataset: Salesforce/wikitext
subset: wikitext-2-raw-v1
split: "test[3:4]"
chunk_length: 0
distributed:
nproc_per_node: 4
nnode: 4
batch_size: 256
num_epochs: 2
lr_schedule:
lr_scheduler_type: polynomial
lr: 0.0008
lr_start: 1e-6
lr_end: 0.00008
warmup_steps: 0.25
subset_strategy: random
wandb_project: magic
Saved as examples/magic/gpt2_wikitext.yaml in the bergson repo.
Run with:
bergson magic examples/magic/gpt2_wikitext.yaml
The bergson magic step trains the model on the train split via its
own training loop (it must, because MAGIC's attribution scores are
the gradients of query loss with respect to per-example training
weights, computed by back-propagating through training). The final
trained weights end up at the hf_model/ subdirectory of the run
path; that's what was uploaded here.
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for EleutherAI/bergson-magic-gpt-2
Base model
openai-community/gpt2