FTN v2_boundary Small
Small FTN v2_boundary checkpoint trained on TinyStories with GPT-2 tokenization.
This repository contains a custom FTN checkpoint plus the exact modeling_ftn.py implementation needed to load it.
Training summary
- Variant:
v2_boundary - Fusion mode:
add - Layers:
4 - Hidden size:
256 - FFN dim:
1024 - Local window:
32 - Global kernel:
256 - Max positions:
256 - Best validation loss:
2.427107 - Best epoch:
10
Files
best_checkpoint.pt: full training checkpoint with config and weightsmodeling_ftn.py: model implementationconfig.json: FTN architecture configmetrics_summary.json: final metrics and diagnostics summarymetrics_history.csv: per-epoch historyftn_diagnostics.json/ftn_diagnostics.csv: branch and spectral diagnosticssamples.json/samples.txt: saved sample generations- GPT-2 tokenizer files for decoding inputs and outputs
Load the model
import torch
from transformers import GPT2TokenizerFast
from modeling_ftn import FTNConfig, FTNForCausalLM
repo_dir = "."
checkpoint = torch.load(f"{repo_dir}/best_checkpoint.pt", map_location="cpu")
config = FTNConfig(**checkpoint["config"])
model = FTNForCausalLM(config)
model.load_state_dict(checkpoint["state_dict"])
model.eval()
tokenizer = GPT2TokenizerFast.from_pretrained(repo_dir)
Notes
- This is a custom FTN architecture, not a stock Transformers
AutoModelclass. - The checkpoint was trained in the FTN research repo at
E:/FTNand is published here with the exact loading code.
- Downloads last month
- 3