BurnyCoder
/

modular-multiplication-transformer

Text Classification

mechanistic-interpretability

modular-arithmetic

TransformerLens

Model card Files Files and versions

BurnyCoder commited on Feb 11

Commit

df0959a

·

verified ·

1 Parent(s): 438d6bd

Upload README.md with huggingface_hub

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ pipeline_tag: text-classification
 # Modular Multiplication Transformer
-A 1-layer, 4-head transformer trained on **(a x b) mod 113** that exhibits **grokking** — delayed generalization after memorization. This checkpoint includes full training history (400 checkpoints across 40,000 epochs).
 ## Model Architecture

 # Modular Multiplication Transformer
+A 1-layer, 4-head transformer trained on **(a x b) mod 113** that exhibits **grokking** (delayed generalization after memorization). This checkpoint includes full training history (400 checkpoints across 40,000 epochs).
 ## Model Architecture