Text Classification
Transformers
PyTorch
English
mechanistic-interpretability
grokking
modular-arithmetic
transformer
TransformerLens
toy-model
Instructions to use BurnyCoder/modular-multiplication-transformer with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use BurnyCoder/modular-multiplication-transformer with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="BurnyCoder/modular-multiplication-transformer")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("BurnyCoder/modular-multiplication-transformer", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -16,7 +16,7 @@ pipeline_tag: text-classification
|
|
| 16 |
|
| 17 |
# Modular Multiplication Transformer
|
| 18 |
|
| 19 |
-
A 1-layer, 4-head transformer trained on **(a x b) mod 113** that exhibits **grokking**
|
| 20 |
|
| 21 |
## Model Architecture
|
| 22 |
|
|
|
|
| 16 |
|
| 17 |
# Modular Multiplication Transformer
|
| 18 |
|
| 19 |
+
A 1-layer, 4-head transformer trained on **(a x b) mod 113** that exhibits **grokking** (delayed generalization after memorization). This checkpoint includes full training history (400 checkpoints across 40,000 epochs).
|
| 20 |
|
| 21 |
## Model Architecture
|
| 22 |
|