ApplePiesFromScratch's picture
Update README.md
b26ae5f verified
---
language: en
license: mit
tags:
- propagation-logic
- mechanism-first
- abstract-reasoning
- derivation-traces
- boundary-conditions
datasets:
- ApplePiesFromScratch/dta-benchmark
metrics:
- dta
---
# MechanismBase β€” P / G β†’ Q
A 10M parameter transformer trained on derivation traces, not natural language.
## What this is
Standard language models learn statistical patterns over text.
This model was trained on the **procedure** P / G β†’ Q β€” explicit derivation
traces showing closure analysis, fixed point detection, cycle structure
identification, and forced boundary condition derivation.
**The claim:** given any carrier V and gradient family Ξ“, the model can derive
forced boundary conditions β€” what logic system the carrier implies, what
fixed points exist, what cycle structure is forced.
## Theory
Propagation Logic v13 β€” SSRN Abstract ID: 6439258 (James Pugmire)
The single primitive operator: `P / G β†’ Q`
A loaded pattern P propagates through gradient field G in context C to
produce updated pattern Q. All of classical logic, fuzzy logic, arithmetic,
calculus, and grammar fall out of different (V, Ξ“) choices.
## Model
- Architecture: Transformer decoder (custom, mechanism-aligned)
- Parameters: 10.5M
- Training tokens: ~1M (derivation traces)
- Training epochs: 5
## Benchmark: DTA (Derivation Trace Accuracy)
The correct benchmark for this model is not BLiMP or MMLU.
It is DTA β€” how accurately does the model predict forced boundary conditions
on novel carriers?
See: `ApplePiesFromScratch/dta-benchmark`
| Model | DTA-Overall | DTA-Closure | DTA-FixedPts | DTA-Involution | DTA-Cycle |
|-------|-------------|-------------|--------------|----------------|-----------|
| MechanismBase (10M) | 77.5% | 80.0% | 90.0% | 100.0% | 40.0% |
| GPT-3.5-turbo (175B)| 55.0% | 70.0% | 10.0% | 50.0% | 90.0% |
| GPT-4 (1.8T) | 87.5% |100.0% | 70.0% | 90.0% | 90.0% |
| Random baseline | 25.0% | 50.0% | 25.0% | 50.0% | 25.0% |
| Engine (oracle) |100.0% |100.0% |100.0% | 100.0% |100.0% |
## Usage
```python
# The model requires the pl/ library and engine.py from the repo
# Clone: github.com/ApplePiesFromScratch/propagation-logic
from model import MechanismBase, SmallConfig
from tokenizers import Tokenizer
import torch
config = SmallConfig()
model = MechanismBase(config)
# Load weights from Hub (see full usage in repo)
tokenizer = Tokenizer.from_file("mechanism_tokenizer/tokenizer.json")
# Give the model a partial derivation trace
partial = """DOMAIN: color_domain
CARRIER: ['red', 'green', 'blue']
GRADIENTS: ['complement', 'id']
THETA: 1.0
---
"""
ids = torch.tensor(tokenizer.encode(partial).ids).unsqueeze(0)
output = model.generate(ids, max_new_tokens=200, temperature=0.3)
print(tokenizer.decode(output[0].tolist()))
```
## Training
```
python generate_data.py # generates derivation trace corpus
python tokenizer_train.py # BPE tokenizer on corpus
python train.py # SmallConfig, ~30 min on RTX 4060 Ti
```
## Repository
GitHub: [ApplePiesFromScratch/propagation-logic](https://github.com/ApplePiesFromScratch/propagation-logic)