| --- |
| language: en |
| license: mit |
| tags: |
| - propagation-logic |
| - mechanism-first |
| - abstract-reasoning |
| - derivation-traces |
| - boundary-conditions |
| datasets: |
| - ApplePiesFromScratch/dta-benchmark |
| metrics: |
| - dta |
| --- |
| |
| # MechanismBase β P / G β Q |
|
|
| A 10M parameter transformer trained on derivation traces, not natural language. |
|
|
| ## What this is |
|
|
| Standard language models learn statistical patterns over text. |
| This model was trained on the **procedure** P / G β Q β explicit derivation |
| traces showing closure analysis, fixed point detection, cycle structure |
| identification, and forced boundary condition derivation. |
|
|
| **The claim:** given any carrier V and gradient family Ξ, the model can derive |
| forced boundary conditions β what logic system the carrier implies, what |
| fixed points exist, what cycle structure is forced. |
|
|
| ## Theory |
|
|
| Propagation Logic v13 β SSRN Abstract ID: 6439258 (James Pugmire) |
|
|
| The single primitive operator: `P / G β Q` |
|
|
| A loaded pattern P propagates through gradient field G in context C to |
| produce updated pattern Q. All of classical logic, fuzzy logic, arithmetic, |
| calculus, and grammar fall out of different (V, Ξ) choices. |
|
|
| ## Model |
|
|
| - Architecture: Transformer decoder (custom, mechanism-aligned) |
| - Parameters: 10.5M |
| - Training tokens: ~1M (derivation traces) |
| - Training epochs: 5 |
|
|
| ## Benchmark: DTA (Derivation Trace Accuracy) |
|
|
| The correct benchmark for this model is not BLiMP or MMLU. |
| It is DTA β how accurately does the model predict forced boundary conditions |
| on novel carriers? |
|
|
| See: `ApplePiesFromScratch/dta-benchmark` |
| | Model | DTA-Overall | DTA-Closure | DTA-FixedPts | DTA-Involution | DTA-Cycle | |
| |-------|-------------|-------------|--------------|----------------|-----------| |
| | MechanismBase (10M) | 77.5% | 80.0% | 90.0% | 100.0% | 40.0% | |
| | GPT-3.5-turbo (175B)| 55.0% | 70.0% | 10.0% | 50.0% | 90.0% | |
| | GPT-4 (1.8T) | 87.5% |100.0% | 70.0% | 90.0% | 90.0% | |
| | Random baseline | 25.0% | 50.0% | 25.0% | 50.0% | 25.0% | |
| | Engine (oracle) |100.0% |100.0% |100.0% | 100.0% |100.0% | |
|
|
|
|
| ## Usage |
|
|
| ```python |
| # The model requires the pl/ library and engine.py from the repo |
| # Clone: github.com/ApplePiesFromScratch/propagation-logic |
| |
| from model import MechanismBase, SmallConfig |
| from tokenizers import Tokenizer |
| import torch |
| |
| config = SmallConfig() |
| model = MechanismBase(config) |
| # Load weights from Hub (see full usage in repo) |
| |
| tokenizer = Tokenizer.from_file("mechanism_tokenizer/tokenizer.json") |
| |
| # Give the model a partial derivation trace |
| partial = """DOMAIN: color_domain |
| CARRIER: ['red', 'green', 'blue'] |
| GRADIENTS: ['complement', 'id'] |
| THETA: 1.0 |
| --- |
| """ |
| |
| ids = torch.tensor(tokenizer.encode(partial).ids).unsqueeze(0) |
| output = model.generate(ids, max_new_tokens=200, temperature=0.3) |
| print(tokenizer.decode(output[0].tolist())) |
| ``` |
|
|
| ## Training |
|
|
| ``` |
| python generate_data.py # generates derivation trace corpus |
| python tokenizer_train.py # BPE tokenizer on corpus |
| python train.py # SmallConfig, ~30 min on RTX 4060 Ti |
| ``` |
|
|
| ## Repository |
|
|
| GitHub: [ApplePiesFromScratch/propagation-logic](https://github.com/ApplePiesFromScratch/propagation-logic) |
|
|