CharlesCNorton
commited on
Commit
Β·
a2cd2fc
1
Parent(s):
084c69c
Clarify proof-of-concept status: circuit validation complete, LLM integration in progress
Browse filesRenamed passthrough training files to reflect their scaffolding role:
- train.py β train_passthrough.py
- train_router.py β train_passthrough_router.py
- trained_router.pt β trained_passthrough_router.pt
These demonstrate routing works with pre-formatted inputs, but the real
challenge is learning to extract operands from LLM hidden states.
Updated README:
- Stage 1 (Circuit Validation): COMPLETE - 100% on all ops
- Stage 2 (LLM Baseline): COMPLETE - SmolLM2 at 11.90%
- Stage 3 (LLM Integration): IN PROGRESS - the actual hard part
Honest assessment: passthrough training is trivial (copies labels).
Real test is parsing "47 + 86" from hidden states, not [bits, op_onehot].
README.md
CHANGED
|
@@ -503,58 +503,59 @@ The experimental condition adds:
|
|
| 503 |
2. Neural interface layers can learn to use discrete computational substrates
|
| 504 |
3. Small language models can achieve perfect arithmetic via architectural augmentation rather than scale
|
| 505 |
|
| 506 |
-
####
|
| 507 |
|
| 508 |
-
**
|
| 509 |
|
| 510 |
-
|
| 511 |
-
|
| 512 |
-
|
|
| 513 |
-
|
| 514 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 515 |
|
| 516 |
```
|
| 517 |
-
|
| 518 |
-
|
| 519 |
-
|
| 520 |
-
|
| 521 |
-
|
| 522 |
-
|
| 523 |
-
|
| 524 |
-
|
| 525 |
-
|
| 526 |
-
|
| 527 |
-
TARGET: 100% FITNESS ACHIEVED
|
| 528 |
-
|
| 529 |
-
Per-operation:
|
| 530 |
-
add: 1.0000
|
| 531 |
-
sub: 1.0000
|
| 532 |
-
mul: 1.0000
|
| 533 |
-
gt: 1.0000
|
| 534 |
-
lt: 1.0000
|
| 535 |
-
eq: 1.0000
|
| 536 |
-
|
| 537 |
-
CONCLUSION: Router successfully learned operation dispatch.
|
| 538 |
-
With correct bit encoding, 100% is achievable.
|
| 539 |
-
======================================================================
|
| 540 |
```
|
| 541 |
|
| 542 |
-
|
| 543 |
-
1. Frozen threshold circuits achieve 100% on all operations when given correct bit inputs
|
| 544 |
-
2. A 1,862-parameter router learns operation dispatch in one epoch
|
| 545 |
-
3. The remaining challenge for full LLM integration is learning bit encoding from hidden states
|
| 546 |
-
4. This validates the core thesis: discrete computational substrates can provide exact arithmetic
|
| 547 |
|
| 548 |
#### Proof of Concept Scope
|
| 549 |
|
| 550 |
-
This proof of concept validated the core mechanism:
|
| 551 |
-
|
| 552 |
- **8-bit operands** (0-255)
|
| 553 |
- **Six operations**: ADD, SUB, MUL, GT, LT, EQ
|
| 554 |
- **Pure ALU profile** (no memory access)
|
| 555 |
-
- **Ground truth bits** (bit encoding from hidden states is the next step)
|
| 556 |
|
| 557 |
-
|
| 558 |
|
| 559 |
### Extension Roadmap
|
| 560 |
|
|
@@ -589,9 +590,9 @@ The following extensions are planned after proof-of-concept validation:
|
|
| 589 |
| `llm_integration/baseline.py` | SmolLM2-360M arithmetic baseline evaluation (11.90% fitness) |
|
| 590 |
| `llm_integration/fitness.py` | Shared fitness function for randomized arithmetic tests |
|
| 591 |
| `llm_integration/circuits.py` | Frozen threshold circuit wrapper with STE gradients |
|
| 592 |
-
| `llm_integration/model.py` |
|
| 593 |
-
| `llm_integration/
|
| 594 |
-
| `llm_integration/
|
| 595 |
|
| 596 |
### Build Tool Usage
|
| 597 |
|
|
|
|
| 503 |
2. Neural interface layers can learn to use discrete computational substrates
|
| 504 |
3. Small language models can achieve perfect arithmetic via architectural augmentation rather than scale
|
| 505 |
|
| 506 |
+
#### Progress
|
| 507 |
|
| 508 |
+
**Stage 1: Circuit Validation β COMPLETE**
|
| 509 |
|
| 510 |
+
The frozen threshold circuits achieve 100% accuracy when given correctly formatted bit inputs:
|
| 511 |
+
|
| 512 |
+
| Test | Result |
|
| 513 |
+
|------|--------|
|
| 514 |
+
| DirectCircuitModel (ground truth bits) | 100.00% on 10,000 random cases |
|
| 515 |
+
| All operations (ADD, SUB, MUL, GT, LT, EQ) | 100.00% each |
|
| 516 |
+
|
| 517 |
+
This confirms the circuits compute correctly. However, this was already established by `eval.py`.
|
| 518 |
+
|
| 519 |
+
**Stage 2: LLM Baseline β COMPLETE**
|
| 520 |
+
|
| 521 |
+
SmolLM2-360M-Instruct baseline on randomized 8-bit arithmetic:
|
| 522 |
+
|
| 523 |
+
| Operation | Accuracy |
|
| 524 |
+
|-----------|----------|
|
| 525 |
+
| Addition | 35.92% |
|
| 526 |
+
| Subtraction | 17.72% |
|
| 527 |
+
| Multiplication | 1.25% |
|
| 528 |
+
| Comparisons | 0.28β14.37% |
|
| 529 |
+
| **Overall** | **11.90%** |
|
| 530 |
+
|
| 531 |
+
Head-to-head on 50 random cases: SmolLM2 got 7/50 (14%), circuits got 50/50 (100%).
|
| 532 |
+
|
| 533 |
+
**Stage 3: LLM Integration β IN PROGRESS**
|
| 534 |
+
|
| 535 |
+
The actual challenge: train an interface that extracts operands and operations from LLM hidden states (not from pre-formatted bit inputs).
|
| 536 |
|
| 537 |
```
|
| 538 |
+
"What is 47 + 86?"
|
| 539 |
+
β
|
| 540 |
+
[LLM hidden states]
|
| 541 |
+
β
|
| 542 |
+
BitExtractor (must LEARN: "47" β 00101111, "86" β 01010110)
|
| 543 |
+
OpRouter (must LEARN: "+" β add operation)
|
| 544 |
+
β
|
| 545 |
+
[Frozen threshold circuits]
|
| 546 |
+
β
|
| 547 |
+
[Result bits] β "133"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 548 |
```
|
| 549 |
|
| 550 |
+
The `train_passthrough_*.py` files demonstrate that routing works when given labels, but this is trivialβthe real test is learning to parse from natural language.
|
|
|
|
|
|
|
|
|
|
|
|
|
| 551 |
|
| 552 |
#### Proof of Concept Scope
|
| 553 |
|
|
|
|
|
|
|
| 554 |
- **8-bit operands** (0-255)
|
| 555 |
- **Six operations**: ADD, SUB, MUL, GT, LT, EQ
|
| 556 |
- **Pure ALU profile** (no memory access)
|
|
|
|
| 557 |
|
| 558 |
+
**Current status**: Circuit validation complete. LLM hidden state extraction in development.
|
| 559 |
|
| 560 |
### Extension Roadmap
|
| 561 |
|
|
|
|
| 590 |
| `llm_integration/baseline.py` | SmolLM2-360M arithmetic baseline evaluation (11.90% fitness) |
|
| 591 |
| `llm_integration/fitness.py` | Shared fitness function for randomized arithmetic tests |
|
| 592 |
| `llm_integration/circuits.py` | Frozen threshold circuit wrapper with STE gradients |
|
| 593 |
+
| `llm_integration/model.py` | Interface layer definitions (BitEncoder, OpRouter, BitDecoder) |
|
| 594 |
+
| `llm_integration/train_passthrough.py` | Scaffolding: trains with pre-formatted bit inputs |
|
| 595 |
+
| `llm_integration/train_passthrough_router.py` | Scaffolding: router-only with ground truth bits |
|
| 596 |
|
| 597 |
### Build Tool Usage
|
| 598 |
|
llm_integration/{train.py β train_passthrough.py}
RENAMED
|
File without changes
|
llm_integration/{train_router.py β train_passthrough_router.py}
RENAMED
|
File without changes
|
llm_integration/{trained_router.pt β trained_passthrough_router.pt}
RENAMED
|
File without changes
|