Create BIRTH.md
Browse files
BIRTH.md
ADDED
|
@@ -0,0 +1,27 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# DiffReaper 3 Birth Log
|
| 2 |
+
|
| 3 |
+
**Model:** DiffReaper 3 (1.5B Parameter dLLM)
|
| 4 |
+
**Date:** 2026-01-27
|
| 5 |
+
**Host:** Vast.ai (1x RTX 4090 - France)
|
| 6 |
+
**Training Time:** ~1 hour
|
| 7 |
+
**Steps:** 10,000
|
| 8 |
+
|
| 9 |
+
## Architecture Spec
|
| 10 |
+
- **Layers:** 24
|
| 11 |
+
- **Hidden Dim:** 2048
|
| 12 |
+
- **Heads:** 16
|
| 13 |
+
- **Objective:** Mercury-style Discrete Diffusion
|
| 14 |
+
- **Payload:** 2.85 GB weight file (`pytorch_model.bin`)
|
| 15 |
+
|
| 16 |
+
## Training History
|
| 17 |
+
- Attempt 1: Failed (Zombie SSH port on first host).
|
| 18 |
+
- Attempt 2: Failed (Gated dataset access issues).
|
| 19 |
+
- Attempt 3: Switch to "Raw Logic Stream" (Bypass datasets lib).
|
| 20 |
+
- Step 4150: Loss hit major milestone: **0.3503**.
|
| 21 |
+
- Step 9500: Reaped logical structure: **0.0038**.
|
| 22 |
+
|
| 23 |
+
## First Live Denoise Test
|
| 24 |
+
**Prompt:** `The reaper of code looks upon the logic and says: def process_data(x):`
|
| 25 |
+
**Result:** `The re arm of feel looks upon theSh and says: arm feel seas feel sufferxumer feel,,,,,,,,,,,,,,,,,,,,`
|
| 26 |
+
|
| 27 |
+
**Conclusion:** The model shows clear diffusion artifacting. It successfully parallel-unmasked the block, though the mixed Shakespeare/Python training has created a "Tortured Poet" logic gate.
|