DiffReaper-3 / BIRTH.md
darwinkernelpanic's picture
Create BIRTH.md
5cb9ba0 verified
# DiffReaper 3 Birth Log
**Model:** DiffReaper 3 (1.5B Parameter dLLM)
**Date:** 2026-01-27
**Host:** Vast.ai (1x RTX 4090 - France)
**Training Time:** ~1 hour
**Steps:** 10,000
## Architecture Spec
- **Layers:** 24
- **Hidden Dim:** 2048
- **Heads:** 16
- **Objective:** Mercury-style Discrete Diffusion
- **Payload:** 2.85 GB weight file (`pytorch_model.bin`)
## Training History
- Attempt 1: Failed (Zombie SSH port on first host).
- Attempt 2: Failed (Gated dataset access issues).
- Attempt 3: Switch to "Raw Logic Stream" (Bypass datasets lib).
- Step 4150: Loss hit major milestone: **0.3503**.
- Step 9500: Reaped logical structure: **0.0038**.
## First Live Denoise Test
**Prompt:** `The reaper of code looks upon the logic and says: def process_data(x):`
**Result:** `The re arm of feel looks upon theSh and says: arm feel seas feel sufferxumer feel,,,,,,,,,,,,,,,,,,,,`
**Conclusion:** The model shows clear diffusion artifacting. It successfully parallel-unmasked the block, though the mixed Shakespeare/Python training has created a "Tortured Poet" logic gate.