| # DiffReaper 3 Birth Log | |
| **Model:** DiffReaper 3 (1.5B Parameter dLLM) | |
| **Date:** 2026-01-27 | |
| **Host:** Vast.ai (1x RTX 4090 - France) | |
| **Training Time:** ~1 hour | |
| **Steps:** 10,000 | |
| ## Architecture Spec | |
| - **Layers:** 24 | |
| - **Hidden Dim:** 2048 | |
| - **Heads:** 16 | |
| - **Objective:** Mercury-style Discrete Diffusion | |
| - **Payload:** 2.85 GB weight file (`pytorch_model.bin`) | |
| ## Training History | |
| - Attempt 1: Failed (Zombie SSH port on first host). | |
| - Attempt 2: Failed (Gated dataset access issues). | |
| - Attempt 3: Switch to "Raw Logic Stream" (Bypass datasets lib). | |
| - Step 4150: Loss hit major milestone: **0.3503**. | |
| - Step 9500: Reaped logical structure: **0.0038**. | |
| ## First Live Denoise Test | |
| **Prompt:** `The reaper of code looks upon the logic and says: def process_data(x):` | |
| **Result:** `The re arm of feel looks upon theSh and says: arm feel seas feel sufferxumer feel,,,,,,,,,,,,,,,,,,,,` | |
| **Conclusion:** The model shows clear diffusion artifacting. It successfully parallel-unmasked the block, though the mixed Shakespeare/Python training has created a "Tortured Poet" logic gate. |