DiffReaper 3 Birth Log
Model: DiffReaper 3 (1.5B Parameter dLLM) Date: 2026-01-27 Host: Vast.ai (1x RTX 4090 - France) Training Time: ~1 hour Steps: 10,000
Architecture Spec
- Layers: 24
- Hidden Dim: 2048
- Heads: 16
- Objective: Mercury-style Discrete Diffusion
- Payload: 2.85 GB weight file (
pytorch_model.bin)
Training History
- Attempt 1: Failed (Zombie SSH port on first host).
- Attempt 2: Failed (Gated dataset access issues).
- Attempt 3: Switch to "Raw Logic Stream" (Bypass datasets lib).
- Step 4150: Loss hit major milestone: 0.3503.
- Step 9500: Reaped logical structure: 0.0038.
First Live Denoise Test
Prompt: The reaper of code looks upon the logic and says: def process_data(x):
Result: The re arm of feel looks upon theSh and says: arm feel seas feel sufferxumer feel,,,,,,,,,,,,,,,,,,,,
Conclusion: The model shows clear diffusion artifacting. It successfully parallel-unmasked the block, though the mixed Shakespeare/Python training has created a "Tortured Poet" logic gate.