DiffReaper-3 / BIRTH.md
darwinkernelpanic's picture
Create BIRTH.md
5cb9ba0 verified

DiffReaper 3 Birth Log

Model: DiffReaper 3 (1.5B Parameter dLLM) Date: 2026-01-27 Host: Vast.ai (1x RTX 4090 - France) Training Time: ~1 hour Steps: 10,000

Architecture Spec

  • Layers: 24
  • Hidden Dim: 2048
  • Heads: 16
  • Objective: Mercury-style Discrete Diffusion
  • Payload: 2.85 GB weight file (pytorch_model.bin)

Training History

  • Attempt 1: Failed (Zombie SSH port on first host).
  • Attempt 2: Failed (Gated dataset access issues).
  • Attempt 3: Switch to "Raw Logic Stream" (Bypass datasets lib).
  • Step 4150: Loss hit major milestone: 0.3503.
  • Step 9500: Reaped logical structure: 0.0038.

First Live Denoise Test

Prompt: The reaper of code looks upon the logic and says: def process_data(x): Result: The re arm of feel looks upon theSh and says: arm feel seas feel sufferxumer feel,,,,,,,,,,,,,,,,,,,,

Conclusion: The model shows clear diffusion artifacting. It successfully parallel-unmasked the block, though the mixed Shakespeare/Python training has created a "Tortured Poet" logic gate.