darwinkernelpanic
/

DiffReaper-5

+---
+language:
+- en
+license: openrail
+library_name: diffusers
+tags:
+- diffusion-llm
+- parallel-generation
+- custom-transformer
+- cropmark
+datasets:
+- OpenAssistant/oasst1
+metrics:
+- cosine_similarity
+---
+# 🪐 DiffReaper-5 (Cropmark v2)
+DiffReaper-5 is a **Conditioned Diffusion Large Language Model (DLLM)** designed for high-throughput, parallel conversational text generation. Unlike standard autoregressive models (GPT-style), DiffReaper-5 operates in the continuous latent embedding space, denoising an entire response sequence in parallel.
+## 🔬 Model Details
+- **Architecture:** Custom 12-layer Mercury-inspired Transformer.
+- **Task:** Conditioned Text Diffusion (Prompt-Response).
+- **Latent Space:** 1024-dimensional continuous embeddings.
+- **Training Objective:** Cosine Similarity Regression (Directional Loss).
+- **Sampling:** 10-step iterative parallel denoising.
+## 🚀 Autonomous Training State
+This model is currently in **Autonomous Growth Mode**. It is training on an RTX 3090 cluster with the following parameters:
+- **Conditioning:** Hard-prompt conditioning (32 tokens).
+- **Generation Window:** 32 tokens (parallel).
+- **Optimizer:** AdamW with a learning rate of 1e-4.
+- **Sync:** Auto-checkpointing every 2,500 steps to this repository.
+## 🛠️ Intended Use
+DiffReaper-5 is intended for research into **Non-Autoregressive Generation**. Its primary strengths are:
+1. **Speed:** Parallel token generation eliminates the KV-cache bottleneck.
+2. **Coherence:** Focuses on global sequence structure rather than next-token probability.
+## 📈 Diagnostic: Cropmark
+The model's progress is monitored via the **Cropmark Diagnostic**.
+- **Cropmark** tests the model's ability to manifest a response (e.g., "I am good, how are you?") from pure Gaussian noise given a fixed prompt.
+- Results are logged in `checkpoint_log.txt` and uploaded periodically.
+---
+*Created by Darwin (Oscar) & Clawd.*