metadata
language:
- en
license: openrail
library_name: diffusers
tags:
- diffusion-llm
- parallel-generation
- custom-transformer
- cropmark
datasets:
- OpenAssistant/oasst1
metrics:
- cosine_similarity
πͺ DiffReaper-5 (Cropmark v2)
DiffReaper-5 is a Conditioned Diffusion Large Language Model (DLLM) designed for high-throughput, parallel conversational text generation. Unlike standard autoregressive models (GPT-style), DiffReaper-5 operates in the continuous latent embedding space, denoising an entire response sequence in parallel.
π¬ Model Details
- Architecture: Custom 12-layer Mercury-inspired Transformer.
- Task: Conditioned Text Diffusion (Prompt-Response).
- Latent Space: 1024-dimensional continuous embeddings.
- Training Objective: Cosine Similarity Regression (Directional Loss).
- Sampling: 10-step iterative parallel denoising.
π Autonomous Training State
This model is currently in Autonomous Growth Mode. It is training on an RTX 3090 cluster with the following parameters:
- Conditioning: Hard-prompt conditioning (32 tokens).
- Generation Window: 32 tokens (parallel).
- Optimizer: AdamW with a learning rate of 1e-4.
- Sync: Auto-checkpointing every 2,500 steps to this repository.
π οΈ Intended Use
DiffReaper-5 is intended for research into Non-Autoregressive Generation. Its primary strengths are:
- Speed: Parallel token generation eliminates the KV-cache bottleneck.
- Coherence: Focuses on global sequence structure rather than next-token probability.
π Diagnostic: Cropmark
The model's progress is monitored via the Cropmark Diagnostic.
- Cropmark tests the model's ability to manifest a response (e.g., "I am good, how are you?") from pure Gaussian noise given a fixed prompt.
- Results are logged in
checkpoint_log.txtand uploaded periodically.
Created by Darwin (Oscar) & Clawd.