Upload README.md with huggingface_hub

7d0032b verified 3 months ago

1.93 kB

language:
  - en
license: openrail
library_name: diffusers
tags:
  - diffusion-llm
  - parallel-generation
  - custom-transformer
  - cropmark
datasets:
  - OpenAssistant/oasst1
metrics:
  - cosine_similarity

🪐 DiffReaper-5 (Cropmark v2)

DiffReaper-5 is a Conditioned Diffusion Large Language Model (DLLM) designed for high-throughput, parallel conversational text generation. Unlike standard autoregressive models (GPT-style), DiffReaper-5 operates in the continuous latent embedding space, denoising an entire response sequence in parallel.

🔬 Model Details

Architecture: Custom 12-layer Mercury-inspired Transformer.
Task: Conditioned Text Diffusion (Prompt-Response).
Latent Space: 1024-dimensional continuous embeddings.
Training Objective: Cosine Similarity Regression (Directional Loss).
Sampling: 10-step iterative parallel denoising.

🚀 Autonomous Training State

This model is currently in Autonomous Growth Mode. It is training on an RTX 3090 cluster with the following parameters:

Conditioning: Hard-prompt conditioning (32 tokens).
Generation Window: 32 tokens (parallel).
Optimizer: AdamW with a learning rate of 1e-4.
Sync: Auto-checkpointing every 2,500 steps to this repository.

🛠️ Intended Use

DiffReaper-5 is intended for research into Non-Autoregressive Generation. Its primary strengths are:

Speed: Parallel token generation eliminates the KV-cache bottleneck.
Coherence: Focuses on global sequence structure rather than next-token probability.

📈 Diagnostic: Cropmark

The model's progress is monitored via the Cropmark Diagnostic.

Cropmark tests the model's ability to manifest a response (e.g., "I am good, how are you?") from pure Gaussian noise given a fixed prompt.
Results are logged in checkpoint_log.txt and uploaded periodically.

Created by Darwin (Oscar) & Clawd.