README.md · Wolfvin/aam-diffusion-v2 at main

aam-diffusion-v2 / README.md

Wolfvin

Upload README.md with huggingface_hub

46b1302 verified 3 days ago

preview code

raw

history blame contribute delete

4.09 kB

	---
	language:
	- id
	- en
	license: mit
	library_name: pytorch
	tags:
	- diffusion
	- llm
	- aam
	- graph-conditioned
	- sentence-arrangement
	- evoformer
	- anchored-decoding
	- flow-matching
	- dual-memory
	- matryoshka
	- swiglu
	- rope
	- mcts
	- thinking-toggle
	---

	# AAM Diffusion LLM v2.0 — Upgraded from Losion

	## Overview

	AphantasicAbstractionModel (AAM) is a specialized sentence composer — NOT a general-purpose LLM. It takes structured graph data (evidence, anomalies, reasoning chains) as input and produces coherent, evidence-backed narrative output through iterative denoising.

	AAM = 1 Mind + 1 Body
	- Mind = RSVS Knowledge Graph (structural, relational memory)
	- Body = This Diffusion LLM (generates natural language FROM the graph)

	## v2.0 Upgrade from Losion

	This version incorporates 14 modules extracted from the [Losion](https://github.com/Wolfvin/Losion) architecture:

	### Tier 1 — Critical Upgrades
	\| Module \| Description \| Impact \|
	\|--------\|-------------\|--------\|
	\| Anchored Diffusion Decoder \| 2-3 step refinement instead of 50+ from noise \| 10-20x speedup \|
	\| Flow Matching Decoder \| Velocity-based alternative to DDPM/DDIM \| Faster + better inference \|
	\| Evoformer Feedback \| 4-level bidirectional feedback (layer/token/decoder/prediction) \| Quality leap \|
	\| Dual Memory System \| Working memory + Long-term memory for coherent generation \| Persistent context \|

	### Tier 2 — Training & Reasoning
	\| Module \| Description \|
	\|--------\|-------------\|
	\| MCTS Reasoning Engine \| AlphaZero-style tree search for narrative arrangement \|
	\| Thinking Toggle \| Adaptive compute — simple=2 steps, complex=5+steps \|
	\| Matryoshka Elastic \| One training → multiple deployment sizes \|
	\| GRPO Training \| Group Relative Policy Optimization (no value function) \|
	\| DAPO Training \| Decoupled clip + dynamic sampling + token-level loss \|
	\| Curriculum Learning \| 4-phase: single-evidence → multi-evidence → reasoning → RL \|

	### Tier 3 — Architecture Improvements
	\| Module \| Description \|
	\|--------\|-------------\|
	\| SwiGLU FFN \| Replaced GELU with SwiGLU (proven in LLaMA/Mistral) \|
	\| RoPE \| Rotary Position Encoding for length generalization \|
	\| Speculative Decoder \| Draft model (graph encoder) + verify (diffusion model) \|
	\| Quantization \| BitNet 1-bit + FP8 weight-only quantization stubs \|

	## Architecture

	```
	INPUT: Graph Conditioning (RSVS Knowledge Graph)
	↓
	Graph Encoder (+ Dual Memory) → cross-attention keys/values
	↓
	Diffusion Transformer (SwiGLU + RoPE + Matryoshka)
	├─ N × TransformerBlock: AdaLN + Self-Attn + Cross-Attn + SwiGLU FFN
	└─ Evoformer Feedback: Layer + Token + Decoder + Prediction recycling
	↓
	OUTPUT PIPELINE:
	├─ Anchored Diffusion Decoder (2-3 steps, default)
	├─ Flow Matching Decoder (2-3 steps, alternative)
	└─ Legacy DDPM/DDIM (backward compatible)
	↓
	INFERENCE CONTROLLER:
	├─ Thinking Toggle (adaptive compute)
	├─ MCTS Reasoning (complex queries)
	└─ Matryoshka (select submodel size)
	```

	## Training Pipeline

	1. Phase 1: Single-evidence simple narratives (25% budget)
	2. Phase 2: Multi-evidence narratives (30% budget)
	3. Phase 3: Complex reasoning + anomaly resolution (30% budget)
	4. Phase 4: GRPO/DAPO RL fine-tuning (15% budget)

	## Model Details

	\| Attribute \| Value \|
	\|-----------\|-------\|
	\| Parameters \| ~5.5M (demo) \|
	\| d_model \| 128 \|
	\| n_layers \| 4 \|
	\| n_heads \| 4 \|
	\| Vocab size \| 2000 \|
	\| Diffusion steps \| 200 (train) / 20 (inference) \|
	\| Anchored refinement \| 2-3 steps \|

	## Usage

	```python
	from diffusion_llm import AamDiffusionModel, AamDiffusionConfig

	# Load config and model
	config = AamDiffusionConfig.from_json("config.json")
	model = AamDiffusionModel.load("pytorch_model.bin")

	# Generate with anchored decoding (2-3 steps)
	graph_cond = model.graph_encoder(
	evidence_ids=evidence_ids,
	evidence_confidence=confidence,
	)
	result = model.sample(graph_cond, method="anchored", n_steps=3)
	tokens = model.embeddings_to_tokens(result)
	```