Wolfvin
/

aam-diffusion-v1

Text Generation

aphantasic-abstraction-model

sentence-arrangement

graph-conditioned

Model card Files Files and versions

aam-diffusion-v1 / README.md

Wolfvin's picture

Upload README.md with huggingface_hub

b46c214 verified 7 days ago

|

history blame contribute delete

3.25 kB

	---
	language:
	- id
	- en
	license: mit
	library_name: pytorch
	tags:
	- diffusion
	- text-generation
	- aam
	- aphantasic-abstraction-model
	- sentence-arrangement
	- graph-conditioned
	---

	# AAM Diffusion LLM v1.0

	> "AAM = 1 Pikiran + 1 Tubuh" (1 Mind + 1 Body)

	The dedicated "body" of the Aphantasic Abstraction Model (AAM) — a small diffusion LLM specifically trained to arrange sentences from structured graph data.

	## What is this?

	This is NOT a general-purpose LLM. This is a SPECIALIZED sentence composer that:
	- Takes graph-structured conditioning as input (evidence, anomalies, reasoning chains, confidence scores)
	- Produces coherent natural language narratives through iterative denoising
	- Cannot hallucinate — it can only narrate what the graph knows

	## Architecture

	```
	Graph Conditioning Encoder → Diffusion Transformer → Noise Scheduler
	(Mind input) (The Body) (Iterative refinement)
	```

	### Key Components
	- Graph Conditioning Encoder: Encodes evidence nodes, compositions, anomalies, reasoning chains with confidence and temporal embeddings
	- Diffusion Transformer: Core denoising network with adaptive layer norm, self-attention, and cross-attention to graph conditioning
	- Noise Scheduler: Cosine noise schedule with DDPM/DDIM sampling support

	## Model Details

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Architecture \| Diffusion Transformer \|
	\| d_model \| 256 \|
	\| n_layers \| 4 \|
	\| n_heads \| 4 \|
	\| d_ff \| 1024 \|
	\| Parameters \| 73.4M \|
	\| Vocab size \| 576 \|
	\| Max sequence length \| 128 \|
	\| Diffusion timesteps (train) \| 200 \|
	\| Diffusion timesteps (inference) \| 20 \|
	\| Noise schedule \| cosine \|
	\| Prediction type \| epsilon \|
	\| Sampling method \| ddim \|

	## Usage

	```python
	from diffusion_llm import AamDiffusionModel, AamTokenizer, AamGenerator, AamDiffusionConfig

	# Load model
	config = AamDiffusionConfig.from_json("config.json")
	model = AamDiffusionModel.load("model.pt")
	tokenizer = AamTokenizer.load("tokenizer.json")

	# Create generator
	generator = AamGenerator(model, tokenizer, config)

	# Generate narrative from graph conditioning
	result = generator.generate(
	trigger="Siapa yang mencuri Snow Plum Pill?",
	evidence_nodes=["Hefei", "Diancang Five Swords", "Ju Jangmok"],
	anomalies=["Tidak ada konsumsi pil baru di pasar gelap"],
	reasoning_steps=["Cross-reference tanggal kejadian"],
	source_trust=0.85,
	)
	print(result.narrative)
	```

	## Philosophy

	AAM = 1 Mind + 1 Body

	- Mind = RSVS Knowledge Graph (structural memory, perfect recall, relational understanding)
	- Body = This Diffusion LLM (sentence arranger, graph-conditioned, anti-hallucination)

	Unlike using a rented LLM (GPT, Claude) as the "body", this model is specifically trained for AAM:
	- It cannot generate information not present in the graph conditioning
	- It arranges sentences based on structured evidence
	- It uses diffusion (non-sequential generation) instead of autoregressive generation
	- It is small (73.4M) but specialized

	## Training

	Trained on synthetic Graph→Narrative pairs with:
	- Indonesian and English narrative templates
	- Evidence nodes, anomalies, reasoning chains
	- Confidence score distributions
	- Source trust scores

	## License

	MIT