File size: 3,247 Bytes
2d7e335
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b46c214
2d7e335
 
 
b46c214
 
 
2d7e335
 
 
 
 
b46c214
 
2d7e335
 
b46c214
 
 
 
 
 
2d7e335
 
 
b46c214
 
 
2d7e335
b46c214
 
 
 
 
 
 
 
 
2d7e335
 
 
 
 
 
 
 
b46c214
2d7e335
 
 
 
 
 
 
 
 
 
b46c214
2d7e335
 
 
 
 
 
 
b46c214
2d7e335
 
 
 
b46c214
 
 
 
 
2d7e335
b46c214
2d7e335
b46c214
 
 
 
 
2d7e335
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
---
language:
- id
- en
license: mit
library_name: pytorch
tags:
- diffusion
- text-generation
- aam
- aphantasic-abstraction-model
- sentence-arrangement
- graph-conditioned
---

# AAM Diffusion LLM v1.0

> **"AAM = 1 Pikiran + 1 Tubuh" (1 Mind + 1 Body)**

The dedicated "body" of the Aphantasic Abstraction Model (AAM) — a small diffusion LLM specifically trained to arrange sentences from structured graph data.

## What is this?

This is NOT a general-purpose LLM. This is a SPECIALIZED sentence composer that:
- Takes **graph-structured conditioning** as input (evidence, anomalies, reasoning chains, confidence scores)
- Produces **coherent natural language narratives** through iterative denoising
- **Cannot hallucinate** — it can only narrate what the graph knows

## Architecture

```
Graph Conditioning Encoder → Diffusion Transformer → Noise Scheduler
         (Mind input)           (The Body)          (Iterative refinement)
```

### Key Components
- **Graph Conditioning Encoder**: Encodes evidence nodes, compositions, anomalies, reasoning chains with confidence and temporal embeddings
- **Diffusion Transformer**: Core denoising network with adaptive layer norm, self-attention, and cross-attention to graph conditioning
- **Noise Scheduler**: Cosine noise schedule with DDPM/DDIM sampling support

## Model Details

| Parameter | Value |
|-----------|-------|
| Architecture | Diffusion Transformer |
| d_model | 256 |
| n_layers | 4 |
| n_heads | 4 |
| d_ff | 1024 |
| Parameters | 73.4M |
| Vocab size | 576 |
| Max sequence length | 128 |
| Diffusion timesteps (train) | 200 |
| Diffusion timesteps (inference) | 20 |
| Noise schedule | cosine |
| Prediction type | epsilon |
| Sampling method | ddim |

## Usage

```python
from diffusion_llm import AamDiffusionModel, AamTokenizer, AamGenerator, AamDiffusionConfig

# Load model
config = AamDiffusionConfig.from_json("config.json")
model = AamDiffusionModel.load("model.pt")
tokenizer = AamTokenizer.load("tokenizer.json")

# Create generator
generator = AamGenerator(model, tokenizer, config)

# Generate narrative from graph conditioning
result = generator.generate(
    trigger="Siapa yang mencuri Snow Plum Pill?",
    evidence_nodes=["Hefei", "Diancang Five Swords", "Ju Jangmok"],
    anomalies=["Tidak ada konsumsi pil baru di pasar gelap"],
    reasoning_steps=["Cross-reference tanggal kejadian"],
    source_trust=0.85,
)
print(result.narrative)
```

## Philosophy

**AAM = 1 Mind + 1 Body**

- **Mind** = RSVS Knowledge Graph (structural memory, perfect recall, relational understanding)
- **Body** = This Diffusion LLM (sentence arranger, graph-conditioned, anti-hallucination)

Unlike using a rented LLM (GPT, Claude) as the "body", this model is specifically trained for AAM:
- It cannot generate information not present in the graph conditioning
- It arranges sentences based on structured evidence
- It uses diffusion (non-sequential generation) instead of autoregressive generation
- It is small (73.4M) but specialized

## Training

Trained on synthetic Graph→Narrative pairs with:
- Indonesian and English narrative templates
- Evidence nodes, anomalies, reasoning chains
- Confidence score distributions
- Source trust scores

## License

MIT