| --- |
| language: |
| - id |
| - en |
| license: mit |
| library_name: pytorch |
| tags: |
| - diffusion |
| - text-generation |
| - aam |
| - aphantasic-abstraction-model |
| - sentence-arrangement |
| - graph-conditioned |
| --- |
| |
| # AAM Diffusion LLM v1.0 |
|
|
| > **"AAM = 1 Pikiran + 1 Tubuh" (1 Mind + 1 Body)** |
|
|
| The dedicated "body" of the Aphantasic Abstraction Model (AAM) — a small diffusion LLM specifically trained to arrange sentences from structured graph data. |
|
|
| ## What is this? |
|
|
| This is NOT a general-purpose LLM. This is a SPECIALIZED sentence composer that: |
| - Takes **graph-structured conditioning** as input (evidence, anomalies, reasoning chains, confidence scores) |
| - Produces **coherent natural language narratives** through iterative denoising |
| - **Cannot hallucinate** — it can only narrate what the graph knows |
|
|
| ## Architecture |
|
|
| ``` |
| Graph Conditioning Encoder → Diffusion Transformer → Noise Scheduler |
| (Mind input) (The Body) (Iterative refinement) |
| ``` |
|
|
| ### Key Components |
| - **Graph Conditioning Encoder**: Encodes evidence nodes, compositions, anomalies, reasoning chains with confidence and temporal embeddings |
| - **Diffusion Transformer**: Core denoising network with adaptive layer norm, self-attention, and cross-attention to graph conditioning |
| - **Noise Scheduler**: Cosine noise schedule with DDPM/DDIM sampling support |
|
|
| ## Model Details |
|
|
| | Parameter | Value | |
| |-----------|-------| |
| | Architecture | Diffusion Transformer | |
| | d_model | 256 | |
| | n_layers | 4 | |
| | n_heads | 4 | |
| | d_ff | 1024 | |
| | Parameters | 73.4M | |
| | Vocab size | 576 | |
| | Max sequence length | 128 | |
| | Diffusion timesteps (train) | 200 | |
| | Diffusion timesteps (inference) | 20 | |
| | Noise schedule | cosine | |
| | Prediction type | epsilon | |
| | Sampling method | ddim | |
|
|
| ## Usage |
|
|
| ```python |
| from diffusion_llm import AamDiffusionModel, AamTokenizer, AamGenerator, AamDiffusionConfig |
| |
| # Load model |
| config = AamDiffusionConfig.from_json("config.json") |
| model = AamDiffusionModel.load("model.pt") |
| tokenizer = AamTokenizer.load("tokenizer.json") |
| |
| # Create generator |
| generator = AamGenerator(model, tokenizer, config) |
| |
| # Generate narrative from graph conditioning |
| result = generator.generate( |
| trigger="Siapa yang mencuri Snow Plum Pill?", |
| evidence_nodes=["Hefei", "Diancang Five Swords", "Ju Jangmok"], |
| anomalies=["Tidak ada konsumsi pil baru di pasar gelap"], |
| reasoning_steps=["Cross-reference tanggal kejadian"], |
| source_trust=0.85, |
| ) |
| print(result.narrative) |
| ``` |
|
|
| ## Philosophy |
|
|
| **AAM = 1 Mind + 1 Body** |
|
|
| - **Mind** = RSVS Knowledge Graph (structural memory, perfect recall, relational understanding) |
| - **Body** = This Diffusion LLM (sentence arranger, graph-conditioned, anti-hallucination) |
|
|
| Unlike using a rented LLM (GPT, Claude) as the "body", this model is specifically trained for AAM: |
| - It cannot generate information not present in the graph conditioning |
| - It arranges sentences based on structured evidence |
| - It uses diffusion (non-sequential generation) instead of autoregressive generation |
| - It is small (73.4M) but specialized |
|
|
| ## Training |
|
|
| Trained on synthetic Graph→Narrative pairs with: |
| - Indonesian and English narrative templates |
| - Evidence nodes, anomalies, reasoning chains |
| - Confidence score distributions |
| - Source trust scores |
|
|
| ## License |
|
|
| MIT |
|
|