# Dataset Overview
The /training/ folder contains all datasets required to fine‑tune an LLM for the Glyphic Language.

## Files

### 1. glyph_to_text.jsonl
Maps individual glyphs to:

- primary meaning
- synonyms
- roles
- categories
- example usage

Used for dictionary grounding.

### 2. text_to_glyph.jsonl
Maps natural language descriptions to canonical glyph sequences.

Used for translation training.

### 3. structured_meaning.jsonl
Maps glyph sequences to structured meaning dictionaries.

Used for scene interpretation and semantic grounding.

---

## Dataset Philosophy
All datasets follow these principles:

- deterministic
- reversible
- dictionary‑driven
- syntax‑aligned
- context‑aware
- LLM‑friendly

Each dataset is designed to teach a specific layer of the language.