File size: 796 Bytes
ed6bec6 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | # Dataset Overview
The /training/ folder contains all datasets required to fine‑tune an LLM for the Glyphic Language.
## Files
### 1. glyph_to_text.jsonl
Maps individual glyphs to:
- primary meaning
- synonyms
- roles
- categories
- example usage
Used for dictionary grounding.
### 2. text_to_glyph.jsonl
Maps natural language descriptions to canonical glyph sequences.
Used for translation training.
### 3. structured_meaning.jsonl
Maps glyph sequences to structured meaning dictionaries.
Used for scene interpretation and semantic grounding.
---
## Dataset Philosophy
All datasets follow these principles:
- deterministic
- reversible
- dictionary‑driven
- syntax‑aligned
- context‑aware
- LLM‑friendly
Each dataset is designed to teach a specific layer of the language.
|