glyphic-language / training /dataset_overview.md
UnconditionalLove's picture
Upload 97 files
ed6bec6 verified
# Dataset Overview
The /training/ folder contains all datasets required to fine‑tune an LLM for the Glyphic Language.
## Files
### 1. glyph_to_text.jsonl
Maps individual glyphs to:
- primary meaning
- synonyms
- roles
- categories
- example usage
Used for dictionary grounding.
### 2. text_to_glyph.jsonl
Maps natural language descriptions to canonical glyph sequences.
Used for translation training.
### 3. structured_meaning.jsonl
Maps glyph sequences to structured meaning dictionaries.
Used for scene interpretation and semantic grounding.
---
## Dataset Philosophy
All datasets follow these principles:
- deterministic
- reversible
- dictionary‑driven
- syntax‑aligned
- context‑aware
- LLM‑friendly
Each dataset is designed to teach a specific layer of the language.