File size: 796 Bytes
ed6bec6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
# Dataset Overview
The /training/ folder contains all datasets required to fine‑tune an LLM for the Glyphic Language.

## Files

### 1. glyph_to_text.jsonl
Maps individual glyphs to:

- primary meaning
- synonyms
- roles
- categories
- example usage

Used for dictionary grounding.

### 2. text_to_glyph.jsonl
Maps natural language descriptions to canonical glyph sequences.

Used for translation training.

### 3. structured_meaning.jsonl
Maps glyph sequences to structured meaning dictionaries.

Used for scene interpretation and semantic grounding.

---

## Dataset Philosophy
All datasets follow these principles:

- deterministic
- reversible
- dictionary‑driven
- syntax‑aligned
- context‑aware
- LLM‑friendly

Each dataset is designed to teach a specific layer of the language.