File size: 916 Bytes
ed6bec6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 |
# Dataset Format Specification
All datasets in this folder use JSON Lines (.jsonl) format.
Each line is a standalone training example:
{"input": "...", "output": "..."}
---
# 1. glyph_to_text.jsonl
## Format
{
"input": "๐ฑ",
"output": {
"id": "glyph.object.nature.sprout",
"primary": "sprout",
"synomic": ["growth", "seedling", "new life"],
"roles": ["object", "symbol"]
}
---
# 2. text_to_glyph.jsonl
## Format
{
"input": "a new beginning, growth, seedling",
"output": "๐ฑ"
}
---
# 3. structured_meaning.jsonl
## Format
{
"input": "๐คโ๏ธ๐๐",
"output": {
"actor": {"id": "glyph.actor.person"},
"action": {"id": "glyph.action.write"},
"object": {"id": "glyph.object.document.page"},
"context": {
"time": [{"id": "context.time.night"}]
}
---
# Validation Rules
- All glyphs must exist in the dictionary.
- All sequences must obey syntax rules.
- All structured meaning must be canonical.
|