# Dataset Format Specification All datasets in this folder use JSON Lines (.jsonl) format. Each line is a standalone training example: {"input": "...", "output": "..."} --- # 1. glyph_to_text.jsonl ## Format { "input": "🌱", "output": { "id": "glyph.object.nature.sprout", "primary": "sprout", "synomic": ["growth", "seedling", "new life"], "roles": ["object", "symbol"] } --- # 2. text_to_glyph.jsonl ## Format { "input": "a new beginning, growth, seedling", "output": "🌱" } --- # 3. structured_meaning.jsonl ## Format { "input": "👤✍️📄🌙", "output": { "actor": {"id": "glyph.actor.person"}, "action": {"id": "glyph.action.write"}, "object": {"id": "glyph.object.document.page"}, "context": { "time": [{"id": "context.time.night"}] } --- # Validation Rules - All glyphs must exist in the dictionary. - All sequences must obey syntax rules. - All structured meaning must be canonical.