PeacebinfLow commited on
Commit
9d92b07
·
verified ·
1 Parent(s): 89b38f6

Update static/schema_help.md

Browse files
Files changed (1) hide show
  1. static/schema_help.md +19 -24
static/schema_help.md CHANGED
@@ -1,31 +1,26 @@
1
- # Schema Help
2
 
3
- This dataset ships as two configs:
4
 
5
- ## 1) nodes
6
- `data/raw/nodes.csv`
7
 
8
- Represents immutable Prompt Evolution Tree (PET) states:
9
- - node identity
10
- - prompt/template content
11
- - parent lineage
12
- - metadata tags
13
- - timestamps
14
- - provenance links
15
 
16
- ## 2) runs
17
- `data/raw/runs.csv`
18
 
19
- Represents execution records (Gemini runs):
20
- - run identity
21
- - node_id link
22
- - model + token usage
23
- - outputs + result_json
24
- - score + status
25
- - timestamps + triggering source
26
 
27
- ## Why configs matter
28
- Hugging Face’s CSV dataset builder requires:
29
- **all CSV files in the same config/split must share identical columns**.
30
 
31
- Nodes and runs intentionally have different schemas, so they MUST be split into separate configs.
 
 
 
 
 
 
 
 
 
 
 
1
+ MindsEye Dataset Configs (Important)
2
 
3
+ If you have:
4
 
5
+ nodes.csv (Prompt Evolution Nodes — immutable cognitive states.)
 
6
 
7
+ runs.csv (Execution Records each reasoning run.)
 
 
 
 
 
 
8
 
9
+ They SHOULD NOT be in the same dataset builder config unless they share identical columns.
 
10
 
11
+ Correct options:
12
+ Option A (recommended): Two configs
 
 
 
 
 
13
 
14
+ config: nodes -> data/raw/nodes.csv
 
 
15
 
16
+ config: runs -> data/raw/runs.csv
17
+
18
+ Option B: Two splits
19
+
20
+ split: nodes -> nodes.csv
21
+
22
+ split: runs -> runs.csv
23
+
24
+ If HF says:
25
+ “All the data files must have the same columns…”
26
+ it means it thinks multiple CSVs belong to the same table.