| # Data Format | |
| You can pass SpanFinder any formats of data, as long as you implement a dataset reader inherited from SpanReader. We also provide a Concrete dataset reader. Besides them, SpanFinder comes with its own JSON data format, which enables richer features for training and modeling. | |
| The minimal example of the JSON is | |
| ```JSON | |
| { | |
| "meta": { | |
| "fully_annotated": true | |
| }, | |
| "tokens": ["Bob", "attacks", "the", "building", "."], | |
| "annotations": [ | |
| { | |
| "span": [1, 1], | |
| "label": "Attack", | |
| "children": [ | |
| { | |
| "span": [0, 0], | |
| "label": "Assailant", | |
| "children": [] | |
| }, | |
| { | |
| "span": [2, 3], | |
| "label": "Victim", | |
| "children": [] | |
| } | |
| ] | |
| }, | |
| { | |
| "span": [3, 3], | |
| "label": "Buildings", | |
| "children": [ | |
| { | |
| "span": [3, 3], | |
| "label": "Building", | |
| "children": [] | |
| } | |
| ] | |
| } | |
| ] | |
| } | |
| ``` | |
| You can have nested spans with unlimited depth. | |
| ## Meta-info for Semantic Role Labeling (SRL) | |
| ```JSON | |
| { | |
| "ontology": { | |
| "event": ["Violence-Attack"], | |
| "argument": ["Agent", "Patient"], | |
| "link": [[0, 0], [0, 1]] | |
| }, | |
| "ontology_mapping": { | |
| "event": { | |
| "Attack": ["Violence-Attack", 0.8] | |
| }, | |
| "argument": { | |
| "Assault": ["Agent", 0.95], | |
| "Victim": ["patient", 0.9] | |
| } | |
| } | |
| } | |
| ``` | |
| TODO: Guanghui needs to doc this. | |