Genipapo API Guide

This guide provides instructions on how to use the Genipapo Parser API for processing Brazilian Portuguese text in CoNLL-U format.

All the examples provided in this guide were extracted from the Porttinari Base corpus, part of the Poetisa project.

Endpoints

1. Process a File

Use the /api/process endpoint to upload a .conllu file. The endpoint accepts the following parameter:

1.1 Example: Returning a File

When response_format is set to file, the processed content is returned as a downloadable .conllu file. Specify the output filename using --output.

curl -X POST -H "Content-Type: multipart/form-data" \
-F "file=@example.conllu" \
"https://genipapo-parser.azurewebsites.net/api/process?response_format=file" \
--output processed_example.conllu

1.2 Example: Returning JSON

When response_format is set to json, the processed content is returned in JSON format.

curl -X POST -H "Content-Type: multipart/form-data" \
-F "file=@example.conllu" \
"https://genipapo-parser.azurewebsites.net/api/process?response_format=json"

Example JSON Response:

{
    "status": "success",
    "warnings": [],
    "processed_content": "# sent_id = FOLHA_DOC000123_SENT016\n# text = O Capitão América também bajulou o tucano.\n1\tO\to\tDET\t_\tDefinite=Def|Gender=Masc|Number=Sing|PronType=Art\t2\tdet\t_\t_\n2\tCapitão\tCapitão\tPROPN\t_\t_\t5\tnsubj\t_\t_\n3\tAmérica\tAmérica\tPROPN\t_\t_\t2\tflat:name\t_\t_\n4\ttambém\ttambém\tADV\t_\t_\t5\tadvmod\t_\t_\n5\tbajulou\tbajular\tVERB\t_\tMood=Ind|Number=Sing|Person=3|Tense=Past|VerbForm=Fin\t0\troot\t_\t_\n6\to\to\tDET\t_\tDefinite=Def|Gender=Masc|Number=Sing|PronType=Art\t7\tdet\t_\t_\n7\ttucano\ttucano\tNOUN\t_\tGender=Masc|Number=Sing\t5\tobj\t_\tSpaceAfter=No\n8\t.\t.\tPUNCT\t_\t_\t5\tpunct\t_\tSpaceAfter=No\n"
}

2. Process Raw Content

Use the /api/process/json endpoint to send raw CoNLL-U content as JSON. Include the content in the content field of the JSON body.

curl -X POST -H "Content-Type: application/json" \
-d '{"content": "# sent_id = FOLHA_DOC000123_SENT016
# text = O Capitão América também bajulou o tucano.
1\tO\to\tDET\t_\tDefinite=Def|Gender=Masc|Number=Sing|PronType=Art\t_\t_\t_\t_
2\tCapitão\tCapitão\tPROPN\t_\t_\t_\t_\t_\t_
3\tAmérica\tAmérica\tPROPN\t_\t_\t_\t_\t_\t_
4\ttambém\ttambém\tADV\t_\t_\t_\t_\t_\t_
5\tbajulou\tbajular\tVERB\t_\tMood=Ind|Number=Sing|Person=3|Tense=Past|VerbForm=Fin\t_\t_\t_\t_
6\to\to\tDET\t_\tDefinite=Def|Gender=Masc|Number=Sing|PronType=Art\t_\t_\t_\t_
7\ttucano\ttucano\tNOUN\t_\tGender=Masc|Number=Sing\t_\t_\t_\tSpaceAfter=No
8\t.\t.\tPUNCT\t_\t_\t_\t_\t_\tSpaceAfter=No"}' \
"http://localhost:8000/api/process/json"

Example JSON Response:

{
    "status": "success",
    "warnings": [],
    "processed_content": "# sent_id = FOLHA_DOC000123_SENT016\n# text = O Capitão América também bajulou o tucano.\n1\tO\to\tDET\t_\tDefinite=Def|Gender=Masc|Number=Sing|PronType=Art\t2\tdet\t_\t_\n2\tCapitão\tCapitão\tPROPN\t_\t_\t5\tnsubj\t_\t_\n3\tAmérica\tAmérica\tPROPN\t_\t_\t2\tflat:name\t_\t_\n4\ttambém\ttambém\tADV\t_\t_\t5\tadvmod\t_\t_\n5\tbajulou\tbajular\tVERB\t_\tMood=Ind|Number=Sing|Person=3|Tense=Past|VerbForm=Fin\t0\troot\t_\t_\n6\to\to\tDET\t_\tDefinite=Def|Gender=Masc|Number=Sing|PronType=Art\t7\tdet\t_\t_\n7\ttucano\ttucano\tNOUN\t_\tGender=Masc|Number=Sing\t5\tobj\t_\tSpaceAfter=No\n8\t.\t.\tPUNCT\t_\t_\t5\tpunct\t_\tSpaceAfter=No\n"
}

Example with Input and Output

Original Input

# sent_id = FOLHA_DOC000123_SENT016
# text = O Capitão América também bajulou o tucano.
1   O       o       DET     _       Definite=Def|Gender=Masc|Number=Sing|PronType=Art   _   _   _   _
2   Capitão Capitão PROPN   _       _                                               _   _   _   _
3   América América PROPN   _       _                                               _   _   _   _
4   também  também  ADV     _       _                                               _   _   _   _
5   bajulou bajular VERB    _       Mood=Ind|Number=Sing|Person=3|Tense=Past|VerbForm=Fin _   _   _   _
6   o       o       DET     _       Definite=Def|Gender=Masc|Number=Sing|PronType=Art   _   _   _   _
7   tucano  tucano  NOUN    _       Gender=Masc|Number=Sing                            _   _   _   SpaceAfter=No
8   .       .       PUNCT   _       _                                               _   _   _   SpaceAfter=No

Processed Output

# sent_id = FOLHA_DOC000123_SENT016
# text = O Capitão América também bajulou o tucano.
1   O       o       DET     _       Definite=Def|Gender=Masc|Number=Sing|PronType=Art   2   det     _   _
2   Capitão Capitão PROPN   _       _                                                   5   nsubj   _   _
3   América América PROPN   _       _                                                   2   flat:name _   _
4   também  também  ADV     _       _                                                   5   advmod  _   _
5   bajulou bajular VERB    _       Mood=Ind|Number=Sing|Person=3|Tense=Past|VerbForm=Fin 0   root    _   _
6   o       o       DET     _       Definite=Def|Gender=Masc|Number=Sing|PronType=Art   7   det     _   _
7   tucano  tucano  NOUN    _       Gender=Masc|Number=Sing                              5   obj     _   SpaceAfter=No
8   .       .       PUNCT   _       _                                                   5   punct   _   SpaceAfter=No

Contact

For further assistance, please contact us.