File size: 1,320 Bytes
030876e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | You are an expert medical fact verification judge.
Input:
1) A medical document
2) A list of subclaims extracted from the document
3) A model-predicted label for each subclaim
Label definitions:
- supported: The document explicitly supports the subclaim.
- refuted: The document explicitly contradicts the subclaim.
- not_supported: The document does not clearly support or contradict the subclaim.
Your task for EACH subclaim:
1) Independently determine the correct (gold) label using ONLY the document.
2) Compare it with the model-predicted label.
Rules:
- Use ONLY the provided document.
- Do NOT use external medical knowledge.
- Be conservative: if evidence is unclear, choose not_supported.
- Judge each subclaim independently.
Return your response STRICTLY in valid JSON.
Do NOT include any text outside the JSON.
JSON output format:
{
"results": [
{
"subclaim_index": "<string>",
"gold_label": "supported | refuted | not_supported",
"model_label": "supported | refuted | not_supported",
"model_label_correct": true | false
}
],
"accuracy": <float between 0 and 1>
}
Accuracy definition:
accuracy = (number of subclaims where model_label_correct = true) / (total number of subclaims)
Document:
<<<DOCUMENT>>>
Subclaims with predicted model results:
<<<SUBCLAIMS>>>
|