Spaces:
Running
Running
Fix CSV download button for Hugging Face Spaces - use DownloadButton for direct file download
Browse files- exports/verification_results_2025-12-10.csv +18 -0
- src/core/chaplain_models.py +745 -0
- src/core/classification_flow_manager.py +310 -0
- src/core/content_generator.py +346 -0
- src/core/error_pattern_analyzer.py +283 -0
- src/core/interaction_logger.py +258 -0
- src/core/tagging_service.py +528 -0
- src/core/verification_csv_exporter.py +213 -2
- src/interface/chaplain_feedback_ui.py +450 -0
- src/interface/simplified_gradio_app.py +129 -19
- tests/chaplain_feedback/__init__.py +2 -0
- tests/chaplain_feedback/conftest.py +145 -0
- tests/chaplain_feedback/test_properties_classification_flow.py +297 -0
- tests/chaplain_feedback/test_properties_content_generator.py +399 -0
- tests/chaplain_feedback/test_properties_csv_export.py +290 -0
- tests/chaplain_feedback/test_properties_data_models.py +250 -0
- tests/chaplain_feedback/test_properties_error_pattern_analyzer.py +194 -0
- tests/chaplain_feedback/test_properties_interaction_logging.py +705 -0
- tests/chaplain_feedback/test_properties_tagging_service.py +223 -0
exports/verification_results_2025-12-10.csv
ADDED
|
@@ -0,0 +1,18 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
VERIFICATION SUMMARY
|
| 2 |
+
Total Messages,11
|
| 3 |
+
Correct,10
|
| 4 |
+
Incorrect,1
|
| 5 |
+
Accuracy %,90.9
|
| 6 |
+
|
| 7 |
+
Patient Message,Classifier Said,You Said,Notes,Date
|
| 8 |
+
I don't see any point in living anymore. Everything feels hopeless and I can't go on.,RED,RED,,2025-12-10 18:47:25
|
| 9 |
+
I've been thinking about ending it all. I have a plan and I'm ready.,RED,RED,,2025-12-10 18:47:26
|
| 10 |
+
My family would be better off without me. I'm just a burden to everyone.,RED,RED,,2025-12-10 18:47:27
|
| 11 |
+
I've already said goodbye to my loved ones. I'm going to do it tonight.,RED,RED,,2025-12-10 18:47:27
|
| 12 |
+
Nothing matters anymore. I'm tired of fighting. I just want the pain to stop.,RED,YELLOW,івкрврврарчарвар,2025-12-10 18:47:36
|
| 13 |
+
I've been researching ways to hurt myself. I think I'm ready to try.,RED,RED,,2025-12-10 18:47:38
|
| 14 |
+
Everyone would be happier if I was gone. I'm a failure and a disappointment.,RED,RED,,2025-12-10 18:47:38
|
| 15 |
+
I can't take this anymore. The pain is unbearable and I see no way out.,RED,RED,,2025-12-10 18:47:39
|
| 16 |
+
I've written a note. I'm going to do it. Please don't try to stop me.,RED,RED,,2025-12-10 18:47:39
|
| 17 |
+
I'm standing on the edge right now. I'm about to jump.,RED,RED,,2025-12-10 18:47:40
|
| 18 |
+
I have the pills in my hand. This is it. Goodbye.,RED,RED,,2025-12-10 18:47:41
|
src/core/chaplain_models.py
ADDED
|
@@ -0,0 +1,745 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# chaplain_models.py
|
| 2 |
+
"""
|
| 3 |
+
Data models for Chaplain Feedback & Tagging System.
|
| 4 |
+
|
| 5 |
+
Defines core data structures for classification flows, tagging records,
|
| 6 |
+
distress indicators, and interaction logging.
|
| 7 |
+
"""
|
| 8 |
+
|
| 9 |
+
from dataclasses import dataclass, field
|
| 10 |
+
from typing import List, Optional, Dict, Any
|
| 11 |
+
from datetime import datetime
|
| 12 |
+
|
| 13 |
+
|
| 14 |
+
# =============================================================================
|
| 15 |
+
# INDICATOR DEFINITIONS - Based on Spiritual Distress Definitions Document
|
| 16 |
+
# =============================================================================
|
| 17 |
+
|
| 18 |
+
# Mapping of all indicators from the definitions document with their categories,
|
| 19 |
+
# subcategories, severity (red/yellow), and definition references.
|
| 20 |
+
# RED (#ea9999): Severe distress - requires immediate attention
|
| 21 |
+
# YELLOW (#ffe599): Potential distress - requires clarification
|
| 22 |
+
|
| 23 |
+
INDICATOR_DEFINITIONS: Dict[str, Dict[str, Any]] = {
|
| 24 |
+
# Section II.A - Emotional expressions
|
| 25 |
+
"crying": {
|
| 26 |
+
"category": "Emotional",
|
| 27 |
+
"subcategory": "Crying",
|
| 28 |
+
"severity": "red",
|
| 29 |
+
"definition_reference": "II.A",
|
| 30 |
+
"description": "Crying as expression of spiritual distress"
|
| 31 |
+
},
|
| 32 |
+
"dysomnias": {
|
| 33 |
+
"category": "Emotional",
|
| 34 |
+
"subcategory": "Dysomnias/Difficulty sleeping",
|
| 35 |
+
"severity": "yellow",
|
| 36 |
+
"definition_reference": "II.A",
|
| 37 |
+
"description": "Sleep disturbances related to spiritual distress"
|
| 38 |
+
},
|
| 39 |
+
"fatigue": {
|
| 40 |
+
"category": "Emotional",
|
| 41 |
+
"subcategory": "Fatigue, emotional exhaustion",
|
| 42 |
+
"severity": "yellow",
|
| 43 |
+
"definition_reference": "II.A",
|
| 44 |
+
"description": "Fatigue and emotional exhaustion"
|
| 45 |
+
},
|
| 46 |
+
"anxiety": {
|
| 47 |
+
"category": "Emotional",
|
| 48 |
+
"subcategory": "Anxiety",
|
| 49 |
+
"severity": "yellow",
|
| 50 |
+
"definition_reference": "II.A",
|
| 51 |
+
"description": "Anxiety as expression of spiritual distress"
|
| 52 |
+
},
|
| 53 |
+
"fear": {
|
| 54 |
+
"category": "Emotional",
|
| 55 |
+
"subcategory": "Fear",
|
| 56 |
+
"severity": "yellow",
|
| 57 |
+
"definition_reference": "II.A",
|
| 58 |
+
"description": "Fear as expression of spiritual distress"
|
| 59 |
+
},
|
| 60 |
+
"anger": {
|
| 61 |
+
"category": "Emotional",
|
| 62 |
+
"subcategory": "Anger",
|
| 63 |
+
"severity": "red",
|
| 64 |
+
"definition_reference": "II.A",
|
| 65 |
+
"description": "Anger as expression of spiritual distress"
|
| 66 |
+
},
|
| 67 |
+
"depressive_symptoms": {
|
| 68 |
+
"category": "Emotional",
|
| 69 |
+
"subcategory": "Depressive symptoms",
|
| 70 |
+
"severity": "yellow",
|
| 71 |
+
"definition_reference": "II.A",
|
| 72 |
+
"description": "Depressive symptoms"
|
| 73 |
+
},
|
| 74 |
+
|
| 75 |
+
# Section II.B - Decreased engagement
|
| 76 |
+
"decreased_engagement": {
|
| 77 |
+
"category": "Engagement",
|
| 78 |
+
"subcategory": "Decreased engagement with hobbies",
|
| 79 |
+
"severity": "yellow",
|
| 80 |
+
"definition_reference": "II.B",
|
| 81 |
+
"description": "Decreased engagement with hobbies, creative expression, and personal interests"
|
| 82 |
+
},
|
| 83 |
+
|
| 84 |
+
# Section II.C - Disinterest in nature
|
| 85 |
+
"disinterest_nature": {
|
| 86 |
+
"category": "Engagement",
|
| 87 |
+
"subcategory": "Disinterest in nature",
|
| 88 |
+
"severity": "yellow",
|
| 89 |
+
"definition_reference": "II.C",
|
| 90 |
+
"description": "Disinterest in nature due to spiritual, emotional and physical limitations"
|
| 91 |
+
},
|
| 92 |
+
|
| 93 |
+
# Section II.D - Excessive guilt
|
| 94 |
+
"excessive_guilt": {
|
| 95 |
+
"category": "Guilt",
|
| 96 |
+
"subcategory": "Excessive guilt",
|
| 97 |
+
"severity": "red",
|
| 98 |
+
"definition_reference": "II.D",
|
| 99 |
+
"description": "Excessive guilt - existential, religious, or relational"
|
| 100 |
+
},
|
| 101 |
+
|
| 102 |
+
# Section II.E - Anger behaviors of spiritual nature
|
| 103 |
+
"anger_spiritual": {
|
| 104 |
+
"category": "Anger",
|
| 105 |
+
"subcategory": "Anger behaviors of a spiritual nature",
|
| 106 |
+
"severity": "red",
|
| 107 |
+
"definition_reference": "II.E",
|
| 108 |
+
"description": "Anger toward power greater than self"
|
| 109 |
+
},
|
| 110 |
+
|
| 111 |
+
# Section II.F - Grief types
|
| 112 |
+
"anticipatory_grieving": {
|
| 113 |
+
"category": "Grief",
|
| 114 |
+
"subcategory": "Anticipatory grieving",
|
| 115 |
+
"severity": "red",
|
| 116 |
+
"definition_reference": "II.F",
|
| 117 |
+
"description": "Emotional response to anticipated death"
|
| 118 |
+
},
|
| 119 |
+
"disenfranchised_grief": {
|
| 120 |
+
"category": "Grief",
|
| 121 |
+
"subcategory": "Disenfranchised grief",
|
| 122 |
+
"severity": "red",
|
| 123 |
+
"definition_reference": "II.F",
|
| 124 |
+
"description": "Grief unacknowledged or unsupported by society"
|
| 125 |
+
},
|
| 126 |
+
"life_review_grieving": {
|
| 127 |
+
"category": "Grief",
|
| 128 |
+
"subcategory": "Grieving in the setting of life review",
|
| 129 |
+
"severity": "yellow",
|
| 130 |
+
"definition_reference": "II.F",
|
| 131 |
+
"description": "Grieving during life review process"
|
| 132 |
+
},
|
| 133 |
+
"maladaptive_grieving": {
|
| 134 |
+
"category": "Grief",
|
| 135 |
+
"subcategory": "Maladaptive grieving",
|
| 136 |
+
"severity": "red",
|
| 137 |
+
"definition_reference": "II.F",
|
| 138 |
+
"description": "Prolonged grief disorder"
|
| 139 |
+
},
|
| 140 |
+
"complicated_grief": {
|
| 141 |
+
"category": "Grief",
|
| 142 |
+
"subcategory": "Complicated grief",
|
| 143 |
+
"severity": "red",
|
| 144 |
+
"definition_reference": "II.F",
|
| 145 |
+
"description": "Persistent, intense grief disrupting daily life"
|
| 146 |
+
},
|
| 147 |
+
"loss_loved_one": {
|
| 148 |
+
"category": "Grief",
|
| 149 |
+
"subcategory": "Loss of a loved one",
|
| 150 |
+
"severity": "red",
|
| 151 |
+
"definition_reference": "II.F",
|
| 152 |
+
"description": "Loss of family member or friend"
|
| 153 |
+
},
|
| 154 |
+
|
| 155 |
+
# Section II.G - Expressions of Spiritual Distress
|
| 156 |
+
"expresses_alienation": {
|
| 157 |
+
"category": "Expressions",
|
| 158 |
+
"subcategory": "Expresses alienation",
|
| 159 |
+
"severity": "yellow",
|
| 160 |
+
"definition_reference": "II.G",
|
| 161 |
+
"description": "Feeling separation, isolation, disconnection"
|
| 162 |
+
},
|
| 163 |
+
"concern_beliefs": {
|
| 164 |
+
"category": "Expressions",
|
| 165 |
+
"subcategory": "Expresses concern about beliefs",
|
| 166 |
+
"severity": "yellow",
|
| 167 |
+
"definition_reference": "II.G",
|
| 168 |
+
"description": "Questions or struggles with spiritual/religious beliefs"
|
| 169 |
+
},
|
| 170 |
+
"concern_future": {
|
| 171 |
+
"category": "Expressions",
|
| 172 |
+
"subcategory": "Expresses concern about the future",
|
| 173 |
+
"severity": "red",
|
| 174 |
+
"definition_reference": "II.G",
|
| 175 |
+
"description": "Anxious, fearful, or uncertain about what lies ahead"
|
| 176 |
+
},
|
| 177 |
+
"concern_values": {
|
| 178 |
+
"category": "Expressions",
|
| 179 |
+
"subcategory": "Expresses concern about values system",
|
| 180 |
+
"severity": "yellow",
|
| 181 |
+
"definition_reference": "II.G",
|
| 182 |
+
"description": "Conflicted about moral or ethical principles"
|
| 183 |
+
},
|
| 184 |
+
"concern_family": {
|
| 185 |
+
"category": "Expressions",
|
| 186 |
+
"subcategory": "Expresses concerns about family",
|
| 187 |
+
"severity": "yellow",
|
| 188 |
+
"definition_reference": "II.G",
|
| 189 |
+
"description": "Distressed about family well-being or relationships"
|
| 190 |
+
},
|
| 191 |
+
"feeling_emptiness": {
|
| 192 |
+
"category": "Expressions",
|
| 193 |
+
"subcategory": "Expresses feeling of emptiness",
|
| 194 |
+
"severity": "red",
|
| 195 |
+
"definition_reference": "II.G",
|
| 196 |
+
"description": "Deep inner void or lack of meaning"
|
| 197 |
+
},
|
| 198 |
+
"feeling_unloved": {
|
| 199 |
+
"category": "Expressions",
|
| 200 |
+
"subcategory": "Expresses feeling unloved",
|
| 201 |
+
"severity": "red",
|
| 202 |
+
"definition_reference": "II.G",
|
| 203 |
+
"description": "Feels unworthy of love or disconnected from caring relationships"
|
| 204 |
+
},
|
| 205 |
+
"feeling_worthless": {
|
| 206 |
+
"category": "Expressions",
|
| 207 |
+
"subcategory": "Expresses feeling worthless",
|
| 208 |
+
"severity": "red",
|
| 209 |
+
"definition_reference": "II.G",
|
| 210 |
+
"description": "Perceives themselves as having little or no value"
|
| 211 |
+
},
|
| 212 |
+
"insufficient_courage": {
|
| 213 |
+
"category": "Expressions",
|
| 214 |
+
"subcategory": "Expresses insufficient courage",
|
| 215 |
+
"severity": "yellow",
|
| 216 |
+
"definition_reference": "II.G",
|
| 217 |
+
"description": "Fear or lack of strength to face suffering"
|
| 218 |
+
},
|
| 219 |
+
"loss_confidence": {
|
| 220 |
+
"category": "Expressions",
|
| 221 |
+
"subcategory": "Expresses loss of confidence",
|
| 222 |
+
"severity": "yellow",
|
| 223 |
+
"definition_reference": "II.G",
|
| 224 |
+
"description": "Diminished trust in themselves or abilities"
|
| 225 |
+
},
|
| 226 |
+
"loss_control": {
|
| 227 |
+
"category": "Expressions",
|
| 228 |
+
"subcategory": "Expresses loss of control",
|
| 229 |
+
"severity": "yellow",
|
| 230 |
+
"definition_reference": "II.G",
|
| 231 |
+
"description": "Feels powerless over life circumstances"
|
| 232 |
+
},
|
| 233 |
+
"loss_hope": {
|
| 234 |
+
"category": "Expressions",
|
| 235 |
+
"subcategory": "Expresses loss of hope",
|
| 236 |
+
"severity": "red",
|
| 237 |
+
"definition_reference": "II.G",
|
| 238 |
+
"description": "Feels despair or believes future holds no possibility"
|
| 239 |
+
},
|
| 240 |
+
"loss_serenity": {
|
| 241 |
+
"category": "Expressions",
|
| 242 |
+
"subcategory": "Expresses loss of serenity",
|
| 243 |
+
"severity": "yellow",
|
| 244 |
+
"definition_reference": "II.G",
|
| 245 |
+
"description": "Inner turmoil, anxiety, or restlessness"
|
| 246 |
+
},
|
| 247 |
+
"need_forgiveness": {
|
| 248 |
+
"category": "Expressions",
|
| 249 |
+
"subcategory": "Expresses need for forgiveness",
|
| 250 |
+
"severity": "red",
|
| 251 |
+
"definition_reference": "II.G",
|
| 252 |
+
"description": "Feels guilt or remorse and desires reconciliation"
|
| 253 |
+
},
|
| 254 |
+
"expresses_regret": {
|
| 255 |
+
"category": "Expressions",
|
| 256 |
+
"subcategory": "Expresses regret",
|
| 257 |
+
"severity": "yellow",
|
| 258 |
+
"definition_reference": "II.G",
|
| 259 |
+
"description": "Sorrow over past actions or missed opportunities"
|
| 260 |
+
},
|
| 261 |
+
"expresses_suffering": {
|
| 262 |
+
"category": "Expressions",
|
| 263 |
+
"subcategory": "Expresses suffering",
|
| 264 |
+
"severity": "red",
|
| 265 |
+
"definition_reference": "II.G",
|
| 266 |
+
"description": "Deep physical, emotional, or spiritual pain"
|
| 267 |
+
},
|
| 268 |
+
"concern_medical_treatment": {
|
| 269 |
+
"category": "Medical",
|
| 270 |
+
"subcategory": "Expresses concern about medical treatment",
|
| 271 |
+
"severity": "red",
|
| 272 |
+
"definition_reference": "II.G",
|
| 273 |
+
"description": "Concern about treatment or medical team"
|
| 274 |
+
},
|
| 275 |
+
"unfinished_business": {
|
| 276 |
+
"category": "Expressions",
|
| 277 |
+
"subcategory": "Expresses feeling of having unfinished business",
|
| 278 |
+
"severity": "red",
|
| 279 |
+
"definition_reference": "II.G",
|
| 280 |
+
"description": "Important matters remain unresolved"
|
| 281 |
+
},
|
| 282 |
+
"desire_share_spiritual": {
|
| 283 |
+
"category": "Spiritual",
|
| 284 |
+
"subcategory": "Expresses desire to share intense spiritual experiences",
|
| 285 |
+
"severity": "yellow",
|
| 286 |
+
"definition_reference": "II.G",
|
| 287 |
+
"description": "Wants to share intense spiritual/religious experiences"
|
| 288 |
+
},
|
| 289 |
+
"inability_transcendence": {
|
| 290 |
+
"category": "Spiritual",
|
| 291 |
+
"subcategory": "Inability to experience transcendence",
|
| 292 |
+
"severity": "red",
|
| 293 |
+
"definition_reference": "II.G",
|
| 294 |
+
"description": "Cannot experience supportive forces larger than oneself"
|
| 295 |
+
},
|
| 296 |
+
"impaired_introspection": {
|
| 297 |
+
"category": "Spiritual",
|
| 298 |
+
"subcategory": "Impaired ability for introspection",
|
| 299 |
+
"severity": "yellow",
|
| 300 |
+
"definition_reference": "II.G",
|
| 301 |
+
"description": "Impaired ability for self-reflection"
|
| 302 |
+
},
|
| 303 |
+
|
| 304 |
+
# Section II.H - Existential questioning
|
| 305 |
+
"questioning_identity": {
|
| 306 |
+
"category": "Existential",
|
| 307 |
+
"subcategory": "Questioning one's identity",
|
| 308 |
+
"severity": "yellow",
|
| 309 |
+
"definition_reference": "II.H",
|
| 310 |
+
"description": "Confused about identity when illness takes away roles"
|
| 311 |
+
},
|
| 312 |
+
"questioning_meaning_life": {
|
| 313 |
+
"category": "Existential",
|
| 314 |
+
"subcategory": "Questioning the meaning of life",
|
| 315 |
+
"severity": "red",
|
| 316 |
+
"definition_reference": "II.H",
|
| 317 |
+
"description": "Grapples with fundamental questions about existence"
|
| 318 |
+
},
|
| 319 |
+
"questioning_meaning_suffering": {
|
| 320 |
+
"category": "Existential",
|
| 321 |
+
"subcategory": "Questioning the meaning of suffering",
|
| 322 |
+
"severity": "red",
|
| 323 |
+
"definition_reference": "II.H",
|
| 324 |
+
"description": "Struggles to understand if pain has purpose"
|
| 325 |
+
},
|
| 326 |
+
"questioning_dignity": {
|
| 327 |
+
"category": "Existential",
|
| 328 |
+
"subcategory": "Questioning one's own dignity",
|
| 329 |
+
"severity": "red",
|
| 330 |
+
"definition_reference": "II.H",
|
| 331 |
+
"description": "Questions inherent worth and value as person"
|
| 332 |
+
},
|
| 333 |
+
|
| 334 |
+
# Section II.I - Social isolation
|
| 335 |
+
"social_isolation": {
|
| 336 |
+
"category": "Social",
|
| 337 |
+
"subcategory": "Social isolation expressions",
|
| 338 |
+
"severity": "yellow",
|
| 339 |
+
"definition_reference": "II.I",
|
| 340 |
+
"description": "Avoids interaction, estrangement, loneliness"
|
| 341 |
+
},
|
| 342 |
+
|
| 343 |
+
# Section II.J - Changes in spiritual/religious practices
|
| 344 |
+
"altered_religious_ritual": {
|
| 345 |
+
"category": "Spiritual",
|
| 346 |
+
"subcategory": "Altered religious ritual",
|
| 347 |
+
"severity": "yellow",
|
| 348 |
+
"definition_reference": "II.J.a",
|
| 349 |
+
"description": "Disruption to religious practices"
|
| 350 |
+
},
|
| 351 |
+
"altered_spiritual_practice": {
|
| 352 |
+
"category": "Spiritual",
|
| 353 |
+
"subcategory": "Altered spiritual practice",
|
| 354 |
+
"severity": "yellow",
|
| 355 |
+
"definition_reference": "II.J.b",
|
| 356 |
+
"description": "Disruption to personal spiritual activities"
|
| 357 |
+
},
|
| 358 |
+
|
| 359 |
+
# Section II.K - Cultural conflict
|
| 360 |
+
"cultural_conflict": {
|
| 361 |
+
"category": "Cultural",
|
| 362 |
+
"subcategory": "Cultural conflict",
|
| 363 |
+
"severity": "yellow",
|
| 364 |
+
"definition_reference": "II.K",
|
| 365 |
+
"description": "Clash between cultural beliefs and healthcare culture"
|
| 366 |
+
},
|
| 367 |
+
|
| 368 |
+
# Section II.L - Sociocultural deprivation
|
| 369 |
+
"sociocultural_deprivation": {
|
| 370 |
+
"category": "Cultural",
|
| 371 |
+
"subcategory": "Sociocultural deprivation",
|
| 372 |
+
"severity": "yellow",
|
| 373 |
+
"definition_reference": "II.L",
|
| 374 |
+
"description": "Separated from cultural community"
|
| 375 |
+
},
|
| 376 |
+
|
| 377 |
+
# Section II.M - Difficulty accepting aging
|
| 378 |
+
"difficulty_accepting_aging": {
|
| 379 |
+
"category": "Aging",
|
| 380 |
+
"subcategory": "Difficulty accepting aging",
|
| 381 |
+
"severity": "yellow",
|
| 382 |
+
"definition_reference": "II.M",
|
| 383 |
+
"description": "Grief over lost abilities, resistance to mortality"
|
| 384 |
+
},
|
| 385 |
+
|
| 386 |
+
# Section II.N - Inadequate environmental control
|
| 387 |
+
"inadequate_environmental_control": {
|
| 388 |
+
"category": "Environment",
|
| 389 |
+
"subcategory": "Inadequate environmental control",
|
| 390 |
+
"severity": "yellow",
|
| 391 |
+
"definition_reference": "II.N",
|
| 392 |
+
"description": "Unable to shape surroundings for spiritual needs"
|
| 393 |
+
},
|
| 394 |
+
|
| 395 |
+
# Section II.O - Loss of independence
|
| 396 |
+
"loss_independence": {
|
| 397 |
+
"category": "Independence",
|
| 398 |
+
"subcategory": "Loss of independence",
|
| 399 |
+
"severity": "yellow",
|
| 400 |
+
"definition_reference": "II.O",
|
| 401 |
+
"description": "Dependency threatens personal and spiritual agency"
|
| 402 |
+
},
|
| 403 |
+
|
| 404 |
+
# Section II.P - Uncontrolled pain
|
| 405 |
+
"uncontrolled_pain": {
|
| 406 |
+
"category": "Medical",
|
| 407 |
+
"subcategory": "Uncontrolled pain",
|
| 408 |
+
"severity": "red",
|
| 409 |
+
"definition_reference": "II.P",
|
| 410 |
+
"description": "Persistent physical pain causing existential distress"
|
| 411 |
+
},
|
| 412 |
+
|
| 413 |
+
# Section II.Q - Spiritual pain
|
| 414 |
+
"spiritual_pain": {
|
| 415 |
+
"category": "Spiritual",
|
| 416 |
+
"subcategory": "Spiritual pain",
|
| 417 |
+
"severity": "red",
|
| 418 |
+
"definition_reference": "II.Q",
|
| 419 |
+
"description": "Soul-level suffering beyond physical symptoms"
|
| 420 |
+
},
|
| 421 |
+
}
|
| 422 |
+
|
| 423 |
+
|
| 424 |
+
# =============================================================================
|
| 425 |
+
# DATA MODELS
|
| 426 |
+
# =============================================================================
|
| 427 |
+
|
| 428 |
+
@dataclass
|
| 429 |
+
class DistressIndicator:
|
| 430 |
+
"""
|
| 431 |
+
Detected distress indicator with category and severity.
|
| 432 |
+
|
| 433 |
+
Based on the Spiritual Distress Definitions document with color coding:
|
| 434 |
+
- RED (#ea9999): Severe distress - requires immediate attention
|
| 435 |
+
- YELLOW (#ffe599): Potential distress - requires clarification
|
| 436 |
+
"""
|
| 437 |
+
indicator_text: str
|
| 438 |
+
category: str # "Emotional", "Grief", "Existential", "Expressions", "Spiritual", "Medical", "Social", "Cultural"
|
| 439 |
+
subcategory: str # Specific indicator name from definitions document
|
| 440 |
+
severity: str # "red" or "yellow" - based on color coding in definitions document
|
| 441 |
+
confidence: float # 0.0-1.0
|
| 442 |
+
definition_reference: str = "" # Section reference (e.g., "II.D", "II.G")
|
| 443 |
+
|
| 444 |
+
def __post_init__(self):
|
| 445 |
+
"""Validate severity value."""
|
| 446 |
+
if self.severity not in ("red", "yellow"):
|
| 447 |
+
raise ValueError(f"Severity must be 'red' or 'yellow', got '{self.severity}'")
|
| 448 |
+
if not 0.0 <= self.confidence <= 1.0:
|
| 449 |
+
raise ValueError(f"Confidence must be between 0.0 and 1.0, got {self.confidence}")
|
| 450 |
+
|
| 451 |
+
def to_dict(self) -> dict:
|
| 452 |
+
"""Convert indicator to dictionary for serialization."""
|
| 453 |
+
return {
|
| 454 |
+
"indicator_text": self.indicator_text,
|
| 455 |
+
"category": self.category,
|
| 456 |
+
"subcategory": self.subcategory,
|
| 457 |
+
"severity": self.severity,
|
| 458 |
+
"confidence": self.confidence,
|
| 459 |
+
"definition_reference": self.definition_reference,
|
| 460 |
+
}
|
| 461 |
+
|
| 462 |
+
@classmethod
|
| 463 |
+
def from_dict(cls, data: dict) -> "DistressIndicator":
|
| 464 |
+
"""Create indicator from dictionary."""
|
| 465 |
+
return cls(**data)
|
| 466 |
+
|
| 467 |
+
@classmethod
|
| 468 |
+
def from_definition(cls, indicator_key: str, indicator_text: str, confidence: float) -> "DistressIndicator":
|
| 469 |
+
"""
|
| 470 |
+
Create indicator from INDICATOR_DEFINITIONS constant.
|
| 471 |
+
|
| 472 |
+
Args:
|
| 473 |
+
indicator_key: Key in INDICATOR_DEFINITIONS (e.g., "excessive_guilt")
|
| 474 |
+
indicator_text: The actual text that triggered this indicator
|
| 475 |
+
confidence: Confidence score 0.0-1.0
|
| 476 |
+
|
| 477 |
+
Returns:
|
| 478 |
+
DistressIndicator with category, subcategory, severity from definitions
|
| 479 |
+
|
| 480 |
+
Raises:
|
| 481 |
+
KeyError: If indicator_key not found in INDICATOR_DEFINITIONS
|
| 482 |
+
"""
|
| 483 |
+
if indicator_key not in INDICATOR_DEFINITIONS:
|
| 484 |
+
raise KeyError(f"Unknown indicator key: {indicator_key}")
|
| 485 |
+
|
| 486 |
+
defn = INDICATOR_DEFINITIONS[indicator_key]
|
| 487 |
+
return cls(
|
| 488 |
+
indicator_text=indicator_text,
|
| 489 |
+
category=defn["category"],
|
| 490 |
+
subcategory=defn["subcategory"],
|
| 491 |
+
severity=defn["severity"],
|
| 492 |
+
confidence=confidence,
|
| 493 |
+
definition_reference=defn["definition_reference"],
|
| 494 |
+
)
|
| 495 |
+
|
| 496 |
+
|
| 497 |
+
|
| 498 |
+
@dataclass
|
| 499 |
+
class FollowUpQuestion:
|
| 500 |
+
"""
|
| 501 |
+
Generated follow-up question for YELLOW cases.
|
| 502 |
+
|
| 503 |
+
Contains 1-2 short, sensitive clarifying questions with purpose explanation.
|
| 504 |
+
"""
|
| 505 |
+
question_id: str
|
| 506 |
+
question_text: str
|
| 507 |
+
purpose: str # Why this question is being asked
|
| 508 |
+
|
| 509 |
+
def to_dict(self) -> dict:
|
| 510 |
+
"""Convert question to dictionary for serialization."""
|
| 511 |
+
return {
|
| 512 |
+
"question_id": self.question_id,
|
| 513 |
+
"question_text": self.question_text,
|
| 514 |
+
"purpose": self.purpose,
|
| 515 |
+
}
|
| 516 |
+
|
| 517 |
+
@classmethod
|
| 518 |
+
def from_dict(cls, data: dict) -> "FollowUpQuestion":
|
| 519 |
+
"""Create question from dictionary."""
|
| 520 |
+
return cls(**data)
|
| 521 |
+
|
| 522 |
+
|
| 523 |
+
@dataclass
|
| 524 |
+
class ClassificationFlowResult:
|
| 525 |
+
"""
|
| 526 |
+
Complete result of classification flow.
|
| 527 |
+
|
| 528 |
+
Contains all flow-specific fields for RED/YELLOW/GREEN classifications.
|
| 529 |
+
"""
|
| 530 |
+
classification: str # "red", "yellow", "green"
|
| 531 |
+
confidence: float # 0.0-1.0
|
| 532 |
+
indicators: List[DistressIndicator] = field(default_factory=list)
|
| 533 |
+
explanation: str = ""
|
| 534 |
+
|
| 535 |
+
# RED-specific fields
|
| 536 |
+
permission_check_message: Optional[str] = None
|
| 537 |
+
referral_message: Optional[str] = None
|
| 538 |
+
consent_status: Optional[str] = None # "granted", "declined", None
|
| 539 |
+
|
| 540 |
+
# YELLOW-specific fields
|
| 541 |
+
follow_up_questions: List[FollowUpQuestion] = field(default_factory=list)
|
| 542 |
+
patient_responses: List[str] = field(default_factory=list)
|
| 543 |
+
re_evaluation_result: Optional[str] = None # "red", "green", None
|
| 544 |
+
|
| 545 |
+
def __post_init__(self):
|
| 546 |
+
"""Validate classification value."""
|
| 547 |
+
if self.classification not in ("red", "yellow", "green"):
|
| 548 |
+
raise ValueError(f"Classification must be 'red', 'yellow', or 'green', got '{self.classification}'")
|
| 549 |
+
if not 0.0 <= self.confidence <= 1.0:
|
| 550 |
+
raise ValueError(f"Confidence must be between 0.0 and 1.0, got {self.confidence}")
|
| 551 |
+
|
| 552 |
+
def to_dict(self) -> dict:
|
| 553 |
+
"""Convert result to dictionary for serialization."""
|
| 554 |
+
return {
|
| 555 |
+
"classification": self.classification,
|
| 556 |
+
"confidence": self.confidence,
|
| 557 |
+
"indicators": [i.to_dict() for i in self.indicators],
|
| 558 |
+
"explanation": self.explanation,
|
| 559 |
+
"permission_check_message": self.permission_check_message,
|
| 560 |
+
"referral_message": self.referral_message,
|
| 561 |
+
"consent_status": self.consent_status,
|
| 562 |
+
"follow_up_questions": [q.to_dict() for q in self.follow_up_questions],
|
| 563 |
+
"patient_responses": self.patient_responses,
|
| 564 |
+
"re_evaluation_result": self.re_evaluation_result,
|
| 565 |
+
}
|
| 566 |
+
|
| 567 |
+
@classmethod
|
| 568 |
+
def from_dict(cls, data: dict) -> "ClassificationFlowResult":
|
| 569 |
+
"""Create result from dictionary."""
|
| 570 |
+
data_copy = data.copy()
|
| 571 |
+
|
| 572 |
+
# Convert nested indicators
|
| 573 |
+
indicators_data = data_copy.pop("indicators", [])
|
| 574 |
+
indicators = [DistressIndicator.from_dict(i) for i in indicators_data]
|
| 575 |
+
|
| 576 |
+
# Convert nested follow-up questions
|
| 577 |
+
questions_data = data_copy.pop("follow_up_questions", [])
|
| 578 |
+
questions = [FollowUpQuestion.from_dict(q) for q in questions_data]
|
| 579 |
+
|
| 580 |
+
result = cls(**data_copy)
|
| 581 |
+
result.indicators = indicators
|
| 582 |
+
result.follow_up_questions = questions
|
| 583 |
+
return result
|
| 584 |
+
|
| 585 |
+
|
| 586 |
+
# Tagging category constants
|
| 587 |
+
CLASSIFICATION_SUBCATEGORIES = [
|
| 588 |
+
"missed_indicators", # Missed key distress indicators
|
| 589 |
+
"false_positive", # Overly sensitive (false-positive flag)
|
| 590 |
+
"missed_distress", # Not sensitive enough (missed distress)
|
| 591 |
+
]
|
| 592 |
+
|
| 593 |
+
QUESTION_ISSUE_TYPES = [
|
| 594 |
+
"inappropriate", # Question is inappropriate or intrusive
|
| 595 |
+
"not_relevant", # Question is not spiritually relevant
|
| 596 |
+
"too_leading", # Question is too leading or assumptive
|
| 597 |
+
"unclear", # Question is unclear or confusing
|
| 598 |
+
"tone_clinical", # Tone too clinical
|
| 599 |
+
"tone_religious", # Tone too religious
|
| 600 |
+
"tone_casual", # Tone too casual
|
| 601 |
+
]
|
| 602 |
+
|
| 603 |
+
REFERRAL_ISSUE_TYPES = [
|
| 604 |
+
"incomplete_summary", # Incorrect or incomplete summary
|
| 605 |
+
"misrepresentation", # Misrepresentation of patient message
|
| 606 |
+
"inappropriate_tone", # Tone inappropriate for spiritual care team
|
| 607 |
+
]
|
| 608 |
+
|
| 609 |
+
|
| 610 |
+
@dataclass
|
| 611 |
+
class TaggingRecord:
|
| 612 |
+
"""
|
| 613 |
+
Structured tagging feedback from chaplain.
|
| 614 |
+
|
| 615 |
+
Supports multi-select for question and referral issues.
|
| 616 |
+
"""
|
| 617 |
+
record_id: str
|
| 618 |
+
message_id: str
|
| 619 |
+
|
| 620 |
+
# Classification feedback
|
| 621 |
+
is_classification_correct: bool = True
|
| 622 |
+
classification_subcategory: Optional[str] = None # "missed_indicators", "false_positive", "missed_distress"
|
| 623 |
+
correct_classification: Optional[str] = None # "red", "yellow", "green"
|
| 624 |
+
|
| 625 |
+
# Follow-up question feedback (YELLOW only)
|
| 626 |
+
question_issues: List[str] = field(default_factory=list) # Multi-select from QUESTION_ISSUE_TYPES
|
| 627 |
+
question_comments: Optional[str] = None
|
| 628 |
+
|
| 629 |
+
# Referral message feedback (RED only)
|
| 630 |
+
referral_issues: List[str] = field(default_factory=list) # Multi-select from REFERRAL_ISSUE_TYPES
|
| 631 |
+
referral_comments: Optional[str] = None
|
| 632 |
+
|
| 633 |
+
# Indicator feedback
|
| 634 |
+
indicator_issues: List[str] = field(default_factory=list) # List of incorrectly identified indicator IDs
|
| 635 |
+
indicator_comments: Optional[str] = None
|
| 636 |
+
|
| 637 |
+
# General
|
| 638 |
+
general_notes: str = ""
|
| 639 |
+
timestamp: datetime = field(default_factory=datetime.now)
|
| 640 |
+
|
| 641 |
+
def __post_init__(self):
|
| 642 |
+
"""Validate tagging values."""
|
| 643 |
+
if self.classification_subcategory and self.classification_subcategory not in CLASSIFICATION_SUBCATEGORIES:
|
| 644 |
+
raise ValueError(f"Invalid classification subcategory: {self.classification_subcategory}")
|
| 645 |
+
if self.correct_classification and self.correct_classification not in ("red", "yellow", "green"):
|
| 646 |
+
raise ValueError(f"Invalid correct_classification: {self.correct_classification}")
|
| 647 |
+
for issue in self.question_issues:
|
| 648 |
+
if issue not in QUESTION_ISSUE_TYPES:
|
| 649 |
+
raise ValueError(f"Invalid question issue type: {issue}")
|
| 650 |
+
for issue in self.referral_issues:
|
| 651 |
+
if issue not in REFERRAL_ISSUE_TYPES:
|
| 652 |
+
raise ValueError(f"Invalid referral issue type: {issue}")
|
| 653 |
+
|
| 654 |
+
def to_dict(self) -> dict:
|
| 655 |
+
"""Convert record to dictionary for serialization."""
|
| 656 |
+
return {
|
| 657 |
+
"record_id": self.record_id,
|
| 658 |
+
"message_id": self.message_id,
|
| 659 |
+
"is_classification_correct": self.is_classification_correct,
|
| 660 |
+
"classification_subcategory": self.classification_subcategory,
|
| 661 |
+
"correct_classification": self.correct_classification,
|
| 662 |
+
"question_issues": self.question_issues,
|
| 663 |
+
"question_comments": self.question_comments,
|
| 664 |
+
"referral_issues": self.referral_issues,
|
| 665 |
+
"referral_comments": self.referral_comments,
|
| 666 |
+
"indicator_issues": self.indicator_issues,
|
| 667 |
+
"indicator_comments": self.indicator_comments,
|
| 668 |
+
"general_notes": self.general_notes,
|
| 669 |
+
"timestamp": self.timestamp.isoformat(),
|
| 670 |
+
}
|
| 671 |
+
|
| 672 |
+
@classmethod
|
| 673 |
+
def from_dict(cls, data: dict) -> "TaggingRecord":
|
| 674 |
+
"""Create record from dictionary."""
|
| 675 |
+
data_copy = data.copy()
|
| 676 |
+
if isinstance(data_copy.get("timestamp"), str):
|
| 677 |
+
data_copy["timestamp"] = datetime.fromisoformat(data_copy["timestamp"])
|
| 678 |
+
return cls(**data_copy)
|
| 679 |
+
|
| 680 |
+
|
| 681 |
+
|
| 682 |
+
# Interaction step types
|
| 683 |
+
INTERACTION_STEP_TYPES = [
|
| 684 |
+
"classification", # Initial classification
|
| 685 |
+
"explanation", # Explanation generation
|
| 686 |
+
"permission_check", # Patient consent request
|
| 687 |
+
"follow_up", # Follow-up questions
|
| 688 |
+
"referral", # Referral message generation
|
| 689 |
+
"feedback", # Chaplain feedback
|
| 690 |
+
]
|
| 691 |
+
|
| 692 |
+
|
| 693 |
+
@dataclass
|
| 694 |
+
class InteractionStepLog:
|
| 695 |
+
"""
|
| 696 |
+
Log entry for a single interaction step.
|
| 697 |
+
|
| 698 |
+
Records all interaction steps with input/output for analysis.
|
| 699 |
+
"""
|
| 700 |
+
step_id: str
|
| 701 |
+
session_id: str
|
| 702 |
+
message_id: str
|
| 703 |
+
step_type: str # "classification", "explanation", "permission_check", "follow_up", "referral", "feedback"
|
| 704 |
+
input_text: str
|
| 705 |
+
model_output: str
|
| 706 |
+
approval_status: Optional[str] = None # "approved", "disapproved", None
|
| 707 |
+
tagging_data: Optional[TaggingRecord] = None
|
| 708 |
+
timestamp: datetime = field(default_factory=datetime.now)
|
| 709 |
+
|
| 710 |
+
def __post_init__(self):
|
| 711 |
+
"""Validate step type."""
|
| 712 |
+
if self.step_type not in INTERACTION_STEP_TYPES:
|
| 713 |
+
raise ValueError(f"Invalid step type: {self.step_type}")
|
| 714 |
+
if self.approval_status and self.approval_status not in ("approved", "disapproved"):
|
| 715 |
+
raise ValueError(f"Invalid approval status: {self.approval_status}")
|
| 716 |
+
|
| 717 |
+
def to_dict(self) -> dict:
|
| 718 |
+
"""Convert log entry to dictionary for serialization."""
|
| 719 |
+
return {
|
| 720 |
+
"step_id": self.step_id,
|
| 721 |
+
"session_id": self.session_id,
|
| 722 |
+
"message_id": self.message_id,
|
| 723 |
+
"step_type": self.step_type,
|
| 724 |
+
"input_text": self.input_text,
|
| 725 |
+
"model_output": self.model_output,
|
| 726 |
+
"approval_status": self.approval_status,
|
| 727 |
+
"tagging_data": self.tagging_data.to_dict() if self.tagging_data else None,
|
| 728 |
+
"timestamp": self.timestamp.isoformat(),
|
| 729 |
+
}
|
| 730 |
+
|
| 731 |
+
@classmethod
|
| 732 |
+
def from_dict(cls, data: dict) -> "InteractionStepLog":
|
| 733 |
+
"""Create log entry from dictionary."""
|
| 734 |
+
data_copy = data.copy()
|
| 735 |
+
if isinstance(data_copy.get("timestamp"), str):
|
| 736 |
+
data_copy["timestamp"] = datetime.fromisoformat(data_copy["timestamp"])
|
| 737 |
+
|
| 738 |
+
# Convert nested tagging data
|
| 739 |
+
tagging_data = data_copy.pop("tagging_data", None)
|
| 740 |
+
if tagging_data:
|
| 741 |
+
tagging_data = TaggingRecord.from_dict(tagging_data)
|
| 742 |
+
|
| 743 |
+
log = cls(**data_copy)
|
| 744 |
+
log.tagging_data = tagging_data
|
| 745 |
+
return log
|
src/core/classification_flow_manager.py
ADDED
|
@@ -0,0 +1,310 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# classification_flow_manager.py
|
| 2 |
+
"""
|
| 3 |
+
Classification Flow Manager for Chaplain Feedback System.
|
| 4 |
+
|
| 5 |
+
Orchestrates RED/YELLOW/GREEN classification flows and integrates with ContentGenerator
|
| 6 |
+
to produce complete classification results with appropriate content.
|
| 7 |
+
"""
|
| 8 |
+
|
| 9 |
+
from typing import List, Optional
|
| 10 |
+
import uuid
|
| 11 |
+
from datetime import datetime
|
| 12 |
+
|
| 13 |
+
from src.core.chaplain_models import (
|
| 14 |
+
DistressIndicator,
|
| 15 |
+
FollowUpQuestion,
|
| 16 |
+
ClassificationFlowResult,
|
| 17 |
+
)
|
| 18 |
+
from src.core.content_generator import ContentGenerator
|
| 19 |
+
|
| 20 |
+
|
| 21 |
+
class ClassificationFlowManager:
|
| 22 |
+
"""
|
| 23 |
+
Orchestrates RED/YELLOW/GREEN classification flows.
|
| 24 |
+
|
| 25 |
+
Integrates with ContentGenerator to produce complete classification results
|
| 26 |
+
with explanations, permission checks, referral messages, and follow-up questions.
|
| 27 |
+
"""
|
| 28 |
+
|
| 29 |
+
def __init__(self, content_generator: Optional[ContentGenerator] = None):
|
| 30 |
+
"""
|
| 31 |
+
Initialize flow manager.
|
| 32 |
+
|
| 33 |
+
Args:
|
| 34 |
+
content_generator: ContentGenerator instance, creates new one if None
|
| 35 |
+
"""
|
| 36 |
+
self.content_generator = content_generator or ContentGenerator()
|
| 37 |
+
|
| 38 |
+
def execute_classification_flow(
|
| 39 |
+
self,
|
| 40 |
+
message: str,
|
| 41 |
+
classification: str,
|
| 42 |
+
confidence: float,
|
| 43 |
+
indicators: List[DistressIndicator]
|
| 44 |
+
) -> ClassificationFlowResult:
|
| 45 |
+
"""
|
| 46 |
+
Execute complete classification flow based on classification type.
|
| 47 |
+
|
| 48 |
+
Args:
|
| 49 |
+
message: Original patient message
|
| 50 |
+
classification: "red", "yellow", or "green"
|
| 51 |
+
confidence: Classification confidence (0.0-1.0)
|
| 52 |
+
indicators: List of detected distress indicators
|
| 53 |
+
|
| 54 |
+
Returns:
|
| 55 |
+
Complete ClassificationFlowResult with all generated content
|
| 56 |
+
"""
|
| 57 |
+
if classification == "red":
|
| 58 |
+
return self.execute_red_flow(message, confidence, indicators)
|
| 59 |
+
elif classification == "yellow":
|
| 60 |
+
return self.execute_yellow_flow(message, confidence, indicators)
|
| 61 |
+
elif classification == "green":
|
| 62 |
+
return self.execute_green_flow(message, confidence, indicators)
|
| 63 |
+
else:
|
| 64 |
+
raise ValueError(f"Invalid classification: {classification}")
|
| 65 |
+
|
| 66 |
+
def execute_red_flow(
|
| 67 |
+
self,
|
| 68 |
+
message: str,
|
| 69 |
+
confidence: float,
|
| 70 |
+
indicators: List[DistressIndicator],
|
| 71 |
+
consent_status: Optional[str] = None
|
| 72 |
+
) -> ClassificationFlowResult:
|
| 73 |
+
"""
|
| 74 |
+
Execute RED flag flow.
|
| 75 |
+
|
| 76 |
+
Generates explanation, permission check, and referral message.
|
| 77 |
+
Handles consent granted/declined states.
|
| 78 |
+
|
| 79 |
+
Args:
|
| 80 |
+
message: Original patient message
|
| 81 |
+
confidence: Classification confidence
|
| 82 |
+
indicators: List of detected distress indicators
|
| 83 |
+
consent_status: "granted", "declined", or None for simulation
|
| 84 |
+
|
| 85 |
+
Returns:
|
| 86 |
+
ClassificationFlowResult with RED flow content
|
| 87 |
+
"""
|
| 88 |
+
# Generate explanation
|
| 89 |
+
explanation = self.content_generator.generate_explanation(
|
| 90 |
+
"red", indicators, message
|
| 91 |
+
)
|
| 92 |
+
|
| 93 |
+
# Generate permission check message
|
| 94 |
+
permission_check = self.content_generator.generate_permission_check(indicators)
|
| 95 |
+
|
| 96 |
+
# Simulate consent if not provided
|
| 97 |
+
if consent_status is None:
|
| 98 |
+
# For testing/demo purposes, simulate consent as granted
|
| 99 |
+
# In real implementation, this would come from user interaction
|
| 100 |
+
consent_status = "granted"
|
| 101 |
+
|
| 102 |
+
# Generate referral message if consent granted
|
| 103 |
+
referral_message = None
|
| 104 |
+
if consent_status == "granted":
|
| 105 |
+
referral_message = self.content_generator.generate_referral_message(
|
| 106 |
+
message, indicators, explanation
|
| 107 |
+
)
|
| 108 |
+
|
| 109 |
+
return ClassificationFlowResult(
|
| 110 |
+
classification="red",
|
| 111 |
+
confidence=confidence,
|
| 112 |
+
indicators=indicators,
|
| 113 |
+
explanation=explanation,
|
| 114 |
+
permission_check_message=permission_check,
|
| 115 |
+
referral_message=referral_message,
|
| 116 |
+
consent_status=consent_status,
|
| 117 |
+
)
|
| 118 |
+
|
| 119 |
+
def execute_yellow_flow(
|
| 120 |
+
self,
|
| 121 |
+
message: str,
|
| 122 |
+
confidence: float,
|
| 123 |
+
indicators: List[DistressIndicator],
|
| 124 |
+
patient_responses: Optional[List[str]] = None
|
| 125 |
+
) -> ClassificationFlowResult:
|
| 126 |
+
"""
|
| 127 |
+
Execute YELLOW flag flow.
|
| 128 |
+
|
| 129 |
+
Generates explanation and follow-up questions.
|
| 130 |
+
Handles re-evaluation based on responses.
|
| 131 |
+
|
| 132 |
+
Args:
|
| 133 |
+
message: Original patient message
|
| 134 |
+
confidence: Classification confidence
|
| 135 |
+
indicators: List of detected distress indicators
|
| 136 |
+
patient_responses: Simulated patient responses to follow-up questions
|
| 137 |
+
|
| 138 |
+
Returns:
|
| 139 |
+
ClassificationFlowResult with YELLOW flow content
|
| 140 |
+
"""
|
| 141 |
+
# Generate explanation
|
| 142 |
+
explanation = self.content_generator.generate_explanation(
|
| 143 |
+
"yellow", indicators, message
|
| 144 |
+
)
|
| 145 |
+
|
| 146 |
+
# Generate follow-up questions
|
| 147 |
+
follow_up_questions = self.content_generator.generate_follow_up_questions(
|
| 148 |
+
message, indicators
|
| 149 |
+
)
|
| 150 |
+
|
| 151 |
+
# Handle patient responses and re-evaluation
|
| 152 |
+
re_evaluation_result = None
|
| 153 |
+
if patient_responses is None:
|
| 154 |
+
# Simulate patient responses for demo/testing
|
| 155 |
+
patient_responses = self._simulate_patient_responses(follow_up_questions)
|
| 156 |
+
|
| 157 |
+
if patient_responses:
|
| 158 |
+
re_evaluation_result = self._evaluate_patient_responses(patient_responses)
|
| 159 |
+
|
| 160 |
+
return ClassificationFlowResult(
|
| 161 |
+
classification="yellow",
|
| 162 |
+
confidence=confidence,
|
| 163 |
+
indicators=indicators,
|
| 164 |
+
explanation=explanation,
|
| 165 |
+
follow_up_questions=follow_up_questions,
|
| 166 |
+
patient_responses=patient_responses,
|
| 167 |
+
re_evaluation_result=re_evaluation_result,
|
| 168 |
+
)
|
| 169 |
+
|
| 170 |
+
def execute_green_flow(
|
| 171 |
+
self,
|
| 172 |
+
message: str,
|
| 173 |
+
confidence: float,
|
| 174 |
+
indicators: List[DistressIndicator]
|
| 175 |
+
) -> ClassificationFlowResult:
|
| 176 |
+
"""
|
| 177 |
+
Execute GREEN flag flow.
|
| 178 |
+
|
| 179 |
+
Generates explanation for no indicators.
|
| 180 |
+
Displays "No further steps" status.
|
| 181 |
+
|
| 182 |
+
Args:
|
| 183 |
+
message: Original patient message
|
| 184 |
+
confidence: Classification confidence
|
| 185 |
+
indicators: List of detected distress indicators (should be empty)
|
| 186 |
+
|
| 187 |
+
Returns:
|
| 188 |
+
ClassificationFlowResult with GREEN flow content
|
| 189 |
+
"""
|
| 190 |
+
# Generate explanation
|
| 191 |
+
explanation = self.content_generator.generate_explanation(
|
| 192 |
+
"green", indicators, message
|
| 193 |
+
)
|
| 194 |
+
|
| 195 |
+
return ClassificationFlowResult(
|
| 196 |
+
classification="green",
|
| 197 |
+
confidence=confidence,
|
| 198 |
+
indicators=indicators,
|
| 199 |
+
explanation=explanation,
|
| 200 |
+
)
|
| 201 |
+
|
| 202 |
+
def escalate_yellow_to_red(
|
| 203 |
+
self,
|
| 204 |
+
yellow_result: ClassificationFlowResult,
|
| 205 |
+
message: str
|
| 206 |
+
) -> ClassificationFlowResult:
|
| 207 |
+
"""
|
| 208 |
+
Escalate YELLOW classification to RED based on patient responses.
|
| 209 |
+
|
| 210 |
+
Args:
|
| 211 |
+
yellow_result: Original YELLOW classification result
|
| 212 |
+
message: Original patient message
|
| 213 |
+
|
| 214 |
+
Returns:
|
| 215 |
+
New RED ClassificationFlowResult
|
| 216 |
+
"""
|
| 217 |
+
# Create new RED indicators based on escalation
|
| 218 |
+
escalated_indicators = yellow_result.indicators.copy()
|
| 219 |
+
|
| 220 |
+
# Execute RED flow with escalated indicators
|
| 221 |
+
return self.execute_red_flow(
|
| 222 |
+
message,
|
| 223 |
+
confidence=0.85, # High confidence for escalated case
|
| 224 |
+
indicators=escalated_indicators,
|
| 225 |
+
consent_status="granted" # Assume consent for escalated cases
|
| 226 |
+
)
|
| 227 |
+
|
| 228 |
+
def downgrade_yellow_to_green(
|
| 229 |
+
self,
|
| 230 |
+
yellow_result: ClassificationFlowResult,
|
| 231 |
+
message: str
|
| 232 |
+
) -> ClassificationFlowResult:
|
| 233 |
+
"""
|
| 234 |
+
Downgrade YELLOW classification to GREEN based on patient responses.
|
| 235 |
+
|
| 236 |
+
Args:
|
| 237 |
+
yellow_result: Original YELLOW classification result
|
| 238 |
+
message: Original patient message
|
| 239 |
+
|
| 240 |
+
Returns:
|
| 241 |
+
New GREEN ClassificationFlowResult
|
| 242 |
+
"""
|
| 243 |
+
# Execute GREEN flow
|
| 244 |
+
return self.execute_green_flow(
|
| 245 |
+
message,
|
| 246 |
+
confidence=0.80, # High confidence for downgraded case
|
| 247 |
+
indicators=[] # No indicators for GREEN
|
| 248 |
+
)
|
| 249 |
+
|
| 250 |
+
def _simulate_patient_responses(
|
| 251 |
+
self,
|
| 252 |
+
questions: List[FollowUpQuestion]
|
| 253 |
+
) -> List[str]:
|
| 254 |
+
"""
|
| 255 |
+
Simulate patient responses to follow-up questions for demo/testing.
|
| 256 |
+
|
| 257 |
+
Args:
|
| 258 |
+
questions: List of follow-up questions
|
| 259 |
+
|
| 260 |
+
Returns:
|
| 261 |
+
List of simulated patient responses
|
| 262 |
+
"""
|
| 263 |
+
# Simple simulation - in real implementation, these would come from user
|
| 264 |
+
responses = [
|
| 265 |
+
"I've been feeling okay, just worried about my treatment.",
|
| 266 |
+
"I have my family to talk to, but sometimes I feel alone.",
|
| 267 |
+
"I think I'd like to talk to someone from the care team."
|
| 268 |
+
]
|
| 269 |
+
|
| 270 |
+
# Return responses matching the number of questions
|
| 271 |
+
return responses[:len(questions)]
|
| 272 |
+
|
| 273 |
+
def _evaluate_patient_responses(
|
| 274 |
+
self,
|
| 275 |
+
responses: List[str]
|
| 276 |
+
) -> str:
|
| 277 |
+
"""
|
| 278 |
+
Evaluate patient responses to determine if escalation or downgrade needed.
|
| 279 |
+
|
| 280 |
+
Args:
|
| 281 |
+
responses: List of patient responses
|
| 282 |
+
|
| 283 |
+
Returns:
|
| 284 |
+
"red" for escalation, "green" for downgrade, None for no change
|
| 285 |
+
"""
|
| 286 |
+
# Simple evaluation logic for demo/testing
|
| 287 |
+
# In real implementation, this would use more sophisticated analysis
|
| 288 |
+
|
| 289 |
+
combined_responses = " ".join(responses).lower()
|
| 290 |
+
|
| 291 |
+
# Check for escalation keywords (distress indicators)
|
| 292 |
+
escalation_keywords = [
|
| 293 |
+
"hopeless", "worthless", "can't go on", "want to die",
|
| 294 |
+
"no point", "give up", "unbearable", "can't take it"
|
| 295 |
+
]
|
| 296 |
+
|
| 297 |
+
if any(keyword in combined_responses for keyword in escalation_keywords):
|
| 298 |
+
return "red"
|
| 299 |
+
|
| 300 |
+
# Check for downgrade keywords (positive indicators)
|
| 301 |
+
downgrade_keywords = [
|
| 302 |
+
"feeling better", "okay", "fine", "good support",
|
| 303 |
+
"not worried", "managing well", "hopeful"
|
| 304 |
+
]
|
| 305 |
+
|
| 306 |
+
if any(keyword in combined_responses for keyword in downgrade_keywords):
|
| 307 |
+
return "green"
|
| 308 |
+
|
| 309 |
+
# No clear indication - remain YELLOW
|
| 310 |
+
return None
|
src/core/content_generator.py
ADDED
|
@@ -0,0 +1,346 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# content_generator.py
|
| 2 |
+
"""
|
| 3 |
+
Content Generation Service for Chaplain Feedback System.
|
| 4 |
+
|
| 5 |
+
Generates explanations, permission checks, referral messages, and follow-up questions
|
| 6 |
+
for RED/YELLOW/GREEN classification flows.
|
| 7 |
+
"""
|
| 8 |
+
|
| 9 |
+
from typing import List
|
| 10 |
+
import uuid
|
| 11 |
+
|
| 12 |
+
from src.core.chaplain_models import (
|
| 13 |
+
DistressIndicator,
|
| 14 |
+
FollowUpQuestion,
|
| 15 |
+
)
|
| 16 |
+
|
| 17 |
+
|
| 18 |
+
class ContentGenerator:
|
| 19 |
+
"""
|
| 20 |
+
Generates content for classification flows.
|
| 21 |
+
|
| 22 |
+
Provides methods to generate:
|
| 23 |
+
- Explanations for RED/YELLOW/GREEN classifications
|
| 24 |
+
- Permission check messages for RED cases
|
| 25 |
+
- Referral messages for spiritual care team
|
| 26 |
+
- Follow-up questions for YELLOW cases
|
| 27 |
+
"""
|
| 28 |
+
|
| 29 |
+
def generate_explanation(
|
| 30 |
+
self,
|
| 31 |
+
classification: str,
|
| 32 |
+
indicators: List[DistressIndicator],
|
| 33 |
+
message: str
|
| 34 |
+
) -> str:
|
| 35 |
+
"""
|
| 36 |
+
Generate explanation for classification.
|
| 37 |
+
|
| 38 |
+
Args:
|
| 39 |
+
classification: "red", "yellow", or "green"
|
| 40 |
+
indicators: List of detected distress indicators
|
| 41 |
+
message: Original patient message
|
| 42 |
+
|
| 43 |
+
Returns:
|
| 44 |
+
Explanation text referencing distress indicators
|
| 45 |
+
"""
|
| 46 |
+
if classification == "red":
|
| 47 |
+
return self._generate_red_explanation(indicators, message)
|
| 48 |
+
elif classification == "yellow":
|
| 49 |
+
return self._generate_yellow_explanation(indicators, message)
|
| 50 |
+
else:
|
| 51 |
+
return self._generate_green_explanation(message)
|
| 52 |
+
|
| 53 |
+
def _generate_red_explanation(
|
| 54 |
+
self,
|
| 55 |
+
indicators: List[DistressIndicator],
|
| 56 |
+
message: str
|
| 57 |
+
) -> str:
|
| 58 |
+
"""Generate explanation for RED classification."""
|
| 59 |
+
explanation_parts = [
|
| 60 |
+
"This message has been classified as RED FLAG (severe spiritual distress) "
|
| 61 |
+
"requiring immediate attention from the spiritual care team."
|
| 62 |
+
]
|
| 63 |
+
|
| 64 |
+
if indicators:
|
| 65 |
+
explanation_parts.append("\n\nDetected distress indicators:")
|
| 66 |
+
for indicator in indicators:
|
| 67 |
+
indicator_line = (
|
| 68 |
+
f"\n- {indicator.subcategory} ({indicator.category}): "
|
| 69 |
+
f"'{indicator.indicator_text}' "
|
| 70 |
+
f"[Ref: {indicator.definition_reference}, Confidence: {indicator.confidence:.0%}]"
|
| 71 |
+
)
|
| 72 |
+
explanation_parts.append(indicator_line)
|
| 73 |
+
|
| 74 |
+
explanation_parts.append(
|
| 75 |
+
"\n\nThis classification indicates severe spiritual distress that requires "
|
| 76 |
+
"immediate referral to the spiritual health team. The indicators suggest "
|
| 77 |
+
"the patient may benefit from professional spiritual care support."
|
| 78 |
+
)
|
| 79 |
+
|
| 80 |
+
return "".join(explanation_parts)
|
| 81 |
+
|
| 82 |
+
def _generate_yellow_explanation(
|
| 83 |
+
self,
|
| 84 |
+
indicators: List[DistressIndicator],
|
| 85 |
+
message: str
|
| 86 |
+
) -> str:
|
| 87 |
+
"""Generate explanation for YELLOW classification."""
|
| 88 |
+
explanation_parts = [
|
| 89 |
+
"This message has been classified as YELLOW FLAG (potential spiritual distress) "
|
| 90 |
+
"requiring clarifying questions."
|
| 91 |
+
]
|
| 92 |
+
|
| 93 |
+
if indicators:
|
| 94 |
+
explanation_parts.append("\n\nDetected potential distress indicators:")
|
| 95 |
+
for indicator in indicators:
|
| 96 |
+
indicator_line = (
|
| 97 |
+
f"\n- {indicator.subcategory} ({indicator.category}): "
|
| 98 |
+
f"'{indicator.indicator_text}' "
|
| 99 |
+
f"[Ref: {indicator.definition_reference}, Confidence: {indicator.confidence:.0%}]"
|
| 100 |
+
)
|
| 101 |
+
explanation_parts.append(indicator_line)
|
| 102 |
+
|
| 103 |
+
# Explain why not RED
|
| 104 |
+
explanation_parts.append(
|
| 105 |
+
"\n\nWhy not RED: The indicators detected suggest potential distress but "
|
| 106 |
+
"do not meet the threshold for severe spiritual distress requiring immediate "
|
| 107 |
+
"referral. Further clarification is needed to determine the severity."
|
| 108 |
+
)
|
| 109 |
+
|
| 110 |
+
# Explain why not GREEN
|
| 111 |
+
explanation_parts.append(
|
| 112 |
+
"\n\nWhy not GREEN: The message contains indicators that suggest possible "
|
| 113 |
+
"spiritual concerns that warrant follow-up questions to better understand "
|
| 114 |
+
"the patient's spiritual state."
|
| 115 |
+
)
|
| 116 |
+
|
| 117 |
+
return "".join(explanation_parts)
|
| 118 |
+
|
| 119 |
+
def _generate_green_explanation(self, message: str) -> str:
|
| 120 |
+
"""Generate explanation for GREEN classification."""
|
| 121 |
+
explanation_parts = [
|
| 122 |
+
"This message has been classified as GREEN (no spiritual distress indicators detected)."
|
| 123 |
+
]
|
| 124 |
+
|
| 125 |
+
explanation_parts.append(
|
| 126 |
+
"\n\nNo spiritual distress indicators were found in this message. "
|
| 127 |
+
"The content does not suggest spiritual concerns that require follow-up "
|
| 128 |
+
"or referral to the spiritual care team."
|
| 129 |
+
)
|
| 130 |
+
|
| 131 |
+
# Explain why not RED or YELLOW
|
| 132 |
+
explanation_parts.append(
|
| 133 |
+
"\n\nWhy not RED or YELLOW: The message does not contain expressions of "
|
| 134 |
+
"spiritual distress, grief, existential questioning, or other indicators "
|
| 135 |
+
"defined in the spiritual distress definitions document."
|
| 136 |
+
)
|
| 137 |
+
|
| 138 |
+
explanation_parts.append("\n\nNo further steps required.")
|
| 139 |
+
|
| 140 |
+
return "".join(explanation_parts)
|
| 141 |
+
|
| 142 |
+
|
| 143 |
+
def generate_permission_check(
|
| 144 |
+
self,
|
| 145 |
+
indicators: List[DistressIndicator]
|
| 146 |
+
) -> str:
|
| 147 |
+
"""
|
| 148 |
+
Generate patient consent request message for RED cases.
|
| 149 |
+
|
| 150 |
+
Args:
|
| 151 |
+
indicators: List of detected distress indicators
|
| 152 |
+
|
| 153 |
+
Returns:
|
| 154 |
+
Permission check message with spiritual support and consent language
|
| 155 |
+
"""
|
| 156 |
+
message_parts = [
|
| 157 |
+
"We noticed some things in your message that suggest you might be going "
|
| 158 |
+
"through a difficult time spiritually or emotionally."
|
| 159 |
+
]
|
| 160 |
+
|
| 161 |
+
message_parts.append(
|
| 162 |
+
"\n\nOur hospital has a spiritual care team that provides support to "
|
| 163 |
+
"patients who are experiencing spiritual distress. They can offer "
|
| 164 |
+
"compassionate listening, spiritual guidance, and emotional support."
|
| 165 |
+
)
|
| 166 |
+
|
| 167 |
+
message_parts.append(
|
| 168 |
+
"\n\nWould you like us to connect you with a member of our spiritual "
|
| 169 |
+
"care team? Your consent is important to us, and this referral is "
|
| 170 |
+
"entirely voluntary."
|
| 171 |
+
)
|
| 172 |
+
|
| 173 |
+
message_parts.append(
|
| 174 |
+
"\n\nPlease let us know if you would like spiritual support, or if you "
|
| 175 |
+
"prefer not to be contacted by the spiritual care team at this time."
|
| 176 |
+
)
|
| 177 |
+
|
| 178 |
+
return "".join(message_parts)
|
| 179 |
+
|
| 180 |
+
def generate_referral_message(
|
| 181 |
+
self,
|
| 182 |
+
message: str,
|
| 183 |
+
indicators: List[DistressIndicator],
|
| 184 |
+
explanation: str
|
| 185 |
+
) -> str:
|
| 186 |
+
"""
|
| 187 |
+
Generate referral message for spiritual care team.
|
| 188 |
+
|
| 189 |
+
Args:
|
| 190 |
+
message: Original patient message
|
| 191 |
+
indicators: List of detected distress indicators
|
| 192 |
+
explanation: Classification explanation
|
| 193 |
+
|
| 194 |
+
Returns:
|
| 195 |
+
Referral message with background, indicators, and justification
|
| 196 |
+
"""
|
| 197 |
+
referral_parts = ["SPIRITUAL CARE TEAM REFERRAL"]
|
| 198 |
+
referral_parts.append("\n" + "=" * 40)
|
| 199 |
+
|
| 200 |
+
# Background section
|
| 201 |
+
referral_parts.append("\n\nBACKGROUND:")
|
| 202 |
+
referral_parts.append(
|
| 203 |
+
f"\nPatient message excerpt: \"{message[:200]}{'...' if len(message) > 200 else ''}\""
|
| 204 |
+
)
|
| 205 |
+
|
| 206 |
+
# Indicators section
|
| 207 |
+
referral_parts.append("\n\nINDICATORS DETECTED:")
|
| 208 |
+
if indicators:
|
| 209 |
+
for indicator in indicators:
|
| 210 |
+
referral_parts.append(
|
| 211 |
+
f"\n- {indicator.subcategory} ({indicator.category})"
|
| 212 |
+
)
|
| 213 |
+
referral_parts.append(f"\n Severity: {indicator.severity.upper()}")
|
| 214 |
+
referral_parts.append(f"\n Reference: {indicator.definition_reference}")
|
| 215 |
+
referral_parts.append(f"\n Confidence: {indicator.confidence:.0%}")
|
| 216 |
+
referral_parts.append(f"\n Text: \"{indicator.indicator_text}\"")
|
| 217 |
+
else:
|
| 218 |
+
referral_parts.append("\n- No specific indicators (general distress detected)")
|
| 219 |
+
|
| 220 |
+
# Justification section
|
| 221 |
+
referral_parts.append("\n\nJUSTIFICATION FOR RED FLAG:")
|
| 222 |
+
referral_parts.append(
|
| 223 |
+
"\nThis patient has been flagged for immediate spiritual care attention "
|
| 224 |
+
"based on the severity of distress indicators detected in their message. "
|
| 225 |
+
)
|
| 226 |
+
|
| 227 |
+
if indicators:
|
| 228 |
+
red_indicators = [i for i in indicators if i.severity == "red"]
|
| 229 |
+
if red_indicators:
|
| 230 |
+
referral_parts.append(
|
| 231 |
+
f"\n\nThe following severe (RED) indicators were identified: "
|
| 232 |
+
f"{', '.join(i.subcategory for i in red_indicators)}."
|
| 233 |
+
)
|
| 234 |
+
|
| 235 |
+
referral_parts.append(
|
| 236 |
+
"\n\nRecommended action: Please reach out to this patient at your "
|
| 237 |
+
"earliest convenience to provide spiritual support and assessment."
|
| 238 |
+
)
|
| 239 |
+
|
| 240 |
+
referral_parts.append("\n\n" + "=" * 40)
|
| 241 |
+
referral_parts.append("\nPatient has provided consent for this referral.")
|
| 242 |
+
|
| 243 |
+
return "".join(referral_parts)
|
| 244 |
+
|
| 245 |
+
def generate_follow_up_questions(
|
| 246 |
+
self,
|
| 247 |
+
message: str,
|
| 248 |
+
indicators: List[DistressIndicator]
|
| 249 |
+
) -> List[FollowUpQuestion]:
|
| 250 |
+
"""
|
| 251 |
+
Generate 2-3 clarifying questions for YELLOW cases.
|
| 252 |
+
|
| 253 |
+
Each question contains 1-2 short, sensitive clarifying questions
|
| 254 |
+
with a purpose explanation.
|
| 255 |
+
|
| 256 |
+
Args:
|
| 257 |
+
message: Original patient message
|
| 258 |
+
indicators: List of detected distress indicators
|
| 259 |
+
|
| 260 |
+
Returns:
|
| 261 |
+
List of 2-3 FollowUpQuestion instances
|
| 262 |
+
"""
|
| 263 |
+
questions = []
|
| 264 |
+
|
| 265 |
+
# Generate questions based on indicator categories
|
| 266 |
+
categories = set(i.category for i in indicators) if indicators else set()
|
| 267 |
+
|
| 268 |
+
# Question 1: General well-being check
|
| 269 |
+
questions.append(FollowUpQuestion(
|
| 270 |
+
question_id=str(uuid.uuid4())[:8],
|
| 271 |
+
question_text=(
|
| 272 |
+
"How have you been feeling overall lately? "
|
| 273 |
+
"Is there anything specific that's been on your mind?"
|
| 274 |
+
),
|
| 275 |
+
purpose=(
|
| 276 |
+
"To understand the patient's general emotional and spiritual state "
|
| 277 |
+
"and identify any underlying concerns."
|
| 278 |
+
)
|
| 279 |
+
))
|
| 280 |
+
|
| 281 |
+
# Question 2: Based on detected categories or general spiritual inquiry
|
| 282 |
+
if "Grief" in categories:
|
| 283 |
+
questions.append(FollowUpQuestion(
|
| 284 |
+
question_id=str(uuid.uuid4())[:8],
|
| 285 |
+
question_text=(
|
| 286 |
+
"It sounds like you may be dealing with some difficult feelings. "
|
| 287 |
+
"Would you like to share more about what you're experiencing?"
|
| 288 |
+
),
|
| 289 |
+
purpose=(
|
| 290 |
+
"To explore potential grief-related concerns and provide "
|
| 291 |
+
"opportunity for the patient to express their feelings."
|
| 292 |
+
)
|
| 293 |
+
))
|
| 294 |
+
elif "Existential" in categories:
|
| 295 |
+
questions.append(FollowUpQuestion(
|
| 296 |
+
question_id=str(uuid.uuid4())[:8],
|
| 297 |
+
question_text=(
|
| 298 |
+
"Sometimes when we're going through health challenges, we find "
|
| 299 |
+
"ourselves thinking about bigger questions. Is that something "
|
| 300 |
+
"you'd like to talk about?"
|
| 301 |
+
),
|
| 302 |
+
purpose=(
|
| 303 |
+
"To explore existential concerns and meaning-making in the "
|
| 304 |
+
"context of the patient's health situation."
|
| 305 |
+
)
|
| 306 |
+
))
|
| 307 |
+
elif "Spiritual" in categories:
|
| 308 |
+
questions.append(FollowUpQuestion(
|
| 309 |
+
question_id=str(uuid.uuid4())[:8],
|
| 310 |
+
question_text=(
|
| 311 |
+
"Do you have any spiritual or religious practices that are "
|
| 312 |
+
"important to you? How has your current situation affected them?"
|
| 313 |
+
),
|
| 314 |
+
purpose=(
|
| 315 |
+
"To understand the patient's spiritual background and how "
|
| 316 |
+
"their current situation may be impacting their spiritual life."
|
| 317 |
+
)
|
| 318 |
+
))
|
| 319 |
+
else:
|
| 320 |
+
questions.append(FollowUpQuestion(
|
| 321 |
+
question_id=str(uuid.uuid4())[:8],
|
| 322 |
+
question_text=(
|
| 323 |
+
"Is there anything that's been particularly challenging for you "
|
| 324 |
+
"during this time? What kind of support would be most helpful?"
|
| 325 |
+
),
|
| 326 |
+
purpose=(
|
| 327 |
+
"To identify specific challenges and understand what type of "
|
| 328 |
+
"support the patient might need."
|
| 329 |
+
)
|
| 330 |
+
))
|
| 331 |
+
|
| 332 |
+
# Question 3: Support and resources
|
| 333 |
+
questions.append(FollowUpQuestion(
|
| 334 |
+
question_id=str(uuid.uuid4())[:8],
|
| 335 |
+
question_text=(
|
| 336 |
+
"Do you have people in your life you can talk to about these things? "
|
| 337 |
+
"Would you be interested in speaking with someone from our care team?"
|
| 338 |
+
),
|
| 339 |
+
purpose=(
|
| 340 |
+
"To assess the patient's support system and gauge interest in "
|
| 341 |
+
"additional spiritual care resources."
|
| 342 |
+
)
|
| 343 |
+
))
|
| 344 |
+
|
| 345 |
+
# Ensure we return 2-3 questions
|
| 346 |
+
return questions[:3]
|
src/core/error_pattern_analyzer.py
ADDED
|
@@ -0,0 +1,283 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# error_pattern_analyzer.py
|
| 2 |
+
"""
|
| 3 |
+
Error Pattern Analyzer for Chaplain Feedback System.
|
| 4 |
+
|
| 5 |
+
Analyzes tagging records to identify error patterns, calculate subcategory
|
| 6 |
+
breakdowns, and provide insights into classifier performance.
|
| 7 |
+
"""
|
| 8 |
+
|
| 9 |
+
from typing import List, Dict, Any
|
| 10 |
+
from collections import Counter
|
| 11 |
+
|
| 12 |
+
from .chaplain_models import (
|
| 13 |
+
TaggingRecord,
|
| 14 |
+
CLASSIFICATION_SUBCATEGORIES,
|
| 15 |
+
QUESTION_ISSUE_TYPES,
|
| 16 |
+
REFERRAL_ISSUE_TYPES,
|
| 17 |
+
)
|
| 18 |
+
|
| 19 |
+
|
| 20 |
+
class ErrorPatternAnalyzer:
|
| 21 |
+
"""
|
| 22 |
+
Analyzes error patterns from tagging records.
|
| 23 |
+
|
| 24 |
+
Provides methods to calculate subcategory breakdowns, identify common
|
| 25 |
+
error patterns, and generate statistics for session analysis.
|
| 26 |
+
"""
|
| 27 |
+
|
| 28 |
+
def __init__(self):
|
| 29 |
+
"""Initialize the error pattern analyzer."""
|
| 30 |
+
pass
|
| 31 |
+
|
| 32 |
+
def analyze_classification_errors(
|
| 33 |
+
self,
|
| 34 |
+
records: List[TaggingRecord]
|
| 35 |
+
) -> Dict[str, int]:
|
| 36 |
+
"""
|
| 37 |
+
Get breakdown of classification error subcategories.
|
| 38 |
+
|
| 39 |
+
Counts how many times each classification error subcategory appears
|
| 40 |
+
in the provided records.
|
| 41 |
+
|
| 42 |
+
Args:
|
| 43 |
+
records: List of TaggingRecord instances to analyze
|
| 44 |
+
|
| 45 |
+
Returns:
|
| 46 |
+
Dictionary mapping subcategory names to counts
|
| 47 |
+
Example: {
|
| 48 |
+
"missed_indicators": 5,
|
| 49 |
+
"false_positive": 2,
|
| 50 |
+
"missed_distress": 3
|
| 51 |
+
}
|
| 52 |
+
"""
|
| 53 |
+
subcategory_counts = {subcategory: 0 for subcategory in CLASSIFICATION_SUBCATEGORIES}
|
| 54 |
+
|
| 55 |
+
for record in records:
|
| 56 |
+
# Only count records where classification is incorrect
|
| 57 |
+
if not record.is_classification_correct and record.classification_subcategory:
|
| 58 |
+
subcategory = record.classification_subcategory
|
| 59 |
+
if subcategory in subcategory_counts:
|
| 60 |
+
subcategory_counts[subcategory] += 1
|
| 61 |
+
|
| 62 |
+
return subcategory_counts
|
| 63 |
+
|
| 64 |
+
def analyze_question_issues(
|
| 65 |
+
self,
|
| 66 |
+
records: List[TaggingRecord]
|
| 67 |
+
) -> Dict[str, int]:
|
| 68 |
+
"""
|
| 69 |
+
Get breakdown of follow-up question issues by subcategory.
|
| 70 |
+
|
| 71 |
+
Counts how many times each question issue type appears across
|
| 72 |
+
all records (supporting multi-select).
|
| 73 |
+
|
| 74 |
+
Args:
|
| 75 |
+
records: List of TaggingRecord instances to analyze
|
| 76 |
+
|
| 77 |
+
Returns:
|
| 78 |
+
Dictionary mapping issue type names to counts
|
| 79 |
+
Example: {
|
| 80 |
+
"inappropriate": 3,
|
| 81 |
+
"not_relevant": 2,
|
| 82 |
+
"too_leading": 1,
|
| 83 |
+
"unclear": 0,
|
| 84 |
+
"tone_clinical": 2,
|
| 85 |
+
"tone_religious": 0,
|
| 86 |
+
"tone_casual": 1
|
| 87 |
+
}
|
| 88 |
+
"""
|
| 89 |
+
issue_counts = {issue_type: 0 for issue_type in QUESTION_ISSUE_TYPES}
|
| 90 |
+
|
| 91 |
+
for record in records:
|
| 92 |
+
# Count each issue type in the multi-select list
|
| 93 |
+
for issue in record.question_issues:
|
| 94 |
+
if issue in issue_counts:
|
| 95 |
+
issue_counts[issue] += 1
|
| 96 |
+
|
| 97 |
+
return issue_counts
|
| 98 |
+
|
| 99 |
+
def analyze_referral_issues(
|
| 100 |
+
self,
|
| 101 |
+
records: List[TaggingRecord]
|
| 102 |
+
) -> Dict[str, int]:
|
| 103 |
+
"""
|
| 104 |
+
Get breakdown of referral message issues by subcategory.
|
| 105 |
+
|
| 106 |
+
Counts how many times each referral issue type appears across
|
| 107 |
+
all records (supporting multi-select).
|
| 108 |
+
|
| 109 |
+
Args:
|
| 110 |
+
records: List of TaggingRecord instances to analyze
|
| 111 |
+
|
| 112 |
+
Returns:
|
| 113 |
+
Dictionary mapping issue type names to counts
|
| 114 |
+
Example: {
|
| 115 |
+
"incomplete_summary": 2,
|
| 116 |
+
"misrepresentation": 1,
|
| 117 |
+
"inappropriate_tone": 3
|
| 118 |
+
}
|
| 119 |
+
"""
|
| 120 |
+
issue_counts = {issue_type: 0 for issue_type in REFERRAL_ISSUE_TYPES}
|
| 121 |
+
|
| 122 |
+
for record in records:
|
| 123 |
+
# Count each issue type in the multi-select list
|
| 124 |
+
for issue in record.referral_issues:
|
| 125 |
+
if issue in issue_counts:
|
| 126 |
+
issue_counts[issue] += 1
|
| 127 |
+
|
| 128 |
+
return issue_counts
|
| 129 |
+
|
| 130 |
+
def analyze_indicator_issues(
|
| 131 |
+
self,
|
| 132 |
+
records: List[TaggingRecord]
|
| 133 |
+
) -> Dict[str, int]:
|
| 134 |
+
"""
|
| 135 |
+
Get breakdown of commonly missed/incorrectly identified indicators.
|
| 136 |
+
|
| 137 |
+
Counts how many times each indicator ID appears in the indicator_issues
|
| 138 |
+
lists across all records.
|
| 139 |
+
|
| 140 |
+
Args:
|
| 141 |
+
records: List of TaggingRecord instances to analyze
|
| 142 |
+
|
| 143 |
+
Returns:
|
| 144 |
+
Dictionary mapping indicator IDs to counts
|
| 145 |
+
Example: {
|
| 146 |
+
"excessive_guilt": 3,
|
| 147 |
+
"crying": 2,
|
| 148 |
+
"anxiety": 1
|
| 149 |
+
}
|
| 150 |
+
"""
|
| 151 |
+
indicator_counts: Dict[str, int] = {}
|
| 152 |
+
|
| 153 |
+
for record in records:
|
| 154 |
+
# Count each indicator in the list
|
| 155 |
+
for indicator_id in record.indicator_issues:
|
| 156 |
+
if indicator_id not in indicator_counts:
|
| 157 |
+
indicator_counts[indicator_id] = 0
|
| 158 |
+
indicator_counts[indicator_id] += 1
|
| 159 |
+
|
| 160 |
+
return indicator_counts
|
| 161 |
+
|
| 162 |
+
def get_common_patterns(
|
| 163 |
+
self,
|
| 164 |
+
records: List[TaggingRecord]
|
| 165 |
+
) -> List[str]:
|
| 166 |
+
"""
|
| 167 |
+
Get list of common error patterns in plain language.
|
| 168 |
+
|
| 169 |
+
Analyzes all error types and returns human-readable descriptions
|
| 170 |
+
of the most common patterns found in the records.
|
| 171 |
+
|
| 172 |
+
Args:
|
| 173 |
+
records: List of TaggingRecord instances to analyze
|
| 174 |
+
|
| 175 |
+
Returns:
|
| 176 |
+
List of plain-language descriptions of common patterns
|
| 177 |
+
Example: [
|
| 178 |
+
"Most common classification error: missed_indicators (5 occurrences)",
|
| 179 |
+
"Most common question issue: inappropriate (3 occurrences)",
|
| 180 |
+
"Most common referral issue: inappropriate_tone (3 occurrences)"
|
| 181 |
+
]
|
| 182 |
+
"""
|
| 183 |
+
patterns = []
|
| 184 |
+
|
| 185 |
+
# Analyze classification errors
|
| 186 |
+
classification_errors = self.analyze_classification_errors(records)
|
| 187 |
+
if any(classification_errors.values()):
|
| 188 |
+
max_error = max(classification_errors.items(), key=lambda x: x[1])
|
| 189 |
+
if max_error[1] > 0:
|
| 190 |
+
patterns.append(
|
| 191 |
+
f"Most common classification error: {max_error[0]} ({max_error[1]} occurrences)"
|
| 192 |
+
)
|
| 193 |
+
|
| 194 |
+
# Analyze question issues
|
| 195 |
+
question_issues = self.analyze_question_issues(records)
|
| 196 |
+
if any(question_issues.values()):
|
| 197 |
+
max_issue = max(question_issues.items(), key=lambda x: x[1])
|
| 198 |
+
if max_issue[1] > 0:
|
| 199 |
+
patterns.append(
|
| 200 |
+
f"Most common question issue: {max_issue[0]} ({max_issue[1]} occurrences)"
|
| 201 |
+
)
|
| 202 |
+
|
| 203 |
+
# Analyze referral issues
|
| 204 |
+
referral_issues = self.analyze_referral_issues(records)
|
| 205 |
+
if any(referral_issues.values()):
|
| 206 |
+
max_issue = max(referral_issues.items(), key=lambda x: x[1])
|
| 207 |
+
if max_issue[1] > 0:
|
| 208 |
+
patterns.append(
|
| 209 |
+
f"Most common referral issue: {max_issue[0]} ({max_issue[1]} occurrences)"
|
| 210 |
+
)
|
| 211 |
+
|
| 212 |
+
# Analyze indicator issues
|
| 213 |
+
indicator_issues = self.analyze_indicator_issues(records)
|
| 214 |
+
if indicator_issues:
|
| 215 |
+
max_indicator = max(indicator_issues.items(), key=lambda x: x[1])
|
| 216 |
+
if max_indicator[1] > 0:
|
| 217 |
+
patterns.append(
|
| 218 |
+
f"Most commonly missed/incorrect indicator: {max_indicator[0]} ({max_indicator[1]} occurrences)"
|
| 219 |
+
)
|
| 220 |
+
|
| 221 |
+
return patterns
|
| 222 |
+
|
| 223 |
+
def get_statistics_summary(
|
| 224 |
+
self,
|
| 225 |
+
records: List[TaggingRecord]
|
| 226 |
+
) -> Dict[str, Any]:
|
| 227 |
+
"""
|
| 228 |
+
Get comprehensive statistics summary for a session.
|
| 229 |
+
|
| 230 |
+
Combines all analysis methods into a single summary dictionary
|
| 231 |
+
suitable for display or export.
|
| 232 |
+
|
| 233 |
+
Args:
|
| 234 |
+
records: List of TaggingRecord instances to analyze
|
| 235 |
+
|
| 236 |
+
Returns:
|
| 237 |
+
Dictionary containing all statistics
|
| 238 |
+
Example: {
|
| 239 |
+
"total_records": 10,
|
| 240 |
+
"classification_errors": {...},
|
| 241 |
+
"question_issues": {...},
|
| 242 |
+
"referral_issues": {...},
|
| 243 |
+
"indicator_issues": {...},
|
| 244 |
+
"common_patterns": [...]
|
| 245 |
+
}
|
| 246 |
+
"""
|
| 247 |
+
return {
|
| 248 |
+
"total_records": len(records),
|
| 249 |
+
"classification_errors": self.analyze_classification_errors(records),
|
| 250 |
+
"question_issues": self.analyze_question_issues(records),
|
| 251 |
+
"referral_issues": self.analyze_referral_issues(records),
|
| 252 |
+
"indicator_issues": self.analyze_indicator_issues(records),
|
| 253 |
+
"common_patterns": self.get_common_patterns(records),
|
| 254 |
+
}
|
| 255 |
+
|
| 256 |
+
def get_error_patterns_grouped_by_type(
|
| 257 |
+
self,
|
| 258 |
+
records: List[TaggingRecord]
|
| 259 |
+
) -> Dict[str, Dict[str, int]]:
|
| 260 |
+
"""
|
| 261 |
+
Get error patterns grouped by error type.
|
| 262 |
+
|
| 263 |
+
Returns all error types grouped together with their frequency counts,
|
| 264 |
+
suitable for display in error pattern summaries.
|
| 265 |
+
|
| 266 |
+
Args:
|
| 267 |
+
records: List of TaggingRecord instances to analyze
|
| 268 |
+
|
| 269 |
+
Returns:
|
| 270 |
+
Dictionary with error types as keys and subcategory breakdowns as values
|
| 271 |
+
Example: {
|
| 272 |
+
"classification": {"missed_indicators": 5, "false_positive": 2, ...},
|
| 273 |
+
"question": {"inappropriate": 3, "not_relevant": 2, ...},
|
| 274 |
+
"referral": {"incomplete_summary": 2, "misrepresentation": 1, ...},
|
| 275 |
+
"indicator": {"excessive_guilt": 3, "crying": 2, ...}
|
| 276 |
+
}
|
| 277 |
+
"""
|
| 278 |
+
return {
|
| 279 |
+
"classification": self.analyze_classification_errors(records),
|
| 280 |
+
"question": self.analyze_question_issues(records),
|
| 281 |
+
"referral": self.analyze_referral_issues(records),
|
| 282 |
+
"indicator": self.analyze_indicator_issues(records),
|
| 283 |
+
}
|
src/core/interaction_logger.py
ADDED
|
@@ -0,0 +1,258 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# interaction_logger.py
|
| 2 |
+
"""
|
| 3 |
+
Interaction logging service for Chaplain Feedback System.
|
| 4 |
+
|
| 5 |
+
Logs all interaction steps with input/output and supports approval status updates.
|
| 6 |
+
"""
|
| 7 |
+
|
| 8 |
+
import uuid
|
| 9 |
+
from typing import List, Optional, Dict, Any
|
| 10 |
+
from datetime import datetime
|
| 11 |
+
|
| 12 |
+
from src.core.chaplain_models import (
|
| 13 |
+
InteractionStepLog,
|
| 14 |
+
TaggingRecord,
|
| 15 |
+
)
|
| 16 |
+
|
| 17 |
+
|
| 18 |
+
class InteractionLogger:
|
| 19 |
+
"""
|
| 20 |
+
Logs all interaction steps in the chaplain feedback system.
|
| 21 |
+
|
| 22 |
+
Records input/output for each step and supports updating approval status
|
| 23 |
+
with tagging data.
|
| 24 |
+
"""
|
| 25 |
+
|
| 26 |
+
def __init__(self):
|
| 27 |
+
"""Initialize the interaction logger."""
|
| 28 |
+
# In-memory storage of logs (can be extended to persist to database/file)
|
| 29 |
+
self._logs: Dict[str, InteractionStepLog] = {}
|
| 30 |
+
self._session_logs: Dict[str, List[str]] = {} # session_id -> list of step_ids
|
| 31 |
+
|
| 32 |
+
def log_step(
|
| 33 |
+
self,
|
| 34 |
+
session_id: str,
|
| 35 |
+
message_id: str,
|
| 36 |
+
step_type: str,
|
| 37 |
+
input_text: str,
|
| 38 |
+
model_output: str,
|
| 39 |
+
) -> str:
|
| 40 |
+
"""
|
| 41 |
+
Log an interaction step.
|
| 42 |
+
|
| 43 |
+
Args:
|
| 44 |
+
session_id: ID of the verification session
|
| 45 |
+
message_id: ID of the message being processed
|
| 46 |
+
step_type: Type of step (classification, explanation, permission_check, etc.)
|
| 47 |
+
input_text: Input text for this step
|
| 48 |
+
model_output: Output from the model/system for this step
|
| 49 |
+
|
| 50 |
+
Returns:
|
| 51 |
+
step_id: Unique identifier for this logged step
|
| 52 |
+
|
| 53 |
+
Raises:
|
| 54 |
+
ValueError: If step_type is invalid
|
| 55 |
+
"""
|
| 56 |
+
step_id = str(uuid.uuid4())
|
| 57 |
+
|
| 58 |
+
# Create log entry
|
| 59 |
+
log_entry = InteractionStepLog(
|
| 60 |
+
step_id=step_id,
|
| 61 |
+
session_id=session_id,
|
| 62 |
+
message_id=message_id,
|
| 63 |
+
step_type=step_type,
|
| 64 |
+
input_text=input_text,
|
| 65 |
+
model_output=model_output,
|
| 66 |
+
approval_status=None,
|
| 67 |
+
tagging_data=None,
|
| 68 |
+
timestamp=datetime.now(),
|
| 69 |
+
)
|
| 70 |
+
|
| 71 |
+
# Store log entry
|
| 72 |
+
self._logs[step_id] = log_entry
|
| 73 |
+
|
| 74 |
+
# Track logs by session
|
| 75 |
+
if session_id not in self._session_logs:
|
| 76 |
+
self._session_logs[session_id] = []
|
| 77 |
+
self._session_logs[session_id].append(step_id)
|
| 78 |
+
|
| 79 |
+
return step_id
|
| 80 |
+
|
| 81 |
+
def update_approval(
|
| 82 |
+
self,
|
| 83 |
+
step_id: str,
|
| 84 |
+
approval_status: str,
|
| 85 |
+
tagging_data: Optional[TaggingRecord] = None,
|
| 86 |
+
) -> None:
|
| 87 |
+
"""
|
| 88 |
+
Update a step with approval status and optional tagging data.
|
| 89 |
+
|
| 90 |
+
Args:
|
| 91 |
+
step_id: ID of the step to update
|
| 92 |
+
approval_status: "approved" or "disapproved"
|
| 93 |
+
tagging_data: Optional TaggingRecord with feedback details
|
| 94 |
+
|
| 95 |
+
Raises:
|
| 96 |
+
ValueError: If step_id not found or approval_status is invalid
|
| 97 |
+
"""
|
| 98 |
+
if step_id not in self._logs:
|
| 99 |
+
raise ValueError(f"Step {step_id} not found")
|
| 100 |
+
|
| 101 |
+
if approval_status not in ("approved", "disapproved"):
|
| 102 |
+
raise ValueError(f"Invalid approval_status: {approval_status}")
|
| 103 |
+
|
| 104 |
+
log_entry = self._logs[step_id]
|
| 105 |
+
log_entry.approval_status = approval_status
|
| 106 |
+
log_entry.tagging_data = tagging_data
|
| 107 |
+
|
| 108 |
+
def get_step(self, step_id: str) -> Optional[InteractionStepLog]:
|
| 109 |
+
"""
|
| 110 |
+
Get a specific logged step.
|
| 111 |
+
|
| 112 |
+
Args:
|
| 113 |
+
step_id: ID of the step to retrieve
|
| 114 |
+
|
| 115 |
+
Returns:
|
| 116 |
+
InteractionStepLog if found, None otherwise
|
| 117 |
+
"""
|
| 118 |
+
return self._logs.get(step_id)
|
| 119 |
+
|
| 120 |
+
def get_session_logs(self, session_id: str) -> List[InteractionStepLog]:
|
| 121 |
+
"""
|
| 122 |
+
Get all logs for a session.
|
| 123 |
+
|
| 124 |
+
Args:
|
| 125 |
+
session_id: ID of the session
|
| 126 |
+
|
| 127 |
+
Returns:
|
| 128 |
+
List of InteractionStepLog entries for the session, in order
|
| 129 |
+
"""
|
| 130 |
+
step_ids = self._session_logs.get(session_id, [])
|
| 131 |
+
return [self._logs[step_id] for step_id in step_ids if step_id in self._logs]
|
| 132 |
+
|
| 133 |
+
def get_session_logs_by_type(
|
| 134 |
+
self,
|
| 135 |
+
session_id: str,
|
| 136 |
+
step_type: str,
|
| 137 |
+
) -> List[InteractionStepLog]:
|
| 138 |
+
"""
|
| 139 |
+
Get all logs of a specific type for a session.
|
| 140 |
+
|
| 141 |
+
Args:
|
| 142 |
+
session_id: ID of the session
|
| 143 |
+
step_type: Type of step to filter by
|
| 144 |
+
|
| 145 |
+
Returns:
|
| 146 |
+
List of InteractionStepLog entries matching the type
|
| 147 |
+
"""
|
| 148 |
+
all_logs = self.get_session_logs(session_id)
|
| 149 |
+
return [log for log in all_logs if log.step_type == step_type]
|
| 150 |
+
|
| 151 |
+
def get_message_logs(self, message_id: str) -> List[InteractionStepLog]:
|
| 152 |
+
"""
|
| 153 |
+
Get all logs for a specific message across all sessions.
|
| 154 |
+
|
| 155 |
+
Args:
|
| 156 |
+
message_id: ID of the message
|
| 157 |
+
|
| 158 |
+
Returns:
|
| 159 |
+
List of InteractionStepLog entries for the message
|
| 160 |
+
"""
|
| 161 |
+
return [log for log in self._logs.values() if log.message_id == message_id]
|
| 162 |
+
|
| 163 |
+
def get_unapproved_steps(self, session_id: str) -> List[InteractionStepLog]:
|
| 164 |
+
"""
|
| 165 |
+
Get all steps in a session that haven't been approved/disapproved yet.
|
| 166 |
+
|
| 167 |
+
Args:
|
| 168 |
+
session_id: ID of the session
|
| 169 |
+
|
| 170 |
+
Returns:
|
| 171 |
+
List of InteractionStepLog entries with no approval status
|
| 172 |
+
"""
|
| 173 |
+
session_logs = self.get_session_logs(session_id)
|
| 174 |
+
return [log for log in session_logs if log.approval_status is None]
|
| 175 |
+
|
| 176 |
+
def get_disapproved_steps(self, session_id: str) -> List[InteractionStepLog]:
|
| 177 |
+
"""
|
| 178 |
+
Get all disapproved steps in a session.
|
| 179 |
+
|
| 180 |
+
Args:
|
| 181 |
+
session_id: ID of the session
|
| 182 |
+
|
| 183 |
+
Returns:
|
| 184 |
+
List of disapproved InteractionStepLog entries
|
| 185 |
+
"""
|
| 186 |
+
session_logs = self.get_session_logs(session_id)
|
| 187 |
+
return [log for log in session_logs if log.approval_status == "disapproved"]
|
| 188 |
+
|
| 189 |
+
def get_session_statistics(self, session_id: str) -> Dict[str, Any]:
|
| 190 |
+
"""
|
| 191 |
+
Get statistics for a session's interaction logs.
|
| 192 |
+
|
| 193 |
+
Args:
|
| 194 |
+
session_id: ID of the session
|
| 195 |
+
|
| 196 |
+
Returns:
|
| 197 |
+
Dictionary with statistics about the session's interactions
|
| 198 |
+
"""
|
| 199 |
+
session_logs = self.get_session_logs(session_id)
|
| 200 |
+
|
| 201 |
+
if not session_logs:
|
| 202 |
+
return {
|
| 203 |
+
"session_id": session_id,
|
| 204 |
+
"total_steps": 0,
|
| 205 |
+
"approved_steps": 0,
|
| 206 |
+
"disapproved_steps": 0,
|
| 207 |
+
"unapproved_steps": 0,
|
| 208 |
+
"steps_by_type": {},
|
| 209 |
+
}
|
| 210 |
+
|
| 211 |
+
# Count by approval status
|
| 212 |
+
approved = sum(1 for log in session_logs if log.approval_status == "approved")
|
| 213 |
+
disapproved = sum(1 for log in session_logs if log.approval_status == "disapproved")
|
| 214 |
+
unapproved = sum(1 for log in session_logs if log.approval_status is None)
|
| 215 |
+
|
| 216 |
+
# Count by step type
|
| 217 |
+
steps_by_type = {}
|
| 218 |
+
for log in session_logs:
|
| 219 |
+
if log.step_type not in steps_by_type:
|
| 220 |
+
steps_by_type[log.step_type] = 0
|
| 221 |
+
steps_by_type[log.step_type] += 1
|
| 222 |
+
|
| 223 |
+
return {
|
| 224 |
+
"session_id": session_id,
|
| 225 |
+
"total_steps": len(session_logs),
|
| 226 |
+
"approved_steps": approved,
|
| 227 |
+
"disapproved_steps": disapproved,
|
| 228 |
+
"unapproved_steps": unapproved,
|
| 229 |
+
"steps_by_type": steps_by_type,
|
| 230 |
+
}
|
| 231 |
+
|
| 232 |
+
def clear_session(self, session_id: str) -> None:
|
| 233 |
+
"""
|
| 234 |
+
Clear all logs for a session.
|
| 235 |
+
|
| 236 |
+
Args:
|
| 237 |
+
session_id: ID of the session to clear
|
| 238 |
+
"""
|
| 239 |
+
step_ids = self._session_logs.get(session_id, [])
|
| 240 |
+
for step_id in step_ids:
|
| 241 |
+
if step_id in self._logs:
|
| 242 |
+
del self._logs[step_id]
|
| 243 |
+
|
| 244 |
+
if session_id in self._session_logs:
|
| 245 |
+
del self._session_logs[session_id]
|
| 246 |
+
|
| 247 |
+
def export_session_logs(self, session_id: str) -> List[Dict[str, Any]]:
|
| 248 |
+
"""
|
| 249 |
+
Export all logs for a session as dictionaries.
|
| 250 |
+
|
| 251 |
+
Args:
|
| 252 |
+
session_id: ID of the session
|
| 253 |
+
|
| 254 |
+
Returns:
|
| 255 |
+
List of log entries as dictionaries
|
| 256 |
+
"""
|
| 257 |
+
session_logs = self.get_session_logs(session_id)
|
| 258 |
+
return [log.to_dict() for log in session_logs]
|
src/core/tagging_service.py
ADDED
|
@@ -0,0 +1,528 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# tagging_service.py
|
| 2 |
+
"""
|
| 3 |
+
Tagging Service for Chaplain Feedback System.
|
| 4 |
+
|
| 5 |
+
Handles creation, validation, and management of tagging records
|
| 6 |
+
for chaplain feedback on classification results.
|
| 7 |
+
"""
|
| 8 |
+
|
| 9 |
+
from typing import List, Optional, Dict, Any
|
| 10 |
+
import uuid
|
| 11 |
+
from datetime import datetime
|
| 12 |
+
|
| 13 |
+
from .chaplain_models import (
|
| 14 |
+
TaggingRecord,
|
| 15 |
+
CLASSIFICATION_SUBCATEGORIES,
|
| 16 |
+
QUESTION_ISSUE_TYPES,
|
| 17 |
+
REFERRAL_ISSUE_TYPES,
|
| 18 |
+
)
|
| 19 |
+
|
| 20 |
+
|
| 21 |
+
class TaggingService:
|
| 22 |
+
"""
|
| 23 |
+
Service for handling tagging record creation and validation.
|
| 24 |
+
|
| 25 |
+
Supports multi-select for question and referral issues,
|
| 26 |
+
classification subcategories, and indicator issue tracking.
|
| 27 |
+
"""
|
| 28 |
+
|
| 29 |
+
def __init__(self):
|
| 30 |
+
"""Initialize the tagging service."""
|
| 31 |
+
self._records: Dict[str, TaggingRecord] = {}
|
| 32 |
+
|
| 33 |
+
def create_tagging_record(
|
| 34 |
+
self,
|
| 35 |
+
message_id: str,
|
| 36 |
+
is_classification_correct: bool = True,
|
| 37 |
+
classification_subcategory: Optional[str] = None,
|
| 38 |
+
correct_classification: Optional[str] = None,
|
| 39 |
+
question_issues: Optional[List[str]] = None,
|
| 40 |
+
question_comments: Optional[str] = None,
|
| 41 |
+
referral_issues: Optional[List[str]] = None,
|
| 42 |
+
referral_comments: Optional[str] = None,
|
| 43 |
+
indicator_issues: Optional[List[str]] = None,
|
| 44 |
+
indicator_comments: Optional[str] = None,
|
| 45 |
+
general_notes: str = "",
|
| 46 |
+
) -> TaggingRecord:
|
| 47 |
+
"""
|
| 48 |
+
Create a new tagging record with validation.
|
| 49 |
+
|
| 50 |
+
Args:
|
| 51 |
+
message_id: ID of the message being tagged
|
| 52 |
+
is_classification_correct: Whether classification is correct
|
| 53 |
+
classification_subcategory: Subcategory if classification is wrong
|
| 54 |
+
correct_classification: Correct classification if wrong
|
| 55 |
+
question_issues: List of question issue types (multi-select)
|
| 56 |
+
question_comments: Free-text comments about questions
|
| 57 |
+
referral_issues: List of referral issue types (multi-select)
|
| 58 |
+
referral_comments: Free-text comments about referral
|
| 59 |
+
indicator_issues: List of incorrectly identified indicator IDs
|
| 60 |
+
indicator_comments: Free-text comments about indicators
|
| 61 |
+
general_notes: General notes about the message
|
| 62 |
+
|
| 63 |
+
Returns:
|
| 64 |
+
Created and validated TaggingRecord
|
| 65 |
+
|
| 66 |
+
Raises:
|
| 67 |
+
ValueError: If validation fails
|
| 68 |
+
"""
|
| 69 |
+
record_id = str(uuid.uuid4())
|
| 70 |
+
|
| 71 |
+
# Ensure lists are not None
|
| 72 |
+
question_issues = question_issues or []
|
| 73 |
+
referral_issues = referral_issues or []
|
| 74 |
+
indicator_issues = indicator_issues or []
|
| 75 |
+
|
| 76 |
+
# Validate inputs
|
| 77 |
+
self._validate_classification_tagging(
|
| 78 |
+
is_classification_correct,
|
| 79 |
+
classification_subcategory,
|
| 80 |
+
correct_classification
|
| 81 |
+
)
|
| 82 |
+
self._validate_question_issues(question_issues)
|
| 83 |
+
self._validate_referral_issues(referral_issues)
|
| 84 |
+
|
| 85 |
+
# Create record (validation happens in __post_init__)
|
| 86 |
+
record = TaggingRecord(
|
| 87 |
+
record_id=record_id,
|
| 88 |
+
message_id=message_id,
|
| 89 |
+
is_classification_correct=is_classification_correct,
|
| 90 |
+
classification_subcategory=classification_subcategory,
|
| 91 |
+
correct_classification=correct_classification,
|
| 92 |
+
question_issues=question_issues,
|
| 93 |
+
question_comments=question_comments,
|
| 94 |
+
referral_issues=referral_issues,
|
| 95 |
+
referral_comments=referral_comments,
|
| 96 |
+
indicator_issues=indicator_issues,
|
| 97 |
+
indicator_comments=indicator_comments,
|
| 98 |
+
general_notes=general_notes,
|
| 99 |
+
)
|
| 100 |
+
|
| 101 |
+
# Store record
|
| 102 |
+
self._records[record_id] = record
|
| 103 |
+
return record
|
| 104 |
+
|
| 105 |
+
def update_tagging_record(
|
| 106 |
+
self,
|
| 107 |
+
record_id: str,
|
| 108 |
+
**updates
|
| 109 |
+
) -> TaggingRecord:
|
| 110 |
+
"""
|
| 111 |
+
Update an existing tagging record.
|
| 112 |
+
|
| 113 |
+
Args:
|
| 114 |
+
record_id: ID of the record to update
|
| 115 |
+
**updates: Fields to update
|
| 116 |
+
|
| 117 |
+
Returns:
|
| 118 |
+
Updated TaggingRecord
|
| 119 |
+
|
| 120 |
+
Raises:
|
| 121 |
+
KeyError: If record not found
|
| 122 |
+
ValueError: If validation fails
|
| 123 |
+
"""
|
| 124 |
+
if record_id not in self._records:
|
| 125 |
+
raise KeyError(f"Tagging record not found: {record_id}")
|
| 126 |
+
|
| 127 |
+
record = self._records[record_id]
|
| 128 |
+
|
| 129 |
+
# Create updated data
|
| 130 |
+
record_data = record.to_dict()
|
| 131 |
+
record_data.update(updates)
|
| 132 |
+
|
| 133 |
+
# Validate updates
|
| 134 |
+
if 'classification_subcategory' in updates or 'correct_classification' in updates:
|
| 135 |
+
self._validate_classification_tagging(
|
| 136 |
+
record_data.get('is_classification_correct', True),
|
| 137 |
+
record_data.get('classification_subcategory'),
|
| 138 |
+
record_data.get('correct_classification')
|
| 139 |
+
)
|
| 140 |
+
|
| 141 |
+
if 'question_issues' in updates:
|
| 142 |
+
self._validate_question_issues(record_data.get('question_issues', []))
|
| 143 |
+
|
| 144 |
+
if 'referral_issues' in updates:
|
| 145 |
+
self._validate_referral_issues(record_data.get('referral_issues', []))
|
| 146 |
+
|
| 147 |
+
# Create new record with updates
|
| 148 |
+
updated_record = TaggingRecord.from_dict(record_data)
|
| 149 |
+
self._records[record_id] = updated_record
|
| 150 |
+
return updated_record
|
| 151 |
+
|
| 152 |
+
def get_tagging_record(self, record_id: str) -> Optional[TaggingRecord]:
|
| 153 |
+
"""
|
| 154 |
+
Get a tagging record by ID.
|
| 155 |
+
|
| 156 |
+
Args:
|
| 157 |
+
record_id: ID of the record to retrieve
|
| 158 |
+
|
| 159 |
+
Returns:
|
| 160 |
+
TaggingRecord if found, None otherwise
|
| 161 |
+
"""
|
| 162 |
+
return self._records.get(record_id)
|
| 163 |
+
|
| 164 |
+
def get_records_for_message(self, message_id: str) -> List[TaggingRecord]:
|
| 165 |
+
"""
|
| 166 |
+
Get all tagging records for a specific message.
|
| 167 |
+
|
| 168 |
+
Args:
|
| 169 |
+
message_id: ID of the message
|
| 170 |
+
|
| 171 |
+
Returns:
|
| 172 |
+
List of TaggingRecord instances for the message
|
| 173 |
+
"""
|
| 174 |
+
return [
|
| 175 |
+
record for record in self._records.values()
|
| 176 |
+
if record.message_id == message_id
|
| 177 |
+
]
|
| 178 |
+
|
| 179 |
+
def get_all_records(self) -> List[TaggingRecord]:
|
| 180 |
+
"""
|
| 181 |
+
Get all tagging records.
|
| 182 |
+
|
| 183 |
+
Returns:
|
| 184 |
+
List of all TaggingRecord instances
|
| 185 |
+
"""
|
| 186 |
+
return list(self._records.values())
|
| 187 |
+
|
| 188 |
+
def delete_tagging_record(self, record_id: str) -> bool:
|
| 189 |
+
"""
|
| 190 |
+
Delete a tagging record.
|
| 191 |
+
|
| 192 |
+
Args:
|
| 193 |
+
record_id: ID of the record to delete
|
| 194 |
+
|
| 195 |
+
Returns:
|
| 196 |
+
True if deleted, False if not found
|
| 197 |
+
"""
|
| 198 |
+
if record_id in self._records:
|
| 199 |
+
del self._records[record_id]
|
| 200 |
+
return True
|
| 201 |
+
return False
|
| 202 |
+
|
| 203 |
+
def get_available_classification_subcategories(self) -> List[str]:
|
| 204 |
+
"""
|
| 205 |
+
Get list of available classification subcategories.
|
| 206 |
+
|
| 207 |
+
Returns:
|
| 208 |
+
List of classification subcategory options
|
| 209 |
+
"""
|
| 210 |
+
return CLASSIFICATION_SUBCATEGORIES.copy()
|
| 211 |
+
|
| 212 |
+
def get_available_question_issue_types(self) -> List[str]:
|
| 213 |
+
"""
|
| 214 |
+
Get list of available question issue types.
|
| 215 |
+
|
| 216 |
+
Returns:
|
| 217 |
+
List of question issue type options
|
| 218 |
+
"""
|
| 219 |
+
return QUESTION_ISSUE_TYPES.copy()
|
| 220 |
+
|
| 221 |
+
def get_available_referral_issue_types(self) -> List[str]:
|
| 222 |
+
"""
|
| 223 |
+
Get list of available referral issue types.
|
| 224 |
+
|
| 225 |
+
Returns:
|
| 226 |
+
List of referral issue type options
|
| 227 |
+
"""
|
| 228 |
+
return REFERRAL_ISSUE_TYPES.copy()
|
| 229 |
+
|
| 230 |
+
def create_classification_correction(
|
| 231 |
+
self,
|
| 232 |
+
message_id: str,
|
| 233 |
+
subcategory: str,
|
| 234 |
+
correct_classification: str,
|
| 235 |
+
general_notes: str = ""
|
| 236 |
+
) -> TaggingRecord:
|
| 237 |
+
"""
|
| 238 |
+
Create a tagging record specifically for wrong classification.
|
| 239 |
+
|
| 240 |
+
This is a convenience method that ensures proper validation
|
| 241 |
+
for classification correction scenarios.
|
| 242 |
+
|
| 243 |
+
Args:
|
| 244 |
+
message_id: ID of the message being corrected
|
| 245 |
+
subcategory: Classification error subcategory
|
| 246 |
+
correct_classification: The correct classification
|
| 247 |
+
general_notes: Additional notes about the correction
|
| 248 |
+
|
| 249 |
+
Returns:
|
| 250 |
+
TaggingRecord for the classification correction
|
| 251 |
+
|
| 252 |
+
Raises:
|
| 253 |
+
ValueError: If subcategory or correct_classification is invalid
|
| 254 |
+
"""
|
| 255 |
+
return self.create_tagging_record(
|
| 256 |
+
message_id=message_id,
|
| 257 |
+
is_classification_correct=False,
|
| 258 |
+
classification_subcategory=subcategory,
|
| 259 |
+
correct_classification=correct_classification,
|
| 260 |
+
general_notes=general_notes
|
| 261 |
+
)
|
| 262 |
+
|
| 263 |
+
def get_classification_subcategory_descriptions(self) -> Dict[str, str]:
|
| 264 |
+
"""
|
| 265 |
+
Get descriptions for classification subcategories.
|
| 266 |
+
|
| 267 |
+
Returns:
|
| 268 |
+
Dictionary mapping subcategory codes to descriptions
|
| 269 |
+
"""
|
| 270 |
+
return {
|
| 271 |
+
"missed_indicators": "Missed key distress indicators",
|
| 272 |
+
"false_positive": "Overly sensitive (false-positive flag)",
|
| 273 |
+
"missed_distress": "Not sensitive enough (missed distress)",
|
| 274 |
+
}
|
| 275 |
+
|
| 276 |
+
def create_question_issue_tagging(
|
| 277 |
+
self,
|
| 278 |
+
message_id: str,
|
| 279 |
+
question_issues: List[str],
|
| 280 |
+
question_comments: Optional[str] = None,
|
| 281 |
+
general_notes: str = ""
|
| 282 |
+
) -> TaggingRecord:
|
| 283 |
+
"""
|
| 284 |
+
Create a tagging record specifically for follow-up question issues.
|
| 285 |
+
|
| 286 |
+
This is a convenience method for tagging YELLOW flow question issues
|
| 287 |
+
with multi-select support and free-text comments.
|
| 288 |
+
|
| 289 |
+
Args:
|
| 290 |
+
message_id: ID of the message being tagged
|
| 291 |
+
question_issues: List of question issue types (multi-select)
|
| 292 |
+
question_comments: Free-text comments about questions
|
| 293 |
+
general_notes: Additional notes
|
| 294 |
+
|
| 295 |
+
Returns:
|
| 296 |
+
TaggingRecord for the question issues
|
| 297 |
+
|
| 298 |
+
Raises:
|
| 299 |
+
ValueError: If question_issues contains invalid types
|
| 300 |
+
"""
|
| 301 |
+
return self.create_tagging_record(
|
| 302 |
+
message_id=message_id,
|
| 303 |
+
question_issues=question_issues,
|
| 304 |
+
question_comments=question_comments,
|
| 305 |
+
general_notes=general_notes
|
| 306 |
+
)
|
| 307 |
+
|
| 308 |
+
def get_question_issue_descriptions(self) -> Dict[str, str]:
|
| 309 |
+
"""
|
| 310 |
+
Get descriptions for question issue types.
|
| 311 |
+
|
| 312 |
+
Returns:
|
| 313 |
+
Dictionary mapping issue codes to descriptions
|
| 314 |
+
"""
|
| 315 |
+
return {
|
| 316 |
+
"inappropriate": "Question is inappropriate or intrusive",
|
| 317 |
+
"not_relevant": "Question is not spiritually relevant",
|
| 318 |
+
"too_leading": "Question is too leading or assumptive",
|
| 319 |
+
"unclear": "Question is unclear or confusing",
|
| 320 |
+
"tone_clinical": "Tone too clinical",
|
| 321 |
+
"tone_religious": "Tone too religious",
|
| 322 |
+
"tone_casual": "Tone too casual",
|
| 323 |
+
}
|
| 324 |
+
|
| 325 |
+
def create_referral_issue_tagging(
|
| 326 |
+
self,
|
| 327 |
+
message_id: str,
|
| 328 |
+
referral_issues: List[str],
|
| 329 |
+
referral_comments: Optional[str] = None,
|
| 330 |
+
general_notes: str = ""
|
| 331 |
+
) -> TaggingRecord:
|
| 332 |
+
"""
|
| 333 |
+
Create a tagging record specifically for referral message issues.
|
| 334 |
+
|
| 335 |
+
This is a convenience method for tagging RED flow referral message issues
|
| 336 |
+
with multi-select support and free-text comments.
|
| 337 |
+
|
| 338 |
+
Args:
|
| 339 |
+
message_id: ID of the message being tagged
|
| 340 |
+
referral_issues: List of referral issue types (multi-select)
|
| 341 |
+
referral_comments: Free-text comments about referral message
|
| 342 |
+
general_notes: Additional notes
|
| 343 |
+
|
| 344 |
+
Returns:
|
| 345 |
+
TaggingRecord for the referral issues
|
| 346 |
+
|
| 347 |
+
Raises:
|
| 348 |
+
ValueError: If referral_issues contains invalid types
|
| 349 |
+
"""
|
| 350 |
+
return self.create_tagging_record(
|
| 351 |
+
message_id=message_id,
|
| 352 |
+
referral_issues=referral_issues,
|
| 353 |
+
referral_comments=referral_comments,
|
| 354 |
+
general_notes=general_notes
|
| 355 |
+
)
|
| 356 |
+
|
| 357 |
+
def get_referral_issue_descriptions(self) -> Dict[str, str]:
|
| 358 |
+
"""
|
| 359 |
+
Get descriptions for referral issue types.
|
| 360 |
+
|
| 361 |
+
Returns:
|
| 362 |
+
Dictionary mapping issue codes to descriptions
|
| 363 |
+
"""
|
| 364 |
+
return {
|
| 365 |
+
"incomplete_summary": "Incorrect or incomplete summary",
|
| 366 |
+
"misrepresentation": "Misrepresentation of patient message",
|
| 367 |
+
"inappropriate_tone": "Tone inappropriate for spiritual care team",
|
| 368 |
+
}
|
| 369 |
+
|
| 370 |
+
def create_indicator_issue_tagging(
|
| 371 |
+
self,
|
| 372 |
+
message_id: str,
|
| 373 |
+
indicator_issues: List[str],
|
| 374 |
+
indicator_comments: Optional[str] = None,
|
| 375 |
+
general_notes: str = ""
|
| 376 |
+
) -> TaggingRecord:
|
| 377 |
+
"""
|
| 378 |
+
Create a tagging record specifically for indicator issues.
|
| 379 |
+
|
| 380 |
+
This is a convenience method for tagging incorrectly identified indicators
|
| 381 |
+
with free-text comments.
|
| 382 |
+
|
| 383 |
+
Args:
|
| 384 |
+
message_id: ID of the message being tagged
|
| 385 |
+
indicator_issues: List of incorrectly identified indicator IDs
|
| 386 |
+
indicator_comments: Free-text comments about indicators
|
| 387 |
+
general_notes: Additional notes
|
| 388 |
+
|
| 389 |
+
Returns:
|
| 390 |
+
TaggingRecord for the indicator issues
|
| 391 |
+
"""
|
| 392 |
+
return self.create_tagging_record(
|
| 393 |
+
message_id=message_id,
|
| 394 |
+
indicator_issues=indicator_issues,
|
| 395 |
+
indicator_comments=indicator_comments,
|
| 396 |
+
general_notes=general_notes
|
| 397 |
+
)
|
| 398 |
+
|
| 399 |
+
def create_indicator_issue_tagging(
|
| 400 |
+
self,
|
| 401 |
+
message_id: str,
|
| 402 |
+
indicator_issues: List[str],
|
| 403 |
+
indicator_comments: Optional[str] = None,
|
| 404 |
+
general_notes: str = ""
|
| 405 |
+
) -> TaggingRecord:
|
| 406 |
+
"""
|
| 407 |
+
Create a tagging record specifically for indicator issues.
|
| 408 |
+
|
| 409 |
+
This is a convenience method for marking incorrectly identified
|
| 410 |
+
distress indicators with comments.
|
| 411 |
+
|
| 412 |
+
Args:
|
| 413 |
+
message_id: ID of the message being tagged
|
| 414 |
+
indicator_issues: List of incorrectly identified indicator IDs
|
| 415 |
+
indicator_comments: Free-text comments about indicators
|
| 416 |
+
general_notes: Additional notes
|
| 417 |
+
|
| 418 |
+
Returns:
|
| 419 |
+
TaggingRecord for the indicator issues
|
| 420 |
+
"""
|
| 421 |
+
return self.create_tagging_record(
|
| 422 |
+
message_id=message_id,
|
| 423 |
+
indicator_issues=indicator_issues,
|
| 424 |
+
indicator_comments=indicator_comments,
|
| 425 |
+
general_notes=general_notes
|
| 426 |
+
)
|
| 427 |
+
|
| 428 |
+
def validate_indicator_ids(self, indicator_ids: List[str]) -> bool:
|
| 429 |
+
"""
|
| 430 |
+
Validate that indicator IDs are reasonable.
|
| 431 |
+
|
| 432 |
+
This is a basic validation - in a real system, you might
|
| 433 |
+
validate against actual indicator IDs from the classification result.
|
| 434 |
+
|
| 435 |
+
Args:
|
| 436 |
+
indicator_ids: List of indicator IDs to validate
|
| 437 |
+
|
| 438 |
+
Returns:
|
| 439 |
+
True if all IDs are valid format, False otherwise
|
| 440 |
+
"""
|
| 441 |
+
for indicator_id in indicator_ids:
|
| 442 |
+
if not isinstance(indicator_id, str) or len(indicator_id.strip()) == 0:
|
| 443 |
+
return False
|
| 444 |
+
return True
|
| 445 |
+
|
| 446 |
+
def _validate_classification_tagging(
|
| 447 |
+
self,
|
| 448 |
+
is_classification_correct: bool,
|
| 449 |
+
classification_subcategory: Optional[str],
|
| 450 |
+
correct_classification: Optional[str]
|
| 451 |
+
) -> None:
|
| 452 |
+
"""
|
| 453 |
+
Validate classification tagging fields.
|
| 454 |
+
|
| 455 |
+
Args:
|
| 456 |
+
is_classification_correct: Whether classification is correct
|
| 457 |
+
classification_subcategory: Subcategory if wrong
|
| 458 |
+
correct_classification: Correct classification if wrong
|
| 459 |
+
|
| 460 |
+
Raises:
|
| 461 |
+
ValueError: If validation fails
|
| 462 |
+
"""
|
| 463 |
+
if not is_classification_correct:
|
| 464 |
+
# If classification is wrong, require subcategory and correct classification
|
| 465 |
+
if not classification_subcategory:
|
| 466 |
+
raise ValueError(
|
| 467 |
+
"classification_subcategory is required when is_classification_correct is False"
|
| 468 |
+
)
|
| 469 |
+
if not correct_classification:
|
| 470 |
+
raise ValueError(
|
| 471 |
+
"correct_classification is required when is_classification_correct is False"
|
| 472 |
+
)
|
| 473 |
+
|
| 474 |
+
if classification_subcategory not in CLASSIFICATION_SUBCATEGORIES:
|
| 475 |
+
raise ValueError(
|
| 476 |
+
f"Invalid classification_subcategory: {classification_subcategory}. "
|
| 477 |
+
f"Must be one of: {CLASSIFICATION_SUBCATEGORIES}"
|
| 478 |
+
)
|
| 479 |
+
|
| 480 |
+
if correct_classification not in ("red", "yellow", "green"):
|
| 481 |
+
raise ValueError(
|
| 482 |
+
f"Invalid correct_classification: {correct_classification}. "
|
| 483 |
+
f"Must be one of: red, yellow, green"
|
| 484 |
+
)
|
| 485 |
+
else:
|
| 486 |
+
# If classification is correct, these fields should be None
|
| 487 |
+
if classification_subcategory is not None:
|
| 488 |
+
raise ValueError(
|
| 489 |
+
"classification_subcategory must be None when is_classification_correct is True"
|
| 490 |
+
)
|
| 491 |
+
if correct_classification is not None:
|
| 492 |
+
raise ValueError(
|
| 493 |
+
"correct_classification must be None when is_classification_correct is True"
|
| 494 |
+
)
|
| 495 |
+
|
| 496 |
+
def _validate_question_issues(self, question_issues: List[str]) -> None:
|
| 497 |
+
"""
|
| 498 |
+
Validate question issue types.
|
| 499 |
+
|
| 500 |
+
Args:
|
| 501 |
+
question_issues: List of question issue types
|
| 502 |
+
|
| 503 |
+
Raises:
|
| 504 |
+
ValueError: If any issue type is invalid
|
| 505 |
+
"""
|
| 506 |
+
for issue in question_issues:
|
| 507 |
+
if issue not in QUESTION_ISSUE_TYPES:
|
| 508 |
+
raise ValueError(
|
| 509 |
+
f"Invalid question issue type: {issue}. "
|
| 510 |
+
f"Must be one of: {QUESTION_ISSUE_TYPES}"
|
| 511 |
+
)
|
| 512 |
+
|
| 513 |
+
def _validate_referral_issues(self, referral_issues: List[str]) -> None:
|
| 514 |
+
"""
|
| 515 |
+
Validate referral issue types.
|
| 516 |
+
|
| 517 |
+
Args:
|
| 518 |
+
referral_issues: List of referral issue types
|
| 519 |
+
|
| 520 |
+
Raises:
|
| 521 |
+
ValueError: If any issue type is invalid
|
| 522 |
+
"""
|
| 523 |
+
for issue in referral_issues:
|
| 524 |
+
if issue not in REFERRAL_ISSUE_TYPES:
|
| 525 |
+
raise ValueError(
|
| 526 |
+
f"Invalid referral issue type: {issue}. "
|
| 527 |
+
f"Must be one of: {REFERRAL_ISSUE_TYPES}"
|
| 528 |
+
)
|
src/core/verification_csv_exporter.py
CHANGED
|
@@ -2,14 +2,21 @@
|
|
| 2 |
"""
|
| 3 |
CSV export functionality for verification sessions.
|
| 4 |
|
| 5 |
-
Provides methods for generating CSV files with verification results and summaries
|
|
|
|
| 6 |
"""
|
| 7 |
|
| 8 |
import csv
|
| 9 |
import io
|
| 10 |
from datetime import datetime
|
| 11 |
-
from typing import List
|
| 12 |
from src.core.verification_models import VerificationRecord, VerificationSession
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
|
| 14 |
|
| 15 |
class VerificationCSVExporter:
|
|
@@ -135,3 +142,207 @@ class VerificationCSVExporter:
|
|
| 135 |
"incorrect": session.incorrect_count,
|
| 136 |
"accuracy_percent": accuracy,
|
| 137 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
"""
|
| 3 |
CSV export functionality for verification sessions.
|
| 4 |
|
| 5 |
+
Provides methods for generating CSV files with verification results and summaries,
|
| 6 |
+
including tagging data, generated content, interaction logs, and error statistics.
|
| 7 |
"""
|
| 8 |
|
| 9 |
import csv
|
| 10 |
import io
|
| 11 |
from datetime import datetime
|
| 12 |
+
from typing import List, Optional, Dict, Any
|
| 13 |
from src.core.verification_models import VerificationRecord, VerificationSession
|
| 14 |
+
from src.core.chaplain_models import (
|
| 15 |
+
TaggingRecord,
|
| 16 |
+
ClassificationFlowResult,
|
| 17 |
+
InteractionStepLog,
|
| 18 |
+
)
|
| 19 |
+
from src.core.error_pattern_analyzer import ErrorPatternAnalyzer
|
| 20 |
|
| 21 |
|
| 22 |
class VerificationCSVExporter:
|
|
|
|
| 142 |
"incorrect": session.incorrect_count,
|
| 143 |
"accuracy_percent": accuracy,
|
| 144 |
}
|
| 145 |
+
|
| 146 |
+
@staticmethod
|
| 147 |
+
def generate_enhanced_csv_content(
|
| 148 |
+
session: VerificationSession,
|
| 149 |
+
tagging_records: Optional[List[TaggingRecord]] = None,
|
| 150 |
+
flow_results: Optional[Dict[str, ClassificationFlowResult]] = None,
|
| 151 |
+
interaction_logs: Optional[List[InteractionStepLog]] = None,
|
| 152 |
+
) -> str:
|
| 153 |
+
"""
|
| 154 |
+
Generate enhanced CSV content with tagging data, generated content, and statistics.
|
| 155 |
+
|
| 156 |
+
Includes:
|
| 157 |
+
- Summary section with accuracy metrics
|
| 158 |
+
- Detailed records with tagging categories and subcategories
|
| 159 |
+
- Generated content (explanations, questions, referral messages)
|
| 160 |
+
- Interaction logs
|
| 161 |
+
- Error pattern statistics
|
| 162 |
+
|
| 163 |
+
Args:
|
| 164 |
+
session: The verification session to export
|
| 165 |
+
tagging_records: List of TaggingRecord instances (optional)
|
| 166 |
+
flow_results: Dict mapping message_id to ClassificationFlowResult (optional)
|
| 167 |
+
interaction_logs: List of InteractionStepLog instances (optional)
|
| 168 |
+
|
| 169 |
+
Returns:
|
| 170 |
+
Enhanced CSV content as a string
|
| 171 |
+
|
| 172 |
+
Raises:
|
| 173 |
+
ValueError: If session has no verified messages
|
| 174 |
+
"""
|
| 175 |
+
if session.verified_count == 0:
|
| 176 |
+
raise ValueError("No verified messages to export")
|
| 177 |
+
|
| 178 |
+
output = io.StringIO()
|
| 179 |
+
|
| 180 |
+
# Add summary section
|
| 181 |
+
accuracy = (
|
| 182 |
+
session.correct_count / session.verified_count * 100
|
| 183 |
+
if session.verified_count > 0
|
| 184 |
+
else 0.0
|
| 185 |
+
)
|
| 186 |
+
output.write("VERIFICATION SUMMARY\n")
|
| 187 |
+
output.write(f"Total Messages,{session.verified_count}\n")
|
| 188 |
+
output.write(f"Correct,{session.correct_count}\n")
|
| 189 |
+
output.write(f"Incorrect,{session.incorrect_count}\n")
|
| 190 |
+
output.write(f"Accuracy %,{accuracy:.1f}\n")
|
| 191 |
+
output.write("\n")
|
| 192 |
+
|
| 193 |
+
# Add detailed records section
|
| 194 |
+
output.write("DETAILED RECORDS\n")
|
| 195 |
+
output.write("Patient Message,Classifier Said,You Said,Notes,Date\n")
|
| 196 |
+
|
| 197 |
+
writer = csv.writer(output)
|
| 198 |
+
|
| 199 |
+
for record in session.verifications:
|
| 200 |
+
classifier_decision = record.classifier_decision.upper()
|
| 201 |
+
ground_truth = record.ground_truth_label.upper()
|
| 202 |
+
timestamp = record.timestamp.strftime("%Y-%m-%d %H:%M:%S")
|
| 203 |
+
|
| 204 |
+
writer.writerow([
|
| 205 |
+
record.original_message,
|
| 206 |
+
classifier_decision,
|
| 207 |
+
ground_truth,
|
| 208 |
+
record.verifier_notes,
|
| 209 |
+
timestamp,
|
| 210 |
+
])
|
| 211 |
+
|
| 212 |
+
output.write("\n")
|
| 213 |
+
|
| 214 |
+
# Add tagging data section if provided
|
| 215 |
+
if tagging_records:
|
| 216 |
+
output.write("TAGGING DATA\n")
|
| 217 |
+
output.write("Message ID,Classification Correct,Classification Subcategory,Correct Classification,Question Issues,Question Comments,Referral Issues,Referral Comments,Indicator Issues,Indicator Comments,General Notes\n")
|
| 218 |
+
|
| 219 |
+
for record in tagging_records:
|
| 220 |
+
writer.writerow([
|
| 221 |
+
record.message_id,
|
| 222 |
+
"Yes" if record.is_classification_correct else "No",
|
| 223 |
+
record.classification_subcategory or "",
|
| 224 |
+
record.correct_classification or "",
|
| 225 |
+
"; ".join(record.question_issues) if record.question_issues else "",
|
| 226 |
+
record.question_comments or "",
|
| 227 |
+
"; ".join(record.referral_issues) if record.referral_issues else "",
|
| 228 |
+
record.referral_comments or "",
|
| 229 |
+
"; ".join(record.indicator_issues) if record.indicator_issues else "",
|
| 230 |
+
record.indicator_comments or "",
|
| 231 |
+
record.general_notes,
|
| 232 |
+
])
|
| 233 |
+
|
| 234 |
+
output.write("\n")
|
| 235 |
+
|
| 236 |
+
# Add generated content section if provided
|
| 237 |
+
if flow_results:
|
| 238 |
+
output.write("GENERATED CONTENT\n")
|
| 239 |
+
output.write("Message ID,Classification,Explanation,Permission Check Message,Referral Message,Follow-Up Questions,Patient Responses,Re-evaluation Result\n")
|
| 240 |
+
|
| 241 |
+
for message_id, result in flow_results.items():
|
| 242 |
+
questions_text = "; ".join([q.question_text for q in result.follow_up_questions]) if result.follow_up_questions else ""
|
| 243 |
+
responses_text = "; ".join(result.patient_responses) if result.patient_responses else ""
|
| 244 |
+
|
| 245 |
+
writer.writerow([
|
| 246 |
+
message_id,
|
| 247 |
+
result.classification.upper(),
|
| 248 |
+
result.explanation,
|
| 249 |
+
result.permission_check_message or "",
|
| 250 |
+
result.referral_message or "",
|
| 251 |
+
questions_text,
|
| 252 |
+
responses_text,
|
| 253 |
+
result.re_evaluation_result or "",
|
| 254 |
+
])
|
| 255 |
+
|
| 256 |
+
output.write("\n")
|
| 257 |
+
|
| 258 |
+
# Add interaction logs section if provided
|
| 259 |
+
if interaction_logs:
|
| 260 |
+
output.write("INTERACTION LOGS\n")
|
| 261 |
+
output.write("Step ID,Session ID,Message ID,Step Type,Input Text,Model Output,Approval Status,Timestamp\n")
|
| 262 |
+
|
| 263 |
+
for log in interaction_logs:
|
| 264 |
+
writer.writerow([
|
| 265 |
+
log.step_id,
|
| 266 |
+
log.session_id,
|
| 267 |
+
log.message_id,
|
| 268 |
+
log.step_type,
|
| 269 |
+
log.input_text,
|
| 270 |
+
log.model_output,
|
| 271 |
+
log.approval_status or "",
|
| 272 |
+
log.timestamp.strftime("%Y-%m-%d %H:%M:%S"),
|
| 273 |
+
])
|
| 274 |
+
|
| 275 |
+
output.write("\n")
|
| 276 |
+
|
| 277 |
+
# Add statistics section if tagging records provided
|
| 278 |
+
if tagging_records:
|
| 279 |
+
output.write("ERROR PATTERN STATISTICS\n")
|
| 280 |
+
|
| 281 |
+
analyzer = ErrorPatternAnalyzer()
|
| 282 |
+
stats = analyzer.get_statistics_summary(tagging_records)
|
| 283 |
+
|
| 284 |
+
# Classification errors
|
| 285 |
+
output.write("Classification Errors\n")
|
| 286 |
+
for subcategory, count in stats["classification_errors"].items():
|
| 287 |
+
output.write(f"{subcategory},{count}\n")
|
| 288 |
+
output.write("\n")
|
| 289 |
+
|
| 290 |
+
# Question issues
|
| 291 |
+
output.write("Question Issues\n")
|
| 292 |
+
for issue_type, count in stats["question_issues"].items():
|
| 293 |
+
output.write(f"{issue_type},{count}\n")
|
| 294 |
+
output.write("\n")
|
| 295 |
+
|
| 296 |
+
# Referral issues
|
| 297 |
+
output.write("Referral Issues\n")
|
| 298 |
+
for issue_type, count in stats["referral_issues"].items():
|
| 299 |
+
output.write(f"{issue_type},{count}\n")
|
| 300 |
+
output.write("\n")
|
| 301 |
+
|
| 302 |
+
# Indicator issues
|
| 303 |
+
output.write("Indicator Issues\n")
|
| 304 |
+
for indicator_id, count in stats["indicator_issues"].items():
|
| 305 |
+
output.write(f"{indicator_id},{count}\n")
|
| 306 |
+
output.write("\n")
|
| 307 |
+
|
| 308 |
+
# Common patterns
|
| 309 |
+
output.write("Common Patterns\n")
|
| 310 |
+
for pattern in stats["common_patterns"]:
|
| 311 |
+
output.write(f"{pattern}\n")
|
| 312 |
+
output.write("\n")
|
| 313 |
+
|
| 314 |
+
return output.getvalue()
|
| 315 |
+
|
| 316 |
+
@staticmethod
|
| 317 |
+
def export_enhanced_session_to_csv(
|
| 318 |
+
session: VerificationSession,
|
| 319 |
+
tagging_records: Optional[List[TaggingRecord]] = None,
|
| 320 |
+
flow_results: Optional[Dict[str, ClassificationFlowResult]] = None,
|
| 321 |
+
interaction_logs: Optional[List[InteractionStepLog]] = None,
|
| 322 |
+
) -> tuple:
|
| 323 |
+
"""
|
| 324 |
+
Export a verification session with enhanced data to CSV format.
|
| 325 |
+
|
| 326 |
+
Returns both the CSV content and the filename.
|
| 327 |
+
|
| 328 |
+
Args:
|
| 329 |
+
session: The verification session to export
|
| 330 |
+
tagging_records: List of TaggingRecord instances (optional)
|
| 331 |
+
flow_results: Dict mapping message_id to ClassificationFlowResult (optional)
|
| 332 |
+
interaction_logs: List of InteractionStepLog instances (optional)
|
| 333 |
+
|
| 334 |
+
Returns:
|
| 335 |
+
Tuple of (csv_content, filename)
|
| 336 |
+
|
| 337 |
+
Raises:
|
| 338 |
+
ValueError: If session has no verified messages
|
| 339 |
+
"""
|
| 340 |
+
csv_content = VerificationCSVExporter.generate_enhanced_csv_content(
|
| 341 |
+
session,
|
| 342 |
+
tagging_records=tagging_records,
|
| 343 |
+
flow_results=flow_results,
|
| 344 |
+
interaction_logs=interaction_logs,
|
| 345 |
+
)
|
| 346 |
+
filename = VerificationCSVExporter.generate_csv_filename(session.created_at)
|
| 347 |
+
|
| 348 |
+
return csv_content, filename
|
src/interface/chaplain_feedback_ui.py
ADDED
|
@@ -0,0 +1,450 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# chaplain_feedback_ui.py
|
| 2 |
+
"""
|
| 3 |
+
Gradio UI components for Chaplain Feedback & Tagging System.
|
| 4 |
+
|
| 5 |
+
Provides interface components for displaying classification flows,
|
| 6 |
+
collecting chaplain feedback, and displaying error patterns.
|
| 7 |
+
|
| 8 |
+
Requirements: 1.5, 2.3, 3.3, 4.1, 5.1, 5.3, 6.1, 6.3, 8.1, 8.2, 8.3, 10.1, 10.2, 10.3
|
| 9 |
+
"""
|
| 10 |
+
|
| 11 |
+
import gradio as gr
|
| 12 |
+
from typing import List, Dict, Tuple, Optional, Any
|
| 13 |
+
from dataclasses import dataclass
|
| 14 |
+
|
| 15 |
+
from src.core.chaplain_models import (
|
| 16 |
+
ClassificationFlowResult,
|
| 17 |
+
DistressIndicator,
|
| 18 |
+
FollowUpQuestion,
|
| 19 |
+
TaggingRecord,
|
| 20 |
+
CLASSIFICATION_SUBCATEGORIES,
|
| 21 |
+
QUESTION_ISSUE_TYPES,
|
| 22 |
+
REFERRAL_ISSUE_TYPES,
|
| 23 |
+
)
|
| 24 |
+
|
| 25 |
+
|
| 26 |
+
class ChaplainFeedbackUIComponents:
|
| 27 |
+
"""Manages Gradio UI components for chaplain feedback system."""
|
| 28 |
+
|
| 29 |
+
# Color mappings for classification badges
|
| 30 |
+
BADGE_COLORS = {
|
| 31 |
+
"red": "🔴",
|
| 32 |
+
"yellow": "🟡",
|
| 33 |
+
"green": "🟢",
|
| 34 |
+
}
|
| 35 |
+
|
| 36 |
+
BADGE_LABELS = {
|
| 37 |
+
"red": "RED - Severe Distress",
|
| 38 |
+
"yellow": "YELLOW - Potential Distress",
|
| 39 |
+
"green": "GREEN - No Distress",
|
| 40 |
+
}
|
| 41 |
+
|
| 42 |
+
# Severity color codes for indicators
|
| 43 |
+
SEVERITY_COLORS = {
|
| 44 |
+
"red": "#ea9999", # Red from definitions document
|
| 45 |
+
"yellow": "#ffe599", # Yellow from definitions document
|
| 46 |
+
}
|
| 47 |
+
|
| 48 |
+
@staticmethod
|
| 49 |
+
def create_classification_flow_display() -> Tuple[gr.Component, gr.Component, gr.Component, gr.Component]:
|
| 50 |
+
"""
|
| 51 |
+
Create ClassificationFlowDisplay component.
|
| 52 |
+
|
| 53 |
+
Displays RED/YELLOW/GREEN flow results with all generated content.
|
| 54 |
+
|
| 55 |
+
Returns:
|
| 56 |
+
Tuple of (classification_badge, explanation, content_section, indicators_section) components
|
| 57 |
+
|
| 58 |
+
Requirements: 1.5, 2.3, 3.3
|
| 59 |
+
"""
|
| 60 |
+
classification_badge = gr.Markdown(
|
| 61 |
+
value="🔄 Loading classification...",
|
| 62 |
+
label="Classification Result",
|
| 63 |
+
)
|
| 64 |
+
|
| 65 |
+
explanation = gr.Markdown(
|
| 66 |
+
value="",
|
| 67 |
+
label="Explanation",
|
| 68 |
+
)
|
| 69 |
+
|
| 70 |
+
content_section = gr.Markdown(
|
| 71 |
+
value="",
|
| 72 |
+
label="Generated Content",
|
| 73 |
+
)
|
| 74 |
+
|
| 75 |
+
indicators_section = gr.Markdown(
|
| 76 |
+
value="",
|
| 77 |
+
label="Detected Indicators",
|
| 78 |
+
)
|
| 79 |
+
|
| 80 |
+
return classification_badge, explanation, content_section, indicators_section
|
| 81 |
+
|
| 82 |
+
@staticmethod
|
| 83 |
+
def render_classification_flow(
|
| 84 |
+
flow_result: ClassificationFlowResult,
|
| 85 |
+
) -> Tuple[str, str, str, str]:
|
| 86 |
+
"""
|
| 87 |
+
Render complete classification flow result.
|
| 88 |
+
|
| 89 |
+
Args:
|
| 90 |
+
flow_result: ClassificationFlowResult with all flow data
|
| 91 |
+
|
| 92 |
+
Returns:
|
| 93 |
+
Tuple of (badge, explanation, content, indicators) markdown strings
|
| 94 |
+
"""
|
| 95 |
+
# Classification badge
|
| 96 |
+
badge_emoji = ChaplainFeedbackUIComponents.BADGE_COLORS.get(flow_result.classification, "❓")
|
| 97 |
+
badge_label = ChaplainFeedbackUIComponents.BADGE_LABELS.get(flow_result.classification, "UNKNOWN")
|
| 98 |
+
confidence_pct = int(round(flow_result.confidence * 100))
|
| 99 |
+
badge = f"## {badge_emoji} {badge_label}\n\n**Confidence:** {confidence_pct}%"
|
| 100 |
+
|
| 101 |
+
# Explanation
|
| 102 |
+
explanation = f"### Explanation\n\n{flow_result.explanation}"
|
| 103 |
+
|
| 104 |
+
# Generated content based on classification
|
| 105 |
+
content = ""
|
| 106 |
+
if flow_result.classification == "red":
|
| 107 |
+
content = ChaplainFeedbackUIComponents._render_red_flow_content(flow_result)
|
| 108 |
+
elif flow_result.classification == "yellow":
|
| 109 |
+
content = ChaplainFeedbackUIComponents._render_yellow_flow_content(flow_result)
|
| 110 |
+
elif flow_result.classification == "green":
|
| 111 |
+
content = ChaplainFeedbackUIComponents._render_green_flow_content(flow_result)
|
| 112 |
+
|
| 113 |
+
# Indicators
|
| 114 |
+
indicators = ChaplainFeedbackUIComponents._render_indicators(flow_result.indicators)
|
| 115 |
+
|
| 116 |
+
return badge, explanation, content, indicators
|
| 117 |
+
|
| 118 |
+
@staticmethod
|
| 119 |
+
def _render_red_flow_content(flow_result: ClassificationFlowResult) -> str:
|
| 120 |
+
"""Render RED flow content (permission check + referral message)."""
|
| 121 |
+
content = "### 🔴 RED FLAG - Severe Distress Detected\n\n"
|
| 122 |
+
|
| 123 |
+
if flow_result.permission_check_message:
|
| 124 |
+
content += "#### Patient Permission Check\n\n"
|
| 125 |
+
content += f"{flow_result.permission_check_message}\n\n"
|
| 126 |
+
|
| 127 |
+
if flow_result.consent_status:
|
| 128 |
+
content += f"**Consent Status:** {flow_result.consent_status}\n\n"
|
| 129 |
+
|
| 130 |
+
if flow_result.referral_message and flow_result.consent_status == "granted":
|
| 131 |
+
content += "#### Referral Message for Spiritual Care Team\n\n"
|
| 132 |
+
content += f"{flow_result.referral_message}\n\n"
|
| 133 |
+
elif flow_result.consent_status == "declined":
|
| 134 |
+
content += "**Status:** No further action - patient declined spiritual support referral\n\n"
|
| 135 |
+
|
| 136 |
+
return content
|
| 137 |
+
|
| 138 |
+
@staticmethod
|
| 139 |
+
def _render_yellow_flow_content(flow_result: ClassificationFlowResult) -> str:
|
| 140 |
+
"""Render YELLOW flow content (follow-up questions + re-evaluation)."""
|
| 141 |
+
content = "### 🟡 YELLOW FLAG - Potential Distress\n\n"
|
| 142 |
+
|
| 143 |
+
if flow_result.follow_up_questions:
|
| 144 |
+
content += "#### Follow-Up Questions\n\n"
|
| 145 |
+
for i, question in enumerate(flow_result.follow_up_questions, 1):
|
| 146 |
+
content += f"**Question {i}:** {question.question_text}\n\n"
|
| 147 |
+
content += f"*Purpose:* {question.purpose}\n\n"
|
| 148 |
+
|
| 149 |
+
if flow_result.patient_responses:
|
| 150 |
+
content += "#### Patient Responses\n\n"
|
| 151 |
+
for i, response in enumerate(flow_result.patient_responses, 1):
|
| 152 |
+
content += f"**Response {i}:** {response}\n\n"
|
| 153 |
+
|
| 154 |
+
if flow_result.re_evaluation_result:
|
| 155 |
+
content += f"#### Re-Evaluation Result\n\n"
|
| 156 |
+
if flow_result.re_evaluation_result == "red":
|
| 157 |
+
content += "🔴 **Escalated to RED** - Severe distress detected in responses\n\n"
|
| 158 |
+
elif flow_result.re_evaluation_result == "green":
|
| 159 |
+
content += "🟢 **Downgraded to GREEN** - No distress indicators in responses\n\n"
|
| 160 |
+
|
| 161 |
+
return content
|
| 162 |
+
|
| 163 |
+
@staticmethod
|
| 164 |
+
def _render_green_flow_content(flow_result: ClassificationFlowResult) -> str:
|
| 165 |
+
"""Render GREEN flow content (no distress)."""
|
| 166 |
+
content = "### 🟢 GREEN FLAG - No Distress Detected\n\n"
|
| 167 |
+
content += "**Status:** No further steps required\n\n"
|
| 168 |
+
content += "No spiritual distress indicators were detected in this message.\n\n"
|
| 169 |
+
return content
|
| 170 |
+
|
| 171 |
+
@staticmethod
|
| 172 |
+
def _render_indicators(indicators: List[DistressIndicator]) -> str:
|
| 173 |
+
"""Render detected indicators with categories and severity."""
|
| 174 |
+
if not indicators:
|
| 175 |
+
return "### Detected Indicators\n\nNo indicators detected"
|
| 176 |
+
|
| 177 |
+
content = "### Detected Indicators\n\n"
|
| 178 |
+
|
| 179 |
+
# Group by severity
|
| 180 |
+
red_indicators = [i for i in indicators if i.severity == "red"]
|
| 181 |
+
yellow_indicators = [i for i in indicators if i.severity == "yellow"]
|
| 182 |
+
|
| 183 |
+
if red_indicators:
|
| 184 |
+
content += "#### 🔴 RED Indicators (Severe)\n\n"
|
| 185 |
+
for indicator in red_indicators:
|
| 186 |
+
confidence_pct = int(round(indicator.confidence * 100))
|
| 187 |
+
content += f"• **{indicator.subcategory}** ({confidence_pct}% confidence)\n"
|
| 188 |
+
content += f" - Category: {indicator.category}\n"
|
| 189 |
+
content += f" - Reference: {indicator.definition_reference}\n\n"
|
| 190 |
+
|
| 191 |
+
if yellow_indicators:
|
| 192 |
+
content += "#### 🟡 YELLOW Indicators (Potential)\n\n"
|
| 193 |
+
for indicator in yellow_indicators:
|
| 194 |
+
confidence_pct = int(round(indicator.confidence * 100))
|
| 195 |
+
content += f"• **{indicator.subcategory}** ({confidence_pct}% confidence)\n"
|
| 196 |
+
content += f" - Category: {indicator.category}\n"
|
| 197 |
+
content += f" - Reference: {indicator.definition_reference}\n\n"
|
| 198 |
+
|
| 199 |
+
return content
|
| 200 |
+
|
| 201 |
+
@staticmethod
|
| 202 |
+
def create_tagging_interface() -> Tuple[gr.Component, gr.Component, gr.Component, gr.Component, gr.Component, gr.Component, gr.Component, gr.Component, gr.Component, gr.Component]:
|
| 203 |
+
"""
|
| 204 |
+
Create TaggingInterface component.
|
| 205 |
+
|
| 206 |
+
Provides classification subcategory selector, multi-select for issues,
|
| 207 |
+
and free-text comment fields.
|
| 208 |
+
|
| 209 |
+
Returns:
|
| 210 |
+
Tuple of individual tagging components for use in event handlers
|
| 211 |
+
|
| 212 |
+
Requirements: 4.1, 5.1, 5.3, 6.1, 6.3
|
| 213 |
+
"""
|
| 214 |
+
# Classification tagging components
|
| 215 |
+
is_correct = gr.Radio(
|
| 216 |
+
choices=[("✓ Correct", True), ("✗ Incorrect", False)],
|
| 217 |
+
label="Is the classification correct?",
|
| 218 |
+
interactive=True,
|
| 219 |
+
visible=False,
|
| 220 |
+
)
|
| 221 |
+
|
| 222 |
+
subcategory = gr.Dropdown(
|
| 223 |
+
choices=CLASSIFICATION_SUBCATEGORIES,
|
| 224 |
+
label="What type of error? (if incorrect)",
|
| 225 |
+
interactive=True,
|
| 226 |
+
visible=False,
|
| 227 |
+
)
|
| 228 |
+
|
| 229 |
+
correct_classification = gr.Radio(
|
| 230 |
+
choices=[
|
| 231 |
+
("🟢 GREEN - No Distress", "green"),
|
| 232 |
+
("🟡 YELLOW - Potential Distress", "yellow"),
|
| 233 |
+
("🔴 RED - Severe Distress", "red"),
|
| 234 |
+
],
|
| 235 |
+
label="What should the correct classification be?",
|
| 236 |
+
interactive=True,
|
| 237 |
+
visible=False,
|
| 238 |
+
)
|
| 239 |
+
|
| 240 |
+
# Follow-up question issues components
|
| 241 |
+
question_issues = gr.CheckboxGroup(
|
| 242 |
+
choices=QUESTION_ISSUE_TYPES,
|
| 243 |
+
label="Issues with follow-up questions (select all that apply)",
|
| 244 |
+
interactive=True,
|
| 245 |
+
visible=False,
|
| 246 |
+
)
|
| 247 |
+
|
| 248 |
+
question_comments = gr.Textbox(
|
| 249 |
+
label="Comments on questions",
|
| 250 |
+
placeholder="e.g., 'Too clinical', 'Not spiritually relevant'",
|
| 251 |
+
lines=2,
|
| 252 |
+
interactive=True,
|
| 253 |
+
visible=False,
|
| 254 |
+
)
|
| 255 |
+
|
| 256 |
+
# Referral message issues components
|
| 257 |
+
referral_issues = gr.CheckboxGroup(
|
| 258 |
+
choices=REFERRAL_ISSUE_TYPES,
|
| 259 |
+
label="Issues with referral message (select all that apply)",
|
| 260 |
+
interactive=True,
|
| 261 |
+
visible=False,
|
| 262 |
+
)
|
| 263 |
+
|
| 264 |
+
referral_comments = gr.Textbox(
|
| 265 |
+
label="Comments on referral message",
|
| 266 |
+
placeholder="e.g., 'Incomplete summary', 'Tone inappropriate'",
|
| 267 |
+
lines=2,
|
| 268 |
+
interactive=True,
|
| 269 |
+
visible=False,
|
| 270 |
+
)
|
| 271 |
+
|
| 272 |
+
# Indicator issues components
|
| 273 |
+
indicator_issues = gr.Textbox(
|
| 274 |
+
label="Incorrectly identified indicators",
|
| 275 |
+
placeholder="List indicator IDs or names that were incorrectly identified",
|
| 276 |
+
lines=2,
|
| 277 |
+
interactive=True,
|
| 278 |
+
visible=False,
|
| 279 |
+
)
|
| 280 |
+
|
| 281 |
+
indicator_comments = gr.Textbox(
|
| 282 |
+
label="Comments on indicators",
|
| 283 |
+
placeholder="e.g., 'Missed anxiety indicators', 'False positive on grief'",
|
| 284 |
+
lines=2,
|
| 285 |
+
interactive=True,
|
| 286 |
+
visible=False,
|
| 287 |
+
)
|
| 288 |
+
|
| 289 |
+
# General notes component
|
| 290 |
+
notes_section = gr.Textbox(
|
| 291 |
+
label="General Notes",
|
| 292 |
+
placeholder="Any additional feedback or observations",
|
| 293 |
+
lines=3,
|
| 294 |
+
interactive=True,
|
| 295 |
+
visible=False,
|
| 296 |
+
)
|
| 297 |
+
|
| 298 |
+
return is_correct, subcategory, correct_classification, question_issues, question_comments, referral_issues, referral_comments, indicator_issues, indicator_comments, notes_section
|
| 299 |
+
|
| 300 |
+
@staticmethod
|
| 301 |
+
def create_indicator_display() -> Tuple[gr.Component, gr.Component]:
|
| 302 |
+
"""
|
| 303 |
+
Create IndicatorDisplay component.
|
| 304 |
+
|
| 305 |
+
Shows indicators with categories and allows tagging incorrect indicators.
|
| 306 |
+
|
| 307 |
+
Returns:
|
| 308 |
+
Tuple of (indicators_display, indicator_tagging) components
|
| 309 |
+
|
| 310 |
+
Requirements: 8.1, 8.2, 8.3
|
| 311 |
+
"""
|
| 312 |
+
indicators_display = gr.Markdown(
|
| 313 |
+
value="No indicators to display",
|
| 314 |
+
label="Detected Indicators",
|
| 315 |
+
)
|
| 316 |
+
|
| 317 |
+
indicator_tagging = gr.Group(visible=False)
|
| 318 |
+
with indicator_tagging:
|
| 319 |
+
incorrect_indicators = gr.CheckboxGroup(
|
| 320 |
+
choices=[],
|
| 321 |
+
label="Select indicators that are incorrectly identified",
|
| 322 |
+
interactive=True,
|
| 323 |
+
)
|
| 324 |
+
|
| 325 |
+
indicator_notes = gr.Textbox(
|
| 326 |
+
label="Why are these indicators incorrect?",
|
| 327 |
+
placeholder="Explain why these indicators don't apply",
|
| 328 |
+
lines=2,
|
| 329 |
+
interactive=True,
|
| 330 |
+
)
|
| 331 |
+
|
| 332 |
+
return indicators_display, indicator_tagging
|
| 333 |
+
|
| 334 |
+
@staticmethod
|
| 335 |
+
def create_error_pattern_summary() -> Tuple[gr.Component, gr.Component, gr.Component]:
|
| 336 |
+
"""
|
| 337 |
+
Create ErrorPatternSummary component.
|
| 338 |
+
|
| 339 |
+
Displays error patterns grouped by type with frequent subcategories highlighted.
|
| 340 |
+
|
| 341 |
+
Returns:
|
| 342 |
+
Tuple of (error_patterns, subcategory_breakdown, recommendations) components
|
| 343 |
+
|
| 344 |
+
Requirements: 10.1, 10.2, 10.3
|
| 345 |
+
"""
|
| 346 |
+
error_patterns = gr.Markdown(
|
| 347 |
+
value="No error patterns yet",
|
| 348 |
+
label="Error Patterns",
|
| 349 |
+
)
|
| 350 |
+
|
| 351 |
+
subcategory_breakdown = gr.Markdown(
|
| 352 |
+
value="No data",
|
| 353 |
+
label="Subcategory Breakdown",
|
| 354 |
+
)
|
| 355 |
+
|
| 356 |
+
recommendations = gr.Markdown(
|
| 357 |
+
value="No recommendations yet",
|
| 358 |
+
label="Recommendations for Improvement",
|
| 359 |
+
)
|
| 360 |
+
|
| 361 |
+
return error_patterns, subcategory_breakdown, recommendations
|
| 362 |
+
|
| 363 |
+
@staticmethod
|
| 364 |
+
def render_error_patterns(
|
| 365 |
+
classification_errors: Dict[str, int],
|
| 366 |
+
question_errors: Dict[str, int],
|
| 367 |
+
referral_errors: Dict[str, int],
|
| 368 |
+
) -> Tuple[str, str, str]:
|
| 369 |
+
"""
|
| 370 |
+
Render error patterns summary.
|
| 371 |
+
|
| 372 |
+
Args:
|
| 373 |
+
classification_errors: Dict of classification error subcategories with counts
|
| 374 |
+
question_errors: Dict of question issue types with counts
|
| 375 |
+
referral_errors: Dict of referral issue types with counts
|
| 376 |
+
|
| 377 |
+
Returns:
|
| 378 |
+
Tuple of (patterns, breakdown, recommendations) markdown strings
|
| 379 |
+
"""
|
| 380 |
+
# Error patterns grouped by type
|
| 381 |
+
patterns = "### Error Patterns\n\n"
|
| 382 |
+
|
| 383 |
+
total_classification_errors = sum(classification_errors.values())
|
| 384 |
+
total_question_errors = sum(question_errors.values())
|
| 385 |
+
total_referral_errors = sum(referral_errors.values())
|
| 386 |
+
|
| 387 |
+
if total_classification_errors > 0:
|
| 388 |
+
patterns += f"#### Classification Errors: {total_classification_errors} total\n\n"
|
| 389 |
+
for subcategory, count in sorted(classification_errors.items(), key=lambda x: x[1], reverse=True):
|
| 390 |
+
patterns += f"• {subcategory}: {count}\n"
|
| 391 |
+
patterns += "\n"
|
| 392 |
+
|
| 393 |
+
if total_question_errors > 0:
|
| 394 |
+
patterns += f"#### Follow-Up Question Issues: {total_question_errors} total\n\n"
|
| 395 |
+
for issue_type, count in sorted(question_errors.items(), key=lambda x: x[1], reverse=True):
|
| 396 |
+
patterns += f"• {issue_type}: {count}\n"
|
| 397 |
+
patterns += "\n"
|
| 398 |
+
|
| 399 |
+
if total_referral_errors > 0:
|
| 400 |
+
patterns += f"#### Referral Message Issues: {total_referral_errors} total\n\n"
|
| 401 |
+
for issue_type, count in sorted(referral_errors.items(), key=lambda x: x[1], reverse=True):
|
| 402 |
+
patterns += f"• {issue_type}: {count}\n"
|
| 403 |
+
patterns += "\n"
|
| 404 |
+
|
| 405 |
+
# Subcategory breakdown
|
| 406 |
+
breakdown = "### Subcategory Breakdown\n\n"
|
| 407 |
+
|
| 408 |
+
if classification_errors:
|
| 409 |
+
breakdown += "**Classification Errors:**\n"
|
| 410 |
+
for subcategory, count in sorted(classification_errors.items(), key=lambda x: x[1], reverse=True):
|
| 411 |
+
breakdown += f"- {subcategory}: {count}\n"
|
| 412 |
+
breakdown += "\n"
|
| 413 |
+
|
| 414 |
+
if question_errors:
|
| 415 |
+
breakdown += "**Question Issues:**\n"
|
| 416 |
+
for issue_type, count in sorted(question_errors.items(), key=lambda x: x[1], reverse=True):
|
| 417 |
+
breakdown += f"- {issue_type}: {count}\n"
|
| 418 |
+
breakdown += "\n"
|
| 419 |
+
|
| 420 |
+
if referral_errors:
|
| 421 |
+
breakdown += "**Referral Issues:**\n"
|
| 422 |
+
for issue_type, count in sorted(referral_errors.items(), key=lambda x: x[1], reverse=True):
|
| 423 |
+
breakdown += f"- {issue_type}: {count}\n"
|
| 424 |
+
breakdown += "\n"
|
| 425 |
+
|
| 426 |
+
# Recommendations
|
| 427 |
+
recommendations = "### Recommendations for Improvement\n\n"
|
| 428 |
+
|
| 429 |
+
# Find most common errors
|
| 430 |
+
all_errors = {}
|
| 431 |
+
for subcategory, count in classification_errors.items():
|
| 432 |
+
all_errors[f"Classification: {subcategory}"] = count
|
| 433 |
+
for issue_type, count in question_errors.items():
|
| 434 |
+
all_errors[f"Questions: {issue_type}"] = count
|
| 435 |
+
for issue_type, count in referral_errors.items():
|
| 436 |
+
all_errors[f"Referral: {issue_type}"] = count
|
| 437 |
+
|
| 438 |
+
if all_errors:
|
| 439 |
+
sorted_errors = sorted(all_errors.items(), key=lambda x: x[1], reverse=True)
|
| 440 |
+
top_3 = sorted_errors[:3]
|
| 441 |
+
|
| 442 |
+
recommendations += "**Top areas for improvement:**\n\n"
|
| 443 |
+
for error_type, count in top_3:
|
| 444 |
+
recommendations += f"1. **{error_type}** ({count} occurrences)\n"
|
| 445 |
+
recommendations += f" - Review prompts and logic for this error type\n"
|
| 446 |
+
recommendations += f" - Consider additional training data\n\n"
|
| 447 |
+
else:
|
| 448 |
+
recommendations += "No errors detected yet. Great job!\n\n"
|
| 449 |
+
|
| 450 |
+
return patterns, breakdown, recommendations
|
src/interface/simplified_gradio_app.py
CHANGED
|
@@ -29,10 +29,13 @@ from typing import Dict, Any, Optional, List
|
|
| 29 |
from src.core.simplified_medical_app import SimplifiedMedicalApp
|
| 30 |
from src.core.spiritual_state import SpiritualState
|
| 31 |
from src.interface.verification_ui import VerificationUIComponents
|
|
|
|
| 32 |
from src.core.test_datasets import TestDatasetManager
|
| 33 |
from src.core.verification_models import VerificationSession, VerificationRecord, TestMessage
|
| 34 |
from src.core.verification_store import JSONVerificationStore
|
| 35 |
from src.core.verification_csv_exporter import VerificationCSVExporter
|
|
|
|
|
|
|
| 36 |
|
| 37 |
try:
|
| 38 |
from app_config import GRADIO_CONFIG
|
|
@@ -159,9 +162,9 @@ def create_simplified_interface():
|
|
| 159 |
skip_btn = gr.Button("⏭️ Skip", scale=1)
|
| 160 |
next_btn = gr.Button("Next ➡️", scale=1)
|
| 161 |
|
| 162 |
-
# Save results button
|
| 163 |
with gr.Row():
|
| 164 |
-
save_results_btn = gr.
|
| 165 |
clear_session_btn = gr.Button("🗑️ Clear Session", scale=1)
|
| 166 |
|
| 167 |
with gr.Column(scale=1):
|
|
@@ -174,6 +177,28 @@ def create_simplified_interface():
|
|
| 174 |
# Summary card
|
| 175 |
summary_card = VerificationUIComponents.create_summary_card_component()
|
| 176 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 177 |
# Results section
|
| 178 |
with gr.Row(visible=False) as results_section:
|
| 179 |
with gr.Column():
|
|
@@ -196,8 +221,8 @@ def create_simplified_interface():
|
|
| 196 |
# Error message display
|
| 197 |
error_message = gr.Markdown(
|
| 198 |
value="",
|
| 199 |
-
visible=
|
| 200 |
-
label="
|
| 201 |
)
|
| 202 |
|
| 203 |
# Hidden state for tracking
|
|
@@ -1238,32 +1263,30 @@ To revert, use "Reset to Default" button.
|
|
| 1238 |
)
|
| 1239 |
|
| 1240 |
def handle_download_csv(session: VerificationSession, store: JSONVerificationStore):
|
| 1241 |
-
"""Handle CSV download."""
|
| 1242 |
try:
|
| 1243 |
if not session or session.verified_count == 0:
|
| 1244 |
-
return None
|
| 1245 |
|
| 1246 |
csv_content = VerificationCSVExporter.generate_csv_content(session)
|
| 1247 |
filename = VerificationCSVExporter.generate_csv_filename()
|
| 1248 |
|
| 1249 |
-
# Write to temporary file
|
| 1250 |
-
import tempfile
|
| 1251 |
import os
|
|
|
|
| 1252 |
|
| 1253 |
-
#
|
| 1254 |
-
temp_dir =
|
| 1255 |
-
os.
|
| 1256 |
|
| 1257 |
-
|
| 1258 |
-
temp_path = os.path.join(temp_dir, filename)
|
| 1259 |
-
with open(temp_path, 'w') as f:
|
| 1260 |
f.write(csv_content)
|
| 1261 |
|
| 1262 |
-
|
| 1263 |
-
return temp_path, success_msg
|
| 1264 |
|
| 1265 |
except Exception as e:
|
| 1266 |
-
|
|
|
|
|
|
|
| 1267 |
|
| 1268 |
# Bind verification events
|
| 1269 |
load_dataset_btn.click(
|
|
@@ -1536,11 +1559,11 @@ To revert, use "Reset to Default" button.
|
|
| 1536 |
]
|
| 1537 |
)
|
| 1538 |
|
| 1539 |
-
# Save results button
|
| 1540 |
save_results_btn.click(
|
| 1541 |
handle_download_csv,
|
| 1542 |
inputs=[verification_session, verification_store],
|
| 1543 |
-
outputs=[
|
| 1544 |
)
|
| 1545 |
|
| 1546 |
# Clear session button
|
|
@@ -1576,6 +1599,93 @@ To revert, use "Reset to Default" button.
|
|
| 1576 |
]
|
| 1577 |
)
|
| 1578 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1579 |
# Bind events
|
| 1580 |
demo.load(
|
| 1581 |
initialize_session,
|
|
|
|
| 29 |
from src.core.simplified_medical_app import SimplifiedMedicalApp
|
| 30 |
from src.core.spiritual_state import SpiritualState
|
| 31 |
from src.interface.verification_ui import VerificationUIComponents
|
| 32 |
+
from src.interface.chaplain_feedback_ui import ChaplainFeedbackUIComponents
|
| 33 |
from src.core.test_datasets import TestDatasetManager
|
| 34 |
from src.core.verification_models import VerificationSession, VerificationRecord, TestMessage
|
| 35 |
from src.core.verification_store import JSONVerificationStore
|
| 36 |
from src.core.verification_csv_exporter import VerificationCSVExporter
|
| 37 |
+
from src.core.chaplain_models import ClassificationFlowResult, DistressIndicator, FollowUpQuestion
|
| 38 |
+
from src.core.error_pattern_analyzer import ErrorPatternAnalyzer
|
| 39 |
|
| 40 |
try:
|
| 41 |
from app_config import GRADIO_CONFIG
|
|
|
|
| 162 |
skip_btn = gr.Button("⏭️ Skip", scale=1)
|
| 163 |
next_btn = gr.Button("Next ➡️", scale=1)
|
| 164 |
|
| 165 |
+
# Save results button - using DownloadButton for Hugging Face compatibility
|
| 166 |
with gr.Row():
|
| 167 |
+
save_results_btn = gr.DownloadButton("💾 Download Results (CSV)", variant="primary", scale=2)
|
| 168 |
clear_session_btn = gr.Button("🗑️ Clear Session", scale=1)
|
| 169 |
|
| 170 |
with gr.Column(scale=1):
|
|
|
|
| 177 |
# Summary card
|
| 178 |
summary_card = VerificationUIComponents.create_summary_card_component()
|
| 179 |
|
| 180 |
+
# Chaplain Feedback Section - for displaying classification flows and collecting feedback
|
| 181 |
+
chaplain_feedback_section = gr.Row(visible=False)
|
| 182 |
+
with chaplain_feedback_section:
|
| 183 |
+
with gr.Column(scale=2):
|
| 184 |
+
# Classification flow display
|
| 185 |
+
flow_badge, flow_explanation, flow_content, flow_indicators = ChaplainFeedbackUIComponents.create_classification_flow_display()
|
| 186 |
+
|
| 187 |
+
# Tagging interface - returns individual components
|
| 188 |
+
(is_correct, subcategory, correct_classification,
|
| 189 |
+
question_issues, question_comments,
|
| 190 |
+
referral_issues, referral_comments,
|
| 191 |
+
indicator_issues, indicator_comments, general_notes) = ChaplainFeedbackUIComponents.create_tagging_interface()
|
| 192 |
+
|
| 193 |
+
# Submit feedback button
|
| 194 |
+
with gr.Row():
|
| 195 |
+
submit_feedback_btn = gr.Button("✓ Submit Feedback", variant="primary", scale=2)
|
| 196 |
+
skip_feedback_btn = gr.Button("⏭️ Skip Feedback", scale=1)
|
| 197 |
+
|
| 198 |
+
with gr.Column(scale=1):
|
| 199 |
+
# Error pattern summary
|
| 200 |
+
error_patterns, subcategory_breakdown, recommendations = ChaplainFeedbackUIComponents.create_error_pattern_summary()
|
| 201 |
+
|
| 202 |
# Results section
|
| 203 |
with gr.Row(visible=False) as results_section:
|
| 204 |
with gr.Column():
|
|
|
|
| 221 |
# Error message display
|
| 222 |
error_message = gr.Markdown(
|
| 223 |
value="",
|
| 224 |
+
visible=True,
|
| 225 |
+
label="Status"
|
| 226 |
)
|
| 227 |
|
| 228 |
# Hidden state for tracking
|
|
|
|
| 1263 |
)
|
| 1264 |
|
| 1265 |
def handle_download_csv(session: VerificationSession, store: JSONVerificationStore):
|
| 1266 |
+
"""Handle CSV download - returns file path for DownloadButton."""
|
| 1267 |
try:
|
| 1268 |
if not session or session.verified_count == 0:
|
| 1269 |
+
return None
|
| 1270 |
|
| 1271 |
csv_content = VerificationCSVExporter.generate_csv_content(session)
|
| 1272 |
filename = VerificationCSVExporter.generate_csv_filename()
|
| 1273 |
|
|
|
|
|
|
|
| 1274 |
import os
|
| 1275 |
+
import tempfile
|
| 1276 |
|
| 1277 |
+
# Use temp directory for Hugging Face compatibility
|
| 1278 |
+
temp_dir = tempfile.gettempdir()
|
| 1279 |
+
file_path = os.path.join(temp_dir, filename)
|
| 1280 |
|
| 1281 |
+
with open(file_path, 'w', encoding='utf-8') as f:
|
|
|
|
|
|
|
| 1282 |
f.write(csv_content)
|
| 1283 |
|
| 1284 |
+
return file_path
|
|
|
|
| 1285 |
|
| 1286 |
except Exception as e:
|
| 1287 |
+
import traceback
|
| 1288 |
+
print(f"CSV Export Error: {traceback.format_exc()}")
|
| 1289 |
+
return None
|
| 1290 |
|
| 1291 |
# Bind verification events
|
| 1292 |
load_dataset_btn.click(
|
|
|
|
| 1559 |
]
|
| 1560 |
)
|
| 1561 |
|
| 1562 |
+
# Save results button - DownloadButton triggers download directly
|
| 1563 |
save_results_btn.click(
|
| 1564 |
handle_download_csv,
|
| 1565 |
inputs=[verification_session, verification_store],
|
| 1566 |
+
outputs=[save_results_btn]
|
| 1567 |
)
|
| 1568 |
|
| 1569 |
# Clear session button
|
|
|
|
| 1599 |
]
|
| 1600 |
)
|
| 1601 |
|
| 1602 |
+
# Chaplain Feedback Event Handlers
|
| 1603 |
+
def show_chaplain_feedback_section():
|
| 1604 |
+
"""Show chaplain feedback section after message review."""
|
| 1605 |
+
return gr.Row(visible=True)
|
| 1606 |
+
|
| 1607 |
+
def handle_submit_feedback(
|
| 1608 |
+
classification_correct: bool,
|
| 1609 |
+
classification_subcategory: Optional[str],
|
| 1610 |
+
correct_classification: Optional[str],
|
| 1611 |
+
question_issues: List[str],
|
| 1612 |
+
question_comments: str,
|
| 1613 |
+
referral_issues: List[str],
|
| 1614 |
+
referral_comments: str,
|
| 1615 |
+
indicator_issues: str,
|
| 1616 |
+
indicator_comments: str,
|
| 1617 |
+
general_notes: str,
|
| 1618 |
+
session: VerificationSession,
|
| 1619 |
+
current_idx: int,
|
| 1620 |
+
message_queue: List[str],
|
| 1621 |
+
):
|
| 1622 |
+
"""Handle chaplain feedback submission."""
|
| 1623 |
+
try:
|
| 1624 |
+
if not session or current_idx >= len(message_queue):
|
| 1625 |
+
return "❌ Error: Invalid session state", session, current_idx
|
| 1626 |
+
|
| 1627 |
+
# Create tagging record
|
| 1628 |
+
from src.core.chaplain_models import TaggingRecord
|
| 1629 |
+
import uuid
|
| 1630 |
+
|
| 1631 |
+
current_message_id = message_queue[current_idx]
|
| 1632 |
+
|
| 1633 |
+
tagging_record = TaggingRecord(
|
| 1634 |
+
record_id=str(uuid.uuid4()),
|
| 1635 |
+
message_id=current_message_id,
|
| 1636 |
+
is_classification_correct=classification_correct,
|
| 1637 |
+
classification_subcategory=classification_subcategory,
|
| 1638 |
+
correct_classification=correct_classification,
|
| 1639 |
+
question_issues=question_issues or [],
|
| 1640 |
+
question_comments=question_comments,
|
| 1641 |
+
referral_issues=referral_issues or [],
|
| 1642 |
+
referral_comments=referral_comments,
|
| 1643 |
+
indicator_issues=[i.strip() for i in indicator_issues.split(",") if i.strip()],
|
| 1644 |
+
indicator_comments=indicator_comments,
|
| 1645 |
+
general_notes=general_notes,
|
| 1646 |
+
)
|
| 1647 |
+
|
| 1648 |
+
# Store tagging record in session (would need to extend VerificationSession)
|
| 1649 |
+
# For now, just confirm submission
|
| 1650 |
+
success_msg = f"✅ Feedback submitted for message {current_idx + 1}"
|
| 1651 |
+
|
| 1652 |
+
return success_msg, session, current_idx
|
| 1653 |
+
|
| 1654 |
+
except Exception as e:
|
| 1655 |
+
return f"❌ Error: {str(e)}", session, current_idx
|
| 1656 |
+
|
| 1657 |
+
def display_classification_flow(flow_result: Optional[ClassificationFlowResult]):
|
| 1658 |
+
"""Display classification flow result."""
|
| 1659 |
+
if not flow_result:
|
| 1660 |
+
return "", "", "", ""
|
| 1661 |
+
|
| 1662 |
+
badge, explanation, content, indicators = ChaplainFeedbackUIComponents.render_classification_flow(flow_result)
|
| 1663 |
+
return badge, explanation, content, indicators
|
| 1664 |
+
|
| 1665 |
+
# Bind chaplain feedback events
|
| 1666 |
+
submit_feedback_btn.click(
|
| 1667 |
+
handle_submit_feedback,
|
| 1668 |
+
inputs=[
|
| 1669 |
+
is_correct, # is_correct radio
|
| 1670 |
+
subcategory, # subcategory dropdown
|
| 1671 |
+
correct_classification, # correct_classification radio
|
| 1672 |
+
question_issues, # question_issues checkbox
|
| 1673 |
+
question_comments, # question_comments textbox
|
| 1674 |
+
referral_issues, # referral_issues checkbox
|
| 1675 |
+
referral_comments, # referral_comments textbox
|
| 1676 |
+
indicator_issues, # indicator_issues textbox
|
| 1677 |
+
indicator_comments, # indicator_comments textbox
|
| 1678 |
+
general_notes,
|
| 1679 |
+
verification_session,
|
| 1680 |
+
current_message_index,
|
| 1681 |
+
message_queue,
|
| 1682 |
+
],
|
| 1683 |
+
outputs=[error_message, verification_session, current_message_index]
|
| 1684 |
+
).then(
|
| 1685 |
+
lambda: gr.Row(visible=False),
|
| 1686 |
+
outputs=[chaplain_feedback_section]
|
| 1687 |
+
)
|
| 1688 |
+
|
| 1689 |
# Bind events
|
| 1690 |
demo.load(
|
| 1691 |
initialize_session,
|
tests/chaplain_feedback/__init__.py
ADDED
|
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# tests/chaplain_feedback/__init__.py
|
| 2 |
+
"""Tests for Chaplain Feedback & Tagging System."""
|
tests/chaplain_feedback/conftest.py
ADDED
|
@@ -0,0 +1,145 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# conftest.py
|
| 2 |
+
"""
|
| 3 |
+
Pytest fixtures for Chaplain Feedback tests.
|
| 4 |
+
"""
|
| 5 |
+
|
| 6 |
+
import pytest
|
| 7 |
+
from hypothesis import strategies as st
|
| 8 |
+
from datetime import datetime
|
| 9 |
+
|
| 10 |
+
from src.core.chaplain_models import (
|
| 11 |
+
DistressIndicator,
|
| 12 |
+
FollowUpQuestion,
|
| 13 |
+
ClassificationFlowResult,
|
| 14 |
+
TaggingRecord,
|
| 15 |
+
InteractionStepLog,
|
| 16 |
+
INDICATOR_DEFINITIONS,
|
| 17 |
+
CLASSIFICATION_SUBCATEGORIES,
|
| 18 |
+
QUESTION_ISSUE_TYPES,
|
| 19 |
+
REFERRAL_ISSUE_TYPES,
|
| 20 |
+
INTERACTION_STEP_TYPES,
|
| 21 |
+
)
|
| 22 |
+
|
| 23 |
+
|
| 24 |
+
# =============================================================================
|
| 25 |
+
# Hypothesis Strategies for generating test data
|
| 26 |
+
# =============================================================================
|
| 27 |
+
|
| 28 |
+
def valid_id_strategy():
|
| 29 |
+
"""Generate valid IDs."""
|
| 30 |
+
return st.text(
|
| 31 |
+
alphabet="abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789_-",
|
| 32 |
+
min_size=1,
|
| 33 |
+
max_size=20,
|
| 34 |
+
)
|
| 35 |
+
|
| 36 |
+
|
| 37 |
+
def distress_indicator_strategy():
|
| 38 |
+
"""Generate random DistressIndicator instances."""
|
| 39 |
+
return st.builds(
|
| 40 |
+
DistressIndicator,
|
| 41 |
+
indicator_text=st.text(min_size=1, max_size=200),
|
| 42 |
+
category=st.sampled_from([
|
| 43 |
+
"Emotional", "Grief", "Existential", "Expressions",
|
| 44 |
+
"Spiritual", "Medical", "Social", "Cultural",
|
| 45 |
+
"Engagement", "Guilt", "Anger", "Aging",
|
| 46 |
+
"Environment", "Independence"
|
| 47 |
+
]),
|
| 48 |
+
subcategory=st.text(min_size=1, max_size=100),
|
| 49 |
+
severity=st.sampled_from(["red", "yellow"]),
|
| 50 |
+
confidence=st.floats(min_value=0.0, max_value=1.0, allow_nan=False),
|
| 51 |
+
definition_reference=st.text(max_size=20),
|
| 52 |
+
)
|
| 53 |
+
|
| 54 |
+
|
| 55 |
+
def follow_up_question_strategy():
|
| 56 |
+
"""Generate random FollowUpQuestion instances."""
|
| 57 |
+
return st.builds(
|
| 58 |
+
FollowUpQuestion,
|
| 59 |
+
question_id=valid_id_strategy(),
|
| 60 |
+
question_text=st.text(min_size=1, max_size=500),
|
| 61 |
+
purpose=st.text(min_size=1, max_size=200),
|
| 62 |
+
)
|
| 63 |
+
|
| 64 |
+
|
| 65 |
+
def classification_flow_result_strategy():
|
| 66 |
+
"""Generate random ClassificationFlowResult instances."""
|
| 67 |
+
return st.builds(
|
| 68 |
+
ClassificationFlowResult,
|
| 69 |
+
classification=st.sampled_from(["red", "yellow", "green"]),
|
| 70 |
+
confidence=st.floats(min_value=0.0, max_value=1.0, allow_nan=False),
|
| 71 |
+
indicators=st.lists(distress_indicator_strategy(), max_size=5),
|
| 72 |
+
explanation=st.text(max_size=500),
|
| 73 |
+
permission_check_message=st.one_of(st.none(), st.text(max_size=300)),
|
| 74 |
+
referral_message=st.one_of(st.none(), st.text(max_size=500)),
|
| 75 |
+
consent_status=st.one_of(st.none(), st.sampled_from(["granted", "declined"])),
|
| 76 |
+
follow_up_questions=st.lists(follow_up_question_strategy(), max_size=3),
|
| 77 |
+
patient_responses=st.lists(st.text(max_size=200), max_size=3),
|
| 78 |
+
re_evaluation_result=st.one_of(st.none(), st.sampled_from(["red", "green"])),
|
| 79 |
+
)
|
| 80 |
+
|
| 81 |
+
|
| 82 |
+
def tagging_record_strategy():
|
| 83 |
+
"""Generate random TaggingRecord instances."""
|
| 84 |
+
return st.builds(
|
| 85 |
+
TaggingRecord,
|
| 86 |
+
record_id=valid_id_strategy(),
|
| 87 |
+
message_id=valid_id_strategy(),
|
| 88 |
+
is_classification_correct=st.booleans(),
|
| 89 |
+
classification_subcategory=st.one_of(
|
| 90 |
+
st.none(),
|
| 91 |
+
st.sampled_from(CLASSIFICATION_SUBCATEGORIES)
|
| 92 |
+
),
|
| 93 |
+
correct_classification=st.one_of(
|
| 94 |
+
st.none(),
|
| 95 |
+
st.sampled_from(["red", "yellow", "green"])
|
| 96 |
+
),
|
| 97 |
+
question_issues=st.lists(
|
| 98 |
+
st.sampled_from(QUESTION_ISSUE_TYPES),
|
| 99 |
+
max_size=3,
|
| 100 |
+
unique=True
|
| 101 |
+
),
|
| 102 |
+
question_comments=st.one_of(st.none(), st.text(max_size=200)),
|
| 103 |
+
referral_issues=st.lists(
|
| 104 |
+
st.sampled_from(REFERRAL_ISSUE_TYPES),
|
| 105 |
+
max_size=3,
|
| 106 |
+
unique=True
|
| 107 |
+
),
|
| 108 |
+
referral_comments=st.one_of(st.none(), st.text(max_size=200)),
|
| 109 |
+
indicator_issues=st.lists(st.text(min_size=1, max_size=50), max_size=5),
|
| 110 |
+
indicator_comments=st.one_of(st.none(), st.text(max_size=200)),
|
| 111 |
+
general_notes=st.text(max_size=300),
|
| 112 |
+
timestamp=st.just(datetime.now()),
|
| 113 |
+
)
|
| 114 |
+
|
| 115 |
+
|
| 116 |
+
def interaction_step_log_strategy():
|
| 117 |
+
"""Generate random InteractionStepLog instances (without nested tagging)."""
|
| 118 |
+
return st.builds(
|
| 119 |
+
InteractionStepLog,
|
| 120 |
+
step_id=valid_id_strategy(),
|
| 121 |
+
session_id=valid_id_strategy(),
|
| 122 |
+
message_id=valid_id_strategy(),
|
| 123 |
+
step_type=st.sampled_from(INTERACTION_STEP_TYPES),
|
| 124 |
+
input_text=st.text(max_size=500),
|
| 125 |
+
model_output=st.text(max_size=500),
|
| 126 |
+
approval_status=st.one_of(st.none(), st.sampled_from(["approved", "disapproved"])),
|
| 127 |
+
tagging_data=st.none(), # Simplified - no nested tagging for basic tests
|
| 128 |
+
timestamp=st.just(datetime.now()),
|
| 129 |
+
)
|
| 130 |
+
|
| 131 |
+
|
| 132 |
+
def interaction_step_log_with_tagging_strategy():
|
| 133 |
+
"""Generate random InteractionStepLog instances with nested tagging."""
|
| 134 |
+
return st.builds(
|
| 135 |
+
InteractionStepLog,
|
| 136 |
+
step_id=valid_id_strategy(),
|
| 137 |
+
session_id=valid_id_strategy(),
|
| 138 |
+
message_id=valid_id_strategy(),
|
| 139 |
+
step_type=st.sampled_from(INTERACTION_STEP_TYPES),
|
| 140 |
+
input_text=st.text(max_size=500),
|
| 141 |
+
model_output=st.text(max_size=500),
|
| 142 |
+
approval_status=st.one_of(st.none(), st.sampled_from(["approved", "disapproved"])),
|
| 143 |
+
tagging_data=st.one_of(st.none(), tagging_record_strategy()),
|
| 144 |
+
timestamp=st.just(datetime.now()),
|
| 145 |
+
)
|
tests/chaplain_feedback/test_properties_classification_flow.py
ADDED
|
@@ -0,0 +1,297 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# test_properties_classification_flow.py
|
| 2 |
+
"""
|
| 3 |
+
Property-based tests for Classification Flow Manager.
|
| 4 |
+
|
| 5 |
+
Tests universal properties that should hold across all inputs for
|
| 6 |
+
RED/YELLOW/GREEN classification flows.
|
| 7 |
+
"""
|
| 8 |
+
|
| 9 |
+
import pytest
|
| 10 |
+
from hypothesis import given, strategies as st
|
| 11 |
+
|
| 12 |
+
from src.core.classification_flow_manager import ClassificationFlowManager
|
| 13 |
+
from src.core.content_generator import ContentGenerator
|
| 14 |
+
from src.core.chaplain_models import DistressIndicator
|
| 15 |
+
from tests.chaplain_feedback.conftest import distress_indicator_strategy
|
| 16 |
+
|
| 17 |
+
|
| 18 |
+
class TestClassificationFlowProperties:
|
| 19 |
+
"""Property-based tests for ClassificationFlowManager."""
|
| 20 |
+
|
| 21 |
+
def setup_method(self):
|
| 22 |
+
"""Set up test fixtures."""
|
| 23 |
+
self.content_generator = ContentGenerator()
|
| 24 |
+
self.flow_manager = ClassificationFlowManager(self.content_generator)
|
| 25 |
+
|
| 26 |
+
@given(
|
| 27 |
+
message=st.text(min_size=1, max_size=500),
|
| 28 |
+
confidence=st.floats(min_value=0.0, max_value=1.0, allow_nan=False),
|
| 29 |
+
indicators=st.lists(distress_indicator_strategy(), min_size=1, max_size=5),
|
| 30 |
+
consent_status=st.sampled_from(["granted", "declined"])
|
| 31 |
+
)
|
| 32 |
+
def test_property_4_red_flow_displays_all_content(
|
| 33 |
+
self, message, confidence, indicators, consent_status
|
| 34 |
+
):
|
| 35 |
+
"""
|
| 36 |
+
**Feature: chaplain-feedback-system, Property 4: RED Flow Displays All Content**
|
| 37 |
+
**Validates: Requirements 1.5**
|
| 38 |
+
|
| 39 |
+
For any RED classification result, the UI should display all three content types:
|
| 40 |
+
explanation, permission check message, and referral message (if consent granted).
|
| 41 |
+
"""
|
| 42 |
+
# Execute RED flow
|
| 43 |
+
result = self.flow_manager.execute_red_flow(
|
| 44 |
+
message=message,
|
| 45 |
+
confidence=confidence,
|
| 46 |
+
indicators=indicators,
|
| 47 |
+
consent_status=consent_status
|
| 48 |
+
)
|
| 49 |
+
|
| 50 |
+
# Verify all required content is present
|
| 51 |
+
assert result.classification == "red"
|
| 52 |
+
assert result.explanation is not None and result.explanation.strip() != ""
|
| 53 |
+
assert result.permission_check_message is not None and result.permission_check_message.strip() != ""
|
| 54 |
+
assert result.consent_status == consent_status
|
| 55 |
+
|
| 56 |
+
# If consent granted, referral message should be present
|
| 57 |
+
if consent_status == "granted":
|
| 58 |
+
assert result.referral_message is not None and result.referral_message.strip() != ""
|
| 59 |
+
else:
|
| 60 |
+
# If consent declined, referral message should be None
|
| 61 |
+
assert result.referral_message is None
|
| 62 |
+
|
| 63 |
+
# Verify indicators are preserved
|
| 64 |
+
assert result.indicators == indicators
|
| 65 |
+
assert result.confidence == confidence
|
| 66 |
+
|
| 67 |
+
@given(
|
| 68 |
+
message=st.text(min_size=1, max_size=500),
|
| 69 |
+
confidence=st.floats(min_value=0.0, max_value=1.0, allow_nan=False),
|
| 70 |
+
indicators=st.lists(distress_indicator_strategy(), min_size=1, max_size=5)
|
| 71 |
+
)
|
| 72 |
+
def test_property_5_yellow_explanation_differentiates(
|
| 73 |
+
self, message, confidence, indicators
|
| 74 |
+
):
|
| 75 |
+
"""
|
| 76 |
+
**Feature: chaplain-feedback-system, Property 5: YELLOW Explanation Differentiates**
|
| 77 |
+
**Validates: Requirements 2.1**
|
| 78 |
+
|
| 79 |
+
For any YELLOW classification, the explanation should contain reasoning
|
| 80 |
+
for why it's not RED and why it's not GREEN.
|
| 81 |
+
"""
|
| 82 |
+
# Execute YELLOW flow
|
| 83 |
+
result = self.flow_manager.execute_yellow_flow(
|
| 84 |
+
message=message,
|
| 85 |
+
confidence=confidence,
|
| 86 |
+
indicators=indicators
|
| 87 |
+
)
|
| 88 |
+
|
| 89 |
+
# Verify explanation differentiates from RED and GREEN
|
| 90 |
+
explanation = result.explanation.lower()
|
| 91 |
+
|
| 92 |
+
# Should explain why not RED
|
| 93 |
+
assert any(phrase in explanation for phrase in [
|
| 94 |
+
"why not red", "not red", "not meet the threshold",
|
| 95 |
+
"do not meet", "further clarification", "not severe"
|
| 96 |
+
]), f"Explanation should explain why not RED: {result.explanation}"
|
| 97 |
+
|
| 98 |
+
# Should explain why not GREEN
|
| 99 |
+
assert any(phrase in explanation for phrase in [
|
| 100 |
+
"why not green", "not green", "indicators", "concerns",
|
| 101 |
+
"warrant follow-up", "suggest possible"
|
| 102 |
+
]), f"Explanation should explain why not GREEN: {result.explanation}"
|
| 103 |
+
|
| 104 |
+
# Verify other YELLOW flow properties
|
| 105 |
+
assert result.classification == "yellow"
|
| 106 |
+
assert result.explanation is not None and result.explanation.strip() != ""
|
| 107 |
+
assert len(result.follow_up_questions) >= 2
|
| 108 |
+
assert len(result.follow_up_questions) <= 3
|
| 109 |
+
|
| 110 |
+
@given(
|
| 111 |
+
message=st.text(min_size=1, max_size=500),
|
| 112 |
+
confidence=st.floats(min_value=0.0, max_value=1.0, allow_nan=False),
|
| 113 |
+
indicators=st.lists(distress_indicator_strategy(), min_size=1, max_size=5)
|
| 114 |
+
)
|
| 115 |
+
def test_property_6_yellow_generates_2_3_questions(
|
| 116 |
+
self, message, confidence, indicators
|
| 117 |
+
):
|
| 118 |
+
"""
|
| 119 |
+
**Feature: chaplain-feedback-system, Property 6: YELLOW Generates 2-3 Questions**
|
| 120 |
+
**Validates: Requirements 2.2**
|
| 121 |
+
|
| 122 |
+
For any YELLOW classification, the system should generate between 2 and 3
|
| 123 |
+
follow-up questions, each containing 1-2 clarifying questions.
|
| 124 |
+
"""
|
| 125 |
+
# Execute YELLOW flow
|
| 126 |
+
result = self.flow_manager.execute_yellow_flow(
|
| 127 |
+
message=message,
|
| 128 |
+
confidence=confidence,
|
| 129 |
+
indicators=indicators
|
| 130 |
+
)
|
| 131 |
+
|
| 132 |
+
# Verify question count
|
| 133 |
+
assert 2 <= len(result.follow_up_questions) <= 3, (
|
| 134 |
+
f"Expected 2-3 questions, got {len(result.follow_up_questions)}"
|
| 135 |
+
)
|
| 136 |
+
|
| 137 |
+
# Verify each question has required fields
|
| 138 |
+
for question in result.follow_up_questions:
|
| 139 |
+
assert question.question_id is not None and question.question_id.strip() != ""
|
| 140 |
+
assert question.question_text is not None and question.question_text.strip() != ""
|
| 141 |
+
assert question.purpose is not None and question.purpose.strip() != ""
|
| 142 |
+
|
| 143 |
+
# Each question should contain 1-2 clarifying questions (check for question marks)
|
| 144 |
+
question_marks = question.question_text.count("?")
|
| 145 |
+
assert 1 <= question_marks <= 2, (
|
| 146 |
+
f"Expected 1-2 questions per follow-up, got {question_marks} in: {question.question_text}"
|
| 147 |
+
)
|
| 148 |
+
|
| 149 |
+
@given(
|
| 150 |
+
message=st.text(min_size=1, max_size=500),
|
| 151 |
+
confidence=st.floats(min_value=0.0, max_value=1.0, allow_nan=False),
|
| 152 |
+
indicators=st.lists(distress_indicator_strategy(), max_size=2) # GREEN should have few/no indicators
|
| 153 |
+
)
|
| 154 |
+
def test_property_9_green_explanation_generated(
|
| 155 |
+
self, message, confidence, indicators
|
| 156 |
+
):
|
| 157 |
+
"""
|
| 158 |
+
**Feature: chaplain-feedback-system, Property 9: GREEN Explanation Generated**
|
| 159 |
+
**Validates: Requirements 3.1, 3.2**
|
| 160 |
+
|
| 161 |
+
For any GREEN classification, an explanation should be generated explaining
|
| 162 |
+
why no spiritual indicators were found.
|
| 163 |
+
"""
|
| 164 |
+
# Execute GREEN flow
|
| 165 |
+
result = self.flow_manager.execute_green_flow(
|
| 166 |
+
message=message,
|
| 167 |
+
confidence=confidence,
|
| 168 |
+
indicators=indicators
|
| 169 |
+
)
|
| 170 |
+
|
| 171 |
+
# Verify explanation is generated
|
| 172 |
+
assert result.classification == "green"
|
| 173 |
+
assert result.explanation is not None and result.explanation.strip() != ""
|
| 174 |
+
|
| 175 |
+
# Explanation should mention no indicators or no distress
|
| 176 |
+
explanation = result.explanation.lower()
|
| 177 |
+
assert any(phrase in explanation for phrase in [
|
| 178 |
+
"no spiritual distress", "no indicators", "not suggest spiritual",
|
| 179 |
+
"no spiritual concerns", "no further steps"
|
| 180 |
+
]), f"GREEN explanation should mention no distress: {result.explanation}"
|
| 181 |
+
|
| 182 |
+
# Should explain why not RED or YELLOW
|
| 183 |
+
assert any(phrase in explanation for phrase in [
|
| 184 |
+
"why not red", "why not yellow", "not contain", "does not suggest"
|
| 185 |
+
]), f"GREEN explanation should differentiate from RED/YELLOW: {result.explanation}"
|
| 186 |
+
|
| 187 |
+
# GREEN flow should not have RED/YELLOW specific content
|
| 188 |
+
assert result.permission_check_message is None
|
| 189 |
+
assert result.referral_message is None
|
| 190 |
+
assert result.consent_status is None
|
| 191 |
+
assert len(result.follow_up_questions) == 0
|
| 192 |
+
assert len(result.patient_responses) == 0
|
| 193 |
+
assert result.re_evaluation_result is None
|
| 194 |
+
|
| 195 |
+
@given(
|
| 196 |
+
message=st.text(min_size=1, max_size=500),
|
| 197 |
+
confidence=st.floats(min_value=0.0, max_value=1.0, allow_nan=False),
|
| 198 |
+
indicators=st.lists(distress_indicator_strategy(), min_size=1, max_size=5),
|
| 199 |
+
# Generate responses that contain escalation keywords
|
| 200 |
+
escalation_responses=st.lists(
|
| 201 |
+
st.sampled_from([
|
| 202 |
+
"I feel hopeless about everything",
|
| 203 |
+
"I feel worthless and can't go on",
|
| 204 |
+
"There's no point in anything anymore",
|
| 205 |
+
"I want to give up completely",
|
| 206 |
+
"This is unbearable, I can't take it"
|
| 207 |
+
]),
|
| 208 |
+
min_size=1,
|
| 209 |
+
max_size=3
|
| 210 |
+
)
|
| 211 |
+
)
|
| 212 |
+
def test_property_7_yellow_escalation_to_red(
|
| 213 |
+
self, message, confidence, indicators, escalation_responses
|
| 214 |
+
):
|
| 215 |
+
"""
|
| 216 |
+
**Feature: chaplain-feedback-system, Property 7: YELLOW Escalation to RED**
|
| 217 |
+
**Validates: Requirements 2.4**
|
| 218 |
+
|
| 219 |
+
For any YELLOW classification where simulated patient responses indicate distress,
|
| 220 |
+
the system should transition to RED FLAG flow.
|
| 221 |
+
"""
|
| 222 |
+
# Execute YELLOW flow with escalation responses
|
| 223 |
+
result = self.flow_manager.execute_yellow_flow(
|
| 224 |
+
message=message,
|
| 225 |
+
confidence=confidence,
|
| 226 |
+
indicators=indicators,
|
| 227 |
+
patient_responses=escalation_responses
|
| 228 |
+
)
|
| 229 |
+
|
| 230 |
+
# Verify escalation occurred
|
| 231 |
+
assert result.re_evaluation_result == "red", (
|
| 232 |
+
f"Expected escalation to RED, got {result.re_evaluation_result} "
|
| 233 |
+
f"for responses: {escalation_responses}"
|
| 234 |
+
)
|
| 235 |
+
|
| 236 |
+
# Test the escalation method
|
| 237 |
+
escalated_result = self.flow_manager.escalate_yellow_to_red(result, message)
|
| 238 |
+
|
| 239 |
+
# Verify escalated result is RED
|
| 240 |
+
assert escalated_result.classification == "red"
|
| 241 |
+
assert escalated_result.explanation is not None
|
| 242 |
+
assert escalated_result.permission_check_message is not None
|
| 243 |
+
assert escalated_result.referral_message is not None # Should have consent granted
|
| 244 |
+
assert escalated_result.consent_status == "granted"
|
| 245 |
+
|
| 246 |
+
@given(
|
| 247 |
+
message=st.text(min_size=1, max_size=500),
|
| 248 |
+
confidence=st.floats(min_value=0.0, max_value=1.0, allow_nan=False),
|
| 249 |
+
indicators=st.lists(distress_indicator_strategy(), min_size=1, max_size=5),
|
| 250 |
+
# Generate responses that contain downgrade keywords
|
| 251 |
+
downgrade_responses=st.lists(
|
| 252 |
+
st.sampled_from([
|
| 253 |
+
"I'm feeling better now",
|
| 254 |
+
"Everything is okay",
|
| 255 |
+
"I have good support from my family",
|
| 256 |
+
"I'm not worried about it",
|
| 257 |
+
"I'm managing well",
|
| 258 |
+
"I feel hopeful about the future"
|
| 259 |
+
]),
|
| 260 |
+
min_size=1,
|
| 261 |
+
max_size=3
|
| 262 |
+
)
|
| 263 |
+
)
|
| 264 |
+
def test_property_8_yellow_downgrade_to_green(
|
| 265 |
+
self, message, confidence, indicators, downgrade_responses
|
| 266 |
+
):
|
| 267 |
+
"""
|
| 268 |
+
**Feature: chaplain-feedback-system, Property 8: YELLOW Downgrade to GREEN**
|
| 269 |
+
**Validates: Requirements 2.5**
|
| 270 |
+
|
| 271 |
+
For any YELLOW classification where simulated patient responses indicate no distress,
|
| 272 |
+
the system should transition to GREEN status.
|
| 273 |
+
"""
|
| 274 |
+
# Execute YELLOW flow with downgrade responses
|
| 275 |
+
result = self.flow_manager.execute_yellow_flow(
|
| 276 |
+
message=message,
|
| 277 |
+
confidence=confidence,
|
| 278 |
+
indicators=indicators,
|
| 279 |
+
patient_responses=downgrade_responses
|
| 280 |
+
)
|
| 281 |
+
|
| 282 |
+
# Verify downgrade occurred
|
| 283 |
+
assert result.re_evaluation_result == "green", (
|
| 284 |
+
f"Expected downgrade to GREEN, got {result.re_evaluation_result} "
|
| 285 |
+
f"for responses: {downgrade_responses}"
|
| 286 |
+
)
|
| 287 |
+
|
| 288 |
+
# Test the downgrade method
|
| 289 |
+
downgraded_result = self.flow_manager.downgrade_yellow_to_green(result, message)
|
| 290 |
+
|
| 291 |
+
# Verify downgraded result is GREEN
|
| 292 |
+
assert downgraded_result.classification == "green"
|
| 293 |
+
assert downgraded_result.explanation is not None
|
| 294 |
+
assert downgraded_result.permission_check_message is None
|
| 295 |
+
assert downgraded_result.referral_message is None
|
| 296 |
+
assert downgraded_result.consent_status is None
|
| 297 |
+
assert len(downgraded_result.follow_up_questions) == 0
|
tests/chaplain_feedback/test_properties_content_generator.py
ADDED
|
@@ -0,0 +1,399 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# test_properties_content_generator.py
|
| 2 |
+
"""
|
| 3 |
+
Property-based tests for Content Generator Service.
|
| 4 |
+
|
| 5 |
+
Tests that content generation follows the specification requirements.
|
| 6 |
+
"""
|
| 7 |
+
|
| 8 |
+
import pytest
|
| 9 |
+
from hypothesis import given, settings, assume
|
| 10 |
+
from hypothesis import strategies as st
|
| 11 |
+
|
| 12 |
+
from src.core.chaplain_models import (
|
| 13 |
+
DistressIndicator,
|
| 14 |
+
FollowUpQuestion,
|
| 15 |
+
)
|
| 16 |
+
from src.core.content_generator import ContentGenerator
|
| 17 |
+
|
| 18 |
+
from tests.chaplain_feedback.conftest import (
|
| 19 |
+
distress_indicator_strategy,
|
| 20 |
+
)
|
| 21 |
+
|
| 22 |
+
|
| 23 |
+
# =============================================================================
|
| 24 |
+
# Strategies for content generator tests
|
| 25 |
+
# =============================================================================
|
| 26 |
+
|
| 27 |
+
def non_empty_indicators_strategy():
|
| 28 |
+
"""Generate non-empty list of distress indicators."""
|
| 29 |
+
return st.lists(distress_indicator_strategy(), min_size=1, max_size=5)
|
| 30 |
+
|
| 31 |
+
|
| 32 |
+
def red_indicators_strategy():
|
| 33 |
+
"""Generate list with at least one RED severity indicator."""
|
| 34 |
+
return st.lists(
|
| 35 |
+
distress_indicator_strategy(),
|
| 36 |
+
min_size=1,
|
| 37 |
+
max_size=5
|
| 38 |
+
).filter(lambda indicators: any(i.severity == "red" for i in indicators))
|
| 39 |
+
|
| 40 |
+
|
| 41 |
+
def patient_message_strategy():
|
| 42 |
+
"""Generate patient message text."""
|
| 43 |
+
return st.text(min_size=10, max_size=500).filter(lambda s: s.strip())
|
| 44 |
+
|
| 45 |
+
|
| 46 |
+
# =============================================================================
|
| 47 |
+
# Property Tests for RED Explanation
|
| 48 |
+
# =============================================================================
|
| 49 |
+
|
| 50 |
+
class TestRedExplanationContainsIndicators:
|
| 51 |
+
"""
|
| 52 |
+
**Feature: chaplain-feedback-system, Property 1: RED Explanation Contains Indicators**
|
| 53 |
+
**Validates: Requirements 1.1**
|
| 54 |
+
|
| 55 |
+
For any RED classification, the generated explanation should reference
|
| 56 |
+
at least one distress indicator from the definitions document categories.
|
| 57 |
+
"""
|
| 58 |
+
|
| 59 |
+
@given(
|
| 60 |
+
indicators=non_empty_indicators_strategy(),
|
| 61 |
+
message=patient_message_strategy()
|
| 62 |
+
)
|
| 63 |
+
@settings(max_examples=100)
|
| 64 |
+
def test_red_explanation_contains_indicator_references(self, indicators, message):
|
| 65 |
+
"""
|
| 66 |
+
**Feature: chaplain-feedback-system, Property 1: RED Explanation Contains Indicators**
|
| 67 |
+
**Validates: Requirements 1.1**
|
| 68 |
+
|
| 69 |
+
For any RED classification with indicators, the explanation should
|
| 70 |
+
reference at least one indicator's subcategory or category.
|
| 71 |
+
"""
|
| 72 |
+
generator = ContentGenerator()
|
| 73 |
+
explanation = generator.generate_explanation("red", indicators, message)
|
| 74 |
+
|
| 75 |
+
# The explanation should contain at least one indicator reference
|
| 76 |
+
indicator_referenced = False
|
| 77 |
+
for indicator in indicators:
|
| 78 |
+
if indicator.subcategory in explanation or indicator.category in explanation:
|
| 79 |
+
indicator_referenced = True
|
| 80 |
+
break
|
| 81 |
+
|
| 82 |
+
assert indicator_referenced, (
|
| 83 |
+
f"RED explanation should reference at least one indicator. "
|
| 84 |
+
f"Indicators: {[i.subcategory for i in indicators]}"
|
| 85 |
+
)
|
| 86 |
+
|
| 87 |
+
@given(
|
| 88 |
+
indicators=non_empty_indicators_strategy(),
|
| 89 |
+
message=patient_message_strategy()
|
| 90 |
+
)
|
| 91 |
+
@settings(max_examples=100)
|
| 92 |
+
def test_red_explanation_mentions_red_flag(self, indicators, message):
|
| 93 |
+
"""
|
| 94 |
+
**Feature: chaplain-feedback-system, Property 1: RED Explanation Contains Indicators**
|
| 95 |
+
**Validates: Requirements 1.1**
|
| 96 |
+
|
| 97 |
+
For any RED classification, the explanation should mention RED FLAG.
|
| 98 |
+
"""
|
| 99 |
+
generator = ContentGenerator()
|
| 100 |
+
explanation = generator.generate_explanation("red", indicators, message)
|
| 101 |
+
|
| 102 |
+
assert "RED FLAG" in explanation or "red" in explanation.lower(), (
|
| 103 |
+
"RED explanation should mention RED FLAG classification"
|
| 104 |
+
)
|
| 105 |
+
|
| 106 |
+
@given(
|
| 107 |
+
indicators=non_empty_indicators_strategy(),
|
| 108 |
+
message=patient_message_strategy()
|
| 109 |
+
)
|
| 110 |
+
@settings(max_examples=100)
|
| 111 |
+
def test_red_explanation_mentions_spiritual_care(self, indicators, message):
|
| 112 |
+
"""
|
| 113 |
+
**Feature: chaplain-feedback-system, Property 1: RED Explanation Contains Indicators**
|
| 114 |
+
**Validates: Requirements 1.1**
|
| 115 |
+
|
| 116 |
+
For any RED classification, the explanation should mention spiritual care team.
|
| 117 |
+
"""
|
| 118 |
+
generator = ContentGenerator()
|
| 119 |
+
explanation = generator.generate_explanation("red", indicators, message)
|
| 120 |
+
|
| 121 |
+
assert "spiritual" in explanation.lower(), (
|
| 122 |
+
"RED explanation should mention spiritual care"
|
| 123 |
+
)
|
| 124 |
+
|
| 125 |
+
|
| 126 |
+
|
| 127 |
+
# =============================================================================
|
| 128 |
+
# Property Tests for Permission Check Message
|
| 129 |
+
# =============================================================================
|
| 130 |
+
|
| 131 |
+
class TestRedPermissionCheckGenerated:
|
| 132 |
+
"""
|
| 133 |
+
**Feature: chaplain-feedback-system, Property 2: RED Permission Check Generated**
|
| 134 |
+
**Validates: Requirements 1.2**
|
| 135 |
+
|
| 136 |
+
For any RED classification, a patient permission check message should be
|
| 137 |
+
generated and contain consent-related language.
|
| 138 |
+
"""
|
| 139 |
+
|
| 140 |
+
@given(indicators=non_empty_indicators_strategy())
|
| 141 |
+
@settings(max_examples=100)
|
| 142 |
+
def test_permission_check_contains_spiritual_support(self, indicators):
|
| 143 |
+
"""
|
| 144 |
+
**Feature: chaplain-feedback-system, Property 2: RED Permission Check Generated**
|
| 145 |
+
**Validates: Requirements 1.2**
|
| 146 |
+
|
| 147 |
+
For any RED classification, the permission check message should
|
| 148 |
+
contain "spiritual" language.
|
| 149 |
+
"""
|
| 150 |
+
generator = ContentGenerator()
|
| 151 |
+
message = generator.generate_permission_check(indicators)
|
| 152 |
+
|
| 153 |
+
assert "spiritual" in message.lower(), (
|
| 154 |
+
"Permission check message should mention spiritual support"
|
| 155 |
+
)
|
| 156 |
+
|
| 157 |
+
@given(indicators=non_empty_indicators_strategy())
|
| 158 |
+
@settings(max_examples=100)
|
| 159 |
+
def test_permission_check_contains_consent_language(self, indicators):
|
| 160 |
+
"""
|
| 161 |
+
**Feature: chaplain-feedback-system, Property 2: RED Permission Check Generated**
|
| 162 |
+
**Validates: Requirements 1.2**
|
| 163 |
+
|
| 164 |
+
For any RED classification, the permission check message should
|
| 165 |
+
contain consent-related language.
|
| 166 |
+
"""
|
| 167 |
+
generator = ContentGenerator()
|
| 168 |
+
message = generator.generate_permission_check(indicators)
|
| 169 |
+
|
| 170 |
+
# Check for consent-related terms
|
| 171 |
+
consent_terms = ["consent", "permission", "voluntary", "would you like"]
|
| 172 |
+
has_consent_language = any(term in message.lower() for term in consent_terms)
|
| 173 |
+
|
| 174 |
+
assert has_consent_language, (
|
| 175 |
+
f"Permission check message should contain consent language. "
|
| 176 |
+
f"Message: {message[:200]}..."
|
| 177 |
+
)
|
| 178 |
+
|
| 179 |
+
@given(indicators=non_empty_indicators_strategy())
|
| 180 |
+
@settings(max_examples=100)
|
| 181 |
+
def test_permission_check_is_non_empty(self, indicators):
|
| 182 |
+
"""
|
| 183 |
+
**Feature: chaplain-feedback-system, Property 2: RED Permission Check Generated**
|
| 184 |
+
**Validates: Requirements 1.2**
|
| 185 |
+
|
| 186 |
+
For any RED classification, a non-empty permission check message
|
| 187 |
+
should be generated.
|
| 188 |
+
"""
|
| 189 |
+
generator = ContentGenerator()
|
| 190 |
+
message = generator.generate_permission_check(indicators)
|
| 191 |
+
|
| 192 |
+
assert message and len(message.strip()) > 0, (
|
| 193 |
+
"Permission check message should not be empty"
|
| 194 |
+
)
|
| 195 |
+
|
| 196 |
+
|
| 197 |
+
|
| 198 |
+
# =============================================================================
|
| 199 |
+
# Property Tests for Referral Message
|
| 200 |
+
# =============================================================================
|
| 201 |
+
|
| 202 |
+
class TestRedReferralMessageContainsRequiredSections:
|
| 203 |
+
"""
|
| 204 |
+
**Feature: chaplain-feedback-system, Property 3: RED Referral Message Contains Required Sections**
|
| 205 |
+
**Validates: Requirements 1.3**
|
| 206 |
+
|
| 207 |
+
For any RED classification with granted consent, the referral message should
|
| 208 |
+
contain: background information, detected indicators, and justification.
|
| 209 |
+
"""
|
| 210 |
+
|
| 211 |
+
@given(
|
| 212 |
+
indicators=non_empty_indicators_strategy(),
|
| 213 |
+
message=patient_message_strategy()
|
| 214 |
+
)
|
| 215 |
+
@settings(max_examples=100)
|
| 216 |
+
def test_referral_message_contains_background(self, indicators, message):
|
| 217 |
+
"""
|
| 218 |
+
**Feature: chaplain-feedback-system, Property 3: RED Referral Message Contains Required Sections**
|
| 219 |
+
**Validates: Requirements 1.3**
|
| 220 |
+
|
| 221 |
+
For any RED classification, the referral message should contain
|
| 222 |
+
background information section.
|
| 223 |
+
"""
|
| 224 |
+
generator = ContentGenerator()
|
| 225 |
+
explanation = generator.generate_explanation("red", indicators, message)
|
| 226 |
+
referral = generator.generate_referral_message(message, indicators, explanation)
|
| 227 |
+
|
| 228 |
+
assert "BACKGROUND" in referral.upper(), (
|
| 229 |
+
"Referral message should contain BACKGROUND section"
|
| 230 |
+
)
|
| 231 |
+
|
| 232 |
+
@given(
|
| 233 |
+
indicators=non_empty_indicators_strategy(),
|
| 234 |
+
message=patient_message_strategy()
|
| 235 |
+
)
|
| 236 |
+
@settings(max_examples=100)
|
| 237 |
+
def test_referral_message_contains_indicators_section(self, indicators, message):
|
| 238 |
+
"""
|
| 239 |
+
**Feature: chaplain-feedback-system, Property 3: RED Referral Message Contains Required Sections**
|
| 240 |
+
**Validates: Requirements 1.3**
|
| 241 |
+
|
| 242 |
+
For any RED classification, the referral message should contain
|
| 243 |
+
indicators section.
|
| 244 |
+
"""
|
| 245 |
+
generator = ContentGenerator()
|
| 246 |
+
explanation = generator.generate_explanation("red", indicators, message)
|
| 247 |
+
referral = generator.generate_referral_message(message, indicators, explanation)
|
| 248 |
+
|
| 249 |
+
assert "INDICATORS" in referral.upper(), (
|
| 250 |
+
"Referral message should contain INDICATORS section"
|
| 251 |
+
)
|
| 252 |
+
|
| 253 |
+
@given(
|
| 254 |
+
indicators=non_empty_indicators_strategy(),
|
| 255 |
+
message=patient_message_strategy()
|
| 256 |
+
)
|
| 257 |
+
@settings(max_examples=100)
|
| 258 |
+
def test_referral_message_contains_justification(self, indicators, message):
|
| 259 |
+
"""
|
| 260 |
+
**Feature: chaplain-feedback-system, Property 3: RED Referral Message Contains Required Sections**
|
| 261 |
+
**Validates: Requirements 1.3**
|
| 262 |
+
|
| 263 |
+
For any RED classification, the referral message should contain
|
| 264 |
+
justification section.
|
| 265 |
+
"""
|
| 266 |
+
generator = ContentGenerator()
|
| 267 |
+
explanation = generator.generate_explanation("red", indicators, message)
|
| 268 |
+
referral = generator.generate_referral_message(message, indicators, explanation)
|
| 269 |
+
|
| 270 |
+
assert "JUSTIFICATION" in referral.upper(), (
|
| 271 |
+
"Referral message should contain JUSTIFICATION section"
|
| 272 |
+
)
|
| 273 |
+
|
| 274 |
+
@given(
|
| 275 |
+
indicators=non_empty_indicators_strategy(),
|
| 276 |
+
message=patient_message_strategy()
|
| 277 |
+
)
|
| 278 |
+
@settings(max_examples=100)
|
| 279 |
+
def test_referral_message_references_indicators(self, indicators, message):
|
| 280 |
+
"""
|
| 281 |
+
**Feature: chaplain-feedback-system, Property 3: RED Referral Message Contains Required Sections**
|
| 282 |
+
**Validates: Requirements 1.3**
|
| 283 |
+
|
| 284 |
+
For any RED classification with indicators, the referral message should
|
| 285 |
+
reference at least one indicator.
|
| 286 |
+
"""
|
| 287 |
+
generator = ContentGenerator()
|
| 288 |
+
explanation = generator.generate_explanation("red", indicators, message)
|
| 289 |
+
referral = generator.generate_referral_message(message, indicators, explanation)
|
| 290 |
+
|
| 291 |
+
# Check that at least one indicator is referenced
|
| 292 |
+
indicator_referenced = False
|
| 293 |
+
for indicator in indicators:
|
| 294 |
+
if indicator.subcategory in referral or indicator.category in referral:
|
| 295 |
+
indicator_referenced = True
|
| 296 |
+
break
|
| 297 |
+
|
| 298 |
+
assert indicator_referenced, (
|
| 299 |
+
f"Referral message should reference at least one indicator. "
|
| 300 |
+
f"Indicators: {[i.subcategory for i in indicators]}"
|
| 301 |
+
)
|
| 302 |
+
|
| 303 |
+
|
| 304 |
+
|
| 305 |
+
# =============================================================================
|
| 306 |
+
# Property Tests for Follow-Up Questions
|
| 307 |
+
# =============================================================================
|
| 308 |
+
|
| 309 |
+
class TestYellowGenerates2To3Questions:
|
| 310 |
+
"""
|
| 311 |
+
**Feature: chaplain-feedback-system, Property 6: YELLOW Generates 2-3 Questions**
|
| 312 |
+
**Validates: Requirements 2.2**
|
| 313 |
+
|
| 314 |
+
For any YELLOW classification, the system should generate between 2 and 3
|
| 315 |
+
follow-up questions, each containing 1-2 clarifying questions.
|
| 316 |
+
"""
|
| 317 |
+
|
| 318 |
+
@given(
|
| 319 |
+
indicators=st.lists(distress_indicator_strategy(), min_size=0, max_size=5),
|
| 320 |
+
message=patient_message_strategy()
|
| 321 |
+
)
|
| 322 |
+
@settings(max_examples=100)
|
| 323 |
+
def test_follow_up_questions_count_in_range(self, indicators, message):
|
| 324 |
+
"""
|
| 325 |
+
**Feature: chaplain-feedback-system, Property 6: YELLOW Generates 2-3 Questions**
|
| 326 |
+
**Validates: Requirements 2.2**
|
| 327 |
+
|
| 328 |
+
For any YELLOW classification, the number of follow-up questions
|
| 329 |
+
should be between 2 and 3.
|
| 330 |
+
"""
|
| 331 |
+
generator = ContentGenerator()
|
| 332 |
+
questions = generator.generate_follow_up_questions(message, indicators)
|
| 333 |
+
|
| 334 |
+
assert 2 <= len(questions) <= 3, (
|
| 335 |
+
f"Should generate 2-3 follow-up questions, got {len(questions)}"
|
| 336 |
+
)
|
| 337 |
+
|
| 338 |
+
@given(
|
| 339 |
+
indicators=st.lists(distress_indicator_strategy(), min_size=0, max_size=5),
|
| 340 |
+
message=patient_message_strategy()
|
| 341 |
+
)
|
| 342 |
+
@settings(max_examples=100)
|
| 343 |
+
def test_follow_up_questions_have_required_fields(self, indicators, message):
|
| 344 |
+
"""
|
| 345 |
+
**Feature: chaplain-feedback-system, Property 6: YELLOW Generates 2-3 Questions**
|
| 346 |
+
**Validates: Requirements 2.2**
|
| 347 |
+
|
| 348 |
+
For any YELLOW classification, each follow-up question should have
|
| 349 |
+
question_id, question_text, and purpose fields.
|
| 350 |
+
"""
|
| 351 |
+
generator = ContentGenerator()
|
| 352 |
+
questions = generator.generate_follow_up_questions(message, indicators)
|
| 353 |
+
|
| 354 |
+
for question in questions:
|
| 355 |
+
assert question.question_id, "Question should have question_id"
|
| 356 |
+
assert question.question_text, "Question should have question_text"
|
| 357 |
+
assert question.purpose, "Question should have purpose"
|
| 358 |
+
|
| 359 |
+
@given(
|
| 360 |
+
indicators=st.lists(distress_indicator_strategy(), min_size=0, max_size=5),
|
| 361 |
+
message=patient_message_strategy()
|
| 362 |
+
)
|
| 363 |
+
@settings(max_examples=100)
|
| 364 |
+
def test_follow_up_questions_are_follow_up_question_instances(self, indicators, message):
|
| 365 |
+
"""
|
| 366 |
+
**Feature: chaplain-feedback-system, Property 6: YELLOW Generates 2-3 Questions**
|
| 367 |
+
**Validates: Requirements 2.2**
|
| 368 |
+
|
| 369 |
+
For any YELLOW classification, all generated questions should be
|
| 370 |
+
FollowUpQuestion instances.
|
| 371 |
+
"""
|
| 372 |
+
generator = ContentGenerator()
|
| 373 |
+
questions = generator.generate_follow_up_questions(message, indicators)
|
| 374 |
+
|
| 375 |
+
for question in questions:
|
| 376 |
+
assert isinstance(question, FollowUpQuestion), (
|
| 377 |
+
f"Question should be FollowUpQuestion instance, got {type(question)}"
|
| 378 |
+
)
|
| 379 |
+
|
| 380 |
+
@given(
|
| 381 |
+
indicators=st.lists(distress_indicator_strategy(), min_size=0, max_size=5),
|
| 382 |
+
message=patient_message_strategy()
|
| 383 |
+
)
|
| 384 |
+
@settings(max_examples=100)
|
| 385 |
+
def test_follow_up_questions_have_unique_ids(self, indicators, message):
|
| 386 |
+
"""
|
| 387 |
+
**Feature: chaplain-feedback-system, Property 6: YELLOW Generates 2-3 Questions**
|
| 388 |
+
**Validates: Requirements 2.2**
|
| 389 |
+
|
| 390 |
+
For any YELLOW classification, all generated questions should have
|
| 391 |
+
unique question_ids.
|
| 392 |
+
"""
|
| 393 |
+
generator = ContentGenerator()
|
| 394 |
+
questions = generator.generate_follow_up_questions(message, indicators)
|
| 395 |
+
|
| 396 |
+
question_ids = [q.question_id for q in questions]
|
| 397 |
+
assert len(question_ids) == len(set(question_ids)), (
|
| 398 |
+
f"Question IDs should be unique, got: {question_ids}"
|
| 399 |
+
)
|
tests/chaplain_feedback/test_properties_csv_export.py
ADDED
|
@@ -0,0 +1,290 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# test_properties_csv_export.py
|
| 2 |
+
"""
|
| 3 |
+
Property-based tests for Enhanced CSV Export functionality.
|
| 4 |
+
|
| 5 |
+
Tests that CSV export includes all tagging data, generated content,
|
| 6 |
+
interaction logs, and statistics.
|
| 7 |
+
"""
|
| 8 |
+
|
| 9 |
+
import pytest
|
| 10 |
+
from hypothesis import given, settings
|
| 11 |
+
from datetime import datetime
|
| 12 |
+
|
| 13 |
+
from src.core.verification_csv_exporter import VerificationCSVExporter
|
| 14 |
+
from src.core.verification_models import VerificationSession, VerificationRecord
|
| 15 |
+
from src.core.chaplain_models import (
|
| 16 |
+
TaggingRecord,
|
| 17 |
+
ClassificationFlowResult,
|
| 18 |
+
InteractionStepLog,
|
| 19 |
+
DistressIndicator,
|
| 20 |
+
FollowUpQuestion,
|
| 21 |
+
)
|
| 22 |
+
|
| 23 |
+
from tests.chaplain_feedback.conftest import (
|
| 24 |
+
tagging_record_strategy,
|
| 25 |
+
classification_flow_result_strategy,
|
| 26 |
+
interaction_step_log_strategy,
|
| 27 |
+
)
|
| 28 |
+
|
| 29 |
+
|
| 30 |
+
class TestExportContainsAllTags:
|
| 31 |
+
"""
|
| 32 |
+
**Feature: chaplain-feedback-system, Property 17: Export Contains All Tags**
|
| 33 |
+
|
| 34 |
+
Tests that CSV export includes all tagging categories and subcategories.
|
| 35 |
+
"""
|
| 36 |
+
|
| 37 |
+
@given(tagging_record_strategy())
|
| 38 |
+
@settings(max_examples=100)
|
| 39 |
+
def test_export_contains_all_tags(self, tagging_record):
|
| 40 |
+
"""
|
| 41 |
+
**Feature: chaplain-feedback-system, Property 17: Export Contains All Tags**
|
| 42 |
+
**Validates: Requirements 9.1**
|
| 43 |
+
|
| 44 |
+
For any TaggingRecord, the CSV export should contain all tagging
|
| 45 |
+
categories and subcategories from that record.
|
| 46 |
+
"""
|
| 47 |
+
# Create a minimal session
|
| 48 |
+
session = VerificationSession(
|
| 49 |
+
session_id="test_session",
|
| 50 |
+
verifier_name="Test Verifier",
|
| 51 |
+
dataset_id="test_dataset",
|
| 52 |
+
dataset_name="Test Dataset",
|
| 53 |
+
total_messages=1,
|
| 54 |
+
verified_count=1,
|
| 55 |
+
correct_count=1,
|
| 56 |
+
incorrect_count=0,
|
| 57 |
+
)
|
| 58 |
+
|
| 59 |
+
# Add a verification record
|
| 60 |
+
verification = VerificationRecord(
|
| 61 |
+
message_id=tagging_record.message_id,
|
| 62 |
+
original_message="Test message",
|
| 63 |
+
classifier_decision="red",
|
| 64 |
+
classifier_confidence=0.9,
|
| 65 |
+
classifier_indicators=["indicator1"],
|
| 66 |
+
ground_truth_label="red",
|
| 67 |
+
is_correct=True,
|
| 68 |
+
)
|
| 69 |
+
session.verifications.append(verification)
|
| 70 |
+
|
| 71 |
+
# Generate CSV with tagging records
|
| 72 |
+
csv_content = VerificationCSVExporter.generate_enhanced_csv_content(
|
| 73 |
+
session,
|
| 74 |
+
tagging_records=[tagging_record],
|
| 75 |
+
)
|
| 76 |
+
|
| 77 |
+
# Verify tagging data section exists
|
| 78 |
+
assert "TAGGING DATA" in csv_content
|
| 79 |
+
|
| 80 |
+
# Verify message ID is in export
|
| 81 |
+
assert tagging_record.message_id in csv_content
|
| 82 |
+
|
| 83 |
+
# Verify classification correctness is in export
|
| 84 |
+
correctness_str = "Yes" if tagging_record.is_classification_correct else "No"
|
| 85 |
+
assert correctness_str in csv_content
|
| 86 |
+
|
| 87 |
+
# Verify classification subcategory is in export (if present)
|
| 88 |
+
if tagging_record.classification_subcategory:
|
| 89 |
+
assert tagging_record.classification_subcategory in csv_content
|
| 90 |
+
|
| 91 |
+
# Verify correct classification is in export (if present)
|
| 92 |
+
if tagging_record.correct_classification:
|
| 93 |
+
assert tagging_record.correct_classification in csv_content
|
| 94 |
+
|
| 95 |
+
# Verify question issues are in export (if present)
|
| 96 |
+
if tagging_record.question_issues:
|
| 97 |
+
for issue in tagging_record.question_issues:
|
| 98 |
+
assert issue in csv_content
|
| 99 |
+
|
| 100 |
+
# Verify referral issues are in export (if present)
|
| 101 |
+
if tagging_record.referral_issues:
|
| 102 |
+
for issue in tagging_record.referral_issues:
|
| 103 |
+
assert issue in csv_content
|
| 104 |
+
|
| 105 |
+
# Verify indicator issues are in export (if present)
|
| 106 |
+
if tagging_record.indicator_issues:
|
| 107 |
+
for indicator_id in tagging_record.indicator_issues:
|
| 108 |
+
assert indicator_id in csv_content
|
| 109 |
+
|
| 110 |
+
|
| 111 |
+
class TestExportContainsGeneratedContent:
|
| 112 |
+
"""
|
| 113 |
+
**Feature: chaplain-feedback-system, Property 18: Export Contains Generated Content**
|
| 114 |
+
|
| 115 |
+
Tests that CSV export includes all generated content.
|
| 116 |
+
"""
|
| 117 |
+
|
| 118 |
+
@given(classification_flow_result_strategy())
|
| 119 |
+
@settings(max_examples=100)
|
| 120 |
+
def test_export_contains_generated_content(self, flow_result):
|
| 121 |
+
"""
|
| 122 |
+
**Feature: chaplain-feedback-system, Property 18: Export Contains Generated Content**
|
| 123 |
+
**Validates: Requirements 9.2**
|
| 124 |
+
|
| 125 |
+
For any ClassificationFlowResult, the CSV export should contain
|
| 126 |
+
all generated content (explanations, questions, referral messages).
|
| 127 |
+
"""
|
| 128 |
+
# Create a minimal session
|
| 129 |
+
session = VerificationSession(
|
| 130 |
+
session_id="test_session",
|
| 131 |
+
verifier_name="Test Verifier",
|
| 132 |
+
dataset_id="test_dataset",
|
| 133 |
+
dataset_name="Test Dataset",
|
| 134 |
+
total_messages=1,
|
| 135 |
+
verified_count=1,
|
| 136 |
+
correct_count=1,
|
| 137 |
+
incorrect_count=0,
|
| 138 |
+
)
|
| 139 |
+
|
| 140 |
+
# Add a verification record
|
| 141 |
+
message_id = "msg_001"
|
| 142 |
+
verification = VerificationRecord(
|
| 143 |
+
message_id=message_id,
|
| 144 |
+
original_message="Test message",
|
| 145 |
+
classifier_decision=flow_result.classification,
|
| 146 |
+
classifier_confidence=flow_result.confidence,
|
| 147 |
+
classifier_indicators=[ind.indicator_text for ind in flow_result.indicators],
|
| 148 |
+
ground_truth_label=flow_result.classification,
|
| 149 |
+
is_correct=True,
|
| 150 |
+
)
|
| 151 |
+
session.verifications.append(verification)
|
| 152 |
+
|
| 153 |
+
# Generate CSV with flow results
|
| 154 |
+
flow_results = {message_id: flow_result}
|
| 155 |
+
csv_content = VerificationCSVExporter.generate_enhanced_csv_content(
|
| 156 |
+
session,
|
| 157 |
+
flow_results=flow_results,
|
| 158 |
+
)
|
| 159 |
+
|
| 160 |
+
# Verify generated content section exists
|
| 161 |
+
assert "GENERATED CONTENT" in csv_content
|
| 162 |
+
|
| 163 |
+
# Verify message ID is in export
|
| 164 |
+
assert message_id in csv_content
|
| 165 |
+
|
| 166 |
+
# Verify classification is in export
|
| 167 |
+
assert flow_result.classification.upper() in csv_content
|
| 168 |
+
|
| 169 |
+
|
| 170 |
+
class TestExportContainsInteractionLogs:
|
| 171 |
+
"""
|
| 172 |
+
Tests that CSV export includes interaction logs.
|
| 173 |
+
"""
|
| 174 |
+
|
| 175 |
+
@given(interaction_step_log_strategy())
|
| 176 |
+
@settings(max_examples=100)
|
| 177 |
+
def test_export_contains_interaction_logs(self, log):
|
| 178 |
+
"""
|
| 179 |
+
For any InteractionStepLog, the CSV export should contain
|
| 180 |
+
all logged interaction steps.
|
| 181 |
+
"""
|
| 182 |
+
# Create a minimal session
|
| 183 |
+
session = VerificationSession(
|
| 184 |
+
session_id=log.session_id,
|
| 185 |
+
verifier_name="Test Verifier",
|
| 186 |
+
dataset_id="test_dataset",
|
| 187 |
+
dataset_name="Test Dataset",
|
| 188 |
+
total_messages=1,
|
| 189 |
+
verified_count=1,
|
| 190 |
+
correct_count=1,
|
| 191 |
+
incorrect_count=0,
|
| 192 |
+
)
|
| 193 |
+
|
| 194 |
+
# Add a verification record
|
| 195 |
+
verification = VerificationRecord(
|
| 196 |
+
message_id=log.message_id,
|
| 197 |
+
original_message="Test message",
|
| 198 |
+
classifier_decision="red",
|
| 199 |
+
classifier_confidence=0.9,
|
| 200 |
+
classifier_indicators=["indicator1"],
|
| 201 |
+
ground_truth_label="red",
|
| 202 |
+
is_correct=True,
|
| 203 |
+
)
|
| 204 |
+
session.verifications.append(verification)
|
| 205 |
+
|
| 206 |
+
# Generate CSV with interaction logs
|
| 207 |
+
csv_content = VerificationCSVExporter.generate_enhanced_csv_content(
|
| 208 |
+
session,
|
| 209 |
+
interaction_logs=[log],
|
| 210 |
+
)
|
| 211 |
+
|
| 212 |
+
# Verify interaction logs section exists
|
| 213 |
+
assert "INTERACTION LOGS" in csv_content
|
| 214 |
+
|
| 215 |
+
# Verify step ID is in export
|
| 216 |
+
assert log.step_id in csv_content
|
| 217 |
+
|
| 218 |
+
# Verify session ID is in export
|
| 219 |
+
assert log.session_id in csv_content
|
| 220 |
+
|
| 221 |
+
# Verify message ID is in export
|
| 222 |
+
assert log.message_id in csv_content
|
| 223 |
+
|
| 224 |
+
# Verify step type is in export
|
| 225 |
+
assert log.step_type in csv_content
|
| 226 |
+
|
| 227 |
+
# Verify approval status is in export (if present)
|
| 228 |
+
if log.approval_status:
|
| 229 |
+
assert log.approval_status in csv_content
|
| 230 |
+
|
| 231 |
+
|
| 232 |
+
class TestExportContainsStatistics:
|
| 233 |
+
"""
|
| 234 |
+
Tests that CSV export includes error pattern statistics.
|
| 235 |
+
"""
|
| 236 |
+
|
| 237 |
+
@given(tagging_record_strategy())
|
| 238 |
+
@settings(max_examples=100)
|
| 239 |
+
def test_export_contains_statistics(self, tagging_record):
|
| 240 |
+
"""
|
| 241 |
+
For any set of TaggingRecords, the CSV export should contain
|
| 242 |
+
error pattern statistics with subcategory breakdowns.
|
| 243 |
+
"""
|
| 244 |
+
# Create a minimal session
|
| 245 |
+
session = VerificationSession(
|
| 246 |
+
session_id="test_session",
|
| 247 |
+
verifier_name="Test Verifier",
|
| 248 |
+
dataset_id="test_dataset",
|
| 249 |
+
dataset_name="Test Dataset",
|
| 250 |
+
total_messages=1,
|
| 251 |
+
verified_count=1,
|
| 252 |
+
correct_count=1,
|
| 253 |
+
incorrect_count=0,
|
| 254 |
+
)
|
| 255 |
+
|
| 256 |
+
# Add a verification record
|
| 257 |
+
verification = VerificationRecord(
|
| 258 |
+
message_id=tagging_record.message_id,
|
| 259 |
+
original_message="Test message",
|
| 260 |
+
classifier_decision="red",
|
| 261 |
+
classifier_confidence=0.9,
|
| 262 |
+
classifier_indicators=["indicator1"],
|
| 263 |
+
ground_truth_label="red",
|
| 264 |
+
is_correct=True,
|
| 265 |
+
)
|
| 266 |
+
session.verifications.append(verification)
|
| 267 |
+
|
| 268 |
+
# Generate CSV with tagging records (which triggers statistics)
|
| 269 |
+
csv_content = VerificationCSVExporter.generate_enhanced_csv_content(
|
| 270 |
+
session,
|
| 271 |
+
tagging_records=[tagging_record],
|
| 272 |
+
)
|
| 273 |
+
|
| 274 |
+
# Verify statistics section exists
|
| 275 |
+
assert "ERROR PATTERN STATISTICS" in csv_content
|
| 276 |
+
|
| 277 |
+
# Verify classification errors section exists
|
| 278 |
+
assert "Classification Errors" in csv_content
|
| 279 |
+
|
| 280 |
+
# Verify question issues section exists
|
| 281 |
+
assert "Question Issues" in csv_content
|
| 282 |
+
|
| 283 |
+
# Verify referral issues section exists
|
| 284 |
+
assert "Referral Issues" in csv_content
|
| 285 |
+
|
| 286 |
+
# Verify indicator issues section exists
|
| 287 |
+
assert "Indicator Issues" in csv_content
|
| 288 |
+
|
| 289 |
+
# Verify common patterns section exists
|
| 290 |
+
assert "Common Patterns" in csv_content
|
tests/chaplain_feedback/test_properties_data_models.py
ADDED
|
@@ -0,0 +1,250 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# test_properties_data_models.py
|
| 2 |
+
"""
|
| 3 |
+
Property-based tests for Chaplain Feedback data model serialization.
|
| 4 |
+
|
| 5 |
+
Tests that all data models serialize and deserialize correctly (round-trip).
|
| 6 |
+
"""
|
| 7 |
+
|
| 8 |
+
import pytest
|
| 9 |
+
from hypothesis import given, settings
|
| 10 |
+
from datetime import datetime
|
| 11 |
+
|
| 12 |
+
from src.core.chaplain_models import (
|
| 13 |
+
DistressIndicator,
|
| 14 |
+
FollowUpQuestion,
|
| 15 |
+
ClassificationFlowResult,
|
| 16 |
+
TaggingRecord,
|
| 17 |
+
InteractionStepLog,
|
| 18 |
+
INDICATOR_DEFINITIONS,
|
| 19 |
+
)
|
| 20 |
+
|
| 21 |
+
from tests.chaplain_feedback.conftest import (
|
| 22 |
+
distress_indicator_strategy,
|
| 23 |
+
follow_up_question_strategy,
|
| 24 |
+
classification_flow_result_strategy,
|
| 25 |
+
tagging_record_strategy,
|
| 26 |
+
interaction_step_log_strategy,
|
| 27 |
+
interaction_step_log_with_tagging_strategy,
|
| 28 |
+
)
|
| 29 |
+
|
| 30 |
+
|
| 31 |
+
class TestDistressIndicatorRoundTrip:
|
| 32 |
+
"""
|
| 33 |
+
**Feature: chaplain-feedback-system, Property: Data Model Round Trip**
|
| 34 |
+
|
| 35 |
+
Tests that DistressIndicator serializes and deserializes correctly.
|
| 36 |
+
"""
|
| 37 |
+
|
| 38 |
+
@given(distress_indicator_strategy())
|
| 39 |
+
@settings(max_examples=100)
|
| 40 |
+
def test_distress_indicator_round_trip(self, indicator):
|
| 41 |
+
"""
|
| 42 |
+
**Feature: chaplain-feedback-system, Property: Data Model Round Trip**
|
| 43 |
+
**Validates: Requirements 8.5**
|
| 44 |
+
|
| 45 |
+
For any DistressIndicator, converting to dict and back should
|
| 46 |
+
preserve all fields exactly.
|
| 47 |
+
"""
|
| 48 |
+
# Convert to dict and back
|
| 49 |
+
indicator_dict = indicator.to_dict()
|
| 50 |
+
restored = DistressIndicator.from_dict(indicator_dict)
|
| 51 |
+
|
| 52 |
+
# Verify all fields match
|
| 53 |
+
assert restored.indicator_text == indicator.indicator_text
|
| 54 |
+
assert restored.category == indicator.category
|
| 55 |
+
assert restored.subcategory == indicator.subcategory
|
| 56 |
+
assert restored.severity == indicator.severity
|
| 57 |
+
assert restored.confidence == indicator.confidence
|
| 58 |
+
assert restored.definition_reference == indicator.definition_reference
|
| 59 |
+
|
| 60 |
+
def test_distress_indicator_from_definition(self):
|
| 61 |
+
"""
|
| 62 |
+
Test creating DistressIndicator from INDICATOR_DEFINITIONS.
|
| 63 |
+
"""
|
| 64 |
+
# Test with a known indicator
|
| 65 |
+
indicator = DistressIndicator.from_definition(
|
| 66 |
+
indicator_key="excessive_guilt",
|
| 67 |
+
indicator_text="I feel so guilty about everything",
|
| 68 |
+
confidence=0.85
|
| 69 |
+
)
|
| 70 |
+
|
| 71 |
+
assert indicator.category == "Guilt"
|
| 72 |
+
assert indicator.subcategory == "Excessive guilt"
|
| 73 |
+
assert indicator.severity == "red"
|
| 74 |
+
assert indicator.definition_reference == "II.D"
|
| 75 |
+
assert indicator.confidence == 0.85
|
| 76 |
+
|
| 77 |
+
|
| 78 |
+
class TestFollowUpQuestionRoundTrip:
|
| 79 |
+
"""
|
| 80 |
+
**Feature: chaplain-feedback-system, Property: Data Model Round Trip**
|
| 81 |
+
|
| 82 |
+
Tests that FollowUpQuestion serializes and deserializes correctly.
|
| 83 |
+
"""
|
| 84 |
+
|
| 85 |
+
@given(follow_up_question_strategy())
|
| 86 |
+
@settings(max_examples=100)
|
| 87 |
+
def test_follow_up_question_round_trip(self, question):
|
| 88 |
+
"""
|
| 89 |
+
**Feature: chaplain-feedback-system, Property: Data Model Round Trip**
|
| 90 |
+
**Validates: Requirements 8.5**
|
| 91 |
+
|
| 92 |
+
For any FollowUpQuestion, converting to dict and back should
|
| 93 |
+
preserve all fields exactly.
|
| 94 |
+
"""
|
| 95 |
+
# Convert to dict and back
|
| 96 |
+
question_dict = question.to_dict()
|
| 97 |
+
restored = FollowUpQuestion.from_dict(question_dict)
|
| 98 |
+
|
| 99 |
+
# Verify all fields match
|
| 100 |
+
assert restored.question_id == question.question_id
|
| 101 |
+
assert restored.question_text == question.question_text
|
| 102 |
+
assert restored.purpose == question.purpose
|
| 103 |
+
|
| 104 |
+
|
| 105 |
+
class TestClassificationFlowResultRoundTrip:
|
| 106 |
+
"""
|
| 107 |
+
**Feature: chaplain-feedback-system, Property: Data Model Round Trip**
|
| 108 |
+
|
| 109 |
+
Tests that ClassificationFlowResult serializes and deserializes correctly.
|
| 110 |
+
"""
|
| 111 |
+
|
| 112 |
+
@given(classification_flow_result_strategy())
|
| 113 |
+
@settings(max_examples=100)
|
| 114 |
+
def test_classification_flow_result_round_trip(self, result):
|
| 115 |
+
"""
|
| 116 |
+
**Feature: chaplain-feedback-system, Property: Data Model Round Trip**
|
| 117 |
+
**Validates: Requirements 8.5**
|
| 118 |
+
|
| 119 |
+
For any ClassificationFlowResult, converting to dict and back should
|
| 120 |
+
preserve all fields exactly.
|
| 121 |
+
"""
|
| 122 |
+
# Convert to dict and back
|
| 123 |
+
result_dict = result.to_dict()
|
| 124 |
+
restored = ClassificationFlowResult.from_dict(result_dict)
|
| 125 |
+
|
| 126 |
+
# Verify basic fields match
|
| 127 |
+
assert restored.classification == result.classification
|
| 128 |
+
assert restored.confidence == result.confidence
|
| 129 |
+
assert restored.explanation == result.explanation
|
| 130 |
+
assert restored.permission_check_message == result.permission_check_message
|
| 131 |
+
assert restored.referral_message == result.referral_message
|
| 132 |
+
assert restored.consent_status == result.consent_status
|
| 133 |
+
assert restored.patient_responses == result.patient_responses
|
| 134 |
+
assert restored.re_evaluation_result == result.re_evaluation_result
|
| 135 |
+
|
| 136 |
+
# Verify nested indicators
|
| 137 |
+
assert len(restored.indicators) == len(result.indicators)
|
| 138 |
+
for orig, rest in zip(result.indicators, restored.indicators):
|
| 139 |
+
assert rest.indicator_text == orig.indicator_text
|
| 140 |
+
assert rest.category == orig.category
|
| 141 |
+
assert rest.severity == orig.severity
|
| 142 |
+
|
| 143 |
+
# Verify nested follow-up questions
|
| 144 |
+
assert len(restored.follow_up_questions) == len(result.follow_up_questions)
|
| 145 |
+
for orig, rest in zip(result.follow_up_questions, restored.follow_up_questions):
|
| 146 |
+
assert rest.question_id == orig.question_id
|
| 147 |
+
assert rest.question_text == orig.question_text
|
| 148 |
+
assert rest.purpose == orig.purpose
|
| 149 |
+
|
| 150 |
+
|
| 151 |
+
class TestTaggingRecordRoundTrip:
|
| 152 |
+
"""
|
| 153 |
+
**Feature: chaplain-feedback-system, Property: Data Model Round Trip**
|
| 154 |
+
|
| 155 |
+
Tests that TaggingRecord serializes and deserializes correctly.
|
| 156 |
+
"""
|
| 157 |
+
|
| 158 |
+
@given(tagging_record_strategy())
|
| 159 |
+
@settings(max_examples=100)
|
| 160 |
+
def test_tagging_record_round_trip(self, record):
|
| 161 |
+
"""
|
| 162 |
+
**Feature: chaplain-feedback-system, Property: Data Model Round Trip**
|
| 163 |
+
**Validates: Requirements 8.5**
|
| 164 |
+
|
| 165 |
+
For any TaggingRecord, converting to dict and back should
|
| 166 |
+
preserve all fields exactly.
|
| 167 |
+
"""
|
| 168 |
+
# Convert to dict and back
|
| 169 |
+
record_dict = record.to_dict()
|
| 170 |
+
restored = TaggingRecord.from_dict(record_dict)
|
| 171 |
+
|
| 172 |
+
# Verify all fields match
|
| 173 |
+
assert restored.record_id == record.record_id
|
| 174 |
+
assert restored.message_id == record.message_id
|
| 175 |
+
assert restored.is_classification_correct == record.is_classification_correct
|
| 176 |
+
assert restored.classification_subcategory == record.classification_subcategory
|
| 177 |
+
assert restored.correct_classification == record.correct_classification
|
| 178 |
+
assert restored.question_issues == record.question_issues
|
| 179 |
+
assert restored.question_comments == record.question_comments
|
| 180 |
+
assert restored.referral_issues == record.referral_issues
|
| 181 |
+
assert restored.referral_comments == record.referral_comments
|
| 182 |
+
assert restored.indicator_issues == record.indicator_issues
|
| 183 |
+
assert restored.indicator_comments == record.indicator_comments
|
| 184 |
+
assert restored.general_notes == record.general_notes
|
| 185 |
+
|
| 186 |
+
|
| 187 |
+
class TestInteractionStepLogRoundTrip:
|
| 188 |
+
"""
|
| 189 |
+
**Feature: chaplain-feedback-system, Property: Data Model Round Trip**
|
| 190 |
+
|
| 191 |
+
Tests that InteractionStepLog serializes and deserializes correctly.
|
| 192 |
+
"""
|
| 193 |
+
|
| 194 |
+
@given(interaction_step_log_strategy())
|
| 195 |
+
@settings(max_examples=100)
|
| 196 |
+
def test_interaction_step_log_round_trip(self, log):
|
| 197 |
+
"""
|
| 198 |
+
**Feature: chaplain-feedback-system, Property: Data Model Round Trip**
|
| 199 |
+
**Validates: Requirements 8.5**
|
| 200 |
+
|
| 201 |
+
For any InteractionStepLog, converting to dict and back should
|
| 202 |
+
preserve all fields exactly.
|
| 203 |
+
"""
|
| 204 |
+
# Convert to dict and back
|
| 205 |
+
log_dict = log.to_dict()
|
| 206 |
+
restored = InteractionStepLog.from_dict(log_dict)
|
| 207 |
+
|
| 208 |
+
# Verify all fields match
|
| 209 |
+
assert restored.step_id == log.step_id
|
| 210 |
+
assert restored.session_id == log.session_id
|
| 211 |
+
assert restored.message_id == log.message_id
|
| 212 |
+
assert restored.step_type == log.step_type
|
| 213 |
+
assert restored.input_text == log.input_text
|
| 214 |
+
assert restored.model_output == log.model_output
|
| 215 |
+
assert restored.approval_status == log.approval_status
|
| 216 |
+
assert restored.tagging_data == log.tagging_data
|
| 217 |
+
|
| 218 |
+
@given(interaction_step_log_with_tagging_strategy())
|
| 219 |
+
@settings(max_examples=100)
|
| 220 |
+
def test_interaction_step_log_with_tagging_round_trip(self, log):
|
| 221 |
+
"""
|
| 222 |
+
**Feature: chaplain-feedback-system, Property: Data Model Round Trip**
|
| 223 |
+
**Validates: Requirements 8.5**
|
| 224 |
+
|
| 225 |
+
For any InteractionStepLog with nested TaggingRecord, converting to dict
|
| 226 |
+
and back should preserve all fields exactly.
|
| 227 |
+
"""
|
| 228 |
+
# Convert to dict and back
|
| 229 |
+
log_dict = log.to_dict()
|
| 230 |
+
restored = InteractionStepLog.from_dict(log_dict)
|
| 231 |
+
|
| 232 |
+
# Verify basic fields match
|
| 233 |
+
assert restored.step_id == log.step_id
|
| 234 |
+
assert restored.session_id == log.session_id
|
| 235 |
+
assert restored.message_id == log.message_id
|
| 236 |
+
assert restored.step_type == log.step_type
|
| 237 |
+
assert restored.input_text == log.input_text
|
| 238 |
+
assert restored.model_output == log.model_output
|
| 239 |
+
assert restored.approval_status == log.approval_status
|
| 240 |
+
|
| 241 |
+
# Verify nested tagging data
|
| 242 |
+
if log.tagging_data is None:
|
| 243 |
+
assert restored.tagging_data is None
|
| 244 |
+
else:
|
| 245 |
+
assert restored.tagging_data is not None
|
| 246 |
+
assert restored.tagging_data.record_id == log.tagging_data.record_id
|
| 247 |
+
assert restored.tagging_data.message_id == log.tagging_data.message_id
|
| 248 |
+
assert restored.tagging_data.is_classification_correct == log.tagging_data.is_classification_correct
|
| 249 |
+
assert restored.tagging_data.question_issues == log.tagging_data.question_issues
|
| 250 |
+
assert restored.tagging_data.referral_issues == log.tagging_data.referral_issues
|
tests/chaplain_feedback/test_properties_error_pattern_analyzer.py
ADDED
|
@@ -0,0 +1,194 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Property-based tests for ErrorPatternAnalyzer.
|
| 3 |
+
|
| 4 |
+
Tests universal properties that should hold across all inputs
|
| 5 |
+
for the error pattern analysis functionality.
|
| 6 |
+
"""
|
| 7 |
+
|
| 8 |
+
from hypothesis import given, strategies as st
|
| 9 |
+
from src.core.error_pattern_analyzer import ErrorPatternAnalyzer
|
| 10 |
+
from src.core.chaplain_models import (
|
| 11 |
+
CLASSIFICATION_SUBCATEGORIES,
|
| 12 |
+
QUESTION_ISSUE_TYPES,
|
| 13 |
+
REFERRAL_ISSUE_TYPES,
|
| 14 |
+
)
|
| 15 |
+
from .conftest import tagging_record_strategy
|
| 16 |
+
|
| 17 |
+
|
| 18 |
+
class TestErrorPatternAnalyzerProperties:
|
| 19 |
+
"""Property-based tests for ErrorPatternAnalyzer."""
|
| 20 |
+
|
| 21 |
+
@given(st.lists(tagging_record_strategy(), min_size=1, max_size=20))
|
| 22 |
+
def test_property_19_statistics_include_subcategory_breakdown(self, records):
|
| 23 |
+
"""
|
| 24 |
+
**Feature: chaplain-feedback-system, Property 19: Statistics Include Subcategory Breakdown**
|
| 25 |
+
**Validates: Requirements 4.4, 5.4, 6.4**
|
| 26 |
+
"""
|
| 27 |
+
analyzer = ErrorPatternAnalyzer()
|
| 28 |
+
stats = analyzer.get_statistics_summary(records)
|
| 29 |
+
|
| 30 |
+
assert "total_records" in stats
|
| 31 |
+
assert "classification_errors" in stats
|
| 32 |
+
assert "question_issues" in stats
|
| 33 |
+
assert "referral_issues" in stats
|
| 34 |
+
assert "indicator_issues" in stats
|
| 35 |
+
assert "common_patterns" in stats
|
| 36 |
+
|
| 37 |
+
assert stats["total_records"] == len(records)
|
| 38 |
+
|
| 39 |
+
classification_errors = stats["classification_errors"]
|
| 40 |
+
for subcategory in CLASSIFICATION_SUBCATEGORIES:
|
| 41 |
+
assert subcategory in classification_errors
|
| 42 |
+
assert isinstance(classification_errors[subcategory], int)
|
| 43 |
+
assert classification_errors[subcategory] >= 0
|
| 44 |
+
|
| 45 |
+
question_issues = stats["question_issues"]
|
| 46 |
+
for issue_type in QUESTION_ISSUE_TYPES:
|
| 47 |
+
assert issue_type in question_issues
|
| 48 |
+
assert isinstance(question_issues[issue_type], int)
|
| 49 |
+
assert question_issues[issue_type] >= 0
|
| 50 |
+
|
| 51 |
+
referral_issues = stats["referral_issues"]
|
| 52 |
+
for issue_type in REFERRAL_ISSUE_TYPES:
|
| 53 |
+
assert issue_type in referral_issues
|
| 54 |
+
assert isinstance(referral_issues[issue_type], int)
|
| 55 |
+
assert referral_issues[issue_type] >= 0
|
| 56 |
+
|
| 57 |
+
indicator_issues = stats["indicator_issues"]
|
| 58 |
+
assert isinstance(indicator_issues, dict)
|
| 59 |
+
for indicator_id, count in indicator_issues.items():
|
| 60 |
+
assert isinstance(indicator_id, str)
|
| 61 |
+
assert isinstance(count, int)
|
| 62 |
+
assert count >= 0
|
| 63 |
+
|
| 64 |
+
common_patterns = stats["common_patterns"]
|
| 65 |
+
assert isinstance(common_patterns, list)
|
| 66 |
+
|
| 67 |
+
@given(st.lists(tagging_record_strategy(), min_size=1, max_size=20))
|
| 68 |
+
def test_property_20_error_patterns_grouped_by_type(self, records):
|
| 69 |
+
"""
|
| 70 |
+
**Feature: chaplain-feedback-system, Property 20: Error Patterns Grouped by Type**
|
| 71 |
+
**Validates: Requirements 10.2, 10.3**
|
| 72 |
+
"""
|
| 73 |
+
analyzer = ErrorPatternAnalyzer()
|
| 74 |
+
grouped_patterns = analyzer.get_error_patterns_grouped_by_type(records)
|
| 75 |
+
|
| 76 |
+
assert "classification" in grouped_patterns
|
| 77 |
+
assert "question" in grouped_patterns
|
| 78 |
+
assert "referral" in grouped_patterns
|
| 79 |
+
assert "indicator" in grouped_patterns
|
| 80 |
+
|
| 81 |
+
classification_group = grouped_patterns["classification"]
|
| 82 |
+
assert isinstance(classification_group, dict)
|
| 83 |
+
for subcategory in CLASSIFICATION_SUBCATEGORIES:
|
| 84 |
+
assert subcategory in classification_group
|
| 85 |
+
assert isinstance(classification_group[subcategory], int)
|
| 86 |
+
assert classification_group[subcategory] >= 0
|
| 87 |
+
|
| 88 |
+
question_group = grouped_patterns["question"]
|
| 89 |
+
assert isinstance(question_group, dict)
|
| 90 |
+
for issue_type in QUESTION_ISSUE_TYPES:
|
| 91 |
+
assert issue_type in question_group
|
| 92 |
+
assert isinstance(question_group[issue_type], int)
|
| 93 |
+
assert question_group[issue_type] >= 0
|
| 94 |
+
|
| 95 |
+
referral_group = grouped_patterns["referral"]
|
| 96 |
+
assert isinstance(referral_group, dict)
|
| 97 |
+
for issue_type in REFERRAL_ISSUE_TYPES:
|
| 98 |
+
assert issue_type in referral_group
|
| 99 |
+
assert isinstance(referral_group[issue_type], int)
|
| 100 |
+
assert referral_group[issue_type] >= 0
|
| 101 |
+
|
| 102 |
+
indicator_group = grouped_patterns["indicator"]
|
| 103 |
+
assert isinstance(indicator_group, dict)
|
| 104 |
+
for indicator_id, count in indicator_group.items():
|
| 105 |
+
assert isinstance(indicator_id, str)
|
| 106 |
+
assert isinstance(count, int)
|
| 107 |
+
assert count >= 0
|
| 108 |
+
|
| 109 |
+
|
| 110 |
+
@given(st.lists(tagging_record_strategy(), min_size=0, max_size=20))
|
| 111 |
+
def test_classification_error_analysis_counts_correctly(self, records):
|
| 112 |
+
"""Test that classification error analysis counts errors correctly."""
|
| 113 |
+
analyzer = ErrorPatternAnalyzer()
|
| 114 |
+
error_counts = analyzer.analyze_classification_errors(records)
|
| 115 |
+
|
| 116 |
+
for subcategory in CLASSIFICATION_SUBCATEGORIES:
|
| 117 |
+
assert subcategory in error_counts
|
| 118 |
+
assert isinstance(error_counts[subcategory], int)
|
| 119 |
+
assert error_counts[subcategory] >= 0
|
| 120 |
+
|
| 121 |
+
expected_counts = {subcategory: 0 for subcategory in CLASSIFICATION_SUBCATEGORIES}
|
| 122 |
+
for record in records:
|
| 123 |
+
if not record.is_classification_correct and record.classification_subcategory:
|
| 124 |
+
if record.classification_subcategory in expected_counts:
|
| 125 |
+
expected_counts[record.classification_subcategory] += 1
|
| 126 |
+
|
| 127 |
+
assert error_counts == expected_counts
|
| 128 |
+
|
| 129 |
+
@given(st.lists(tagging_record_strategy(), min_size=0, max_size=20))
|
| 130 |
+
def test_question_issue_analysis_counts_correctly(self, records):
|
| 131 |
+
"""Test that question issue analysis counts issues correctly."""
|
| 132 |
+
analyzer = ErrorPatternAnalyzer()
|
| 133 |
+
issue_counts = analyzer.analyze_question_issues(records)
|
| 134 |
+
|
| 135 |
+
for issue_type in QUESTION_ISSUE_TYPES:
|
| 136 |
+
assert issue_type in issue_counts
|
| 137 |
+
assert isinstance(issue_counts[issue_type], int)
|
| 138 |
+
assert issue_counts[issue_type] >= 0
|
| 139 |
+
|
| 140 |
+
expected_counts = {issue_type: 0 for issue_type in QUESTION_ISSUE_TYPES}
|
| 141 |
+
for record in records:
|
| 142 |
+
for issue in record.question_issues:
|
| 143 |
+
if issue in expected_counts:
|
| 144 |
+
expected_counts[issue] += 1
|
| 145 |
+
|
| 146 |
+
assert issue_counts == expected_counts
|
| 147 |
+
|
| 148 |
+
@given(st.lists(tagging_record_strategy(), min_size=0, max_size=20))
|
| 149 |
+
def test_referral_issue_analysis_counts_correctly(self, records):
|
| 150 |
+
"""Test that referral issue analysis counts issues correctly."""
|
| 151 |
+
analyzer = ErrorPatternAnalyzer()
|
| 152 |
+
issue_counts = analyzer.analyze_referral_issues(records)
|
| 153 |
+
|
| 154 |
+
for issue_type in REFERRAL_ISSUE_TYPES:
|
| 155 |
+
assert issue_type in issue_counts
|
| 156 |
+
assert isinstance(issue_counts[issue_type], int)
|
| 157 |
+
assert issue_counts[issue_type] >= 0
|
| 158 |
+
|
| 159 |
+
expected_counts = {issue_type: 0 for issue_type in REFERRAL_ISSUE_TYPES}
|
| 160 |
+
for record in records:
|
| 161 |
+
for issue in record.referral_issues:
|
| 162 |
+
if issue in expected_counts:
|
| 163 |
+
expected_counts[issue] += 1
|
| 164 |
+
|
| 165 |
+
assert issue_counts == expected_counts
|
| 166 |
+
|
| 167 |
+
@given(st.lists(tagging_record_strategy(), min_size=0, max_size=20))
|
| 168 |
+
def test_indicator_issue_analysis_counts_correctly(self, records):
|
| 169 |
+
"""Test that indicator issue analysis counts indicators correctly."""
|
| 170 |
+
analyzer = ErrorPatternAnalyzer()
|
| 171 |
+
indicator_counts = analyzer.analyze_indicator_issues(records)
|
| 172 |
+
|
| 173 |
+
assert isinstance(indicator_counts, dict)
|
| 174 |
+
|
| 175 |
+
expected_counts = {}
|
| 176 |
+
for record in records:
|
| 177 |
+
for indicator_id in record.indicator_issues:
|
| 178 |
+
if indicator_id not in expected_counts:
|
| 179 |
+
expected_counts[indicator_id] = 0
|
| 180 |
+
expected_counts[indicator_id] += 1
|
| 181 |
+
|
| 182 |
+
assert indicator_counts == expected_counts
|
| 183 |
+
|
| 184 |
+
@given(st.lists(tagging_record_strategy(), min_size=0, max_size=20))
|
| 185 |
+
def test_common_patterns_returns_list(self, records):
|
| 186 |
+
"""Test that common patterns analysis returns a list of strings."""
|
| 187 |
+
analyzer = ErrorPatternAnalyzer()
|
| 188 |
+
patterns = analyzer.get_common_patterns(records)
|
| 189 |
+
|
| 190 |
+
assert isinstance(patterns, list)
|
| 191 |
+
|
| 192 |
+
for pattern in patterns:
|
| 193 |
+
assert isinstance(pattern, str)
|
| 194 |
+
assert len(pattern) > 0
|
tests/chaplain_feedback/test_properties_interaction_logging.py
ADDED
|
@@ -0,0 +1,705 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# test_properties_interaction_logging.py
|
| 2 |
+
"""
|
| 3 |
+
Property-based tests for Chaplain Feedback interaction logging.
|
| 4 |
+
|
| 5 |
+
Tests that interaction logging correctly records all steps with input/output
|
| 6 |
+
and supports approval status updates.
|
| 7 |
+
"""
|
| 8 |
+
|
| 9 |
+
import pytest
|
| 10 |
+
from hypothesis import given, settings
|
| 11 |
+
from datetime import datetime
|
| 12 |
+
|
| 13 |
+
from src.core.interaction_logger import InteractionLogger
|
| 14 |
+
from src.core.chaplain_models import (
|
| 15 |
+
InteractionStepLog,
|
| 16 |
+
TaggingRecord,
|
| 17 |
+
INTERACTION_STEP_TYPES,
|
| 18 |
+
)
|
| 19 |
+
|
| 20 |
+
from tests.chaplain_feedback.conftest import (
|
| 21 |
+
valid_id_strategy,
|
| 22 |
+
tagging_record_strategy,
|
| 23 |
+
)
|
| 24 |
+
|
| 25 |
+
|
| 26 |
+
class TestInteractionLoggingCompleteness:
|
| 27 |
+
"""
|
| 28 |
+
**Feature: chaplain-feedback-system, Property 14: Interaction Step Logging Complete**
|
| 29 |
+
|
| 30 |
+
Tests that interaction logging records all required fields for each step.
|
| 31 |
+
"""
|
| 32 |
+
|
| 33 |
+
def test_interaction_step_logging_complete_all_types(self):
|
| 34 |
+
"""
|
| 35 |
+
**Feature: chaplain-feedback-system, Property 14: Interaction Step Logging Complete**
|
| 36 |
+
**Validates: Requirements 7.1, 7.2**
|
| 37 |
+
|
| 38 |
+
For any interaction step, the log should contain: input text, model output, and timestamp.
|
| 39 |
+
"""
|
| 40 |
+
logger = InteractionLogger()
|
| 41 |
+
|
| 42 |
+
# Test all step types
|
| 43 |
+
for step_type in INTERACTION_STEP_TYPES:
|
| 44 |
+
session_id = f"session_{step_type}"
|
| 45 |
+
message_id = f"msg_{step_type}"
|
| 46 |
+
input_text = f"input for {step_type}"
|
| 47 |
+
model_output = f"output for {step_type}"
|
| 48 |
+
|
| 49 |
+
# Log a step
|
| 50 |
+
step_id = logger.log_step(
|
| 51 |
+
session_id=session_id,
|
| 52 |
+
message_id=message_id,
|
| 53 |
+
step_type=step_type,
|
| 54 |
+
input_text=input_text,
|
| 55 |
+
model_output=model_output,
|
| 56 |
+
)
|
| 57 |
+
|
| 58 |
+
# Retrieve the logged step
|
| 59 |
+
logged_step = logger.get_step(step_id)
|
| 60 |
+
|
| 61 |
+
# Verify all required fields are present and correct
|
| 62 |
+
assert logged_step is not None
|
| 63 |
+
assert logged_step.step_id == step_id
|
| 64 |
+
assert logged_step.session_id == session_id
|
| 65 |
+
assert logged_step.message_id == message_id
|
| 66 |
+
assert logged_step.step_type == step_type
|
| 67 |
+
assert logged_step.input_text == input_text
|
| 68 |
+
assert logged_step.model_output == model_output
|
| 69 |
+
assert logged_step.timestamp is not None
|
| 70 |
+
assert isinstance(logged_step.timestamp, datetime)
|
| 71 |
+
assert logged_step.approval_status is None # Initially no approval
|
| 72 |
+
assert logged_step.tagging_data is None # Initially no tagging
|
| 73 |
+
|
| 74 |
+
def test_interaction_step_logging_multiple_steps(self):
|
| 75 |
+
"""
|
| 76 |
+
Test that multiple steps are logged correctly for a session.
|
| 77 |
+
"""
|
| 78 |
+
logger = InteractionLogger()
|
| 79 |
+
session_id = "test_session_1"
|
| 80 |
+
message_id = "test_message_1"
|
| 81 |
+
|
| 82 |
+
# Log multiple steps
|
| 83 |
+
step_ids = []
|
| 84 |
+
for i in range(3):
|
| 85 |
+
step_id = logger.log_step(
|
| 86 |
+
session_id=session_id,
|
| 87 |
+
message_id=message_id,
|
| 88 |
+
step_type="classification",
|
| 89 |
+
input_text=f"input {i}",
|
| 90 |
+
model_output=f"output {i}",
|
| 91 |
+
)
|
| 92 |
+
step_ids.append(step_id)
|
| 93 |
+
|
| 94 |
+
# Retrieve all session logs
|
| 95 |
+
session_logs = logger.get_session_logs(session_id)
|
| 96 |
+
|
| 97 |
+
# Verify all steps are logged
|
| 98 |
+
assert len(session_logs) == 3
|
| 99 |
+
for i, log in enumerate(session_logs):
|
| 100 |
+
assert log.input_text == f"input {i}"
|
| 101 |
+
assert log.model_output == f"output {i}"
|
| 102 |
+
|
| 103 |
+
def test_interaction_step_logging_preserves_order(self):
|
| 104 |
+
"""
|
| 105 |
+
Test that logged steps are retrieved in the order they were logged.
|
| 106 |
+
"""
|
| 107 |
+
logger = InteractionLogger()
|
| 108 |
+
session_id = "test_session_order"
|
| 109 |
+
|
| 110 |
+
# Log steps in order
|
| 111 |
+
step_ids = []
|
| 112 |
+
for i in range(5):
|
| 113 |
+
step_id = logger.log_step(
|
| 114 |
+
session_id=session_id,
|
| 115 |
+
message_id=f"msg_{i}",
|
| 116 |
+
step_type="classification",
|
| 117 |
+
input_text=f"input_{i}",
|
| 118 |
+
model_output=f"output_{i}",
|
| 119 |
+
)
|
| 120 |
+
step_ids.append(step_id)
|
| 121 |
+
|
| 122 |
+
# Retrieve logs
|
| 123 |
+
session_logs = logger.get_session_logs(session_id)
|
| 124 |
+
|
| 125 |
+
# Verify order is preserved
|
| 126 |
+
assert len(session_logs) == 5
|
| 127 |
+
for i, log in enumerate(session_logs):
|
| 128 |
+
assert log.message_id == f"msg_{i}"
|
| 129 |
+
assert log.input_text == f"input_{i}"
|
| 130 |
+
|
| 131 |
+
def test_interaction_step_logging_by_type(self):
|
| 132 |
+
"""
|
| 133 |
+
Test filtering logs by step type.
|
| 134 |
+
"""
|
| 135 |
+
logger = InteractionLogger()
|
| 136 |
+
session_id = "test_session_types"
|
| 137 |
+
|
| 138 |
+
# Log different types of steps
|
| 139 |
+
logger.log_step(session_id, "msg1", "classification", "input1", "output1")
|
| 140 |
+
logger.log_step(session_id, "msg2", "explanation", "input2", "output2")
|
| 141 |
+
logger.log_step(session_id, "msg3", "classification", "input3", "output3")
|
| 142 |
+
logger.log_step(session_id, "msg4", "referral", "input4", "output4")
|
| 143 |
+
|
| 144 |
+
# Filter by type
|
| 145 |
+
classification_logs = logger.get_session_logs_by_type(session_id, "classification")
|
| 146 |
+
explanation_logs = logger.get_session_logs_by_type(session_id, "explanation")
|
| 147 |
+
referral_logs = logger.get_session_logs_by_type(session_id, "referral")
|
| 148 |
+
|
| 149 |
+
# Verify filtering
|
| 150 |
+
assert len(classification_logs) == 2
|
| 151 |
+
assert len(explanation_logs) == 1
|
| 152 |
+
assert len(referral_logs) == 1
|
| 153 |
+
|
| 154 |
+
def test_interaction_step_logging_message_logs(self):
|
| 155 |
+
"""
|
| 156 |
+
Test retrieving logs for a specific message across sessions.
|
| 157 |
+
"""
|
| 158 |
+
logger = InteractionLogger()
|
| 159 |
+
message_id = "shared_message"
|
| 160 |
+
|
| 161 |
+
# Log same message in different sessions
|
| 162 |
+
logger.log_step("session1", message_id, "classification", "input1", "output1")
|
| 163 |
+
logger.log_step("session2", message_id, "explanation", "input2", "output2")
|
| 164 |
+
logger.log_step("session1", "other_msg", "referral", "input3", "output3")
|
| 165 |
+
|
| 166 |
+
# Get logs for the message
|
| 167 |
+
message_logs = logger.get_message_logs(message_id)
|
| 168 |
+
|
| 169 |
+
# Verify we get logs from both sessions
|
| 170 |
+
assert len(message_logs) == 2
|
| 171 |
+
assert all(log.message_id == message_id for log in message_logs)
|
| 172 |
+
|
| 173 |
+
def test_interaction_step_logging_empty_strings(self):
|
| 174 |
+
"""
|
| 175 |
+
Test that empty input/output strings are logged correctly.
|
| 176 |
+
"""
|
| 177 |
+
logger = InteractionLogger()
|
| 178 |
+
|
| 179 |
+
step_id = logger.log_step(
|
| 180 |
+
session_id="test_session",
|
| 181 |
+
message_id="test_msg",
|
| 182 |
+
step_type="classification",
|
| 183 |
+
input_text="",
|
| 184 |
+
model_output="",
|
| 185 |
+
)
|
| 186 |
+
|
| 187 |
+
logged_step = logger.get_step(step_id)
|
| 188 |
+
|
| 189 |
+
assert logged_step.input_text == ""
|
| 190 |
+
assert logged_step.model_output == ""
|
| 191 |
+
|
| 192 |
+
def test_interaction_step_logging_long_text(self):
|
| 193 |
+
"""
|
| 194 |
+
Test that long input/output text is logged correctly.
|
| 195 |
+
"""
|
| 196 |
+
logger = InteractionLogger()
|
| 197 |
+
long_text = "x" * 10000
|
| 198 |
+
|
| 199 |
+
step_id = logger.log_step(
|
| 200 |
+
session_id="test_session",
|
| 201 |
+
message_id="test_msg",
|
| 202 |
+
step_type="classification",
|
| 203 |
+
input_text=long_text,
|
| 204 |
+
model_output=long_text,
|
| 205 |
+
)
|
| 206 |
+
|
| 207 |
+
logged_step = logger.get_step(step_id)
|
| 208 |
+
|
| 209 |
+
assert logged_step.input_text == long_text
|
| 210 |
+
assert logged_step.model_output == long_text
|
| 211 |
+
assert len(logged_step.input_text) == 10000
|
| 212 |
+
|
| 213 |
+
def test_interaction_step_logging_special_characters(self):
|
| 214 |
+
"""
|
| 215 |
+
Test that special characters in input/output are preserved.
|
| 216 |
+
"""
|
| 217 |
+
logger = InteractionLogger()
|
| 218 |
+
special_text = "Test with special chars: !@#$%^&*()_+-=[]{}|;:',.<>?/~`"
|
| 219 |
+
|
| 220 |
+
step_id = logger.log_step(
|
| 221 |
+
session_id="test_session",
|
| 222 |
+
message_id="test_msg",
|
| 223 |
+
step_type="classification",
|
| 224 |
+
input_text=special_text,
|
| 225 |
+
model_output=special_text,
|
| 226 |
+
)
|
| 227 |
+
|
| 228 |
+
logged_step = logger.get_step(step_id)
|
| 229 |
+
|
| 230 |
+
assert logged_step.input_text == special_text
|
| 231 |
+
assert logged_step.model_output == special_text
|
| 232 |
+
|
| 233 |
+
def test_interaction_step_logging_unicode(self):
|
| 234 |
+
"""
|
| 235 |
+
Test that Unicode characters in input/output are preserved.
|
| 236 |
+
"""
|
| 237 |
+
logger = InteractionLogger()
|
| 238 |
+
unicode_text = "Test with Unicode: 你好世界 🌍 Привет мир"
|
| 239 |
+
|
| 240 |
+
step_id = logger.log_step(
|
| 241 |
+
session_id="test_session",
|
| 242 |
+
message_id="test_msg",
|
| 243 |
+
step_type="classification",
|
| 244 |
+
input_text=unicode_text,
|
| 245 |
+
model_output=unicode_text,
|
| 246 |
+
)
|
| 247 |
+
|
| 248 |
+
logged_step = logger.get_step(step_id)
|
| 249 |
+
|
| 250 |
+
assert logged_step.input_text == unicode_text
|
| 251 |
+
assert logged_step.model_output == unicode_text
|
| 252 |
+
|
| 253 |
+
def test_interaction_step_logging_statistics(self):
|
| 254 |
+
"""
|
| 255 |
+
Test that session statistics are calculated correctly.
|
| 256 |
+
"""
|
| 257 |
+
logger = InteractionLogger()
|
| 258 |
+
session_id = "test_session_stats"
|
| 259 |
+
|
| 260 |
+
# Log some steps
|
| 261 |
+
logger.log_step(session_id, "msg1", "classification", "input1", "output1")
|
| 262 |
+
logger.log_step(session_id, "msg2", "explanation", "input2", "output2")
|
| 263 |
+
logger.log_step(session_id, "msg3", "referral", "input3", "output3")
|
| 264 |
+
|
| 265 |
+
# Get statistics
|
| 266 |
+
stats = logger.get_session_statistics(session_id)
|
| 267 |
+
|
| 268 |
+
# Verify statistics
|
| 269 |
+
assert stats["session_id"] == session_id
|
| 270 |
+
assert stats["total_steps"] == 3
|
| 271 |
+
assert stats["approved_steps"] == 0
|
| 272 |
+
assert stats["disapproved_steps"] == 0
|
| 273 |
+
assert stats["unapproved_steps"] == 3
|
| 274 |
+
assert stats["steps_by_type"]["classification"] == 1
|
| 275 |
+
assert stats["steps_by_type"]["explanation"] == 1
|
| 276 |
+
assert stats["steps_by_type"]["referral"] == 1
|
| 277 |
+
|
| 278 |
+
def test_interaction_step_logging_invalid_step_type(self):
|
| 279 |
+
"""
|
| 280 |
+
Test that invalid step types raise an error.
|
| 281 |
+
"""
|
| 282 |
+
logger = InteractionLogger()
|
| 283 |
+
|
| 284 |
+
with pytest.raises(ValueError):
|
| 285 |
+
logger.log_step(
|
| 286 |
+
session_id="test_session",
|
| 287 |
+
message_id="test_msg",
|
| 288 |
+
step_type="invalid_type",
|
| 289 |
+
input_text="input",
|
| 290 |
+
model_output="output",
|
| 291 |
+
)
|
| 292 |
+
|
| 293 |
+
def test_interaction_step_logging_nonexistent_step(self):
|
| 294 |
+
"""
|
| 295 |
+
Test that retrieving a nonexistent step returns None.
|
| 296 |
+
"""
|
| 297 |
+
logger = InteractionLogger()
|
| 298 |
+
|
| 299 |
+
result = logger.get_step("nonexistent_step_id")
|
| 300 |
+
|
| 301 |
+
assert result is None
|
| 302 |
+
|
| 303 |
+
def test_interaction_step_logging_empty_session(self):
|
| 304 |
+
"""
|
| 305 |
+
Test that retrieving logs for an empty session returns empty list.
|
| 306 |
+
"""
|
| 307 |
+
logger = InteractionLogger()
|
| 308 |
+
|
| 309 |
+
session_logs = logger.get_session_logs("nonexistent_session")
|
| 310 |
+
|
| 311 |
+
assert session_logs == []
|
| 312 |
+
|
| 313 |
+
def test_interaction_step_logging_export(self):
|
| 314 |
+
"""
|
| 315 |
+
Test that session logs can be exported as dictionaries.
|
| 316 |
+
"""
|
| 317 |
+
logger = InteractionLogger()
|
| 318 |
+
session_id = "test_session_export"
|
| 319 |
+
|
| 320 |
+
# Log some steps
|
| 321 |
+
logger.log_step(session_id, "msg1", "classification", "input1", "output1")
|
| 322 |
+
logger.log_step(session_id, "msg2", "explanation", "input2", "output2")
|
| 323 |
+
|
| 324 |
+
# Export logs
|
| 325 |
+
exported = logger.export_session_logs(session_id)
|
| 326 |
+
|
| 327 |
+
# Verify export
|
| 328 |
+
assert len(exported) == 2
|
| 329 |
+
assert all(isinstance(log, dict) for log in exported)
|
| 330 |
+
assert all("step_id" in log for log in exported)
|
| 331 |
+
assert all("input_text" in log for log in exported)
|
| 332 |
+
assert all("model_output" in log for log in exported)
|
| 333 |
+
assert all("timestamp" in log for log in exported)
|
| 334 |
+
|
| 335 |
+
|
| 336 |
+
class TestFeedbackLogging:
|
| 337 |
+
"""
|
| 338 |
+
**Feature: chaplain-feedback-system, Property 15: Feedback Logging Complete**
|
| 339 |
+
|
| 340 |
+
Tests that feedback logging correctly records approval/disapproval status
|
| 341 |
+
with tagging categories and comments.
|
| 342 |
+
"""
|
| 343 |
+
|
| 344 |
+
def test_feedback_logging_approved_status(self):
|
| 345 |
+
"""
|
| 346 |
+
**Feature: chaplain-feedback-system, Property 15: Feedback Logging Complete**
|
| 347 |
+
**Validates: Requirements 7.3, 7.4**
|
| 348 |
+
|
| 349 |
+
For any feedback, the log should record approval status.
|
| 350 |
+
"""
|
| 351 |
+
logger = InteractionLogger()
|
| 352 |
+
session_id = "test_session_feedback"
|
| 353 |
+
|
| 354 |
+
# Log a step
|
| 355 |
+
step_id = logger.log_step(
|
| 356 |
+
session_id=session_id,
|
| 357 |
+
message_id="msg1",
|
| 358 |
+
step_type="classification",
|
| 359 |
+
input_text="input",
|
| 360 |
+
model_output="output",
|
| 361 |
+
)
|
| 362 |
+
|
| 363 |
+
# Update with approved status
|
| 364 |
+
logger.update_approval(step_id, "approved")
|
| 365 |
+
|
| 366 |
+
# Retrieve and verify
|
| 367 |
+
logged_step = logger.get_step(step_id)
|
| 368 |
+
assert logged_step.approval_status == "approved"
|
| 369 |
+
assert logged_step.tagging_data is None
|
| 370 |
+
|
| 371 |
+
def test_feedback_logging_disapproved_status(self):
|
| 372 |
+
"""
|
| 373 |
+
Test that disapproved status is recorded correctly.
|
| 374 |
+
"""
|
| 375 |
+
logger = InteractionLogger()
|
| 376 |
+
session_id = "test_session_feedback"
|
| 377 |
+
|
| 378 |
+
# Log a step
|
| 379 |
+
step_id = logger.log_step(
|
| 380 |
+
session_id=session_id,
|
| 381 |
+
message_id="msg1",
|
| 382 |
+
step_type="classification",
|
| 383 |
+
input_text="input",
|
| 384 |
+
model_output="output",
|
| 385 |
+
)
|
| 386 |
+
|
| 387 |
+
# Update with disapproved status
|
| 388 |
+
logger.update_approval(step_id, "disapproved")
|
| 389 |
+
|
| 390 |
+
# Retrieve and verify
|
| 391 |
+
logged_step = logger.get_step(step_id)
|
| 392 |
+
assert logged_step.approval_status == "disapproved"
|
| 393 |
+
|
| 394 |
+
@given(tagging_record_strategy())
|
| 395 |
+
@settings(max_examples=100)
|
| 396 |
+
def test_feedback_logging_with_tagging_data(self, tagging_record):
|
| 397 |
+
"""
|
| 398 |
+
**Feature: chaplain-feedback-system, Property 15: Feedback Logging Complete**
|
| 399 |
+
**Validates: Requirements 7.3, 7.4**
|
| 400 |
+
|
| 401 |
+
For any chaplain feedback, the log should contain: approval/disapproval status,
|
| 402 |
+
and if disapproved, the tagging categories and comments.
|
| 403 |
+
"""
|
| 404 |
+
logger = InteractionLogger()
|
| 405 |
+
session_id = "test_session_tagging"
|
| 406 |
+
|
| 407 |
+
# Log a step
|
| 408 |
+
step_id = logger.log_step(
|
| 409 |
+
session_id=session_id,
|
| 410 |
+
message_id=tagging_record.message_id,
|
| 411 |
+
step_type="classification",
|
| 412 |
+
input_text="input",
|
| 413 |
+
model_output="output",
|
| 414 |
+
)
|
| 415 |
+
|
| 416 |
+
# Update with disapproved status and tagging data
|
| 417 |
+
logger.update_approval(step_id, "disapproved", tagging_record)
|
| 418 |
+
|
| 419 |
+
# Retrieve and verify
|
| 420 |
+
logged_step = logger.get_step(step_id)
|
| 421 |
+
assert logged_step.approval_status == "disapproved"
|
| 422 |
+
assert logged_step.tagging_data is not None
|
| 423 |
+
assert logged_step.tagging_data.record_id == tagging_record.record_id
|
| 424 |
+
assert logged_step.tagging_data.message_id == tagging_record.message_id
|
| 425 |
+
assert logged_step.tagging_data.is_classification_correct == tagging_record.is_classification_correct
|
| 426 |
+
assert logged_step.tagging_data.question_issues == tagging_record.question_issues
|
| 427 |
+
assert logged_step.tagging_data.referral_issues == tagging_record.referral_issues
|
| 428 |
+
|
| 429 |
+
def test_feedback_logging_classification_subcategory(self):
|
| 430 |
+
"""
|
| 431 |
+
Test that classification subcategory is recorded in tagging data.
|
| 432 |
+
"""
|
| 433 |
+
logger = InteractionLogger()
|
| 434 |
+
session_id = "test_session_classification"
|
| 435 |
+
|
| 436 |
+
# Create tagging record with classification subcategory
|
| 437 |
+
tagging = TaggingRecord(
|
| 438 |
+
record_id="tag1",
|
| 439 |
+
message_id="msg1",
|
| 440 |
+
is_classification_correct=False,
|
| 441 |
+
classification_subcategory="missed_indicators",
|
| 442 |
+
correct_classification="red",
|
| 443 |
+
)
|
| 444 |
+
|
| 445 |
+
# Log a step
|
| 446 |
+
step_id = logger.log_step(
|
| 447 |
+
session_id=session_id,
|
| 448 |
+
message_id="msg1",
|
| 449 |
+
step_type="classification",
|
| 450 |
+
input_text="input",
|
| 451 |
+
model_output="output",
|
| 452 |
+
)
|
| 453 |
+
|
| 454 |
+
# Update with tagging
|
| 455 |
+
logger.update_approval(step_id, "disapproved", tagging)
|
| 456 |
+
|
| 457 |
+
# Retrieve and verify
|
| 458 |
+
logged_step = logger.get_step(step_id)
|
| 459 |
+
assert logged_step.tagging_data.classification_subcategory == "missed_indicators"
|
| 460 |
+
assert logged_step.tagging_data.correct_classification == "red"
|
| 461 |
+
|
| 462 |
+
def test_feedback_logging_question_issues(self):
|
| 463 |
+
"""
|
| 464 |
+
Test that question issues are recorded in tagging data.
|
| 465 |
+
"""
|
| 466 |
+
logger = InteractionLogger()
|
| 467 |
+
session_id = "test_session_questions"
|
| 468 |
+
|
| 469 |
+
# Create tagging record with question issues
|
| 470 |
+
tagging = TaggingRecord(
|
| 471 |
+
record_id="tag1",
|
| 472 |
+
message_id="msg1",
|
| 473 |
+
is_classification_correct=True,
|
| 474 |
+
question_issues=["inappropriate", "too_leading"],
|
| 475 |
+
question_comments="Questions were too intrusive",
|
| 476 |
+
)
|
| 477 |
+
|
| 478 |
+
# Log a step
|
| 479 |
+
step_id = logger.log_step(
|
| 480 |
+
session_id=session_id,
|
| 481 |
+
message_id="msg1",
|
| 482 |
+
step_type="follow_up",
|
| 483 |
+
input_text="input",
|
| 484 |
+
model_output="output",
|
| 485 |
+
)
|
| 486 |
+
|
| 487 |
+
# Update with tagging
|
| 488 |
+
logger.update_approval(step_id, "disapproved", tagging)
|
| 489 |
+
|
| 490 |
+
# Retrieve and verify
|
| 491 |
+
logged_step = logger.get_step(step_id)
|
| 492 |
+
assert logged_step.tagging_data.question_issues == ["inappropriate", "too_leading"]
|
| 493 |
+
assert logged_step.tagging_data.question_comments == "Questions were too intrusive"
|
| 494 |
+
|
| 495 |
+
def test_feedback_logging_referral_issues(self):
|
| 496 |
+
"""
|
| 497 |
+
Test that referral issues are recorded in tagging data.
|
| 498 |
+
"""
|
| 499 |
+
logger = InteractionLogger()
|
| 500 |
+
session_id = "test_session_referral"
|
| 501 |
+
|
| 502 |
+
# Create tagging record with referral issues
|
| 503 |
+
tagging = TaggingRecord(
|
| 504 |
+
record_id="tag1",
|
| 505 |
+
message_id="msg1",
|
| 506 |
+
is_classification_correct=True,
|
| 507 |
+
referral_issues=["incomplete_summary", "inappropriate_tone"],
|
| 508 |
+
referral_comments="Message was incomplete",
|
| 509 |
+
)
|
| 510 |
+
|
| 511 |
+
# Log a step
|
| 512 |
+
step_id = logger.log_step(
|
| 513 |
+
session_id=session_id,
|
| 514 |
+
message_id="msg1",
|
| 515 |
+
step_type="referral",
|
| 516 |
+
input_text="input",
|
| 517 |
+
model_output="output",
|
| 518 |
+
)
|
| 519 |
+
|
| 520 |
+
# Update with tagging
|
| 521 |
+
logger.update_approval(step_id, "disapproved", tagging)
|
| 522 |
+
|
| 523 |
+
# Retrieve and verify
|
| 524 |
+
logged_step = logger.get_step(step_id)
|
| 525 |
+
assert logged_step.tagging_data.referral_issues == ["incomplete_summary", "inappropriate_tone"]
|
| 526 |
+
assert logged_step.tagging_data.referral_comments == "Message was incomplete"
|
| 527 |
+
|
| 528 |
+
def test_feedback_logging_indicator_issues(self):
|
| 529 |
+
"""
|
| 530 |
+
Test that indicator issues are recorded in tagging data.
|
| 531 |
+
"""
|
| 532 |
+
logger = InteractionLogger()
|
| 533 |
+
session_id = "test_session_indicators"
|
| 534 |
+
|
| 535 |
+
# Create tagging record with indicator issues
|
| 536 |
+
tagging = TaggingRecord(
|
| 537 |
+
record_id="tag1",
|
| 538 |
+
message_id="msg1",
|
| 539 |
+
is_classification_correct=True,
|
| 540 |
+
indicator_issues=["indicator_1", "indicator_2"],
|
| 541 |
+
indicator_comments="These indicators were incorrectly identified",
|
| 542 |
+
)
|
| 543 |
+
|
| 544 |
+
# Log a step
|
| 545 |
+
step_id = logger.log_step(
|
| 546 |
+
session_id=session_id,
|
| 547 |
+
message_id="msg1",
|
| 548 |
+
step_type="classification",
|
| 549 |
+
input_text="input",
|
| 550 |
+
model_output="output",
|
| 551 |
+
)
|
| 552 |
+
|
| 553 |
+
# Update with tagging
|
| 554 |
+
logger.update_approval(step_id, "disapproved", tagging)
|
| 555 |
+
|
| 556 |
+
# Retrieve and verify
|
| 557 |
+
logged_step = logger.get_step(step_id)
|
| 558 |
+
assert logged_step.tagging_data.indicator_issues == ["indicator_1", "indicator_2"]
|
| 559 |
+
assert logged_step.tagging_data.indicator_comments == "These indicators were incorrectly identified"
|
| 560 |
+
|
| 561 |
+
def test_feedback_logging_general_notes(self):
|
| 562 |
+
"""
|
| 563 |
+
Test that general notes are recorded in tagging data.
|
| 564 |
+
"""
|
| 565 |
+
logger = InteractionLogger()
|
| 566 |
+
session_id = "test_session_notes"
|
| 567 |
+
|
| 568 |
+
# Create tagging record with general notes
|
| 569 |
+
tagging = TaggingRecord(
|
| 570 |
+
record_id="tag1",
|
| 571 |
+
message_id="msg1",
|
| 572 |
+
is_classification_correct=True,
|
| 573 |
+
general_notes="Overall good classification but needs improvement in tone",
|
| 574 |
+
)
|
| 575 |
+
|
| 576 |
+
# Log a step
|
| 577 |
+
step_id = logger.log_step(
|
| 578 |
+
session_id=session_id,
|
| 579 |
+
message_id="msg1",
|
| 580 |
+
step_type="classification",
|
| 581 |
+
input_text="input",
|
| 582 |
+
model_output="output",
|
| 583 |
+
)
|
| 584 |
+
|
| 585 |
+
# Update with tagging
|
| 586 |
+
logger.update_approval(step_id, "approved", tagging)
|
| 587 |
+
|
| 588 |
+
# Retrieve and verify
|
| 589 |
+
logged_step = logger.get_step(step_id)
|
| 590 |
+
assert logged_step.tagging_data.general_notes == "Overall good classification but needs improvement in tone"
|
| 591 |
+
|
| 592 |
+
def test_feedback_logging_disapproved_steps_retrieval(self):
|
| 593 |
+
"""
|
| 594 |
+
Test that disapproved steps can be retrieved from a session.
|
| 595 |
+
"""
|
| 596 |
+
logger = InteractionLogger()
|
| 597 |
+
session_id = "test_session_disapproved"
|
| 598 |
+
|
| 599 |
+
# Log multiple steps
|
| 600 |
+
step_id_1 = logger.log_step(session_id, "msg1", "classification", "input1", "output1")
|
| 601 |
+
step_id_2 = logger.log_step(session_id, "msg2", "explanation", "input2", "output2")
|
| 602 |
+
step_id_3 = logger.log_step(session_id, "msg3", "referral", "input3", "output3")
|
| 603 |
+
|
| 604 |
+
# Approve first, disapprove second and third
|
| 605 |
+
logger.update_approval(step_id_1, "approved")
|
| 606 |
+
logger.update_approval(step_id_2, "disapproved")
|
| 607 |
+
logger.update_approval(step_id_3, "disapproved")
|
| 608 |
+
|
| 609 |
+
# Get disapproved steps
|
| 610 |
+
disapproved = logger.get_disapproved_steps(session_id)
|
| 611 |
+
|
| 612 |
+
# Verify
|
| 613 |
+
assert len(disapproved) == 2
|
| 614 |
+
assert all(log.approval_status == "disapproved" for log in disapproved)
|
| 615 |
+
|
| 616 |
+
def test_feedback_logging_unapproved_steps_retrieval(self):
|
| 617 |
+
"""
|
| 618 |
+
Test that unapproved steps can be retrieved from a session.
|
| 619 |
+
"""
|
| 620 |
+
logger = InteractionLogger()
|
| 621 |
+
session_id = "test_session_unapproved"
|
| 622 |
+
|
| 623 |
+
# Log multiple steps
|
| 624 |
+
step_id_1 = logger.log_step(session_id, "msg1", "classification", "input1", "output1")
|
| 625 |
+
step_id_2 = logger.log_step(session_id, "msg2", "explanation", "input2", "output2")
|
| 626 |
+
step_id_3 = logger.log_step(session_id, "msg3", "referral", "input3", "output3")
|
| 627 |
+
|
| 628 |
+
# Approve first, leave others unapproved
|
| 629 |
+
logger.update_approval(step_id_1, "approved")
|
| 630 |
+
|
| 631 |
+
# Get unapproved steps
|
| 632 |
+
unapproved = logger.get_unapproved_steps(session_id)
|
| 633 |
+
|
| 634 |
+
# Verify
|
| 635 |
+
assert len(unapproved) == 2
|
| 636 |
+
assert all(log.approval_status is None for log in unapproved)
|
| 637 |
+
|
| 638 |
+
def test_feedback_logging_invalid_approval_status(self):
|
| 639 |
+
"""
|
| 640 |
+
Test that invalid approval status raises an error.
|
| 641 |
+
"""
|
| 642 |
+
logger = InteractionLogger()
|
| 643 |
+
session_id = "test_session_invalid"
|
| 644 |
+
|
| 645 |
+
# Log a step
|
| 646 |
+
step_id = logger.log_step(
|
| 647 |
+
session_id=session_id,
|
| 648 |
+
message_id="msg1",
|
| 649 |
+
step_type="classification",
|
| 650 |
+
input_text="input",
|
| 651 |
+
model_output="output",
|
| 652 |
+
)
|
| 653 |
+
|
| 654 |
+
# Try to update with invalid status
|
| 655 |
+
with pytest.raises(ValueError):
|
| 656 |
+
logger.update_approval(step_id, "invalid_status")
|
| 657 |
+
|
| 658 |
+
def test_feedback_logging_nonexistent_step(self):
|
| 659 |
+
"""
|
| 660 |
+
Test that updating a nonexistent step raises an error.
|
| 661 |
+
"""
|
| 662 |
+
logger = InteractionLogger()
|
| 663 |
+
|
| 664 |
+
with pytest.raises(ValueError):
|
| 665 |
+
logger.update_approval("nonexistent_step", "approved")
|
| 666 |
+
|
| 667 |
+
def test_feedback_logging_export_with_tagging(self):
|
| 668 |
+
"""
|
| 669 |
+
Test that exported logs include tagging data.
|
| 670 |
+
"""
|
| 671 |
+
logger = InteractionLogger()
|
| 672 |
+
session_id = "test_session_export_tagging"
|
| 673 |
+
|
| 674 |
+
# Create tagging record
|
| 675 |
+
tagging = TaggingRecord(
|
| 676 |
+
record_id="tag1",
|
| 677 |
+
message_id="msg1",
|
| 678 |
+
is_classification_correct=False,
|
| 679 |
+
classification_subcategory="missed_indicators",
|
| 680 |
+
correct_classification="red",
|
| 681 |
+
general_notes="Missed key indicators",
|
| 682 |
+
)
|
| 683 |
+
|
| 684 |
+
# Log a step
|
| 685 |
+
step_id = logger.log_step(
|
| 686 |
+
session_id=session_id,
|
| 687 |
+
message_id="msg1",
|
| 688 |
+
step_type="classification",
|
| 689 |
+
input_text="input",
|
| 690 |
+
model_output="output",
|
| 691 |
+
)
|
| 692 |
+
|
| 693 |
+
# Update with tagging
|
| 694 |
+
logger.update_approval(step_id, "disapproved", tagging)
|
| 695 |
+
|
| 696 |
+
# Export logs
|
| 697 |
+
exported = logger.export_session_logs(session_id)
|
| 698 |
+
|
| 699 |
+
# Verify export includes tagging data
|
| 700 |
+
assert len(exported) == 1
|
| 701 |
+
assert exported[0]["approval_status"] == "disapproved"
|
| 702 |
+
assert exported[0]["tagging_data"] is not None
|
| 703 |
+
assert exported[0]["tagging_data"]["classification_subcategory"] == "missed_indicators"
|
| 704 |
+
assert exported[0]["tagging_data"]["correct_classification"] == "red"
|
| 705 |
+
assert exported[0]["tagging_data"]["general_notes"] == "Missed key indicators"
|
tests/chaplain_feedback/test_properties_tagging_service.py
ADDED
|
@@ -0,0 +1,223 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# test_properties_tagging_service.py
|
| 2 |
+
"""
|
| 3 |
+
Property-based tests for TaggingService.
|
| 4 |
+
|
| 5 |
+
Tests universal properties that should hold across all inputs
|
| 6 |
+
for the tagging system functionality.
|
| 7 |
+
"""
|
| 8 |
+
|
| 9 |
+
import pytest
|
| 10 |
+
from hypothesis import given, strategies as st
|
| 11 |
+
|
| 12 |
+
from src.core.tagging_service import TaggingService
|
| 13 |
+
from src.core.chaplain_models import (
|
| 14 |
+
CLASSIFICATION_SUBCATEGORIES,
|
| 15 |
+
QUESTION_ISSUE_TYPES,
|
| 16 |
+
REFERRAL_ISSUE_TYPES,
|
| 17 |
+
)
|
| 18 |
+
from .conftest import valid_id_strategy
|
| 19 |
+
|
| 20 |
+
|
| 21 |
+
class TestTaggingServiceProperties:
|
| 22 |
+
"""Property-based tests for TaggingService."""
|
| 23 |
+
|
| 24 |
+
@given(
|
| 25 |
+
message_id=valid_id_strategy(),
|
| 26 |
+
general_notes=st.text(max_size=200)
|
| 27 |
+
)
|
| 28 |
+
def test_property_10_wrong_classification_subcategories_available(
|
| 29 |
+
self, message_id: str, general_notes: str
|
| 30 |
+
):
|
| 31 |
+
"""
|
| 32 |
+
**Feature: chaplain-feedback-system, Property 10: Wrong Classification Subcategories Available**
|
| 33 |
+
**Validates: Requirements 4.1**
|
| 34 |
+
|
| 35 |
+
For any incorrect classification feedback, the system should provide
|
| 36 |
+
all three subcategory options: "missed_indicators", "false_positive", "missed_distress".
|
| 37 |
+
"""
|
| 38 |
+
service = TaggingService()
|
| 39 |
+
|
| 40 |
+
# Get available subcategories
|
| 41 |
+
available_subcategories = service.get_available_classification_subcategories()
|
| 42 |
+
|
| 43 |
+
# Should contain all three required subcategories
|
| 44 |
+
expected_subcategories = {"missed_indicators", "false_positive", "missed_distress"}
|
| 45 |
+
assert set(available_subcategories) == expected_subcategories
|
| 46 |
+
|
| 47 |
+
# Should be able to create records with each subcategory
|
| 48 |
+
for subcategory in available_subcategories:
|
| 49 |
+
record = service.create_classification_correction(
|
| 50 |
+
message_id=f"{message_id}_{subcategory}",
|
| 51 |
+
subcategory=subcategory,
|
| 52 |
+
correct_classification="red",
|
| 53 |
+
general_notes=general_notes
|
| 54 |
+
)
|
| 55 |
+
|
| 56 |
+
assert record.classification_subcategory == subcategory
|
| 57 |
+
assert record.is_classification_correct is False
|
| 58 |
+
assert record.correct_classification == "red"
|
| 59 |
+
|
| 60 |
+
@given(
|
| 61 |
+
message_id=valid_id_strategy(),
|
| 62 |
+
subcategory=st.sampled_from(CLASSIFICATION_SUBCATEGORIES),
|
| 63 |
+
correct_classification=st.sampled_from(["red", "yellow", "green"]),
|
| 64 |
+
general_notes=st.text(max_size=200)
|
| 65 |
+
)
|
| 66 |
+
def test_property_11_wrong_classification_saves_subcategory(
|
| 67 |
+
self,
|
| 68 |
+
message_id: str,
|
| 69 |
+
subcategory: str,
|
| 70 |
+
correct_classification: str,
|
| 71 |
+
general_notes: str
|
| 72 |
+
):
|
| 73 |
+
"""
|
| 74 |
+
**Feature: chaplain-feedback-system, Property 11: Wrong Classification Saves Subcategory**
|
| 75 |
+
**Validates: Requirements 4.3**
|
| 76 |
+
|
| 77 |
+
For any wrong classification tag submission, the saved record should contain
|
| 78 |
+
both the subcategory and the correct classification.
|
| 79 |
+
"""
|
| 80 |
+
service = TaggingService()
|
| 81 |
+
|
| 82 |
+
# Create classification correction
|
| 83 |
+
record = service.create_classification_correction(
|
| 84 |
+
message_id=message_id,
|
| 85 |
+
subcategory=subcategory,
|
| 86 |
+
correct_classification=correct_classification,
|
| 87 |
+
general_notes=general_notes
|
| 88 |
+
)
|
| 89 |
+
|
| 90 |
+
# Record should be saved and retrievable
|
| 91 |
+
retrieved_record = service.get_tagging_record(record.record_id)
|
| 92 |
+
assert retrieved_record is not None
|
| 93 |
+
|
| 94 |
+
# Should contain both subcategory and correct classification
|
| 95 |
+
assert retrieved_record.classification_subcategory == subcategory
|
| 96 |
+
assert retrieved_record.correct_classification == correct_classification
|
| 97 |
+
assert retrieved_record.is_classification_correct is False
|
| 98 |
+
|
| 99 |
+
# Should also be retrievable by message ID
|
| 100 |
+
message_records = service.get_records_for_message(message_id)
|
| 101 |
+
assert len(message_records) == 1
|
| 102 |
+
assert message_records[0].classification_subcategory == subcategory
|
| 103 |
+
assert message_records[0].correct_classification == correct_classification
|
| 104 |
+
|
| 105 |
+
@given(
|
| 106 |
+
message_id=valid_id_strategy(),
|
| 107 |
+
question_issues=st.lists(
|
| 108 |
+
st.sampled_from(QUESTION_ISSUE_TYPES),
|
| 109 |
+
min_size=1,
|
| 110 |
+
max_size=len(QUESTION_ISSUE_TYPES),
|
| 111 |
+
unique=True
|
| 112 |
+
),
|
| 113 |
+
question_comments=st.one_of(st.none(), st.text(max_size=200))
|
| 114 |
+
)
|
| 115 |
+
def test_property_12_question_issues_multi_select(
|
| 116 |
+
self,
|
| 117 |
+
message_id: str,
|
| 118 |
+
question_issues: list,
|
| 119 |
+
question_comments: str
|
| 120 |
+
):
|
| 121 |
+
"""
|
| 122 |
+
**Feature: chaplain-feedback-system, Property 12: Question Issues Multi-Select**
|
| 123 |
+
**Validates: Requirements 5.2**
|
| 124 |
+
|
| 125 |
+
For any follow-up question issue tagging, the system should allow
|
| 126 |
+
selecting multiple subcategories and save all selected values.
|
| 127 |
+
"""
|
| 128 |
+
service = TaggingService()
|
| 129 |
+
|
| 130 |
+
# Create record with multiple question issues
|
| 131 |
+
record = service.create_tagging_record(
|
| 132 |
+
message_id=message_id,
|
| 133 |
+
question_issues=question_issues,
|
| 134 |
+
question_comments=question_comments
|
| 135 |
+
)
|
| 136 |
+
|
| 137 |
+
# Should save all selected question issues
|
| 138 |
+
assert set(record.question_issues) == set(question_issues)
|
| 139 |
+
assert record.question_comments == question_comments
|
| 140 |
+
|
| 141 |
+
# Should be retrievable with all issues intact
|
| 142 |
+
retrieved_record = service.get_tagging_record(record.record_id)
|
| 143 |
+
assert retrieved_record is not None
|
| 144 |
+
assert set(retrieved_record.question_issues) == set(question_issues)
|
| 145 |
+
assert retrieved_record.question_comments == question_comments
|
| 146 |
+
|
| 147 |
+
@given(
|
| 148 |
+
message_id=valid_id_strategy(),
|
| 149 |
+
referral_issues=st.lists(
|
| 150 |
+
st.sampled_from(REFERRAL_ISSUE_TYPES),
|
| 151 |
+
min_size=1,
|
| 152 |
+
max_size=len(REFERRAL_ISSUE_TYPES),
|
| 153 |
+
unique=True
|
| 154 |
+
),
|
| 155 |
+
referral_comments=st.one_of(st.none(), st.text(max_size=200))
|
| 156 |
+
)
|
| 157 |
+
def test_property_13_referral_issues_multi_select(
|
| 158 |
+
self,
|
| 159 |
+
message_id: str,
|
| 160 |
+
referral_issues: list,
|
| 161 |
+
referral_comments: str
|
| 162 |
+
):
|
| 163 |
+
"""
|
| 164 |
+
**Feature: chaplain-feedback-system, Property 13: Referral Issues Multi-Select**
|
| 165 |
+
**Validates: Requirements 6.2**
|
| 166 |
+
|
| 167 |
+
For any referral message issue tagging, the system should allow
|
| 168 |
+
selecting multiple subcategories and save all selected values.
|
| 169 |
+
"""
|
| 170 |
+
service = TaggingService()
|
| 171 |
+
|
| 172 |
+
# Create record with multiple referral issues
|
| 173 |
+
record = service.create_tagging_record(
|
| 174 |
+
message_id=message_id,
|
| 175 |
+
referral_issues=referral_issues,
|
| 176 |
+
referral_comments=referral_comments
|
| 177 |
+
)
|
| 178 |
+
|
| 179 |
+
# Should save all selected referral issues
|
| 180 |
+
assert set(record.referral_issues) == set(referral_issues)
|
| 181 |
+
assert record.referral_comments == referral_comments
|
| 182 |
+
|
| 183 |
+
# Should be retrievable with all issues intact
|
| 184 |
+
retrieved_record = service.get_tagging_record(record.record_id)
|
| 185 |
+
assert retrieved_record is not None
|
| 186 |
+
assert set(retrieved_record.referral_issues) == set(referral_issues)
|
| 187 |
+
assert retrieved_record.referral_comments == referral_comments
|
| 188 |
+
|
| 189 |
+
@given(
|
| 190 |
+
message_id=valid_id_strategy(),
|
| 191 |
+
indicator_issues=st.lists(st.text(min_size=1, max_size=50), min_size=1, max_size=5),
|
| 192 |
+
indicator_comments=st.one_of(st.none(), st.text(max_size=200))
|
| 193 |
+
)
|
| 194 |
+
def test_indicator_issue_tagging_functionality(
|
| 195 |
+
self,
|
| 196 |
+
message_id: str,
|
| 197 |
+
indicator_issues: list,
|
| 198 |
+
indicator_comments: str
|
| 199 |
+
):
|
| 200 |
+
"""
|
| 201 |
+
Test that indicator issue tagging works correctly.
|
| 202 |
+
|
| 203 |
+
This tests the indicator issue tagging functionality to ensure
|
| 204 |
+
incorrectly identified indicators can be marked with comments.
|
| 205 |
+
"""
|
| 206 |
+
service = TaggingService()
|
| 207 |
+
|
| 208 |
+
# Create record with indicator issues
|
| 209 |
+
record = service.create_indicator_issue_tagging(
|
| 210 |
+
message_id=message_id,
|
| 211 |
+
indicator_issues=indicator_issues,
|
| 212 |
+
indicator_comments=indicator_comments
|
| 213 |
+
)
|
| 214 |
+
|
| 215 |
+
# Should save all indicator issues
|
| 216 |
+
assert record.indicator_issues == indicator_issues
|
| 217 |
+
assert record.indicator_comments == indicator_comments
|
| 218 |
+
|
| 219 |
+
# Should be retrievable with all issues intact
|
| 220 |
+
retrieved_record = service.get_tagging_record(record.record_id)
|
| 221 |
+
assert retrieved_record is not None
|
| 222 |
+
assert retrieved_record.indicator_issues == indicator_issues
|
| 223 |
+
assert retrieved_record.indicator_comments == indicator_comments
|