DocUA commited on
Commit
ab93d81
·
1 Parent(s): a3934b1

Fix CSV download button for Hugging Face Spaces - use DownloadButton for direct file download

Browse files
exports/verification_results_2025-12-10.csv ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ VERIFICATION SUMMARY
2
+ Total Messages,11
3
+ Correct,10
4
+ Incorrect,1
5
+ Accuracy %,90.9
6
+
7
+ Patient Message,Classifier Said,You Said,Notes,Date
8
+ I don't see any point in living anymore. Everything feels hopeless and I can't go on.,RED,RED,,2025-12-10 18:47:25
9
+ I've been thinking about ending it all. I have a plan and I'm ready.,RED,RED,,2025-12-10 18:47:26
10
+ My family would be better off without me. I'm just a burden to everyone.,RED,RED,,2025-12-10 18:47:27
11
+ I've already said goodbye to my loved ones. I'm going to do it tonight.,RED,RED,,2025-12-10 18:47:27
12
+ Nothing matters anymore. I'm tired of fighting. I just want the pain to stop.,RED,YELLOW,івкрврврарчарвар,2025-12-10 18:47:36
13
+ I've been researching ways to hurt myself. I think I'm ready to try.,RED,RED,,2025-12-10 18:47:38
14
+ Everyone would be happier if I was gone. I'm a failure and a disappointment.,RED,RED,,2025-12-10 18:47:38
15
+ I can't take this anymore. The pain is unbearable and I see no way out.,RED,RED,,2025-12-10 18:47:39
16
+ I've written a note. I'm going to do it. Please don't try to stop me.,RED,RED,,2025-12-10 18:47:39
17
+ I'm standing on the edge right now. I'm about to jump.,RED,RED,,2025-12-10 18:47:40
18
+ I have the pills in my hand. This is it. Goodbye.,RED,RED,,2025-12-10 18:47:41
src/core/chaplain_models.py ADDED
@@ -0,0 +1,745 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # chaplain_models.py
2
+ """
3
+ Data models for Chaplain Feedback & Tagging System.
4
+
5
+ Defines core data structures for classification flows, tagging records,
6
+ distress indicators, and interaction logging.
7
+ """
8
+
9
+ from dataclasses import dataclass, field
10
+ from typing import List, Optional, Dict, Any
11
+ from datetime import datetime
12
+
13
+
14
+ # =============================================================================
15
+ # INDICATOR DEFINITIONS - Based on Spiritual Distress Definitions Document
16
+ # =============================================================================
17
+
18
+ # Mapping of all indicators from the definitions document with their categories,
19
+ # subcategories, severity (red/yellow), and definition references.
20
+ # RED (#ea9999): Severe distress - requires immediate attention
21
+ # YELLOW (#ffe599): Potential distress - requires clarification
22
+
23
+ INDICATOR_DEFINITIONS: Dict[str, Dict[str, Any]] = {
24
+ # Section II.A - Emotional expressions
25
+ "crying": {
26
+ "category": "Emotional",
27
+ "subcategory": "Crying",
28
+ "severity": "red",
29
+ "definition_reference": "II.A",
30
+ "description": "Crying as expression of spiritual distress"
31
+ },
32
+ "dysomnias": {
33
+ "category": "Emotional",
34
+ "subcategory": "Dysomnias/Difficulty sleeping",
35
+ "severity": "yellow",
36
+ "definition_reference": "II.A",
37
+ "description": "Sleep disturbances related to spiritual distress"
38
+ },
39
+ "fatigue": {
40
+ "category": "Emotional",
41
+ "subcategory": "Fatigue, emotional exhaustion",
42
+ "severity": "yellow",
43
+ "definition_reference": "II.A",
44
+ "description": "Fatigue and emotional exhaustion"
45
+ },
46
+ "anxiety": {
47
+ "category": "Emotional",
48
+ "subcategory": "Anxiety",
49
+ "severity": "yellow",
50
+ "definition_reference": "II.A",
51
+ "description": "Anxiety as expression of spiritual distress"
52
+ },
53
+ "fear": {
54
+ "category": "Emotional",
55
+ "subcategory": "Fear",
56
+ "severity": "yellow",
57
+ "definition_reference": "II.A",
58
+ "description": "Fear as expression of spiritual distress"
59
+ },
60
+ "anger": {
61
+ "category": "Emotional",
62
+ "subcategory": "Anger",
63
+ "severity": "red",
64
+ "definition_reference": "II.A",
65
+ "description": "Anger as expression of spiritual distress"
66
+ },
67
+ "depressive_symptoms": {
68
+ "category": "Emotional",
69
+ "subcategory": "Depressive symptoms",
70
+ "severity": "yellow",
71
+ "definition_reference": "II.A",
72
+ "description": "Depressive symptoms"
73
+ },
74
+
75
+ # Section II.B - Decreased engagement
76
+ "decreased_engagement": {
77
+ "category": "Engagement",
78
+ "subcategory": "Decreased engagement with hobbies",
79
+ "severity": "yellow",
80
+ "definition_reference": "II.B",
81
+ "description": "Decreased engagement with hobbies, creative expression, and personal interests"
82
+ },
83
+
84
+ # Section II.C - Disinterest in nature
85
+ "disinterest_nature": {
86
+ "category": "Engagement",
87
+ "subcategory": "Disinterest in nature",
88
+ "severity": "yellow",
89
+ "definition_reference": "II.C",
90
+ "description": "Disinterest in nature due to spiritual, emotional and physical limitations"
91
+ },
92
+
93
+ # Section II.D - Excessive guilt
94
+ "excessive_guilt": {
95
+ "category": "Guilt",
96
+ "subcategory": "Excessive guilt",
97
+ "severity": "red",
98
+ "definition_reference": "II.D",
99
+ "description": "Excessive guilt - existential, religious, or relational"
100
+ },
101
+
102
+ # Section II.E - Anger behaviors of spiritual nature
103
+ "anger_spiritual": {
104
+ "category": "Anger",
105
+ "subcategory": "Anger behaviors of a spiritual nature",
106
+ "severity": "red",
107
+ "definition_reference": "II.E",
108
+ "description": "Anger toward power greater than self"
109
+ },
110
+
111
+ # Section II.F - Grief types
112
+ "anticipatory_grieving": {
113
+ "category": "Grief",
114
+ "subcategory": "Anticipatory grieving",
115
+ "severity": "red",
116
+ "definition_reference": "II.F",
117
+ "description": "Emotional response to anticipated death"
118
+ },
119
+ "disenfranchised_grief": {
120
+ "category": "Grief",
121
+ "subcategory": "Disenfranchised grief",
122
+ "severity": "red",
123
+ "definition_reference": "II.F",
124
+ "description": "Grief unacknowledged or unsupported by society"
125
+ },
126
+ "life_review_grieving": {
127
+ "category": "Grief",
128
+ "subcategory": "Grieving in the setting of life review",
129
+ "severity": "yellow",
130
+ "definition_reference": "II.F",
131
+ "description": "Grieving during life review process"
132
+ },
133
+ "maladaptive_grieving": {
134
+ "category": "Grief",
135
+ "subcategory": "Maladaptive grieving",
136
+ "severity": "red",
137
+ "definition_reference": "II.F",
138
+ "description": "Prolonged grief disorder"
139
+ },
140
+ "complicated_grief": {
141
+ "category": "Grief",
142
+ "subcategory": "Complicated grief",
143
+ "severity": "red",
144
+ "definition_reference": "II.F",
145
+ "description": "Persistent, intense grief disrupting daily life"
146
+ },
147
+ "loss_loved_one": {
148
+ "category": "Grief",
149
+ "subcategory": "Loss of a loved one",
150
+ "severity": "red",
151
+ "definition_reference": "II.F",
152
+ "description": "Loss of family member or friend"
153
+ },
154
+
155
+ # Section II.G - Expressions of Spiritual Distress
156
+ "expresses_alienation": {
157
+ "category": "Expressions",
158
+ "subcategory": "Expresses alienation",
159
+ "severity": "yellow",
160
+ "definition_reference": "II.G",
161
+ "description": "Feeling separation, isolation, disconnection"
162
+ },
163
+ "concern_beliefs": {
164
+ "category": "Expressions",
165
+ "subcategory": "Expresses concern about beliefs",
166
+ "severity": "yellow",
167
+ "definition_reference": "II.G",
168
+ "description": "Questions or struggles with spiritual/religious beliefs"
169
+ },
170
+ "concern_future": {
171
+ "category": "Expressions",
172
+ "subcategory": "Expresses concern about the future",
173
+ "severity": "red",
174
+ "definition_reference": "II.G",
175
+ "description": "Anxious, fearful, or uncertain about what lies ahead"
176
+ },
177
+ "concern_values": {
178
+ "category": "Expressions",
179
+ "subcategory": "Expresses concern about values system",
180
+ "severity": "yellow",
181
+ "definition_reference": "II.G",
182
+ "description": "Conflicted about moral or ethical principles"
183
+ },
184
+ "concern_family": {
185
+ "category": "Expressions",
186
+ "subcategory": "Expresses concerns about family",
187
+ "severity": "yellow",
188
+ "definition_reference": "II.G",
189
+ "description": "Distressed about family well-being or relationships"
190
+ },
191
+ "feeling_emptiness": {
192
+ "category": "Expressions",
193
+ "subcategory": "Expresses feeling of emptiness",
194
+ "severity": "red",
195
+ "definition_reference": "II.G",
196
+ "description": "Deep inner void or lack of meaning"
197
+ },
198
+ "feeling_unloved": {
199
+ "category": "Expressions",
200
+ "subcategory": "Expresses feeling unloved",
201
+ "severity": "red",
202
+ "definition_reference": "II.G",
203
+ "description": "Feels unworthy of love or disconnected from caring relationships"
204
+ },
205
+ "feeling_worthless": {
206
+ "category": "Expressions",
207
+ "subcategory": "Expresses feeling worthless",
208
+ "severity": "red",
209
+ "definition_reference": "II.G",
210
+ "description": "Perceives themselves as having little or no value"
211
+ },
212
+ "insufficient_courage": {
213
+ "category": "Expressions",
214
+ "subcategory": "Expresses insufficient courage",
215
+ "severity": "yellow",
216
+ "definition_reference": "II.G",
217
+ "description": "Fear or lack of strength to face suffering"
218
+ },
219
+ "loss_confidence": {
220
+ "category": "Expressions",
221
+ "subcategory": "Expresses loss of confidence",
222
+ "severity": "yellow",
223
+ "definition_reference": "II.G",
224
+ "description": "Diminished trust in themselves or abilities"
225
+ },
226
+ "loss_control": {
227
+ "category": "Expressions",
228
+ "subcategory": "Expresses loss of control",
229
+ "severity": "yellow",
230
+ "definition_reference": "II.G",
231
+ "description": "Feels powerless over life circumstances"
232
+ },
233
+ "loss_hope": {
234
+ "category": "Expressions",
235
+ "subcategory": "Expresses loss of hope",
236
+ "severity": "red",
237
+ "definition_reference": "II.G",
238
+ "description": "Feels despair or believes future holds no possibility"
239
+ },
240
+ "loss_serenity": {
241
+ "category": "Expressions",
242
+ "subcategory": "Expresses loss of serenity",
243
+ "severity": "yellow",
244
+ "definition_reference": "II.G",
245
+ "description": "Inner turmoil, anxiety, or restlessness"
246
+ },
247
+ "need_forgiveness": {
248
+ "category": "Expressions",
249
+ "subcategory": "Expresses need for forgiveness",
250
+ "severity": "red",
251
+ "definition_reference": "II.G",
252
+ "description": "Feels guilt or remorse and desires reconciliation"
253
+ },
254
+ "expresses_regret": {
255
+ "category": "Expressions",
256
+ "subcategory": "Expresses regret",
257
+ "severity": "yellow",
258
+ "definition_reference": "II.G",
259
+ "description": "Sorrow over past actions or missed opportunities"
260
+ },
261
+ "expresses_suffering": {
262
+ "category": "Expressions",
263
+ "subcategory": "Expresses suffering",
264
+ "severity": "red",
265
+ "definition_reference": "II.G",
266
+ "description": "Deep physical, emotional, or spiritual pain"
267
+ },
268
+ "concern_medical_treatment": {
269
+ "category": "Medical",
270
+ "subcategory": "Expresses concern about medical treatment",
271
+ "severity": "red",
272
+ "definition_reference": "II.G",
273
+ "description": "Concern about treatment or medical team"
274
+ },
275
+ "unfinished_business": {
276
+ "category": "Expressions",
277
+ "subcategory": "Expresses feeling of having unfinished business",
278
+ "severity": "red",
279
+ "definition_reference": "II.G",
280
+ "description": "Important matters remain unresolved"
281
+ },
282
+ "desire_share_spiritual": {
283
+ "category": "Spiritual",
284
+ "subcategory": "Expresses desire to share intense spiritual experiences",
285
+ "severity": "yellow",
286
+ "definition_reference": "II.G",
287
+ "description": "Wants to share intense spiritual/religious experiences"
288
+ },
289
+ "inability_transcendence": {
290
+ "category": "Spiritual",
291
+ "subcategory": "Inability to experience transcendence",
292
+ "severity": "red",
293
+ "definition_reference": "II.G",
294
+ "description": "Cannot experience supportive forces larger than oneself"
295
+ },
296
+ "impaired_introspection": {
297
+ "category": "Spiritual",
298
+ "subcategory": "Impaired ability for introspection",
299
+ "severity": "yellow",
300
+ "definition_reference": "II.G",
301
+ "description": "Impaired ability for self-reflection"
302
+ },
303
+
304
+ # Section II.H - Existential questioning
305
+ "questioning_identity": {
306
+ "category": "Existential",
307
+ "subcategory": "Questioning one's identity",
308
+ "severity": "yellow",
309
+ "definition_reference": "II.H",
310
+ "description": "Confused about identity when illness takes away roles"
311
+ },
312
+ "questioning_meaning_life": {
313
+ "category": "Existential",
314
+ "subcategory": "Questioning the meaning of life",
315
+ "severity": "red",
316
+ "definition_reference": "II.H",
317
+ "description": "Grapples with fundamental questions about existence"
318
+ },
319
+ "questioning_meaning_suffering": {
320
+ "category": "Existential",
321
+ "subcategory": "Questioning the meaning of suffering",
322
+ "severity": "red",
323
+ "definition_reference": "II.H",
324
+ "description": "Struggles to understand if pain has purpose"
325
+ },
326
+ "questioning_dignity": {
327
+ "category": "Existential",
328
+ "subcategory": "Questioning one's own dignity",
329
+ "severity": "red",
330
+ "definition_reference": "II.H",
331
+ "description": "Questions inherent worth and value as person"
332
+ },
333
+
334
+ # Section II.I - Social isolation
335
+ "social_isolation": {
336
+ "category": "Social",
337
+ "subcategory": "Social isolation expressions",
338
+ "severity": "yellow",
339
+ "definition_reference": "II.I",
340
+ "description": "Avoids interaction, estrangement, loneliness"
341
+ },
342
+
343
+ # Section II.J - Changes in spiritual/religious practices
344
+ "altered_religious_ritual": {
345
+ "category": "Spiritual",
346
+ "subcategory": "Altered religious ritual",
347
+ "severity": "yellow",
348
+ "definition_reference": "II.J.a",
349
+ "description": "Disruption to religious practices"
350
+ },
351
+ "altered_spiritual_practice": {
352
+ "category": "Spiritual",
353
+ "subcategory": "Altered spiritual practice",
354
+ "severity": "yellow",
355
+ "definition_reference": "II.J.b",
356
+ "description": "Disruption to personal spiritual activities"
357
+ },
358
+
359
+ # Section II.K - Cultural conflict
360
+ "cultural_conflict": {
361
+ "category": "Cultural",
362
+ "subcategory": "Cultural conflict",
363
+ "severity": "yellow",
364
+ "definition_reference": "II.K",
365
+ "description": "Clash between cultural beliefs and healthcare culture"
366
+ },
367
+
368
+ # Section II.L - Sociocultural deprivation
369
+ "sociocultural_deprivation": {
370
+ "category": "Cultural",
371
+ "subcategory": "Sociocultural deprivation",
372
+ "severity": "yellow",
373
+ "definition_reference": "II.L",
374
+ "description": "Separated from cultural community"
375
+ },
376
+
377
+ # Section II.M - Difficulty accepting aging
378
+ "difficulty_accepting_aging": {
379
+ "category": "Aging",
380
+ "subcategory": "Difficulty accepting aging",
381
+ "severity": "yellow",
382
+ "definition_reference": "II.M",
383
+ "description": "Grief over lost abilities, resistance to mortality"
384
+ },
385
+
386
+ # Section II.N - Inadequate environmental control
387
+ "inadequate_environmental_control": {
388
+ "category": "Environment",
389
+ "subcategory": "Inadequate environmental control",
390
+ "severity": "yellow",
391
+ "definition_reference": "II.N",
392
+ "description": "Unable to shape surroundings for spiritual needs"
393
+ },
394
+
395
+ # Section II.O - Loss of independence
396
+ "loss_independence": {
397
+ "category": "Independence",
398
+ "subcategory": "Loss of independence",
399
+ "severity": "yellow",
400
+ "definition_reference": "II.O",
401
+ "description": "Dependency threatens personal and spiritual agency"
402
+ },
403
+
404
+ # Section II.P - Uncontrolled pain
405
+ "uncontrolled_pain": {
406
+ "category": "Medical",
407
+ "subcategory": "Uncontrolled pain",
408
+ "severity": "red",
409
+ "definition_reference": "II.P",
410
+ "description": "Persistent physical pain causing existential distress"
411
+ },
412
+
413
+ # Section II.Q - Spiritual pain
414
+ "spiritual_pain": {
415
+ "category": "Spiritual",
416
+ "subcategory": "Spiritual pain",
417
+ "severity": "red",
418
+ "definition_reference": "II.Q",
419
+ "description": "Soul-level suffering beyond physical symptoms"
420
+ },
421
+ }
422
+
423
+
424
+ # =============================================================================
425
+ # DATA MODELS
426
+ # =============================================================================
427
+
428
+ @dataclass
429
+ class DistressIndicator:
430
+ """
431
+ Detected distress indicator with category and severity.
432
+
433
+ Based on the Spiritual Distress Definitions document with color coding:
434
+ - RED (#ea9999): Severe distress - requires immediate attention
435
+ - YELLOW (#ffe599): Potential distress - requires clarification
436
+ """
437
+ indicator_text: str
438
+ category: str # "Emotional", "Grief", "Existential", "Expressions", "Spiritual", "Medical", "Social", "Cultural"
439
+ subcategory: str # Specific indicator name from definitions document
440
+ severity: str # "red" or "yellow" - based on color coding in definitions document
441
+ confidence: float # 0.0-1.0
442
+ definition_reference: str = "" # Section reference (e.g., "II.D", "II.G")
443
+
444
+ def __post_init__(self):
445
+ """Validate severity value."""
446
+ if self.severity not in ("red", "yellow"):
447
+ raise ValueError(f"Severity must be 'red' or 'yellow', got '{self.severity}'")
448
+ if not 0.0 <= self.confidence <= 1.0:
449
+ raise ValueError(f"Confidence must be between 0.0 and 1.0, got {self.confidence}")
450
+
451
+ def to_dict(self) -> dict:
452
+ """Convert indicator to dictionary for serialization."""
453
+ return {
454
+ "indicator_text": self.indicator_text,
455
+ "category": self.category,
456
+ "subcategory": self.subcategory,
457
+ "severity": self.severity,
458
+ "confidence": self.confidence,
459
+ "definition_reference": self.definition_reference,
460
+ }
461
+
462
+ @classmethod
463
+ def from_dict(cls, data: dict) -> "DistressIndicator":
464
+ """Create indicator from dictionary."""
465
+ return cls(**data)
466
+
467
+ @classmethod
468
+ def from_definition(cls, indicator_key: str, indicator_text: str, confidence: float) -> "DistressIndicator":
469
+ """
470
+ Create indicator from INDICATOR_DEFINITIONS constant.
471
+
472
+ Args:
473
+ indicator_key: Key in INDICATOR_DEFINITIONS (e.g., "excessive_guilt")
474
+ indicator_text: The actual text that triggered this indicator
475
+ confidence: Confidence score 0.0-1.0
476
+
477
+ Returns:
478
+ DistressIndicator with category, subcategory, severity from definitions
479
+
480
+ Raises:
481
+ KeyError: If indicator_key not found in INDICATOR_DEFINITIONS
482
+ """
483
+ if indicator_key not in INDICATOR_DEFINITIONS:
484
+ raise KeyError(f"Unknown indicator key: {indicator_key}")
485
+
486
+ defn = INDICATOR_DEFINITIONS[indicator_key]
487
+ return cls(
488
+ indicator_text=indicator_text,
489
+ category=defn["category"],
490
+ subcategory=defn["subcategory"],
491
+ severity=defn["severity"],
492
+ confidence=confidence,
493
+ definition_reference=defn["definition_reference"],
494
+ )
495
+
496
+
497
+
498
+ @dataclass
499
+ class FollowUpQuestion:
500
+ """
501
+ Generated follow-up question for YELLOW cases.
502
+
503
+ Contains 1-2 short, sensitive clarifying questions with purpose explanation.
504
+ """
505
+ question_id: str
506
+ question_text: str
507
+ purpose: str # Why this question is being asked
508
+
509
+ def to_dict(self) -> dict:
510
+ """Convert question to dictionary for serialization."""
511
+ return {
512
+ "question_id": self.question_id,
513
+ "question_text": self.question_text,
514
+ "purpose": self.purpose,
515
+ }
516
+
517
+ @classmethod
518
+ def from_dict(cls, data: dict) -> "FollowUpQuestion":
519
+ """Create question from dictionary."""
520
+ return cls(**data)
521
+
522
+
523
+ @dataclass
524
+ class ClassificationFlowResult:
525
+ """
526
+ Complete result of classification flow.
527
+
528
+ Contains all flow-specific fields for RED/YELLOW/GREEN classifications.
529
+ """
530
+ classification: str # "red", "yellow", "green"
531
+ confidence: float # 0.0-1.0
532
+ indicators: List[DistressIndicator] = field(default_factory=list)
533
+ explanation: str = ""
534
+
535
+ # RED-specific fields
536
+ permission_check_message: Optional[str] = None
537
+ referral_message: Optional[str] = None
538
+ consent_status: Optional[str] = None # "granted", "declined", None
539
+
540
+ # YELLOW-specific fields
541
+ follow_up_questions: List[FollowUpQuestion] = field(default_factory=list)
542
+ patient_responses: List[str] = field(default_factory=list)
543
+ re_evaluation_result: Optional[str] = None # "red", "green", None
544
+
545
+ def __post_init__(self):
546
+ """Validate classification value."""
547
+ if self.classification not in ("red", "yellow", "green"):
548
+ raise ValueError(f"Classification must be 'red', 'yellow', or 'green', got '{self.classification}'")
549
+ if not 0.0 <= self.confidence <= 1.0:
550
+ raise ValueError(f"Confidence must be between 0.0 and 1.0, got {self.confidence}")
551
+
552
+ def to_dict(self) -> dict:
553
+ """Convert result to dictionary for serialization."""
554
+ return {
555
+ "classification": self.classification,
556
+ "confidence": self.confidence,
557
+ "indicators": [i.to_dict() for i in self.indicators],
558
+ "explanation": self.explanation,
559
+ "permission_check_message": self.permission_check_message,
560
+ "referral_message": self.referral_message,
561
+ "consent_status": self.consent_status,
562
+ "follow_up_questions": [q.to_dict() for q in self.follow_up_questions],
563
+ "patient_responses": self.patient_responses,
564
+ "re_evaluation_result": self.re_evaluation_result,
565
+ }
566
+
567
+ @classmethod
568
+ def from_dict(cls, data: dict) -> "ClassificationFlowResult":
569
+ """Create result from dictionary."""
570
+ data_copy = data.copy()
571
+
572
+ # Convert nested indicators
573
+ indicators_data = data_copy.pop("indicators", [])
574
+ indicators = [DistressIndicator.from_dict(i) for i in indicators_data]
575
+
576
+ # Convert nested follow-up questions
577
+ questions_data = data_copy.pop("follow_up_questions", [])
578
+ questions = [FollowUpQuestion.from_dict(q) for q in questions_data]
579
+
580
+ result = cls(**data_copy)
581
+ result.indicators = indicators
582
+ result.follow_up_questions = questions
583
+ return result
584
+
585
+
586
+ # Tagging category constants
587
+ CLASSIFICATION_SUBCATEGORIES = [
588
+ "missed_indicators", # Missed key distress indicators
589
+ "false_positive", # Overly sensitive (false-positive flag)
590
+ "missed_distress", # Not sensitive enough (missed distress)
591
+ ]
592
+
593
+ QUESTION_ISSUE_TYPES = [
594
+ "inappropriate", # Question is inappropriate or intrusive
595
+ "not_relevant", # Question is not spiritually relevant
596
+ "too_leading", # Question is too leading or assumptive
597
+ "unclear", # Question is unclear or confusing
598
+ "tone_clinical", # Tone too clinical
599
+ "tone_religious", # Tone too religious
600
+ "tone_casual", # Tone too casual
601
+ ]
602
+
603
+ REFERRAL_ISSUE_TYPES = [
604
+ "incomplete_summary", # Incorrect or incomplete summary
605
+ "misrepresentation", # Misrepresentation of patient message
606
+ "inappropriate_tone", # Tone inappropriate for spiritual care team
607
+ ]
608
+
609
+
610
+ @dataclass
611
+ class TaggingRecord:
612
+ """
613
+ Structured tagging feedback from chaplain.
614
+
615
+ Supports multi-select for question and referral issues.
616
+ """
617
+ record_id: str
618
+ message_id: str
619
+
620
+ # Classification feedback
621
+ is_classification_correct: bool = True
622
+ classification_subcategory: Optional[str] = None # "missed_indicators", "false_positive", "missed_distress"
623
+ correct_classification: Optional[str] = None # "red", "yellow", "green"
624
+
625
+ # Follow-up question feedback (YELLOW only)
626
+ question_issues: List[str] = field(default_factory=list) # Multi-select from QUESTION_ISSUE_TYPES
627
+ question_comments: Optional[str] = None
628
+
629
+ # Referral message feedback (RED only)
630
+ referral_issues: List[str] = field(default_factory=list) # Multi-select from REFERRAL_ISSUE_TYPES
631
+ referral_comments: Optional[str] = None
632
+
633
+ # Indicator feedback
634
+ indicator_issues: List[str] = field(default_factory=list) # List of incorrectly identified indicator IDs
635
+ indicator_comments: Optional[str] = None
636
+
637
+ # General
638
+ general_notes: str = ""
639
+ timestamp: datetime = field(default_factory=datetime.now)
640
+
641
+ def __post_init__(self):
642
+ """Validate tagging values."""
643
+ if self.classification_subcategory and self.classification_subcategory not in CLASSIFICATION_SUBCATEGORIES:
644
+ raise ValueError(f"Invalid classification subcategory: {self.classification_subcategory}")
645
+ if self.correct_classification and self.correct_classification not in ("red", "yellow", "green"):
646
+ raise ValueError(f"Invalid correct_classification: {self.correct_classification}")
647
+ for issue in self.question_issues:
648
+ if issue not in QUESTION_ISSUE_TYPES:
649
+ raise ValueError(f"Invalid question issue type: {issue}")
650
+ for issue in self.referral_issues:
651
+ if issue not in REFERRAL_ISSUE_TYPES:
652
+ raise ValueError(f"Invalid referral issue type: {issue}")
653
+
654
+ def to_dict(self) -> dict:
655
+ """Convert record to dictionary for serialization."""
656
+ return {
657
+ "record_id": self.record_id,
658
+ "message_id": self.message_id,
659
+ "is_classification_correct": self.is_classification_correct,
660
+ "classification_subcategory": self.classification_subcategory,
661
+ "correct_classification": self.correct_classification,
662
+ "question_issues": self.question_issues,
663
+ "question_comments": self.question_comments,
664
+ "referral_issues": self.referral_issues,
665
+ "referral_comments": self.referral_comments,
666
+ "indicator_issues": self.indicator_issues,
667
+ "indicator_comments": self.indicator_comments,
668
+ "general_notes": self.general_notes,
669
+ "timestamp": self.timestamp.isoformat(),
670
+ }
671
+
672
+ @classmethod
673
+ def from_dict(cls, data: dict) -> "TaggingRecord":
674
+ """Create record from dictionary."""
675
+ data_copy = data.copy()
676
+ if isinstance(data_copy.get("timestamp"), str):
677
+ data_copy["timestamp"] = datetime.fromisoformat(data_copy["timestamp"])
678
+ return cls(**data_copy)
679
+
680
+
681
+
682
+ # Interaction step types
683
+ INTERACTION_STEP_TYPES = [
684
+ "classification", # Initial classification
685
+ "explanation", # Explanation generation
686
+ "permission_check", # Patient consent request
687
+ "follow_up", # Follow-up questions
688
+ "referral", # Referral message generation
689
+ "feedback", # Chaplain feedback
690
+ ]
691
+
692
+
693
+ @dataclass
694
+ class InteractionStepLog:
695
+ """
696
+ Log entry for a single interaction step.
697
+
698
+ Records all interaction steps with input/output for analysis.
699
+ """
700
+ step_id: str
701
+ session_id: str
702
+ message_id: str
703
+ step_type: str # "classification", "explanation", "permission_check", "follow_up", "referral", "feedback"
704
+ input_text: str
705
+ model_output: str
706
+ approval_status: Optional[str] = None # "approved", "disapproved", None
707
+ tagging_data: Optional[TaggingRecord] = None
708
+ timestamp: datetime = field(default_factory=datetime.now)
709
+
710
+ def __post_init__(self):
711
+ """Validate step type."""
712
+ if self.step_type not in INTERACTION_STEP_TYPES:
713
+ raise ValueError(f"Invalid step type: {self.step_type}")
714
+ if self.approval_status and self.approval_status not in ("approved", "disapproved"):
715
+ raise ValueError(f"Invalid approval status: {self.approval_status}")
716
+
717
+ def to_dict(self) -> dict:
718
+ """Convert log entry to dictionary for serialization."""
719
+ return {
720
+ "step_id": self.step_id,
721
+ "session_id": self.session_id,
722
+ "message_id": self.message_id,
723
+ "step_type": self.step_type,
724
+ "input_text": self.input_text,
725
+ "model_output": self.model_output,
726
+ "approval_status": self.approval_status,
727
+ "tagging_data": self.tagging_data.to_dict() if self.tagging_data else None,
728
+ "timestamp": self.timestamp.isoformat(),
729
+ }
730
+
731
+ @classmethod
732
+ def from_dict(cls, data: dict) -> "InteractionStepLog":
733
+ """Create log entry from dictionary."""
734
+ data_copy = data.copy()
735
+ if isinstance(data_copy.get("timestamp"), str):
736
+ data_copy["timestamp"] = datetime.fromisoformat(data_copy["timestamp"])
737
+
738
+ # Convert nested tagging data
739
+ tagging_data = data_copy.pop("tagging_data", None)
740
+ if tagging_data:
741
+ tagging_data = TaggingRecord.from_dict(tagging_data)
742
+
743
+ log = cls(**data_copy)
744
+ log.tagging_data = tagging_data
745
+ return log
src/core/classification_flow_manager.py ADDED
@@ -0,0 +1,310 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # classification_flow_manager.py
2
+ """
3
+ Classification Flow Manager for Chaplain Feedback System.
4
+
5
+ Orchestrates RED/YELLOW/GREEN classification flows and integrates with ContentGenerator
6
+ to produce complete classification results with appropriate content.
7
+ """
8
+
9
+ from typing import List, Optional
10
+ import uuid
11
+ from datetime import datetime
12
+
13
+ from src.core.chaplain_models import (
14
+ DistressIndicator,
15
+ FollowUpQuestion,
16
+ ClassificationFlowResult,
17
+ )
18
+ from src.core.content_generator import ContentGenerator
19
+
20
+
21
+ class ClassificationFlowManager:
22
+ """
23
+ Orchestrates RED/YELLOW/GREEN classification flows.
24
+
25
+ Integrates with ContentGenerator to produce complete classification results
26
+ with explanations, permission checks, referral messages, and follow-up questions.
27
+ """
28
+
29
+ def __init__(self, content_generator: Optional[ContentGenerator] = None):
30
+ """
31
+ Initialize flow manager.
32
+
33
+ Args:
34
+ content_generator: ContentGenerator instance, creates new one if None
35
+ """
36
+ self.content_generator = content_generator or ContentGenerator()
37
+
38
+ def execute_classification_flow(
39
+ self,
40
+ message: str,
41
+ classification: str,
42
+ confidence: float,
43
+ indicators: List[DistressIndicator]
44
+ ) -> ClassificationFlowResult:
45
+ """
46
+ Execute complete classification flow based on classification type.
47
+
48
+ Args:
49
+ message: Original patient message
50
+ classification: "red", "yellow", or "green"
51
+ confidence: Classification confidence (0.0-1.0)
52
+ indicators: List of detected distress indicators
53
+
54
+ Returns:
55
+ Complete ClassificationFlowResult with all generated content
56
+ """
57
+ if classification == "red":
58
+ return self.execute_red_flow(message, confidence, indicators)
59
+ elif classification == "yellow":
60
+ return self.execute_yellow_flow(message, confidence, indicators)
61
+ elif classification == "green":
62
+ return self.execute_green_flow(message, confidence, indicators)
63
+ else:
64
+ raise ValueError(f"Invalid classification: {classification}")
65
+
66
+ def execute_red_flow(
67
+ self,
68
+ message: str,
69
+ confidence: float,
70
+ indicators: List[DistressIndicator],
71
+ consent_status: Optional[str] = None
72
+ ) -> ClassificationFlowResult:
73
+ """
74
+ Execute RED flag flow.
75
+
76
+ Generates explanation, permission check, and referral message.
77
+ Handles consent granted/declined states.
78
+
79
+ Args:
80
+ message: Original patient message
81
+ confidence: Classification confidence
82
+ indicators: List of detected distress indicators
83
+ consent_status: "granted", "declined", or None for simulation
84
+
85
+ Returns:
86
+ ClassificationFlowResult with RED flow content
87
+ """
88
+ # Generate explanation
89
+ explanation = self.content_generator.generate_explanation(
90
+ "red", indicators, message
91
+ )
92
+
93
+ # Generate permission check message
94
+ permission_check = self.content_generator.generate_permission_check(indicators)
95
+
96
+ # Simulate consent if not provided
97
+ if consent_status is None:
98
+ # For testing/demo purposes, simulate consent as granted
99
+ # In real implementation, this would come from user interaction
100
+ consent_status = "granted"
101
+
102
+ # Generate referral message if consent granted
103
+ referral_message = None
104
+ if consent_status == "granted":
105
+ referral_message = self.content_generator.generate_referral_message(
106
+ message, indicators, explanation
107
+ )
108
+
109
+ return ClassificationFlowResult(
110
+ classification="red",
111
+ confidence=confidence,
112
+ indicators=indicators,
113
+ explanation=explanation,
114
+ permission_check_message=permission_check,
115
+ referral_message=referral_message,
116
+ consent_status=consent_status,
117
+ )
118
+
119
+ def execute_yellow_flow(
120
+ self,
121
+ message: str,
122
+ confidence: float,
123
+ indicators: List[DistressIndicator],
124
+ patient_responses: Optional[List[str]] = None
125
+ ) -> ClassificationFlowResult:
126
+ """
127
+ Execute YELLOW flag flow.
128
+
129
+ Generates explanation and follow-up questions.
130
+ Handles re-evaluation based on responses.
131
+
132
+ Args:
133
+ message: Original patient message
134
+ confidence: Classification confidence
135
+ indicators: List of detected distress indicators
136
+ patient_responses: Simulated patient responses to follow-up questions
137
+
138
+ Returns:
139
+ ClassificationFlowResult with YELLOW flow content
140
+ """
141
+ # Generate explanation
142
+ explanation = self.content_generator.generate_explanation(
143
+ "yellow", indicators, message
144
+ )
145
+
146
+ # Generate follow-up questions
147
+ follow_up_questions = self.content_generator.generate_follow_up_questions(
148
+ message, indicators
149
+ )
150
+
151
+ # Handle patient responses and re-evaluation
152
+ re_evaluation_result = None
153
+ if patient_responses is None:
154
+ # Simulate patient responses for demo/testing
155
+ patient_responses = self._simulate_patient_responses(follow_up_questions)
156
+
157
+ if patient_responses:
158
+ re_evaluation_result = self._evaluate_patient_responses(patient_responses)
159
+
160
+ return ClassificationFlowResult(
161
+ classification="yellow",
162
+ confidence=confidence,
163
+ indicators=indicators,
164
+ explanation=explanation,
165
+ follow_up_questions=follow_up_questions,
166
+ patient_responses=patient_responses,
167
+ re_evaluation_result=re_evaluation_result,
168
+ )
169
+
170
+ def execute_green_flow(
171
+ self,
172
+ message: str,
173
+ confidence: float,
174
+ indicators: List[DistressIndicator]
175
+ ) -> ClassificationFlowResult:
176
+ """
177
+ Execute GREEN flag flow.
178
+
179
+ Generates explanation for no indicators.
180
+ Displays "No further steps" status.
181
+
182
+ Args:
183
+ message: Original patient message
184
+ confidence: Classification confidence
185
+ indicators: List of detected distress indicators (should be empty)
186
+
187
+ Returns:
188
+ ClassificationFlowResult with GREEN flow content
189
+ """
190
+ # Generate explanation
191
+ explanation = self.content_generator.generate_explanation(
192
+ "green", indicators, message
193
+ )
194
+
195
+ return ClassificationFlowResult(
196
+ classification="green",
197
+ confidence=confidence,
198
+ indicators=indicators,
199
+ explanation=explanation,
200
+ )
201
+
202
+ def escalate_yellow_to_red(
203
+ self,
204
+ yellow_result: ClassificationFlowResult,
205
+ message: str
206
+ ) -> ClassificationFlowResult:
207
+ """
208
+ Escalate YELLOW classification to RED based on patient responses.
209
+
210
+ Args:
211
+ yellow_result: Original YELLOW classification result
212
+ message: Original patient message
213
+
214
+ Returns:
215
+ New RED ClassificationFlowResult
216
+ """
217
+ # Create new RED indicators based on escalation
218
+ escalated_indicators = yellow_result.indicators.copy()
219
+
220
+ # Execute RED flow with escalated indicators
221
+ return self.execute_red_flow(
222
+ message,
223
+ confidence=0.85, # High confidence for escalated case
224
+ indicators=escalated_indicators,
225
+ consent_status="granted" # Assume consent for escalated cases
226
+ )
227
+
228
+ def downgrade_yellow_to_green(
229
+ self,
230
+ yellow_result: ClassificationFlowResult,
231
+ message: str
232
+ ) -> ClassificationFlowResult:
233
+ """
234
+ Downgrade YELLOW classification to GREEN based on patient responses.
235
+
236
+ Args:
237
+ yellow_result: Original YELLOW classification result
238
+ message: Original patient message
239
+
240
+ Returns:
241
+ New GREEN ClassificationFlowResult
242
+ """
243
+ # Execute GREEN flow
244
+ return self.execute_green_flow(
245
+ message,
246
+ confidence=0.80, # High confidence for downgraded case
247
+ indicators=[] # No indicators for GREEN
248
+ )
249
+
250
+ def _simulate_patient_responses(
251
+ self,
252
+ questions: List[FollowUpQuestion]
253
+ ) -> List[str]:
254
+ """
255
+ Simulate patient responses to follow-up questions for demo/testing.
256
+
257
+ Args:
258
+ questions: List of follow-up questions
259
+
260
+ Returns:
261
+ List of simulated patient responses
262
+ """
263
+ # Simple simulation - in real implementation, these would come from user
264
+ responses = [
265
+ "I've been feeling okay, just worried about my treatment.",
266
+ "I have my family to talk to, but sometimes I feel alone.",
267
+ "I think I'd like to talk to someone from the care team."
268
+ ]
269
+
270
+ # Return responses matching the number of questions
271
+ return responses[:len(questions)]
272
+
273
+ def _evaluate_patient_responses(
274
+ self,
275
+ responses: List[str]
276
+ ) -> str:
277
+ """
278
+ Evaluate patient responses to determine if escalation or downgrade needed.
279
+
280
+ Args:
281
+ responses: List of patient responses
282
+
283
+ Returns:
284
+ "red" for escalation, "green" for downgrade, None for no change
285
+ """
286
+ # Simple evaluation logic for demo/testing
287
+ # In real implementation, this would use more sophisticated analysis
288
+
289
+ combined_responses = " ".join(responses).lower()
290
+
291
+ # Check for escalation keywords (distress indicators)
292
+ escalation_keywords = [
293
+ "hopeless", "worthless", "can't go on", "want to die",
294
+ "no point", "give up", "unbearable", "can't take it"
295
+ ]
296
+
297
+ if any(keyword in combined_responses for keyword in escalation_keywords):
298
+ return "red"
299
+
300
+ # Check for downgrade keywords (positive indicators)
301
+ downgrade_keywords = [
302
+ "feeling better", "okay", "fine", "good support",
303
+ "not worried", "managing well", "hopeful"
304
+ ]
305
+
306
+ if any(keyword in combined_responses for keyword in downgrade_keywords):
307
+ return "green"
308
+
309
+ # No clear indication - remain YELLOW
310
+ return None
src/core/content_generator.py ADDED
@@ -0,0 +1,346 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # content_generator.py
2
+ """
3
+ Content Generation Service for Chaplain Feedback System.
4
+
5
+ Generates explanations, permission checks, referral messages, and follow-up questions
6
+ for RED/YELLOW/GREEN classification flows.
7
+ """
8
+
9
+ from typing import List
10
+ import uuid
11
+
12
+ from src.core.chaplain_models import (
13
+ DistressIndicator,
14
+ FollowUpQuestion,
15
+ )
16
+
17
+
18
+ class ContentGenerator:
19
+ """
20
+ Generates content for classification flows.
21
+
22
+ Provides methods to generate:
23
+ - Explanations for RED/YELLOW/GREEN classifications
24
+ - Permission check messages for RED cases
25
+ - Referral messages for spiritual care team
26
+ - Follow-up questions for YELLOW cases
27
+ """
28
+
29
+ def generate_explanation(
30
+ self,
31
+ classification: str,
32
+ indicators: List[DistressIndicator],
33
+ message: str
34
+ ) -> str:
35
+ """
36
+ Generate explanation for classification.
37
+
38
+ Args:
39
+ classification: "red", "yellow", or "green"
40
+ indicators: List of detected distress indicators
41
+ message: Original patient message
42
+
43
+ Returns:
44
+ Explanation text referencing distress indicators
45
+ """
46
+ if classification == "red":
47
+ return self._generate_red_explanation(indicators, message)
48
+ elif classification == "yellow":
49
+ return self._generate_yellow_explanation(indicators, message)
50
+ else:
51
+ return self._generate_green_explanation(message)
52
+
53
+ def _generate_red_explanation(
54
+ self,
55
+ indicators: List[DistressIndicator],
56
+ message: str
57
+ ) -> str:
58
+ """Generate explanation for RED classification."""
59
+ explanation_parts = [
60
+ "This message has been classified as RED FLAG (severe spiritual distress) "
61
+ "requiring immediate attention from the spiritual care team."
62
+ ]
63
+
64
+ if indicators:
65
+ explanation_parts.append("\n\nDetected distress indicators:")
66
+ for indicator in indicators:
67
+ indicator_line = (
68
+ f"\n- {indicator.subcategory} ({indicator.category}): "
69
+ f"'{indicator.indicator_text}' "
70
+ f"[Ref: {indicator.definition_reference}, Confidence: {indicator.confidence:.0%}]"
71
+ )
72
+ explanation_parts.append(indicator_line)
73
+
74
+ explanation_parts.append(
75
+ "\n\nThis classification indicates severe spiritual distress that requires "
76
+ "immediate referral to the spiritual health team. The indicators suggest "
77
+ "the patient may benefit from professional spiritual care support."
78
+ )
79
+
80
+ return "".join(explanation_parts)
81
+
82
+ def _generate_yellow_explanation(
83
+ self,
84
+ indicators: List[DistressIndicator],
85
+ message: str
86
+ ) -> str:
87
+ """Generate explanation for YELLOW classification."""
88
+ explanation_parts = [
89
+ "This message has been classified as YELLOW FLAG (potential spiritual distress) "
90
+ "requiring clarifying questions."
91
+ ]
92
+
93
+ if indicators:
94
+ explanation_parts.append("\n\nDetected potential distress indicators:")
95
+ for indicator in indicators:
96
+ indicator_line = (
97
+ f"\n- {indicator.subcategory} ({indicator.category}): "
98
+ f"'{indicator.indicator_text}' "
99
+ f"[Ref: {indicator.definition_reference}, Confidence: {indicator.confidence:.0%}]"
100
+ )
101
+ explanation_parts.append(indicator_line)
102
+
103
+ # Explain why not RED
104
+ explanation_parts.append(
105
+ "\n\nWhy not RED: The indicators detected suggest potential distress but "
106
+ "do not meet the threshold for severe spiritual distress requiring immediate "
107
+ "referral. Further clarification is needed to determine the severity."
108
+ )
109
+
110
+ # Explain why not GREEN
111
+ explanation_parts.append(
112
+ "\n\nWhy not GREEN: The message contains indicators that suggest possible "
113
+ "spiritual concerns that warrant follow-up questions to better understand "
114
+ "the patient's spiritual state."
115
+ )
116
+
117
+ return "".join(explanation_parts)
118
+
119
+ def _generate_green_explanation(self, message: str) -> str:
120
+ """Generate explanation for GREEN classification."""
121
+ explanation_parts = [
122
+ "This message has been classified as GREEN (no spiritual distress indicators detected)."
123
+ ]
124
+
125
+ explanation_parts.append(
126
+ "\n\nNo spiritual distress indicators were found in this message. "
127
+ "The content does not suggest spiritual concerns that require follow-up "
128
+ "or referral to the spiritual care team."
129
+ )
130
+
131
+ # Explain why not RED or YELLOW
132
+ explanation_parts.append(
133
+ "\n\nWhy not RED or YELLOW: The message does not contain expressions of "
134
+ "spiritual distress, grief, existential questioning, or other indicators "
135
+ "defined in the spiritual distress definitions document."
136
+ )
137
+
138
+ explanation_parts.append("\n\nNo further steps required.")
139
+
140
+ return "".join(explanation_parts)
141
+
142
+
143
+ def generate_permission_check(
144
+ self,
145
+ indicators: List[DistressIndicator]
146
+ ) -> str:
147
+ """
148
+ Generate patient consent request message for RED cases.
149
+
150
+ Args:
151
+ indicators: List of detected distress indicators
152
+
153
+ Returns:
154
+ Permission check message with spiritual support and consent language
155
+ """
156
+ message_parts = [
157
+ "We noticed some things in your message that suggest you might be going "
158
+ "through a difficult time spiritually or emotionally."
159
+ ]
160
+
161
+ message_parts.append(
162
+ "\n\nOur hospital has a spiritual care team that provides support to "
163
+ "patients who are experiencing spiritual distress. They can offer "
164
+ "compassionate listening, spiritual guidance, and emotional support."
165
+ )
166
+
167
+ message_parts.append(
168
+ "\n\nWould you like us to connect you with a member of our spiritual "
169
+ "care team? Your consent is important to us, and this referral is "
170
+ "entirely voluntary."
171
+ )
172
+
173
+ message_parts.append(
174
+ "\n\nPlease let us know if you would like spiritual support, or if you "
175
+ "prefer not to be contacted by the spiritual care team at this time."
176
+ )
177
+
178
+ return "".join(message_parts)
179
+
180
+ def generate_referral_message(
181
+ self,
182
+ message: str,
183
+ indicators: List[DistressIndicator],
184
+ explanation: str
185
+ ) -> str:
186
+ """
187
+ Generate referral message for spiritual care team.
188
+
189
+ Args:
190
+ message: Original patient message
191
+ indicators: List of detected distress indicators
192
+ explanation: Classification explanation
193
+
194
+ Returns:
195
+ Referral message with background, indicators, and justification
196
+ """
197
+ referral_parts = ["SPIRITUAL CARE TEAM REFERRAL"]
198
+ referral_parts.append("\n" + "=" * 40)
199
+
200
+ # Background section
201
+ referral_parts.append("\n\nBACKGROUND:")
202
+ referral_parts.append(
203
+ f"\nPatient message excerpt: \"{message[:200]}{'...' if len(message) > 200 else ''}\""
204
+ )
205
+
206
+ # Indicators section
207
+ referral_parts.append("\n\nINDICATORS DETECTED:")
208
+ if indicators:
209
+ for indicator in indicators:
210
+ referral_parts.append(
211
+ f"\n- {indicator.subcategory} ({indicator.category})"
212
+ )
213
+ referral_parts.append(f"\n Severity: {indicator.severity.upper()}")
214
+ referral_parts.append(f"\n Reference: {indicator.definition_reference}")
215
+ referral_parts.append(f"\n Confidence: {indicator.confidence:.0%}")
216
+ referral_parts.append(f"\n Text: \"{indicator.indicator_text}\"")
217
+ else:
218
+ referral_parts.append("\n- No specific indicators (general distress detected)")
219
+
220
+ # Justification section
221
+ referral_parts.append("\n\nJUSTIFICATION FOR RED FLAG:")
222
+ referral_parts.append(
223
+ "\nThis patient has been flagged for immediate spiritual care attention "
224
+ "based on the severity of distress indicators detected in their message. "
225
+ )
226
+
227
+ if indicators:
228
+ red_indicators = [i for i in indicators if i.severity == "red"]
229
+ if red_indicators:
230
+ referral_parts.append(
231
+ f"\n\nThe following severe (RED) indicators were identified: "
232
+ f"{', '.join(i.subcategory for i in red_indicators)}."
233
+ )
234
+
235
+ referral_parts.append(
236
+ "\n\nRecommended action: Please reach out to this patient at your "
237
+ "earliest convenience to provide spiritual support and assessment."
238
+ )
239
+
240
+ referral_parts.append("\n\n" + "=" * 40)
241
+ referral_parts.append("\nPatient has provided consent for this referral.")
242
+
243
+ return "".join(referral_parts)
244
+
245
+ def generate_follow_up_questions(
246
+ self,
247
+ message: str,
248
+ indicators: List[DistressIndicator]
249
+ ) -> List[FollowUpQuestion]:
250
+ """
251
+ Generate 2-3 clarifying questions for YELLOW cases.
252
+
253
+ Each question contains 1-2 short, sensitive clarifying questions
254
+ with a purpose explanation.
255
+
256
+ Args:
257
+ message: Original patient message
258
+ indicators: List of detected distress indicators
259
+
260
+ Returns:
261
+ List of 2-3 FollowUpQuestion instances
262
+ """
263
+ questions = []
264
+
265
+ # Generate questions based on indicator categories
266
+ categories = set(i.category for i in indicators) if indicators else set()
267
+
268
+ # Question 1: General well-being check
269
+ questions.append(FollowUpQuestion(
270
+ question_id=str(uuid.uuid4())[:8],
271
+ question_text=(
272
+ "How have you been feeling overall lately? "
273
+ "Is there anything specific that's been on your mind?"
274
+ ),
275
+ purpose=(
276
+ "To understand the patient's general emotional and spiritual state "
277
+ "and identify any underlying concerns."
278
+ )
279
+ ))
280
+
281
+ # Question 2: Based on detected categories or general spiritual inquiry
282
+ if "Grief" in categories:
283
+ questions.append(FollowUpQuestion(
284
+ question_id=str(uuid.uuid4())[:8],
285
+ question_text=(
286
+ "It sounds like you may be dealing with some difficult feelings. "
287
+ "Would you like to share more about what you're experiencing?"
288
+ ),
289
+ purpose=(
290
+ "To explore potential grief-related concerns and provide "
291
+ "opportunity for the patient to express their feelings."
292
+ )
293
+ ))
294
+ elif "Existential" in categories:
295
+ questions.append(FollowUpQuestion(
296
+ question_id=str(uuid.uuid4())[:8],
297
+ question_text=(
298
+ "Sometimes when we're going through health challenges, we find "
299
+ "ourselves thinking about bigger questions. Is that something "
300
+ "you'd like to talk about?"
301
+ ),
302
+ purpose=(
303
+ "To explore existential concerns and meaning-making in the "
304
+ "context of the patient's health situation."
305
+ )
306
+ ))
307
+ elif "Spiritual" in categories:
308
+ questions.append(FollowUpQuestion(
309
+ question_id=str(uuid.uuid4())[:8],
310
+ question_text=(
311
+ "Do you have any spiritual or religious practices that are "
312
+ "important to you? How has your current situation affected them?"
313
+ ),
314
+ purpose=(
315
+ "To understand the patient's spiritual background and how "
316
+ "their current situation may be impacting their spiritual life."
317
+ )
318
+ ))
319
+ else:
320
+ questions.append(FollowUpQuestion(
321
+ question_id=str(uuid.uuid4())[:8],
322
+ question_text=(
323
+ "Is there anything that's been particularly challenging for you "
324
+ "during this time? What kind of support would be most helpful?"
325
+ ),
326
+ purpose=(
327
+ "To identify specific challenges and understand what type of "
328
+ "support the patient might need."
329
+ )
330
+ ))
331
+
332
+ # Question 3: Support and resources
333
+ questions.append(FollowUpQuestion(
334
+ question_id=str(uuid.uuid4())[:8],
335
+ question_text=(
336
+ "Do you have people in your life you can talk to about these things? "
337
+ "Would you be interested in speaking with someone from our care team?"
338
+ ),
339
+ purpose=(
340
+ "To assess the patient's support system and gauge interest in "
341
+ "additional spiritual care resources."
342
+ )
343
+ ))
344
+
345
+ # Ensure we return 2-3 questions
346
+ return questions[:3]
src/core/error_pattern_analyzer.py ADDED
@@ -0,0 +1,283 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # error_pattern_analyzer.py
2
+ """
3
+ Error Pattern Analyzer for Chaplain Feedback System.
4
+
5
+ Analyzes tagging records to identify error patterns, calculate subcategory
6
+ breakdowns, and provide insights into classifier performance.
7
+ """
8
+
9
+ from typing import List, Dict, Any
10
+ from collections import Counter
11
+
12
+ from .chaplain_models import (
13
+ TaggingRecord,
14
+ CLASSIFICATION_SUBCATEGORIES,
15
+ QUESTION_ISSUE_TYPES,
16
+ REFERRAL_ISSUE_TYPES,
17
+ )
18
+
19
+
20
+ class ErrorPatternAnalyzer:
21
+ """
22
+ Analyzes error patterns from tagging records.
23
+
24
+ Provides methods to calculate subcategory breakdowns, identify common
25
+ error patterns, and generate statistics for session analysis.
26
+ """
27
+
28
+ def __init__(self):
29
+ """Initialize the error pattern analyzer."""
30
+ pass
31
+
32
+ def analyze_classification_errors(
33
+ self,
34
+ records: List[TaggingRecord]
35
+ ) -> Dict[str, int]:
36
+ """
37
+ Get breakdown of classification error subcategories.
38
+
39
+ Counts how many times each classification error subcategory appears
40
+ in the provided records.
41
+
42
+ Args:
43
+ records: List of TaggingRecord instances to analyze
44
+
45
+ Returns:
46
+ Dictionary mapping subcategory names to counts
47
+ Example: {
48
+ "missed_indicators": 5,
49
+ "false_positive": 2,
50
+ "missed_distress": 3
51
+ }
52
+ """
53
+ subcategory_counts = {subcategory: 0 for subcategory in CLASSIFICATION_SUBCATEGORIES}
54
+
55
+ for record in records:
56
+ # Only count records where classification is incorrect
57
+ if not record.is_classification_correct and record.classification_subcategory:
58
+ subcategory = record.classification_subcategory
59
+ if subcategory in subcategory_counts:
60
+ subcategory_counts[subcategory] += 1
61
+
62
+ return subcategory_counts
63
+
64
+ def analyze_question_issues(
65
+ self,
66
+ records: List[TaggingRecord]
67
+ ) -> Dict[str, int]:
68
+ """
69
+ Get breakdown of follow-up question issues by subcategory.
70
+
71
+ Counts how many times each question issue type appears across
72
+ all records (supporting multi-select).
73
+
74
+ Args:
75
+ records: List of TaggingRecord instances to analyze
76
+
77
+ Returns:
78
+ Dictionary mapping issue type names to counts
79
+ Example: {
80
+ "inappropriate": 3,
81
+ "not_relevant": 2,
82
+ "too_leading": 1,
83
+ "unclear": 0,
84
+ "tone_clinical": 2,
85
+ "tone_religious": 0,
86
+ "tone_casual": 1
87
+ }
88
+ """
89
+ issue_counts = {issue_type: 0 for issue_type in QUESTION_ISSUE_TYPES}
90
+
91
+ for record in records:
92
+ # Count each issue type in the multi-select list
93
+ for issue in record.question_issues:
94
+ if issue in issue_counts:
95
+ issue_counts[issue] += 1
96
+
97
+ return issue_counts
98
+
99
+ def analyze_referral_issues(
100
+ self,
101
+ records: List[TaggingRecord]
102
+ ) -> Dict[str, int]:
103
+ """
104
+ Get breakdown of referral message issues by subcategory.
105
+
106
+ Counts how many times each referral issue type appears across
107
+ all records (supporting multi-select).
108
+
109
+ Args:
110
+ records: List of TaggingRecord instances to analyze
111
+
112
+ Returns:
113
+ Dictionary mapping issue type names to counts
114
+ Example: {
115
+ "incomplete_summary": 2,
116
+ "misrepresentation": 1,
117
+ "inappropriate_tone": 3
118
+ }
119
+ """
120
+ issue_counts = {issue_type: 0 for issue_type in REFERRAL_ISSUE_TYPES}
121
+
122
+ for record in records:
123
+ # Count each issue type in the multi-select list
124
+ for issue in record.referral_issues:
125
+ if issue in issue_counts:
126
+ issue_counts[issue] += 1
127
+
128
+ return issue_counts
129
+
130
+ def analyze_indicator_issues(
131
+ self,
132
+ records: List[TaggingRecord]
133
+ ) -> Dict[str, int]:
134
+ """
135
+ Get breakdown of commonly missed/incorrectly identified indicators.
136
+
137
+ Counts how many times each indicator ID appears in the indicator_issues
138
+ lists across all records.
139
+
140
+ Args:
141
+ records: List of TaggingRecord instances to analyze
142
+
143
+ Returns:
144
+ Dictionary mapping indicator IDs to counts
145
+ Example: {
146
+ "excessive_guilt": 3,
147
+ "crying": 2,
148
+ "anxiety": 1
149
+ }
150
+ """
151
+ indicator_counts: Dict[str, int] = {}
152
+
153
+ for record in records:
154
+ # Count each indicator in the list
155
+ for indicator_id in record.indicator_issues:
156
+ if indicator_id not in indicator_counts:
157
+ indicator_counts[indicator_id] = 0
158
+ indicator_counts[indicator_id] += 1
159
+
160
+ return indicator_counts
161
+
162
+ def get_common_patterns(
163
+ self,
164
+ records: List[TaggingRecord]
165
+ ) -> List[str]:
166
+ """
167
+ Get list of common error patterns in plain language.
168
+
169
+ Analyzes all error types and returns human-readable descriptions
170
+ of the most common patterns found in the records.
171
+
172
+ Args:
173
+ records: List of TaggingRecord instances to analyze
174
+
175
+ Returns:
176
+ List of plain-language descriptions of common patterns
177
+ Example: [
178
+ "Most common classification error: missed_indicators (5 occurrences)",
179
+ "Most common question issue: inappropriate (3 occurrences)",
180
+ "Most common referral issue: inappropriate_tone (3 occurrences)"
181
+ ]
182
+ """
183
+ patterns = []
184
+
185
+ # Analyze classification errors
186
+ classification_errors = self.analyze_classification_errors(records)
187
+ if any(classification_errors.values()):
188
+ max_error = max(classification_errors.items(), key=lambda x: x[1])
189
+ if max_error[1] > 0:
190
+ patterns.append(
191
+ f"Most common classification error: {max_error[0]} ({max_error[1]} occurrences)"
192
+ )
193
+
194
+ # Analyze question issues
195
+ question_issues = self.analyze_question_issues(records)
196
+ if any(question_issues.values()):
197
+ max_issue = max(question_issues.items(), key=lambda x: x[1])
198
+ if max_issue[1] > 0:
199
+ patterns.append(
200
+ f"Most common question issue: {max_issue[0]} ({max_issue[1]} occurrences)"
201
+ )
202
+
203
+ # Analyze referral issues
204
+ referral_issues = self.analyze_referral_issues(records)
205
+ if any(referral_issues.values()):
206
+ max_issue = max(referral_issues.items(), key=lambda x: x[1])
207
+ if max_issue[1] > 0:
208
+ patterns.append(
209
+ f"Most common referral issue: {max_issue[0]} ({max_issue[1]} occurrences)"
210
+ )
211
+
212
+ # Analyze indicator issues
213
+ indicator_issues = self.analyze_indicator_issues(records)
214
+ if indicator_issues:
215
+ max_indicator = max(indicator_issues.items(), key=lambda x: x[1])
216
+ if max_indicator[1] > 0:
217
+ patterns.append(
218
+ f"Most commonly missed/incorrect indicator: {max_indicator[0]} ({max_indicator[1]} occurrences)"
219
+ )
220
+
221
+ return patterns
222
+
223
+ def get_statistics_summary(
224
+ self,
225
+ records: List[TaggingRecord]
226
+ ) -> Dict[str, Any]:
227
+ """
228
+ Get comprehensive statistics summary for a session.
229
+
230
+ Combines all analysis methods into a single summary dictionary
231
+ suitable for display or export.
232
+
233
+ Args:
234
+ records: List of TaggingRecord instances to analyze
235
+
236
+ Returns:
237
+ Dictionary containing all statistics
238
+ Example: {
239
+ "total_records": 10,
240
+ "classification_errors": {...},
241
+ "question_issues": {...},
242
+ "referral_issues": {...},
243
+ "indicator_issues": {...},
244
+ "common_patterns": [...]
245
+ }
246
+ """
247
+ return {
248
+ "total_records": len(records),
249
+ "classification_errors": self.analyze_classification_errors(records),
250
+ "question_issues": self.analyze_question_issues(records),
251
+ "referral_issues": self.analyze_referral_issues(records),
252
+ "indicator_issues": self.analyze_indicator_issues(records),
253
+ "common_patterns": self.get_common_patterns(records),
254
+ }
255
+
256
+ def get_error_patterns_grouped_by_type(
257
+ self,
258
+ records: List[TaggingRecord]
259
+ ) -> Dict[str, Dict[str, int]]:
260
+ """
261
+ Get error patterns grouped by error type.
262
+
263
+ Returns all error types grouped together with their frequency counts,
264
+ suitable for display in error pattern summaries.
265
+
266
+ Args:
267
+ records: List of TaggingRecord instances to analyze
268
+
269
+ Returns:
270
+ Dictionary with error types as keys and subcategory breakdowns as values
271
+ Example: {
272
+ "classification": {"missed_indicators": 5, "false_positive": 2, ...},
273
+ "question": {"inappropriate": 3, "not_relevant": 2, ...},
274
+ "referral": {"incomplete_summary": 2, "misrepresentation": 1, ...},
275
+ "indicator": {"excessive_guilt": 3, "crying": 2, ...}
276
+ }
277
+ """
278
+ return {
279
+ "classification": self.analyze_classification_errors(records),
280
+ "question": self.analyze_question_issues(records),
281
+ "referral": self.analyze_referral_issues(records),
282
+ "indicator": self.analyze_indicator_issues(records),
283
+ }
src/core/interaction_logger.py ADDED
@@ -0,0 +1,258 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # interaction_logger.py
2
+ """
3
+ Interaction logging service for Chaplain Feedback System.
4
+
5
+ Logs all interaction steps with input/output and supports approval status updates.
6
+ """
7
+
8
+ import uuid
9
+ from typing import List, Optional, Dict, Any
10
+ from datetime import datetime
11
+
12
+ from src.core.chaplain_models import (
13
+ InteractionStepLog,
14
+ TaggingRecord,
15
+ )
16
+
17
+
18
+ class InteractionLogger:
19
+ """
20
+ Logs all interaction steps in the chaplain feedback system.
21
+
22
+ Records input/output for each step and supports updating approval status
23
+ with tagging data.
24
+ """
25
+
26
+ def __init__(self):
27
+ """Initialize the interaction logger."""
28
+ # In-memory storage of logs (can be extended to persist to database/file)
29
+ self._logs: Dict[str, InteractionStepLog] = {}
30
+ self._session_logs: Dict[str, List[str]] = {} # session_id -> list of step_ids
31
+
32
+ def log_step(
33
+ self,
34
+ session_id: str,
35
+ message_id: str,
36
+ step_type: str,
37
+ input_text: str,
38
+ model_output: str,
39
+ ) -> str:
40
+ """
41
+ Log an interaction step.
42
+
43
+ Args:
44
+ session_id: ID of the verification session
45
+ message_id: ID of the message being processed
46
+ step_type: Type of step (classification, explanation, permission_check, etc.)
47
+ input_text: Input text for this step
48
+ model_output: Output from the model/system for this step
49
+
50
+ Returns:
51
+ step_id: Unique identifier for this logged step
52
+
53
+ Raises:
54
+ ValueError: If step_type is invalid
55
+ """
56
+ step_id = str(uuid.uuid4())
57
+
58
+ # Create log entry
59
+ log_entry = InteractionStepLog(
60
+ step_id=step_id,
61
+ session_id=session_id,
62
+ message_id=message_id,
63
+ step_type=step_type,
64
+ input_text=input_text,
65
+ model_output=model_output,
66
+ approval_status=None,
67
+ tagging_data=None,
68
+ timestamp=datetime.now(),
69
+ )
70
+
71
+ # Store log entry
72
+ self._logs[step_id] = log_entry
73
+
74
+ # Track logs by session
75
+ if session_id not in self._session_logs:
76
+ self._session_logs[session_id] = []
77
+ self._session_logs[session_id].append(step_id)
78
+
79
+ return step_id
80
+
81
+ def update_approval(
82
+ self,
83
+ step_id: str,
84
+ approval_status: str,
85
+ tagging_data: Optional[TaggingRecord] = None,
86
+ ) -> None:
87
+ """
88
+ Update a step with approval status and optional tagging data.
89
+
90
+ Args:
91
+ step_id: ID of the step to update
92
+ approval_status: "approved" or "disapproved"
93
+ tagging_data: Optional TaggingRecord with feedback details
94
+
95
+ Raises:
96
+ ValueError: If step_id not found or approval_status is invalid
97
+ """
98
+ if step_id not in self._logs:
99
+ raise ValueError(f"Step {step_id} not found")
100
+
101
+ if approval_status not in ("approved", "disapproved"):
102
+ raise ValueError(f"Invalid approval_status: {approval_status}")
103
+
104
+ log_entry = self._logs[step_id]
105
+ log_entry.approval_status = approval_status
106
+ log_entry.tagging_data = tagging_data
107
+
108
+ def get_step(self, step_id: str) -> Optional[InteractionStepLog]:
109
+ """
110
+ Get a specific logged step.
111
+
112
+ Args:
113
+ step_id: ID of the step to retrieve
114
+
115
+ Returns:
116
+ InteractionStepLog if found, None otherwise
117
+ """
118
+ return self._logs.get(step_id)
119
+
120
+ def get_session_logs(self, session_id: str) -> List[InteractionStepLog]:
121
+ """
122
+ Get all logs for a session.
123
+
124
+ Args:
125
+ session_id: ID of the session
126
+
127
+ Returns:
128
+ List of InteractionStepLog entries for the session, in order
129
+ """
130
+ step_ids = self._session_logs.get(session_id, [])
131
+ return [self._logs[step_id] for step_id in step_ids if step_id in self._logs]
132
+
133
+ def get_session_logs_by_type(
134
+ self,
135
+ session_id: str,
136
+ step_type: str,
137
+ ) -> List[InteractionStepLog]:
138
+ """
139
+ Get all logs of a specific type for a session.
140
+
141
+ Args:
142
+ session_id: ID of the session
143
+ step_type: Type of step to filter by
144
+
145
+ Returns:
146
+ List of InteractionStepLog entries matching the type
147
+ """
148
+ all_logs = self.get_session_logs(session_id)
149
+ return [log for log in all_logs if log.step_type == step_type]
150
+
151
+ def get_message_logs(self, message_id: str) -> List[InteractionStepLog]:
152
+ """
153
+ Get all logs for a specific message across all sessions.
154
+
155
+ Args:
156
+ message_id: ID of the message
157
+
158
+ Returns:
159
+ List of InteractionStepLog entries for the message
160
+ """
161
+ return [log for log in self._logs.values() if log.message_id == message_id]
162
+
163
+ def get_unapproved_steps(self, session_id: str) -> List[InteractionStepLog]:
164
+ """
165
+ Get all steps in a session that haven't been approved/disapproved yet.
166
+
167
+ Args:
168
+ session_id: ID of the session
169
+
170
+ Returns:
171
+ List of InteractionStepLog entries with no approval status
172
+ """
173
+ session_logs = self.get_session_logs(session_id)
174
+ return [log for log in session_logs if log.approval_status is None]
175
+
176
+ def get_disapproved_steps(self, session_id: str) -> List[InteractionStepLog]:
177
+ """
178
+ Get all disapproved steps in a session.
179
+
180
+ Args:
181
+ session_id: ID of the session
182
+
183
+ Returns:
184
+ List of disapproved InteractionStepLog entries
185
+ """
186
+ session_logs = self.get_session_logs(session_id)
187
+ return [log for log in session_logs if log.approval_status == "disapproved"]
188
+
189
+ def get_session_statistics(self, session_id: str) -> Dict[str, Any]:
190
+ """
191
+ Get statistics for a session's interaction logs.
192
+
193
+ Args:
194
+ session_id: ID of the session
195
+
196
+ Returns:
197
+ Dictionary with statistics about the session's interactions
198
+ """
199
+ session_logs = self.get_session_logs(session_id)
200
+
201
+ if not session_logs:
202
+ return {
203
+ "session_id": session_id,
204
+ "total_steps": 0,
205
+ "approved_steps": 0,
206
+ "disapproved_steps": 0,
207
+ "unapproved_steps": 0,
208
+ "steps_by_type": {},
209
+ }
210
+
211
+ # Count by approval status
212
+ approved = sum(1 for log in session_logs if log.approval_status == "approved")
213
+ disapproved = sum(1 for log in session_logs if log.approval_status == "disapproved")
214
+ unapproved = sum(1 for log in session_logs if log.approval_status is None)
215
+
216
+ # Count by step type
217
+ steps_by_type = {}
218
+ for log in session_logs:
219
+ if log.step_type not in steps_by_type:
220
+ steps_by_type[log.step_type] = 0
221
+ steps_by_type[log.step_type] += 1
222
+
223
+ return {
224
+ "session_id": session_id,
225
+ "total_steps": len(session_logs),
226
+ "approved_steps": approved,
227
+ "disapproved_steps": disapproved,
228
+ "unapproved_steps": unapproved,
229
+ "steps_by_type": steps_by_type,
230
+ }
231
+
232
+ def clear_session(self, session_id: str) -> None:
233
+ """
234
+ Clear all logs for a session.
235
+
236
+ Args:
237
+ session_id: ID of the session to clear
238
+ """
239
+ step_ids = self._session_logs.get(session_id, [])
240
+ for step_id in step_ids:
241
+ if step_id in self._logs:
242
+ del self._logs[step_id]
243
+
244
+ if session_id in self._session_logs:
245
+ del self._session_logs[session_id]
246
+
247
+ def export_session_logs(self, session_id: str) -> List[Dict[str, Any]]:
248
+ """
249
+ Export all logs for a session as dictionaries.
250
+
251
+ Args:
252
+ session_id: ID of the session
253
+
254
+ Returns:
255
+ List of log entries as dictionaries
256
+ """
257
+ session_logs = self.get_session_logs(session_id)
258
+ return [log.to_dict() for log in session_logs]
src/core/tagging_service.py ADDED
@@ -0,0 +1,528 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # tagging_service.py
2
+ """
3
+ Tagging Service for Chaplain Feedback System.
4
+
5
+ Handles creation, validation, and management of tagging records
6
+ for chaplain feedback on classification results.
7
+ """
8
+
9
+ from typing import List, Optional, Dict, Any
10
+ import uuid
11
+ from datetime import datetime
12
+
13
+ from .chaplain_models import (
14
+ TaggingRecord,
15
+ CLASSIFICATION_SUBCATEGORIES,
16
+ QUESTION_ISSUE_TYPES,
17
+ REFERRAL_ISSUE_TYPES,
18
+ )
19
+
20
+
21
+ class TaggingService:
22
+ """
23
+ Service for handling tagging record creation and validation.
24
+
25
+ Supports multi-select for question and referral issues,
26
+ classification subcategories, and indicator issue tracking.
27
+ """
28
+
29
+ def __init__(self):
30
+ """Initialize the tagging service."""
31
+ self._records: Dict[str, TaggingRecord] = {}
32
+
33
+ def create_tagging_record(
34
+ self,
35
+ message_id: str,
36
+ is_classification_correct: bool = True,
37
+ classification_subcategory: Optional[str] = None,
38
+ correct_classification: Optional[str] = None,
39
+ question_issues: Optional[List[str]] = None,
40
+ question_comments: Optional[str] = None,
41
+ referral_issues: Optional[List[str]] = None,
42
+ referral_comments: Optional[str] = None,
43
+ indicator_issues: Optional[List[str]] = None,
44
+ indicator_comments: Optional[str] = None,
45
+ general_notes: str = "",
46
+ ) -> TaggingRecord:
47
+ """
48
+ Create a new tagging record with validation.
49
+
50
+ Args:
51
+ message_id: ID of the message being tagged
52
+ is_classification_correct: Whether classification is correct
53
+ classification_subcategory: Subcategory if classification is wrong
54
+ correct_classification: Correct classification if wrong
55
+ question_issues: List of question issue types (multi-select)
56
+ question_comments: Free-text comments about questions
57
+ referral_issues: List of referral issue types (multi-select)
58
+ referral_comments: Free-text comments about referral
59
+ indicator_issues: List of incorrectly identified indicator IDs
60
+ indicator_comments: Free-text comments about indicators
61
+ general_notes: General notes about the message
62
+
63
+ Returns:
64
+ Created and validated TaggingRecord
65
+
66
+ Raises:
67
+ ValueError: If validation fails
68
+ """
69
+ record_id = str(uuid.uuid4())
70
+
71
+ # Ensure lists are not None
72
+ question_issues = question_issues or []
73
+ referral_issues = referral_issues or []
74
+ indicator_issues = indicator_issues or []
75
+
76
+ # Validate inputs
77
+ self._validate_classification_tagging(
78
+ is_classification_correct,
79
+ classification_subcategory,
80
+ correct_classification
81
+ )
82
+ self._validate_question_issues(question_issues)
83
+ self._validate_referral_issues(referral_issues)
84
+
85
+ # Create record (validation happens in __post_init__)
86
+ record = TaggingRecord(
87
+ record_id=record_id,
88
+ message_id=message_id,
89
+ is_classification_correct=is_classification_correct,
90
+ classification_subcategory=classification_subcategory,
91
+ correct_classification=correct_classification,
92
+ question_issues=question_issues,
93
+ question_comments=question_comments,
94
+ referral_issues=referral_issues,
95
+ referral_comments=referral_comments,
96
+ indicator_issues=indicator_issues,
97
+ indicator_comments=indicator_comments,
98
+ general_notes=general_notes,
99
+ )
100
+
101
+ # Store record
102
+ self._records[record_id] = record
103
+ return record
104
+
105
+ def update_tagging_record(
106
+ self,
107
+ record_id: str,
108
+ **updates
109
+ ) -> TaggingRecord:
110
+ """
111
+ Update an existing tagging record.
112
+
113
+ Args:
114
+ record_id: ID of the record to update
115
+ **updates: Fields to update
116
+
117
+ Returns:
118
+ Updated TaggingRecord
119
+
120
+ Raises:
121
+ KeyError: If record not found
122
+ ValueError: If validation fails
123
+ """
124
+ if record_id not in self._records:
125
+ raise KeyError(f"Tagging record not found: {record_id}")
126
+
127
+ record = self._records[record_id]
128
+
129
+ # Create updated data
130
+ record_data = record.to_dict()
131
+ record_data.update(updates)
132
+
133
+ # Validate updates
134
+ if 'classification_subcategory' in updates or 'correct_classification' in updates:
135
+ self._validate_classification_tagging(
136
+ record_data.get('is_classification_correct', True),
137
+ record_data.get('classification_subcategory'),
138
+ record_data.get('correct_classification')
139
+ )
140
+
141
+ if 'question_issues' in updates:
142
+ self._validate_question_issues(record_data.get('question_issues', []))
143
+
144
+ if 'referral_issues' in updates:
145
+ self._validate_referral_issues(record_data.get('referral_issues', []))
146
+
147
+ # Create new record with updates
148
+ updated_record = TaggingRecord.from_dict(record_data)
149
+ self._records[record_id] = updated_record
150
+ return updated_record
151
+
152
+ def get_tagging_record(self, record_id: str) -> Optional[TaggingRecord]:
153
+ """
154
+ Get a tagging record by ID.
155
+
156
+ Args:
157
+ record_id: ID of the record to retrieve
158
+
159
+ Returns:
160
+ TaggingRecord if found, None otherwise
161
+ """
162
+ return self._records.get(record_id)
163
+
164
+ def get_records_for_message(self, message_id: str) -> List[TaggingRecord]:
165
+ """
166
+ Get all tagging records for a specific message.
167
+
168
+ Args:
169
+ message_id: ID of the message
170
+
171
+ Returns:
172
+ List of TaggingRecord instances for the message
173
+ """
174
+ return [
175
+ record for record in self._records.values()
176
+ if record.message_id == message_id
177
+ ]
178
+
179
+ def get_all_records(self) -> List[TaggingRecord]:
180
+ """
181
+ Get all tagging records.
182
+
183
+ Returns:
184
+ List of all TaggingRecord instances
185
+ """
186
+ return list(self._records.values())
187
+
188
+ def delete_tagging_record(self, record_id: str) -> bool:
189
+ """
190
+ Delete a tagging record.
191
+
192
+ Args:
193
+ record_id: ID of the record to delete
194
+
195
+ Returns:
196
+ True if deleted, False if not found
197
+ """
198
+ if record_id in self._records:
199
+ del self._records[record_id]
200
+ return True
201
+ return False
202
+
203
+ def get_available_classification_subcategories(self) -> List[str]:
204
+ """
205
+ Get list of available classification subcategories.
206
+
207
+ Returns:
208
+ List of classification subcategory options
209
+ """
210
+ return CLASSIFICATION_SUBCATEGORIES.copy()
211
+
212
+ def get_available_question_issue_types(self) -> List[str]:
213
+ """
214
+ Get list of available question issue types.
215
+
216
+ Returns:
217
+ List of question issue type options
218
+ """
219
+ return QUESTION_ISSUE_TYPES.copy()
220
+
221
+ def get_available_referral_issue_types(self) -> List[str]:
222
+ """
223
+ Get list of available referral issue types.
224
+
225
+ Returns:
226
+ List of referral issue type options
227
+ """
228
+ return REFERRAL_ISSUE_TYPES.copy()
229
+
230
+ def create_classification_correction(
231
+ self,
232
+ message_id: str,
233
+ subcategory: str,
234
+ correct_classification: str,
235
+ general_notes: str = ""
236
+ ) -> TaggingRecord:
237
+ """
238
+ Create a tagging record specifically for wrong classification.
239
+
240
+ This is a convenience method that ensures proper validation
241
+ for classification correction scenarios.
242
+
243
+ Args:
244
+ message_id: ID of the message being corrected
245
+ subcategory: Classification error subcategory
246
+ correct_classification: The correct classification
247
+ general_notes: Additional notes about the correction
248
+
249
+ Returns:
250
+ TaggingRecord for the classification correction
251
+
252
+ Raises:
253
+ ValueError: If subcategory or correct_classification is invalid
254
+ """
255
+ return self.create_tagging_record(
256
+ message_id=message_id,
257
+ is_classification_correct=False,
258
+ classification_subcategory=subcategory,
259
+ correct_classification=correct_classification,
260
+ general_notes=general_notes
261
+ )
262
+
263
+ def get_classification_subcategory_descriptions(self) -> Dict[str, str]:
264
+ """
265
+ Get descriptions for classification subcategories.
266
+
267
+ Returns:
268
+ Dictionary mapping subcategory codes to descriptions
269
+ """
270
+ return {
271
+ "missed_indicators": "Missed key distress indicators",
272
+ "false_positive": "Overly sensitive (false-positive flag)",
273
+ "missed_distress": "Not sensitive enough (missed distress)",
274
+ }
275
+
276
+ def create_question_issue_tagging(
277
+ self,
278
+ message_id: str,
279
+ question_issues: List[str],
280
+ question_comments: Optional[str] = None,
281
+ general_notes: str = ""
282
+ ) -> TaggingRecord:
283
+ """
284
+ Create a tagging record specifically for follow-up question issues.
285
+
286
+ This is a convenience method for tagging YELLOW flow question issues
287
+ with multi-select support and free-text comments.
288
+
289
+ Args:
290
+ message_id: ID of the message being tagged
291
+ question_issues: List of question issue types (multi-select)
292
+ question_comments: Free-text comments about questions
293
+ general_notes: Additional notes
294
+
295
+ Returns:
296
+ TaggingRecord for the question issues
297
+
298
+ Raises:
299
+ ValueError: If question_issues contains invalid types
300
+ """
301
+ return self.create_tagging_record(
302
+ message_id=message_id,
303
+ question_issues=question_issues,
304
+ question_comments=question_comments,
305
+ general_notes=general_notes
306
+ )
307
+
308
+ def get_question_issue_descriptions(self) -> Dict[str, str]:
309
+ """
310
+ Get descriptions for question issue types.
311
+
312
+ Returns:
313
+ Dictionary mapping issue codes to descriptions
314
+ """
315
+ return {
316
+ "inappropriate": "Question is inappropriate or intrusive",
317
+ "not_relevant": "Question is not spiritually relevant",
318
+ "too_leading": "Question is too leading or assumptive",
319
+ "unclear": "Question is unclear or confusing",
320
+ "tone_clinical": "Tone too clinical",
321
+ "tone_religious": "Tone too religious",
322
+ "tone_casual": "Tone too casual",
323
+ }
324
+
325
+ def create_referral_issue_tagging(
326
+ self,
327
+ message_id: str,
328
+ referral_issues: List[str],
329
+ referral_comments: Optional[str] = None,
330
+ general_notes: str = ""
331
+ ) -> TaggingRecord:
332
+ """
333
+ Create a tagging record specifically for referral message issues.
334
+
335
+ This is a convenience method for tagging RED flow referral message issues
336
+ with multi-select support and free-text comments.
337
+
338
+ Args:
339
+ message_id: ID of the message being tagged
340
+ referral_issues: List of referral issue types (multi-select)
341
+ referral_comments: Free-text comments about referral message
342
+ general_notes: Additional notes
343
+
344
+ Returns:
345
+ TaggingRecord for the referral issues
346
+
347
+ Raises:
348
+ ValueError: If referral_issues contains invalid types
349
+ """
350
+ return self.create_tagging_record(
351
+ message_id=message_id,
352
+ referral_issues=referral_issues,
353
+ referral_comments=referral_comments,
354
+ general_notes=general_notes
355
+ )
356
+
357
+ def get_referral_issue_descriptions(self) -> Dict[str, str]:
358
+ """
359
+ Get descriptions for referral issue types.
360
+
361
+ Returns:
362
+ Dictionary mapping issue codes to descriptions
363
+ """
364
+ return {
365
+ "incomplete_summary": "Incorrect or incomplete summary",
366
+ "misrepresentation": "Misrepresentation of patient message",
367
+ "inappropriate_tone": "Tone inappropriate for spiritual care team",
368
+ }
369
+
370
+ def create_indicator_issue_tagging(
371
+ self,
372
+ message_id: str,
373
+ indicator_issues: List[str],
374
+ indicator_comments: Optional[str] = None,
375
+ general_notes: str = ""
376
+ ) -> TaggingRecord:
377
+ """
378
+ Create a tagging record specifically for indicator issues.
379
+
380
+ This is a convenience method for tagging incorrectly identified indicators
381
+ with free-text comments.
382
+
383
+ Args:
384
+ message_id: ID of the message being tagged
385
+ indicator_issues: List of incorrectly identified indicator IDs
386
+ indicator_comments: Free-text comments about indicators
387
+ general_notes: Additional notes
388
+
389
+ Returns:
390
+ TaggingRecord for the indicator issues
391
+ """
392
+ return self.create_tagging_record(
393
+ message_id=message_id,
394
+ indicator_issues=indicator_issues,
395
+ indicator_comments=indicator_comments,
396
+ general_notes=general_notes
397
+ )
398
+
399
+ def create_indicator_issue_tagging(
400
+ self,
401
+ message_id: str,
402
+ indicator_issues: List[str],
403
+ indicator_comments: Optional[str] = None,
404
+ general_notes: str = ""
405
+ ) -> TaggingRecord:
406
+ """
407
+ Create a tagging record specifically for indicator issues.
408
+
409
+ This is a convenience method for marking incorrectly identified
410
+ distress indicators with comments.
411
+
412
+ Args:
413
+ message_id: ID of the message being tagged
414
+ indicator_issues: List of incorrectly identified indicator IDs
415
+ indicator_comments: Free-text comments about indicators
416
+ general_notes: Additional notes
417
+
418
+ Returns:
419
+ TaggingRecord for the indicator issues
420
+ """
421
+ return self.create_tagging_record(
422
+ message_id=message_id,
423
+ indicator_issues=indicator_issues,
424
+ indicator_comments=indicator_comments,
425
+ general_notes=general_notes
426
+ )
427
+
428
+ def validate_indicator_ids(self, indicator_ids: List[str]) -> bool:
429
+ """
430
+ Validate that indicator IDs are reasonable.
431
+
432
+ This is a basic validation - in a real system, you might
433
+ validate against actual indicator IDs from the classification result.
434
+
435
+ Args:
436
+ indicator_ids: List of indicator IDs to validate
437
+
438
+ Returns:
439
+ True if all IDs are valid format, False otherwise
440
+ """
441
+ for indicator_id in indicator_ids:
442
+ if not isinstance(indicator_id, str) or len(indicator_id.strip()) == 0:
443
+ return False
444
+ return True
445
+
446
+ def _validate_classification_tagging(
447
+ self,
448
+ is_classification_correct: bool,
449
+ classification_subcategory: Optional[str],
450
+ correct_classification: Optional[str]
451
+ ) -> None:
452
+ """
453
+ Validate classification tagging fields.
454
+
455
+ Args:
456
+ is_classification_correct: Whether classification is correct
457
+ classification_subcategory: Subcategory if wrong
458
+ correct_classification: Correct classification if wrong
459
+
460
+ Raises:
461
+ ValueError: If validation fails
462
+ """
463
+ if not is_classification_correct:
464
+ # If classification is wrong, require subcategory and correct classification
465
+ if not classification_subcategory:
466
+ raise ValueError(
467
+ "classification_subcategory is required when is_classification_correct is False"
468
+ )
469
+ if not correct_classification:
470
+ raise ValueError(
471
+ "correct_classification is required when is_classification_correct is False"
472
+ )
473
+
474
+ if classification_subcategory not in CLASSIFICATION_SUBCATEGORIES:
475
+ raise ValueError(
476
+ f"Invalid classification_subcategory: {classification_subcategory}. "
477
+ f"Must be one of: {CLASSIFICATION_SUBCATEGORIES}"
478
+ )
479
+
480
+ if correct_classification not in ("red", "yellow", "green"):
481
+ raise ValueError(
482
+ f"Invalid correct_classification: {correct_classification}. "
483
+ f"Must be one of: red, yellow, green"
484
+ )
485
+ else:
486
+ # If classification is correct, these fields should be None
487
+ if classification_subcategory is not None:
488
+ raise ValueError(
489
+ "classification_subcategory must be None when is_classification_correct is True"
490
+ )
491
+ if correct_classification is not None:
492
+ raise ValueError(
493
+ "correct_classification must be None when is_classification_correct is True"
494
+ )
495
+
496
+ def _validate_question_issues(self, question_issues: List[str]) -> None:
497
+ """
498
+ Validate question issue types.
499
+
500
+ Args:
501
+ question_issues: List of question issue types
502
+
503
+ Raises:
504
+ ValueError: If any issue type is invalid
505
+ """
506
+ for issue in question_issues:
507
+ if issue not in QUESTION_ISSUE_TYPES:
508
+ raise ValueError(
509
+ f"Invalid question issue type: {issue}. "
510
+ f"Must be one of: {QUESTION_ISSUE_TYPES}"
511
+ )
512
+
513
+ def _validate_referral_issues(self, referral_issues: List[str]) -> None:
514
+ """
515
+ Validate referral issue types.
516
+
517
+ Args:
518
+ referral_issues: List of referral issue types
519
+
520
+ Raises:
521
+ ValueError: If any issue type is invalid
522
+ """
523
+ for issue in referral_issues:
524
+ if issue not in REFERRAL_ISSUE_TYPES:
525
+ raise ValueError(
526
+ f"Invalid referral issue type: {issue}. "
527
+ f"Must be one of: {REFERRAL_ISSUE_TYPES}"
528
+ )
src/core/verification_csv_exporter.py CHANGED
@@ -2,14 +2,21 @@
2
  """
3
  CSV export functionality for verification sessions.
4
 
5
- Provides methods for generating CSV files with verification results and summaries.
 
6
  """
7
 
8
  import csv
9
  import io
10
  from datetime import datetime
11
- from typing import List
12
  from src.core.verification_models import VerificationRecord, VerificationSession
 
 
 
 
 
 
13
 
14
 
15
  class VerificationCSVExporter:
@@ -135,3 +142,207 @@ class VerificationCSVExporter:
135
  "incorrect": session.incorrect_count,
136
  "accuracy_percent": accuracy,
137
  }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  """
3
  CSV export functionality for verification sessions.
4
 
5
+ Provides methods for generating CSV files with verification results and summaries,
6
+ including tagging data, generated content, interaction logs, and error statistics.
7
  """
8
 
9
  import csv
10
  import io
11
  from datetime import datetime
12
+ from typing import List, Optional, Dict, Any
13
  from src.core.verification_models import VerificationRecord, VerificationSession
14
+ from src.core.chaplain_models import (
15
+ TaggingRecord,
16
+ ClassificationFlowResult,
17
+ InteractionStepLog,
18
+ )
19
+ from src.core.error_pattern_analyzer import ErrorPatternAnalyzer
20
 
21
 
22
  class VerificationCSVExporter:
 
142
  "incorrect": session.incorrect_count,
143
  "accuracy_percent": accuracy,
144
  }
145
+
146
+ @staticmethod
147
+ def generate_enhanced_csv_content(
148
+ session: VerificationSession,
149
+ tagging_records: Optional[List[TaggingRecord]] = None,
150
+ flow_results: Optional[Dict[str, ClassificationFlowResult]] = None,
151
+ interaction_logs: Optional[List[InteractionStepLog]] = None,
152
+ ) -> str:
153
+ """
154
+ Generate enhanced CSV content with tagging data, generated content, and statistics.
155
+
156
+ Includes:
157
+ - Summary section with accuracy metrics
158
+ - Detailed records with tagging categories and subcategories
159
+ - Generated content (explanations, questions, referral messages)
160
+ - Interaction logs
161
+ - Error pattern statistics
162
+
163
+ Args:
164
+ session: The verification session to export
165
+ tagging_records: List of TaggingRecord instances (optional)
166
+ flow_results: Dict mapping message_id to ClassificationFlowResult (optional)
167
+ interaction_logs: List of InteractionStepLog instances (optional)
168
+
169
+ Returns:
170
+ Enhanced CSV content as a string
171
+
172
+ Raises:
173
+ ValueError: If session has no verified messages
174
+ """
175
+ if session.verified_count == 0:
176
+ raise ValueError("No verified messages to export")
177
+
178
+ output = io.StringIO()
179
+
180
+ # Add summary section
181
+ accuracy = (
182
+ session.correct_count / session.verified_count * 100
183
+ if session.verified_count > 0
184
+ else 0.0
185
+ )
186
+ output.write("VERIFICATION SUMMARY\n")
187
+ output.write(f"Total Messages,{session.verified_count}\n")
188
+ output.write(f"Correct,{session.correct_count}\n")
189
+ output.write(f"Incorrect,{session.incorrect_count}\n")
190
+ output.write(f"Accuracy %,{accuracy:.1f}\n")
191
+ output.write("\n")
192
+
193
+ # Add detailed records section
194
+ output.write("DETAILED RECORDS\n")
195
+ output.write("Patient Message,Classifier Said,You Said,Notes,Date\n")
196
+
197
+ writer = csv.writer(output)
198
+
199
+ for record in session.verifications:
200
+ classifier_decision = record.classifier_decision.upper()
201
+ ground_truth = record.ground_truth_label.upper()
202
+ timestamp = record.timestamp.strftime("%Y-%m-%d %H:%M:%S")
203
+
204
+ writer.writerow([
205
+ record.original_message,
206
+ classifier_decision,
207
+ ground_truth,
208
+ record.verifier_notes,
209
+ timestamp,
210
+ ])
211
+
212
+ output.write("\n")
213
+
214
+ # Add tagging data section if provided
215
+ if tagging_records:
216
+ output.write("TAGGING DATA\n")
217
+ output.write("Message ID,Classification Correct,Classification Subcategory,Correct Classification,Question Issues,Question Comments,Referral Issues,Referral Comments,Indicator Issues,Indicator Comments,General Notes\n")
218
+
219
+ for record in tagging_records:
220
+ writer.writerow([
221
+ record.message_id,
222
+ "Yes" if record.is_classification_correct else "No",
223
+ record.classification_subcategory or "",
224
+ record.correct_classification or "",
225
+ "; ".join(record.question_issues) if record.question_issues else "",
226
+ record.question_comments or "",
227
+ "; ".join(record.referral_issues) if record.referral_issues else "",
228
+ record.referral_comments or "",
229
+ "; ".join(record.indicator_issues) if record.indicator_issues else "",
230
+ record.indicator_comments or "",
231
+ record.general_notes,
232
+ ])
233
+
234
+ output.write("\n")
235
+
236
+ # Add generated content section if provided
237
+ if flow_results:
238
+ output.write("GENERATED CONTENT\n")
239
+ output.write("Message ID,Classification,Explanation,Permission Check Message,Referral Message,Follow-Up Questions,Patient Responses,Re-evaluation Result\n")
240
+
241
+ for message_id, result in flow_results.items():
242
+ questions_text = "; ".join([q.question_text for q in result.follow_up_questions]) if result.follow_up_questions else ""
243
+ responses_text = "; ".join(result.patient_responses) if result.patient_responses else ""
244
+
245
+ writer.writerow([
246
+ message_id,
247
+ result.classification.upper(),
248
+ result.explanation,
249
+ result.permission_check_message or "",
250
+ result.referral_message or "",
251
+ questions_text,
252
+ responses_text,
253
+ result.re_evaluation_result or "",
254
+ ])
255
+
256
+ output.write("\n")
257
+
258
+ # Add interaction logs section if provided
259
+ if interaction_logs:
260
+ output.write("INTERACTION LOGS\n")
261
+ output.write("Step ID,Session ID,Message ID,Step Type,Input Text,Model Output,Approval Status,Timestamp\n")
262
+
263
+ for log in interaction_logs:
264
+ writer.writerow([
265
+ log.step_id,
266
+ log.session_id,
267
+ log.message_id,
268
+ log.step_type,
269
+ log.input_text,
270
+ log.model_output,
271
+ log.approval_status or "",
272
+ log.timestamp.strftime("%Y-%m-%d %H:%M:%S"),
273
+ ])
274
+
275
+ output.write("\n")
276
+
277
+ # Add statistics section if tagging records provided
278
+ if tagging_records:
279
+ output.write("ERROR PATTERN STATISTICS\n")
280
+
281
+ analyzer = ErrorPatternAnalyzer()
282
+ stats = analyzer.get_statistics_summary(tagging_records)
283
+
284
+ # Classification errors
285
+ output.write("Classification Errors\n")
286
+ for subcategory, count in stats["classification_errors"].items():
287
+ output.write(f"{subcategory},{count}\n")
288
+ output.write("\n")
289
+
290
+ # Question issues
291
+ output.write("Question Issues\n")
292
+ for issue_type, count in stats["question_issues"].items():
293
+ output.write(f"{issue_type},{count}\n")
294
+ output.write("\n")
295
+
296
+ # Referral issues
297
+ output.write("Referral Issues\n")
298
+ for issue_type, count in stats["referral_issues"].items():
299
+ output.write(f"{issue_type},{count}\n")
300
+ output.write("\n")
301
+
302
+ # Indicator issues
303
+ output.write("Indicator Issues\n")
304
+ for indicator_id, count in stats["indicator_issues"].items():
305
+ output.write(f"{indicator_id},{count}\n")
306
+ output.write("\n")
307
+
308
+ # Common patterns
309
+ output.write("Common Patterns\n")
310
+ for pattern in stats["common_patterns"]:
311
+ output.write(f"{pattern}\n")
312
+ output.write("\n")
313
+
314
+ return output.getvalue()
315
+
316
+ @staticmethod
317
+ def export_enhanced_session_to_csv(
318
+ session: VerificationSession,
319
+ tagging_records: Optional[List[TaggingRecord]] = None,
320
+ flow_results: Optional[Dict[str, ClassificationFlowResult]] = None,
321
+ interaction_logs: Optional[List[InteractionStepLog]] = None,
322
+ ) -> tuple:
323
+ """
324
+ Export a verification session with enhanced data to CSV format.
325
+
326
+ Returns both the CSV content and the filename.
327
+
328
+ Args:
329
+ session: The verification session to export
330
+ tagging_records: List of TaggingRecord instances (optional)
331
+ flow_results: Dict mapping message_id to ClassificationFlowResult (optional)
332
+ interaction_logs: List of InteractionStepLog instances (optional)
333
+
334
+ Returns:
335
+ Tuple of (csv_content, filename)
336
+
337
+ Raises:
338
+ ValueError: If session has no verified messages
339
+ """
340
+ csv_content = VerificationCSVExporter.generate_enhanced_csv_content(
341
+ session,
342
+ tagging_records=tagging_records,
343
+ flow_results=flow_results,
344
+ interaction_logs=interaction_logs,
345
+ )
346
+ filename = VerificationCSVExporter.generate_csv_filename(session.created_at)
347
+
348
+ return csv_content, filename
src/interface/chaplain_feedback_ui.py ADDED
@@ -0,0 +1,450 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # chaplain_feedback_ui.py
2
+ """
3
+ Gradio UI components for Chaplain Feedback & Tagging System.
4
+
5
+ Provides interface components for displaying classification flows,
6
+ collecting chaplain feedback, and displaying error patterns.
7
+
8
+ Requirements: 1.5, 2.3, 3.3, 4.1, 5.1, 5.3, 6.1, 6.3, 8.1, 8.2, 8.3, 10.1, 10.2, 10.3
9
+ """
10
+
11
+ import gradio as gr
12
+ from typing import List, Dict, Tuple, Optional, Any
13
+ from dataclasses import dataclass
14
+
15
+ from src.core.chaplain_models import (
16
+ ClassificationFlowResult,
17
+ DistressIndicator,
18
+ FollowUpQuestion,
19
+ TaggingRecord,
20
+ CLASSIFICATION_SUBCATEGORIES,
21
+ QUESTION_ISSUE_TYPES,
22
+ REFERRAL_ISSUE_TYPES,
23
+ )
24
+
25
+
26
+ class ChaplainFeedbackUIComponents:
27
+ """Manages Gradio UI components for chaplain feedback system."""
28
+
29
+ # Color mappings for classification badges
30
+ BADGE_COLORS = {
31
+ "red": "🔴",
32
+ "yellow": "🟡",
33
+ "green": "🟢",
34
+ }
35
+
36
+ BADGE_LABELS = {
37
+ "red": "RED - Severe Distress",
38
+ "yellow": "YELLOW - Potential Distress",
39
+ "green": "GREEN - No Distress",
40
+ }
41
+
42
+ # Severity color codes for indicators
43
+ SEVERITY_COLORS = {
44
+ "red": "#ea9999", # Red from definitions document
45
+ "yellow": "#ffe599", # Yellow from definitions document
46
+ }
47
+
48
+ @staticmethod
49
+ def create_classification_flow_display() -> Tuple[gr.Component, gr.Component, gr.Component, gr.Component]:
50
+ """
51
+ Create ClassificationFlowDisplay component.
52
+
53
+ Displays RED/YELLOW/GREEN flow results with all generated content.
54
+
55
+ Returns:
56
+ Tuple of (classification_badge, explanation, content_section, indicators_section) components
57
+
58
+ Requirements: 1.5, 2.3, 3.3
59
+ """
60
+ classification_badge = gr.Markdown(
61
+ value="🔄 Loading classification...",
62
+ label="Classification Result",
63
+ )
64
+
65
+ explanation = gr.Markdown(
66
+ value="",
67
+ label="Explanation",
68
+ )
69
+
70
+ content_section = gr.Markdown(
71
+ value="",
72
+ label="Generated Content",
73
+ )
74
+
75
+ indicators_section = gr.Markdown(
76
+ value="",
77
+ label="Detected Indicators",
78
+ )
79
+
80
+ return classification_badge, explanation, content_section, indicators_section
81
+
82
+ @staticmethod
83
+ def render_classification_flow(
84
+ flow_result: ClassificationFlowResult,
85
+ ) -> Tuple[str, str, str, str]:
86
+ """
87
+ Render complete classification flow result.
88
+
89
+ Args:
90
+ flow_result: ClassificationFlowResult with all flow data
91
+
92
+ Returns:
93
+ Tuple of (badge, explanation, content, indicators) markdown strings
94
+ """
95
+ # Classification badge
96
+ badge_emoji = ChaplainFeedbackUIComponents.BADGE_COLORS.get(flow_result.classification, "❓")
97
+ badge_label = ChaplainFeedbackUIComponents.BADGE_LABELS.get(flow_result.classification, "UNKNOWN")
98
+ confidence_pct = int(round(flow_result.confidence * 100))
99
+ badge = f"## {badge_emoji} {badge_label}\n\n**Confidence:** {confidence_pct}%"
100
+
101
+ # Explanation
102
+ explanation = f"### Explanation\n\n{flow_result.explanation}"
103
+
104
+ # Generated content based on classification
105
+ content = ""
106
+ if flow_result.classification == "red":
107
+ content = ChaplainFeedbackUIComponents._render_red_flow_content(flow_result)
108
+ elif flow_result.classification == "yellow":
109
+ content = ChaplainFeedbackUIComponents._render_yellow_flow_content(flow_result)
110
+ elif flow_result.classification == "green":
111
+ content = ChaplainFeedbackUIComponents._render_green_flow_content(flow_result)
112
+
113
+ # Indicators
114
+ indicators = ChaplainFeedbackUIComponents._render_indicators(flow_result.indicators)
115
+
116
+ return badge, explanation, content, indicators
117
+
118
+ @staticmethod
119
+ def _render_red_flow_content(flow_result: ClassificationFlowResult) -> str:
120
+ """Render RED flow content (permission check + referral message)."""
121
+ content = "### 🔴 RED FLAG - Severe Distress Detected\n\n"
122
+
123
+ if flow_result.permission_check_message:
124
+ content += "#### Patient Permission Check\n\n"
125
+ content += f"{flow_result.permission_check_message}\n\n"
126
+
127
+ if flow_result.consent_status:
128
+ content += f"**Consent Status:** {flow_result.consent_status}\n\n"
129
+
130
+ if flow_result.referral_message and flow_result.consent_status == "granted":
131
+ content += "#### Referral Message for Spiritual Care Team\n\n"
132
+ content += f"{flow_result.referral_message}\n\n"
133
+ elif flow_result.consent_status == "declined":
134
+ content += "**Status:** No further action - patient declined spiritual support referral\n\n"
135
+
136
+ return content
137
+
138
+ @staticmethod
139
+ def _render_yellow_flow_content(flow_result: ClassificationFlowResult) -> str:
140
+ """Render YELLOW flow content (follow-up questions + re-evaluation)."""
141
+ content = "### 🟡 YELLOW FLAG - Potential Distress\n\n"
142
+
143
+ if flow_result.follow_up_questions:
144
+ content += "#### Follow-Up Questions\n\n"
145
+ for i, question in enumerate(flow_result.follow_up_questions, 1):
146
+ content += f"**Question {i}:** {question.question_text}\n\n"
147
+ content += f"*Purpose:* {question.purpose}\n\n"
148
+
149
+ if flow_result.patient_responses:
150
+ content += "#### Patient Responses\n\n"
151
+ for i, response in enumerate(flow_result.patient_responses, 1):
152
+ content += f"**Response {i}:** {response}\n\n"
153
+
154
+ if flow_result.re_evaluation_result:
155
+ content += f"#### Re-Evaluation Result\n\n"
156
+ if flow_result.re_evaluation_result == "red":
157
+ content += "🔴 **Escalated to RED** - Severe distress detected in responses\n\n"
158
+ elif flow_result.re_evaluation_result == "green":
159
+ content += "🟢 **Downgraded to GREEN** - No distress indicators in responses\n\n"
160
+
161
+ return content
162
+
163
+ @staticmethod
164
+ def _render_green_flow_content(flow_result: ClassificationFlowResult) -> str:
165
+ """Render GREEN flow content (no distress)."""
166
+ content = "### 🟢 GREEN FLAG - No Distress Detected\n\n"
167
+ content += "**Status:** No further steps required\n\n"
168
+ content += "No spiritual distress indicators were detected in this message.\n\n"
169
+ return content
170
+
171
+ @staticmethod
172
+ def _render_indicators(indicators: List[DistressIndicator]) -> str:
173
+ """Render detected indicators with categories and severity."""
174
+ if not indicators:
175
+ return "### Detected Indicators\n\nNo indicators detected"
176
+
177
+ content = "### Detected Indicators\n\n"
178
+
179
+ # Group by severity
180
+ red_indicators = [i for i in indicators if i.severity == "red"]
181
+ yellow_indicators = [i for i in indicators if i.severity == "yellow"]
182
+
183
+ if red_indicators:
184
+ content += "#### 🔴 RED Indicators (Severe)\n\n"
185
+ for indicator in red_indicators:
186
+ confidence_pct = int(round(indicator.confidence * 100))
187
+ content += f"• **{indicator.subcategory}** ({confidence_pct}% confidence)\n"
188
+ content += f" - Category: {indicator.category}\n"
189
+ content += f" - Reference: {indicator.definition_reference}\n\n"
190
+
191
+ if yellow_indicators:
192
+ content += "#### 🟡 YELLOW Indicators (Potential)\n\n"
193
+ for indicator in yellow_indicators:
194
+ confidence_pct = int(round(indicator.confidence * 100))
195
+ content += f"• **{indicator.subcategory}** ({confidence_pct}% confidence)\n"
196
+ content += f" - Category: {indicator.category}\n"
197
+ content += f" - Reference: {indicator.definition_reference}\n\n"
198
+
199
+ return content
200
+
201
+ @staticmethod
202
+ def create_tagging_interface() -> Tuple[gr.Component, gr.Component, gr.Component, gr.Component, gr.Component, gr.Component, gr.Component, gr.Component, gr.Component, gr.Component]:
203
+ """
204
+ Create TaggingInterface component.
205
+
206
+ Provides classification subcategory selector, multi-select for issues,
207
+ and free-text comment fields.
208
+
209
+ Returns:
210
+ Tuple of individual tagging components for use in event handlers
211
+
212
+ Requirements: 4.1, 5.1, 5.3, 6.1, 6.3
213
+ """
214
+ # Classification tagging components
215
+ is_correct = gr.Radio(
216
+ choices=[("✓ Correct", True), ("✗ Incorrect", False)],
217
+ label="Is the classification correct?",
218
+ interactive=True,
219
+ visible=False,
220
+ )
221
+
222
+ subcategory = gr.Dropdown(
223
+ choices=CLASSIFICATION_SUBCATEGORIES,
224
+ label="What type of error? (if incorrect)",
225
+ interactive=True,
226
+ visible=False,
227
+ )
228
+
229
+ correct_classification = gr.Radio(
230
+ choices=[
231
+ ("🟢 GREEN - No Distress", "green"),
232
+ ("🟡 YELLOW - Potential Distress", "yellow"),
233
+ ("🔴 RED - Severe Distress", "red"),
234
+ ],
235
+ label="What should the correct classification be?",
236
+ interactive=True,
237
+ visible=False,
238
+ )
239
+
240
+ # Follow-up question issues components
241
+ question_issues = gr.CheckboxGroup(
242
+ choices=QUESTION_ISSUE_TYPES,
243
+ label="Issues with follow-up questions (select all that apply)",
244
+ interactive=True,
245
+ visible=False,
246
+ )
247
+
248
+ question_comments = gr.Textbox(
249
+ label="Comments on questions",
250
+ placeholder="e.g., 'Too clinical', 'Not spiritually relevant'",
251
+ lines=2,
252
+ interactive=True,
253
+ visible=False,
254
+ )
255
+
256
+ # Referral message issues components
257
+ referral_issues = gr.CheckboxGroup(
258
+ choices=REFERRAL_ISSUE_TYPES,
259
+ label="Issues with referral message (select all that apply)",
260
+ interactive=True,
261
+ visible=False,
262
+ )
263
+
264
+ referral_comments = gr.Textbox(
265
+ label="Comments on referral message",
266
+ placeholder="e.g., 'Incomplete summary', 'Tone inappropriate'",
267
+ lines=2,
268
+ interactive=True,
269
+ visible=False,
270
+ )
271
+
272
+ # Indicator issues components
273
+ indicator_issues = gr.Textbox(
274
+ label="Incorrectly identified indicators",
275
+ placeholder="List indicator IDs or names that were incorrectly identified",
276
+ lines=2,
277
+ interactive=True,
278
+ visible=False,
279
+ )
280
+
281
+ indicator_comments = gr.Textbox(
282
+ label="Comments on indicators",
283
+ placeholder="e.g., 'Missed anxiety indicators', 'False positive on grief'",
284
+ lines=2,
285
+ interactive=True,
286
+ visible=False,
287
+ )
288
+
289
+ # General notes component
290
+ notes_section = gr.Textbox(
291
+ label="General Notes",
292
+ placeholder="Any additional feedback or observations",
293
+ lines=3,
294
+ interactive=True,
295
+ visible=False,
296
+ )
297
+
298
+ return is_correct, subcategory, correct_classification, question_issues, question_comments, referral_issues, referral_comments, indicator_issues, indicator_comments, notes_section
299
+
300
+ @staticmethod
301
+ def create_indicator_display() -> Tuple[gr.Component, gr.Component]:
302
+ """
303
+ Create IndicatorDisplay component.
304
+
305
+ Shows indicators with categories and allows tagging incorrect indicators.
306
+
307
+ Returns:
308
+ Tuple of (indicators_display, indicator_tagging) components
309
+
310
+ Requirements: 8.1, 8.2, 8.3
311
+ """
312
+ indicators_display = gr.Markdown(
313
+ value="No indicators to display",
314
+ label="Detected Indicators",
315
+ )
316
+
317
+ indicator_tagging = gr.Group(visible=False)
318
+ with indicator_tagging:
319
+ incorrect_indicators = gr.CheckboxGroup(
320
+ choices=[],
321
+ label="Select indicators that are incorrectly identified",
322
+ interactive=True,
323
+ )
324
+
325
+ indicator_notes = gr.Textbox(
326
+ label="Why are these indicators incorrect?",
327
+ placeholder="Explain why these indicators don't apply",
328
+ lines=2,
329
+ interactive=True,
330
+ )
331
+
332
+ return indicators_display, indicator_tagging
333
+
334
+ @staticmethod
335
+ def create_error_pattern_summary() -> Tuple[gr.Component, gr.Component, gr.Component]:
336
+ """
337
+ Create ErrorPatternSummary component.
338
+
339
+ Displays error patterns grouped by type with frequent subcategories highlighted.
340
+
341
+ Returns:
342
+ Tuple of (error_patterns, subcategory_breakdown, recommendations) components
343
+
344
+ Requirements: 10.1, 10.2, 10.3
345
+ """
346
+ error_patterns = gr.Markdown(
347
+ value="No error patterns yet",
348
+ label="Error Patterns",
349
+ )
350
+
351
+ subcategory_breakdown = gr.Markdown(
352
+ value="No data",
353
+ label="Subcategory Breakdown",
354
+ )
355
+
356
+ recommendations = gr.Markdown(
357
+ value="No recommendations yet",
358
+ label="Recommendations for Improvement",
359
+ )
360
+
361
+ return error_patterns, subcategory_breakdown, recommendations
362
+
363
+ @staticmethod
364
+ def render_error_patterns(
365
+ classification_errors: Dict[str, int],
366
+ question_errors: Dict[str, int],
367
+ referral_errors: Dict[str, int],
368
+ ) -> Tuple[str, str, str]:
369
+ """
370
+ Render error patterns summary.
371
+
372
+ Args:
373
+ classification_errors: Dict of classification error subcategories with counts
374
+ question_errors: Dict of question issue types with counts
375
+ referral_errors: Dict of referral issue types with counts
376
+
377
+ Returns:
378
+ Tuple of (patterns, breakdown, recommendations) markdown strings
379
+ """
380
+ # Error patterns grouped by type
381
+ patterns = "### Error Patterns\n\n"
382
+
383
+ total_classification_errors = sum(classification_errors.values())
384
+ total_question_errors = sum(question_errors.values())
385
+ total_referral_errors = sum(referral_errors.values())
386
+
387
+ if total_classification_errors > 0:
388
+ patterns += f"#### Classification Errors: {total_classification_errors} total\n\n"
389
+ for subcategory, count in sorted(classification_errors.items(), key=lambda x: x[1], reverse=True):
390
+ patterns += f"• {subcategory}: {count}\n"
391
+ patterns += "\n"
392
+
393
+ if total_question_errors > 0:
394
+ patterns += f"#### Follow-Up Question Issues: {total_question_errors} total\n\n"
395
+ for issue_type, count in sorted(question_errors.items(), key=lambda x: x[1], reverse=True):
396
+ patterns += f"• {issue_type}: {count}\n"
397
+ patterns += "\n"
398
+
399
+ if total_referral_errors > 0:
400
+ patterns += f"#### Referral Message Issues: {total_referral_errors} total\n\n"
401
+ for issue_type, count in sorted(referral_errors.items(), key=lambda x: x[1], reverse=True):
402
+ patterns += f"• {issue_type}: {count}\n"
403
+ patterns += "\n"
404
+
405
+ # Subcategory breakdown
406
+ breakdown = "### Subcategory Breakdown\n\n"
407
+
408
+ if classification_errors:
409
+ breakdown += "**Classification Errors:**\n"
410
+ for subcategory, count in sorted(classification_errors.items(), key=lambda x: x[1], reverse=True):
411
+ breakdown += f"- {subcategory}: {count}\n"
412
+ breakdown += "\n"
413
+
414
+ if question_errors:
415
+ breakdown += "**Question Issues:**\n"
416
+ for issue_type, count in sorted(question_errors.items(), key=lambda x: x[1], reverse=True):
417
+ breakdown += f"- {issue_type}: {count}\n"
418
+ breakdown += "\n"
419
+
420
+ if referral_errors:
421
+ breakdown += "**Referral Issues:**\n"
422
+ for issue_type, count in sorted(referral_errors.items(), key=lambda x: x[1], reverse=True):
423
+ breakdown += f"- {issue_type}: {count}\n"
424
+ breakdown += "\n"
425
+
426
+ # Recommendations
427
+ recommendations = "### Recommendations for Improvement\n\n"
428
+
429
+ # Find most common errors
430
+ all_errors = {}
431
+ for subcategory, count in classification_errors.items():
432
+ all_errors[f"Classification: {subcategory}"] = count
433
+ for issue_type, count in question_errors.items():
434
+ all_errors[f"Questions: {issue_type}"] = count
435
+ for issue_type, count in referral_errors.items():
436
+ all_errors[f"Referral: {issue_type}"] = count
437
+
438
+ if all_errors:
439
+ sorted_errors = sorted(all_errors.items(), key=lambda x: x[1], reverse=True)
440
+ top_3 = sorted_errors[:3]
441
+
442
+ recommendations += "**Top areas for improvement:**\n\n"
443
+ for error_type, count in top_3:
444
+ recommendations += f"1. **{error_type}** ({count} occurrences)\n"
445
+ recommendations += f" - Review prompts and logic for this error type\n"
446
+ recommendations += f" - Consider additional training data\n\n"
447
+ else:
448
+ recommendations += "No errors detected yet. Great job!\n\n"
449
+
450
+ return patterns, breakdown, recommendations
src/interface/simplified_gradio_app.py CHANGED
@@ -29,10 +29,13 @@ from typing import Dict, Any, Optional, List
29
  from src.core.simplified_medical_app import SimplifiedMedicalApp
30
  from src.core.spiritual_state import SpiritualState
31
  from src.interface.verification_ui import VerificationUIComponents
 
32
  from src.core.test_datasets import TestDatasetManager
33
  from src.core.verification_models import VerificationSession, VerificationRecord, TestMessage
34
  from src.core.verification_store import JSONVerificationStore
35
  from src.core.verification_csv_exporter import VerificationCSVExporter
 
 
36
 
37
  try:
38
  from app_config import GRADIO_CONFIG
@@ -159,9 +162,9 @@ def create_simplified_interface():
159
  skip_btn = gr.Button("⏭️ Skip", scale=1)
160
  next_btn = gr.Button("Next ➡️", scale=1)
161
 
162
- # Save results button
163
  with gr.Row():
164
- save_results_btn = gr.Button("💾 Save Results (CSV)", variant="primary", scale=2)
165
  clear_session_btn = gr.Button("🗑️ Clear Session", scale=1)
166
 
167
  with gr.Column(scale=1):
@@ -174,6 +177,28 @@ def create_simplified_interface():
174
  # Summary card
175
  summary_card = VerificationUIComponents.create_summary_card_component()
176
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
177
  # Results section
178
  with gr.Row(visible=False) as results_section:
179
  with gr.Column():
@@ -196,8 +221,8 @@ def create_simplified_interface():
196
  # Error message display
197
  error_message = gr.Markdown(
198
  value="",
199
- visible=False,
200
- label="Error"
201
  )
202
 
203
  # Hidden state for tracking
@@ -1238,32 +1263,30 @@ To revert, use "Reset to Default" button.
1238
  )
1239
 
1240
  def handle_download_csv(session: VerificationSession, store: JSONVerificationStore):
1241
- """Handle CSV download."""
1242
  try:
1243
  if not session or session.verified_count == 0:
1244
- return None, "❌ No verified messages to export"
1245
 
1246
  csv_content = VerificationCSVExporter.generate_csv_content(session)
1247
  filename = VerificationCSVExporter.generate_csv_filename()
1248
 
1249
- # Write to temporary file
1250
- import tempfile
1251
  import os
 
1252
 
1253
- # Create temp directory if it doesn't exist
1254
- temp_dir = "/tmp/verification_exports"
1255
- os.makedirs(temp_dir, exist_ok=True)
1256
 
1257
- # Write to file with proper filename
1258
- temp_path = os.path.join(temp_dir, filename)
1259
- with open(temp_path, 'w') as f:
1260
  f.write(csv_content)
1261
 
1262
- success_msg = f"✅ Results exported: {filename}"
1263
- return temp_path, success_msg
1264
 
1265
  except Exception as e:
1266
- return None, f"❌ Error exporting CSV: {str(e)}"
 
 
1267
 
1268
  # Bind verification events
1269
  load_dataset_btn.click(
@@ -1536,11 +1559,11 @@ To revert, use "Reset to Default" button.
1536
  ]
1537
  )
1538
 
1539
- # Save results button
1540
  save_results_btn.click(
1541
  handle_download_csv,
1542
  inputs=[verification_session, verification_store],
1543
- outputs=[csv_download, error_message]
1544
  )
1545
 
1546
  # Clear session button
@@ -1576,6 +1599,93 @@ To revert, use "Reset to Default" button.
1576
  ]
1577
  )
1578
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1579
  # Bind events
1580
  demo.load(
1581
  initialize_session,
 
29
  from src.core.simplified_medical_app import SimplifiedMedicalApp
30
  from src.core.spiritual_state import SpiritualState
31
  from src.interface.verification_ui import VerificationUIComponents
32
+ from src.interface.chaplain_feedback_ui import ChaplainFeedbackUIComponents
33
  from src.core.test_datasets import TestDatasetManager
34
  from src.core.verification_models import VerificationSession, VerificationRecord, TestMessage
35
  from src.core.verification_store import JSONVerificationStore
36
  from src.core.verification_csv_exporter import VerificationCSVExporter
37
+ from src.core.chaplain_models import ClassificationFlowResult, DistressIndicator, FollowUpQuestion
38
+ from src.core.error_pattern_analyzer import ErrorPatternAnalyzer
39
 
40
  try:
41
  from app_config import GRADIO_CONFIG
 
162
  skip_btn = gr.Button("⏭️ Skip", scale=1)
163
  next_btn = gr.Button("Next ➡️", scale=1)
164
 
165
+ # Save results button - using DownloadButton for Hugging Face compatibility
166
  with gr.Row():
167
+ save_results_btn = gr.DownloadButton("💾 Download Results (CSV)", variant="primary", scale=2)
168
  clear_session_btn = gr.Button("🗑️ Clear Session", scale=1)
169
 
170
  with gr.Column(scale=1):
 
177
  # Summary card
178
  summary_card = VerificationUIComponents.create_summary_card_component()
179
 
180
+ # Chaplain Feedback Section - for displaying classification flows and collecting feedback
181
+ chaplain_feedback_section = gr.Row(visible=False)
182
+ with chaplain_feedback_section:
183
+ with gr.Column(scale=2):
184
+ # Classification flow display
185
+ flow_badge, flow_explanation, flow_content, flow_indicators = ChaplainFeedbackUIComponents.create_classification_flow_display()
186
+
187
+ # Tagging interface - returns individual components
188
+ (is_correct, subcategory, correct_classification,
189
+ question_issues, question_comments,
190
+ referral_issues, referral_comments,
191
+ indicator_issues, indicator_comments, general_notes) = ChaplainFeedbackUIComponents.create_tagging_interface()
192
+
193
+ # Submit feedback button
194
+ with gr.Row():
195
+ submit_feedback_btn = gr.Button("✓ Submit Feedback", variant="primary", scale=2)
196
+ skip_feedback_btn = gr.Button("⏭️ Skip Feedback", scale=1)
197
+
198
+ with gr.Column(scale=1):
199
+ # Error pattern summary
200
+ error_patterns, subcategory_breakdown, recommendations = ChaplainFeedbackUIComponents.create_error_pattern_summary()
201
+
202
  # Results section
203
  with gr.Row(visible=False) as results_section:
204
  with gr.Column():
 
221
  # Error message display
222
  error_message = gr.Markdown(
223
  value="",
224
+ visible=True,
225
+ label="Status"
226
  )
227
 
228
  # Hidden state for tracking
 
1263
  )
1264
 
1265
  def handle_download_csv(session: VerificationSession, store: JSONVerificationStore):
1266
+ """Handle CSV download - returns file path for DownloadButton."""
1267
  try:
1268
  if not session or session.verified_count == 0:
1269
+ return None
1270
 
1271
  csv_content = VerificationCSVExporter.generate_csv_content(session)
1272
  filename = VerificationCSVExporter.generate_csv_filename()
1273
 
 
 
1274
  import os
1275
+ import tempfile
1276
 
1277
+ # Use temp directory for Hugging Face compatibility
1278
+ temp_dir = tempfile.gettempdir()
1279
+ file_path = os.path.join(temp_dir, filename)
1280
 
1281
+ with open(file_path, 'w', encoding='utf-8') as f:
 
 
1282
  f.write(csv_content)
1283
 
1284
+ return file_path
 
1285
 
1286
  except Exception as e:
1287
+ import traceback
1288
+ print(f"CSV Export Error: {traceback.format_exc()}")
1289
+ return None
1290
 
1291
  # Bind verification events
1292
  load_dataset_btn.click(
 
1559
  ]
1560
  )
1561
 
1562
+ # Save results button - DownloadButton triggers download directly
1563
  save_results_btn.click(
1564
  handle_download_csv,
1565
  inputs=[verification_session, verification_store],
1566
+ outputs=[save_results_btn]
1567
  )
1568
 
1569
  # Clear session button
 
1599
  ]
1600
  )
1601
 
1602
+ # Chaplain Feedback Event Handlers
1603
+ def show_chaplain_feedback_section():
1604
+ """Show chaplain feedback section after message review."""
1605
+ return gr.Row(visible=True)
1606
+
1607
+ def handle_submit_feedback(
1608
+ classification_correct: bool,
1609
+ classification_subcategory: Optional[str],
1610
+ correct_classification: Optional[str],
1611
+ question_issues: List[str],
1612
+ question_comments: str,
1613
+ referral_issues: List[str],
1614
+ referral_comments: str,
1615
+ indicator_issues: str,
1616
+ indicator_comments: str,
1617
+ general_notes: str,
1618
+ session: VerificationSession,
1619
+ current_idx: int,
1620
+ message_queue: List[str],
1621
+ ):
1622
+ """Handle chaplain feedback submission."""
1623
+ try:
1624
+ if not session or current_idx >= len(message_queue):
1625
+ return "❌ Error: Invalid session state", session, current_idx
1626
+
1627
+ # Create tagging record
1628
+ from src.core.chaplain_models import TaggingRecord
1629
+ import uuid
1630
+
1631
+ current_message_id = message_queue[current_idx]
1632
+
1633
+ tagging_record = TaggingRecord(
1634
+ record_id=str(uuid.uuid4()),
1635
+ message_id=current_message_id,
1636
+ is_classification_correct=classification_correct,
1637
+ classification_subcategory=classification_subcategory,
1638
+ correct_classification=correct_classification,
1639
+ question_issues=question_issues or [],
1640
+ question_comments=question_comments,
1641
+ referral_issues=referral_issues or [],
1642
+ referral_comments=referral_comments,
1643
+ indicator_issues=[i.strip() for i in indicator_issues.split(",") if i.strip()],
1644
+ indicator_comments=indicator_comments,
1645
+ general_notes=general_notes,
1646
+ )
1647
+
1648
+ # Store tagging record in session (would need to extend VerificationSession)
1649
+ # For now, just confirm submission
1650
+ success_msg = f"✅ Feedback submitted for message {current_idx + 1}"
1651
+
1652
+ return success_msg, session, current_idx
1653
+
1654
+ except Exception as e:
1655
+ return f"❌ Error: {str(e)}", session, current_idx
1656
+
1657
+ def display_classification_flow(flow_result: Optional[ClassificationFlowResult]):
1658
+ """Display classification flow result."""
1659
+ if not flow_result:
1660
+ return "", "", "", ""
1661
+
1662
+ badge, explanation, content, indicators = ChaplainFeedbackUIComponents.render_classification_flow(flow_result)
1663
+ return badge, explanation, content, indicators
1664
+
1665
+ # Bind chaplain feedback events
1666
+ submit_feedback_btn.click(
1667
+ handle_submit_feedback,
1668
+ inputs=[
1669
+ is_correct, # is_correct radio
1670
+ subcategory, # subcategory dropdown
1671
+ correct_classification, # correct_classification radio
1672
+ question_issues, # question_issues checkbox
1673
+ question_comments, # question_comments textbox
1674
+ referral_issues, # referral_issues checkbox
1675
+ referral_comments, # referral_comments textbox
1676
+ indicator_issues, # indicator_issues textbox
1677
+ indicator_comments, # indicator_comments textbox
1678
+ general_notes,
1679
+ verification_session,
1680
+ current_message_index,
1681
+ message_queue,
1682
+ ],
1683
+ outputs=[error_message, verification_session, current_message_index]
1684
+ ).then(
1685
+ lambda: gr.Row(visible=False),
1686
+ outputs=[chaplain_feedback_section]
1687
+ )
1688
+
1689
  # Bind events
1690
  demo.load(
1691
  initialize_session,
tests/chaplain_feedback/__init__.py ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ # tests/chaplain_feedback/__init__.py
2
+ """Tests for Chaplain Feedback & Tagging System."""
tests/chaplain_feedback/conftest.py ADDED
@@ -0,0 +1,145 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # conftest.py
2
+ """
3
+ Pytest fixtures for Chaplain Feedback tests.
4
+ """
5
+
6
+ import pytest
7
+ from hypothesis import strategies as st
8
+ from datetime import datetime
9
+
10
+ from src.core.chaplain_models import (
11
+ DistressIndicator,
12
+ FollowUpQuestion,
13
+ ClassificationFlowResult,
14
+ TaggingRecord,
15
+ InteractionStepLog,
16
+ INDICATOR_DEFINITIONS,
17
+ CLASSIFICATION_SUBCATEGORIES,
18
+ QUESTION_ISSUE_TYPES,
19
+ REFERRAL_ISSUE_TYPES,
20
+ INTERACTION_STEP_TYPES,
21
+ )
22
+
23
+
24
+ # =============================================================================
25
+ # Hypothesis Strategies for generating test data
26
+ # =============================================================================
27
+
28
+ def valid_id_strategy():
29
+ """Generate valid IDs."""
30
+ return st.text(
31
+ alphabet="abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789_-",
32
+ min_size=1,
33
+ max_size=20,
34
+ )
35
+
36
+
37
+ def distress_indicator_strategy():
38
+ """Generate random DistressIndicator instances."""
39
+ return st.builds(
40
+ DistressIndicator,
41
+ indicator_text=st.text(min_size=1, max_size=200),
42
+ category=st.sampled_from([
43
+ "Emotional", "Grief", "Existential", "Expressions",
44
+ "Spiritual", "Medical", "Social", "Cultural",
45
+ "Engagement", "Guilt", "Anger", "Aging",
46
+ "Environment", "Independence"
47
+ ]),
48
+ subcategory=st.text(min_size=1, max_size=100),
49
+ severity=st.sampled_from(["red", "yellow"]),
50
+ confidence=st.floats(min_value=0.0, max_value=1.0, allow_nan=False),
51
+ definition_reference=st.text(max_size=20),
52
+ )
53
+
54
+
55
+ def follow_up_question_strategy():
56
+ """Generate random FollowUpQuestion instances."""
57
+ return st.builds(
58
+ FollowUpQuestion,
59
+ question_id=valid_id_strategy(),
60
+ question_text=st.text(min_size=1, max_size=500),
61
+ purpose=st.text(min_size=1, max_size=200),
62
+ )
63
+
64
+
65
+ def classification_flow_result_strategy():
66
+ """Generate random ClassificationFlowResult instances."""
67
+ return st.builds(
68
+ ClassificationFlowResult,
69
+ classification=st.sampled_from(["red", "yellow", "green"]),
70
+ confidence=st.floats(min_value=0.0, max_value=1.0, allow_nan=False),
71
+ indicators=st.lists(distress_indicator_strategy(), max_size=5),
72
+ explanation=st.text(max_size=500),
73
+ permission_check_message=st.one_of(st.none(), st.text(max_size=300)),
74
+ referral_message=st.one_of(st.none(), st.text(max_size=500)),
75
+ consent_status=st.one_of(st.none(), st.sampled_from(["granted", "declined"])),
76
+ follow_up_questions=st.lists(follow_up_question_strategy(), max_size=3),
77
+ patient_responses=st.lists(st.text(max_size=200), max_size=3),
78
+ re_evaluation_result=st.one_of(st.none(), st.sampled_from(["red", "green"])),
79
+ )
80
+
81
+
82
+ def tagging_record_strategy():
83
+ """Generate random TaggingRecord instances."""
84
+ return st.builds(
85
+ TaggingRecord,
86
+ record_id=valid_id_strategy(),
87
+ message_id=valid_id_strategy(),
88
+ is_classification_correct=st.booleans(),
89
+ classification_subcategory=st.one_of(
90
+ st.none(),
91
+ st.sampled_from(CLASSIFICATION_SUBCATEGORIES)
92
+ ),
93
+ correct_classification=st.one_of(
94
+ st.none(),
95
+ st.sampled_from(["red", "yellow", "green"])
96
+ ),
97
+ question_issues=st.lists(
98
+ st.sampled_from(QUESTION_ISSUE_TYPES),
99
+ max_size=3,
100
+ unique=True
101
+ ),
102
+ question_comments=st.one_of(st.none(), st.text(max_size=200)),
103
+ referral_issues=st.lists(
104
+ st.sampled_from(REFERRAL_ISSUE_TYPES),
105
+ max_size=3,
106
+ unique=True
107
+ ),
108
+ referral_comments=st.one_of(st.none(), st.text(max_size=200)),
109
+ indicator_issues=st.lists(st.text(min_size=1, max_size=50), max_size=5),
110
+ indicator_comments=st.one_of(st.none(), st.text(max_size=200)),
111
+ general_notes=st.text(max_size=300),
112
+ timestamp=st.just(datetime.now()),
113
+ )
114
+
115
+
116
+ def interaction_step_log_strategy():
117
+ """Generate random InteractionStepLog instances (without nested tagging)."""
118
+ return st.builds(
119
+ InteractionStepLog,
120
+ step_id=valid_id_strategy(),
121
+ session_id=valid_id_strategy(),
122
+ message_id=valid_id_strategy(),
123
+ step_type=st.sampled_from(INTERACTION_STEP_TYPES),
124
+ input_text=st.text(max_size=500),
125
+ model_output=st.text(max_size=500),
126
+ approval_status=st.one_of(st.none(), st.sampled_from(["approved", "disapproved"])),
127
+ tagging_data=st.none(), # Simplified - no nested tagging for basic tests
128
+ timestamp=st.just(datetime.now()),
129
+ )
130
+
131
+
132
+ def interaction_step_log_with_tagging_strategy():
133
+ """Generate random InteractionStepLog instances with nested tagging."""
134
+ return st.builds(
135
+ InteractionStepLog,
136
+ step_id=valid_id_strategy(),
137
+ session_id=valid_id_strategy(),
138
+ message_id=valid_id_strategy(),
139
+ step_type=st.sampled_from(INTERACTION_STEP_TYPES),
140
+ input_text=st.text(max_size=500),
141
+ model_output=st.text(max_size=500),
142
+ approval_status=st.one_of(st.none(), st.sampled_from(["approved", "disapproved"])),
143
+ tagging_data=st.one_of(st.none(), tagging_record_strategy()),
144
+ timestamp=st.just(datetime.now()),
145
+ )
tests/chaplain_feedback/test_properties_classification_flow.py ADDED
@@ -0,0 +1,297 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # test_properties_classification_flow.py
2
+ """
3
+ Property-based tests for Classification Flow Manager.
4
+
5
+ Tests universal properties that should hold across all inputs for
6
+ RED/YELLOW/GREEN classification flows.
7
+ """
8
+
9
+ import pytest
10
+ from hypothesis import given, strategies as st
11
+
12
+ from src.core.classification_flow_manager import ClassificationFlowManager
13
+ from src.core.content_generator import ContentGenerator
14
+ from src.core.chaplain_models import DistressIndicator
15
+ from tests.chaplain_feedback.conftest import distress_indicator_strategy
16
+
17
+
18
+ class TestClassificationFlowProperties:
19
+ """Property-based tests for ClassificationFlowManager."""
20
+
21
+ def setup_method(self):
22
+ """Set up test fixtures."""
23
+ self.content_generator = ContentGenerator()
24
+ self.flow_manager = ClassificationFlowManager(self.content_generator)
25
+
26
+ @given(
27
+ message=st.text(min_size=1, max_size=500),
28
+ confidence=st.floats(min_value=0.0, max_value=1.0, allow_nan=False),
29
+ indicators=st.lists(distress_indicator_strategy(), min_size=1, max_size=5),
30
+ consent_status=st.sampled_from(["granted", "declined"])
31
+ )
32
+ def test_property_4_red_flow_displays_all_content(
33
+ self, message, confidence, indicators, consent_status
34
+ ):
35
+ """
36
+ **Feature: chaplain-feedback-system, Property 4: RED Flow Displays All Content**
37
+ **Validates: Requirements 1.5**
38
+
39
+ For any RED classification result, the UI should display all three content types:
40
+ explanation, permission check message, and referral message (if consent granted).
41
+ """
42
+ # Execute RED flow
43
+ result = self.flow_manager.execute_red_flow(
44
+ message=message,
45
+ confidence=confidence,
46
+ indicators=indicators,
47
+ consent_status=consent_status
48
+ )
49
+
50
+ # Verify all required content is present
51
+ assert result.classification == "red"
52
+ assert result.explanation is not None and result.explanation.strip() != ""
53
+ assert result.permission_check_message is not None and result.permission_check_message.strip() != ""
54
+ assert result.consent_status == consent_status
55
+
56
+ # If consent granted, referral message should be present
57
+ if consent_status == "granted":
58
+ assert result.referral_message is not None and result.referral_message.strip() != ""
59
+ else:
60
+ # If consent declined, referral message should be None
61
+ assert result.referral_message is None
62
+
63
+ # Verify indicators are preserved
64
+ assert result.indicators == indicators
65
+ assert result.confidence == confidence
66
+
67
+ @given(
68
+ message=st.text(min_size=1, max_size=500),
69
+ confidence=st.floats(min_value=0.0, max_value=1.0, allow_nan=False),
70
+ indicators=st.lists(distress_indicator_strategy(), min_size=1, max_size=5)
71
+ )
72
+ def test_property_5_yellow_explanation_differentiates(
73
+ self, message, confidence, indicators
74
+ ):
75
+ """
76
+ **Feature: chaplain-feedback-system, Property 5: YELLOW Explanation Differentiates**
77
+ **Validates: Requirements 2.1**
78
+
79
+ For any YELLOW classification, the explanation should contain reasoning
80
+ for why it's not RED and why it's not GREEN.
81
+ """
82
+ # Execute YELLOW flow
83
+ result = self.flow_manager.execute_yellow_flow(
84
+ message=message,
85
+ confidence=confidence,
86
+ indicators=indicators
87
+ )
88
+
89
+ # Verify explanation differentiates from RED and GREEN
90
+ explanation = result.explanation.lower()
91
+
92
+ # Should explain why not RED
93
+ assert any(phrase in explanation for phrase in [
94
+ "why not red", "not red", "not meet the threshold",
95
+ "do not meet", "further clarification", "not severe"
96
+ ]), f"Explanation should explain why not RED: {result.explanation}"
97
+
98
+ # Should explain why not GREEN
99
+ assert any(phrase in explanation for phrase in [
100
+ "why not green", "not green", "indicators", "concerns",
101
+ "warrant follow-up", "suggest possible"
102
+ ]), f"Explanation should explain why not GREEN: {result.explanation}"
103
+
104
+ # Verify other YELLOW flow properties
105
+ assert result.classification == "yellow"
106
+ assert result.explanation is not None and result.explanation.strip() != ""
107
+ assert len(result.follow_up_questions) >= 2
108
+ assert len(result.follow_up_questions) <= 3
109
+
110
+ @given(
111
+ message=st.text(min_size=1, max_size=500),
112
+ confidence=st.floats(min_value=0.0, max_value=1.0, allow_nan=False),
113
+ indicators=st.lists(distress_indicator_strategy(), min_size=1, max_size=5)
114
+ )
115
+ def test_property_6_yellow_generates_2_3_questions(
116
+ self, message, confidence, indicators
117
+ ):
118
+ """
119
+ **Feature: chaplain-feedback-system, Property 6: YELLOW Generates 2-3 Questions**
120
+ **Validates: Requirements 2.2**
121
+
122
+ For any YELLOW classification, the system should generate between 2 and 3
123
+ follow-up questions, each containing 1-2 clarifying questions.
124
+ """
125
+ # Execute YELLOW flow
126
+ result = self.flow_manager.execute_yellow_flow(
127
+ message=message,
128
+ confidence=confidence,
129
+ indicators=indicators
130
+ )
131
+
132
+ # Verify question count
133
+ assert 2 <= len(result.follow_up_questions) <= 3, (
134
+ f"Expected 2-3 questions, got {len(result.follow_up_questions)}"
135
+ )
136
+
137
+ # Verify each question has required fields
138
+ for question in result.follow_up_questions:
139
+ assert question.question_id is not None and question.question_id.strip() != ""
140
+ assert question.question_text is not None and question.question_text.strip() != ""
141
+ assert question.purpose is not None and question.purpose.strip() != ""
142
+
143
+ # Each question should contain 1-2 clarifying questions (check for question marks)
144
+ question_marks = question.question_text.count("?")
145
+ assert 1 <= question_marks <= 2, (
146
+ f"Expected 1-2 questions per follow-up, got {question_marks} in: {question.question_text}"
147
+ )
148
+
149
+ @given(
150
+ message=st.text(min_size=1, max_size=500),
151
+ confidence=st.floats(min_value=0.0, max_value=1.0, allow_nan=False),
152
+ indicators=st.lists(distress_indicator_strategy(), max_size=2) # GREEN should have few/no indicators
153
+ )
154
+ def test_property_9_green_explanation_generated(
155
+ self, message, confidence, indicators
156
+ ):
157
+ """
158
+ **Feature: chaplain-feedback-system, Property 9: GREEN Explanation Generated**
159
+ **Validates: Requirements 3.1, 3.2**
160
+
161
+ For any GREEN classification, an explanation should be generated explaining
162
+ why no spiritual indicators were found.
163
+ """
164
+ # Execute GREEN flow
165
+ result = self.flow_manager.execute_green_flow(
166
+ message=message,
167
+ confidence=confidence,
168
+ indicators=indicators
169
+ )
170
+
171
+ # Verify explanation is generated
172
+ assert result.classification == "green"
173
+ assert result.explanation is not None and result.explanation.strip() != ""
174
+
175
+ # Explanation should mention no indicators or no distress
176
+ explanation = result.explanation.lower()
177
+ assert any(phrase in explanation for phrase in [
178
+ "no spiritual distress", "no indicators", "not suggest spiritual",
179
+ "no spiritual concerns", "no further steps"
180
+ ]), f"GREEN explanation should mention no distress: {result.explanation}"
181
+
182
+ # Should explain why not RED or YELLOW
183
+ assert any(phrase in explanation for phrase in [
184
+ "why not red", "why not yellow", "not contain", "does not suggest"
185
+ ]), f"GREEN explanation should differentiate from RED/YELLOW: {result.explanation}"
186
+
187
+ # GREEN flow should not have RED/YELLOW specific content
188
+ assert result.permission_check_message is None
189
+ assert result.referral_message is None
190
+ assert result.consent_status is None
191
+ assert len(result.follow_up_questions) == 0
192
+ assert len(result.patient_responses) == 0
193
+ assert result.re_evaluation_result is None
194
+
195
+ @given(
196
+ message=st.text(min_size=1, max_size=500),
197
+ confidence=st.floats(min_value=0.0, max_value=1.0, allow_nan=False),
198
+ indicators=st.lists(distress_indicator_strategy(), min_size=1, max_size=5),
199
+ # Generate responses that contain escalation keywords
200
+ escalation_responses=st.lists(
201
+ st.sampled_from([
202
+ "I feel hopeless about everything",
203
+ "I feel worthless and can't go on",
204
+ "There's no point in anything anymore",
205
+ "I want to give up completely",
206
+ "This is unbearable, I can't take it"
207
+ ]),
208
+ min_size=1,
209
+ max_size=3
210
+ )
211
+ )
212
+ def test_property_7_yellow_escalation_to_red(
213
+ self, message, confidence, indicators, escalation_responses
214
+ ):
215
+ """
216
+ **Feature: chaplain-feedback-system, Property 7: YELLOW Escalation to RED**
217
+ **Validates: Requirements 2.4**
218
+
219
+ For any YELLOW classification where simulated patient responses indicate distress,
220
+ the system should transition to RED FLAG flow.
221
+ """
222
+ # Execute YELLOW flow with escalation responses
223
+ result = self.flow_manager.execute_yellow_flow(
224
+ message=message,
225
+ confidence=confidence,
226
+ indicators=indicators,
227
+ patient_responses=escalation_responses
228
+ )
229
+
230
+ # Verify escalation occurred
231
+ assert result.re_evaluation_result == "red", (
232
+ f"Expected escalation to RED, got {result.re_evaluation_result} "
233
+ f"for responses: {escalation_responses}"
234
+ )
235
+
236
+ # Test the escalation method
237
+ escalated_result = self.flow_manager.escalate_yellow_to_red(result, message)
238
+
239
+ # Verify escalated result is RED
240
+ assert escalated_result.classification == "red"
241
+ assert escalated_result.explanation is not None
242
+ assert escalated_result.permission_check_message is not None
243
+ assert escalated_result.referral_message is not None # Should have consent granted
244
+ assert escalated_result.consent_status == "granted"
245
+
246
+ @given(
247
+ message=st.text(min_size=1, max_size=500),
248
+ confidence=st.floats(min_value=0.0, max_value=1.0, allow_nan=False),
249
+ indicators=st.lists(distress_indicator_strategy(), min_size=1, max_size=5),
250
+ # Generate responses that contain downgrade keywords
251
+ downgrade_responses=st.lists(
252
+ st.sampled_from([
253
+ "I'm feeling better now",
254
+ "Everything is okay",
255
+ "I have good support from my family",
256
+ "I'm not worried about it",
257
+ "I'm managing well",
258
+ "I feel hopeful about the future"
259
+ ]),
260
+ min_size=1,
261
+ max_size=3
262
+ )
263
+ )
264
+ def test_property_8_yellow_downgrade_to_green(
265
+ self, message, confidence, indicators, downgrade_responses
266
+ ):
267
+ """
268
+ **Feature: chaplain-feedback-system, Property 8: YELLOW Downgrade to GREEN**
269
+ **Validates: Requirements 2.5**
270
+
271
+ For any YELLOW classification where simulated patient responses indicate no distress,
272
+ the system should transition to GREEN status.
273
+ """
274
+ # Execute YELLOW flow with downgrade responses
275
+ result = self.flow_manager.execute_yellow_flow(
276
+ message=message,
277
+ confidence=confidence,
278
+ indicators=indicators,
279
+ patient_responses=downgrade_responses
280
+ )
281
+
282
+ # Verify downgrade occurred
283
+ assert result.re_evaluation_result == "green", (
284
+ f"Expected downgrade to GREEN, got {result.re_evaluation_result} "
285
+ f"for responses: {downgrade_responses}"
286
+ )
287
+
288
+ # Test the downgrade method
289
+ downgraded_result = self.flow_manager.downgrade_yellow_to_green(result, message)
290
+
291
+ # Verify downgraded result is GREEN
292
+ assert downgraded_result.classification == "green"
293
+ assert downgraded_result.explanation is not None
294
+ assert downgraded_result.permission_check_message is None
295
+ assert downgraded_result.referral_message is None
296
+ assert downgraded_result.consent_status is None
297
+ assert len(downgraded_result.follow_up_questions) == 0
tests/chaplain_feedback/test_properties_content_generator.py ADDED
@@ -0,0 +1,399 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # test_properties_content_generator.py
2
+ """
3
+ Property-based tests for Content Generator Service.
4
+
5
+ Tests that content generation follows the specification requirements.
6
+ """
7
+
8
+ import pytest
9
+ from hypothesis import given, settings, assume
10
+ from hypothesis import strategies as st
11
+
12
+ from src.core.chaplain_models import (
13
+ DistressIndicator,
14
+ FollowUpQuestion,
15
+ )
16
+ from src.core.content_generator import ContentGenerator
17
+
18
+ from tests.chaplain_feedback.conftest import (
19
+ distress_indicator_strategy,
20
+ )
21
+
22
+
23
+ # =============================================================================
24
+ # Strategies for content generator tests
25
+ # =============================================================================
26
+
27
+ def non_empty_indicators_strategy():
28
+ """Generate non-empty list of distress indicators."""
29
+ return st.lists(distress_indicator_strategy(), min_size=1, max_size=5)
30
+
31
+
32
+ def red_indicators_strategy():
33
+ """Generate list with at least one RED severity indicator."""
34
+ return st.lists(
35
+ distress_indicator_strategy(),
36
+ min_size=1,
37
+ max_size=5
38
+ ).filter(lambda indicators: any(i.severity == "red" for i in indicators))
39
+
40
+
41
+ def patient_message_strategy():
42
+ """Generate patient message text."""
43
+ return st.text(min_size=10, max_size=500).filter(lambda s: s.strip())
44
+
45
+
46
+ # =============================================================================
47
+ # Property Tests for RED Explanation
48
+ # =============================================================================
49
+
50
+ class TestRedExplanationContainsIndicators:
51
+ """
52
+ **Feature: chaplain-feedback-system, Property 1: RED Explanation Contains Indicators**
53
+ **Validates: Requirements 1.1**
54
+
55
+ For any RED classification, the generated explanation should reference
56
+ at least one distress indicator from the definitions document categories.
57
+ """
58
+
59
+ @given(
60
+ indicators=non_empty_indicators_strategy(),
61
+ message=patient_message_strategy()
62
+ )
63
+ @settings(max_examples=100)
64
+ def test_red_explanation_contains_indicator_references(self, indicators, message):
65
+ """
66
+ **Feature: chaplain-feedback-system, Property 1: RED Explanation Contains Indicators**
67
+ **Validates: Requirements 1.1**
68
+
69
+ For any RED classification with indicators, the explanation should
70
+ reference at least one indicator's subcategory or category.
71
+ """
72
+ generator = ContentGenerator()
73
+ explanation = generator.generate_explanation("red", indicators, message)
74
+
75
+ # The explanation should contain at least one indicator reference
76
+ indicator_referenced = False
77
+ for indicator in indicators:
78
+ if indicator.subcategory in explanation or indicator.category in explanation:
79
+ indicator_referenced = True
80
+ break
81
+
82
+ assert indicator_referenced, (
83
+ f"RED explanation should reference at least one indicator. "
84
+ f"Indicators: {[i.subcategory for i in indicators]}"
85
+ )
86
+
87
+ @given(
88
+ indicators=non_empty_indicators_strategy(),
89
+ message=patient_message_strategy()
90
+ )
91
+ @settings(max_examples=100)
92
+ def test_red_explanation_mentions_red_flag(self, indicators, message):
93
+ """
94
+ **Feature: chaplain-feedback-system, Property 1: RED Explanation Contains Indicators**
95
+ **Validates: Requirements 1.1**
96
+
97
+ For any RED classification, the explanation should mention RED FLAG.
98
+ """
99
+ generator = ContentGenerator()
100
+ explanation = generator.generate_explanation("red", indicators, message)
101
+
102
+ assert "RED FLAG" in explanation or "red" in explanation.lower(), (
103
+ "RED explanation should mention RED FLAG classification"
104
+ )
105
+
106
+ @given(
107
+ indicators=non_empty_indicators_strategy(),
108
+ message=patient_message_strategy()
109
+ )
110
+ @settings(max_examples=100)
111
+ def test_red_explanation_mentions_spiritual_care(self, indicators, message):
112
+ """
113
+ **Feature: chaplain-feedback-system, Property 1: RED Explanation Contains Indicators**
114
+ **Validates: Requirements 1.1**
115
+
116
+ For any RED classification, the explanation should mention spiritual care team.
117
+ """
118
+ generator = ContentGenerator()
119
+ explanation = generator.generate_explanation("red", indicators, message)
120
+
121
+ assert "spiritual" in explanation.lower(), (
122
+ "RED explanation should mention spiritual care"
123
+ )
124
+
125
+
126
+
127
+ # =============================================================================
128
+ # Property Tests for Permission Check Message
129
+ # =============================================================================
130
+
131
+ class TestRedPermissionCheckGenerated:
132
+ """
133
+ **Feature: chaplain-feedback-system, Property 2: RED Permission Check Generated**
134
+ **Validates: Requirements 1.2**
135
+
136
+ For any RED classification, a patient permission check message should be
137
+ generated and contain consent-related language.
138
+ """
139
+
140
+ @given(indicators=non_empty_indicators_strategy())
141
+ @settings(max_examples=100)
142
+ def test_permission_check_contains_spiritual_support(self, indicators):
143
+ """
144
+ **Feature: chaplain-feedback-system, Property 2: RED Permission Check Generated**
145
+ **Validates: Requirements 1.2**
146
+
147
+ For any RED classification, the permission check message should
148
+ contain "spiritual" language.
149
+ """
150
+ generator = ContentGenerator()
151
+ message = generator.generate_permission_check(indicators)
152
+
153
+ assert "spiritual" in message.lower(), (
154
+ "Permission check message should mention spiritual support"
155
+ )
156
+
157
+ @given(indicators=non_empty_indicators_strategy())
158
+ @settings(max_examples=100)
159
+ def test_permission_check_contains_consent_language(self, indicators):
160
+ """
161
+ **Feature: chaplain-feedback-system, Property 2: RED Permission Check Generated**
162
+ **Validates: Requirements 1.2**
163
+
164
+ For any RED classification, the permission check message should
165
+ contain consent-related language.
166
+ """
167
+ generator = ContentGenerator()
168
+ message = generator.generate_permission_check(indicators)
169
+
170
+ # Check for consent-related terms
171
+ consent_terms = ["consent", "permission", "voluntary", "would you like"]
172
+ has_consent_language = any(term in message.lower() for term in consent_terms)
173
+
174
+ assert has_consent_language, (
175
+ f"Permission check message should contain consent language. "
176
+ f"Message: {message[:200]}..."
177
+ )
178
+
179
+ @given(indicators=non_empty_indicators_strategy())
180
+ @settings(max_examples=100)
181
+ def test_permission_check_is_non_empty(self, indicators):
182
+ """
183
+ **Feature: chaplain-feedback-system, Property 2: RED Permission Check Generated**
184
+ **Validates: Requirements 1.2**
185
+
186
+ For any RED classification, a non-empty permission check message
187
+ should be generated.
188
+ """
189
+ generator = ContentGenerator()
190
+ message = generator.generate_permission_check(indicators)
191
+
192
+ assert message and len(message.strip()) > 0, (
193
+ "Permission check message should not be empty"
194
+ )
195
+
196
+
197
+
198
+ # =============================================================================
199
+ # Property Tests for Referral Message
200
+ # =============================================================================
201
+
202
+ class TestRedReferralMessageContainsRequiredSections:
203
+ """
204
+ **Feature: chaplain-feedback-system, Property 3: RED Referral Message Contains Required Sections**
205
+ **Validates: Requirements 1.3**
206
+
207
+ For any RED classification with granted consent, the referral message should
208
+ contain: background information, detected indicators, and justification.
209
+ """
210
+
211
+ @given(
212
+ indicators=non_empty_indicators_strategy(),
213
+ message=patient_message_strategy()
214
+ )
215
+ @settings(max_examples=100)
216
+ def test_referral_message_contains_background(self, indicators, message):
217
+ """
218
+ **Feature: chaplain-feedback-system, Property 3: RED Referral Message Contains Required Sections**
219
+ **Validates: Requirements 1.3**
220
+
221
+ For any RED classification, the referral message should contain
222
+ background information section.
223
+ """
224
+ generator = ContentGenerator()
225
+ explanation = generator.generate_explanation("red", indicators, message)
226
+ referral = generator.generate_referral_message(message, indicators, explanation)
227
+
228
+ assert "BACKGROUND" in referral.upper(), (
229
+ "Referral message should contain BACKGROUND section"
230
+ )
231
+
232
+ @given(
233
+ indicators=non_empty_indicators_strategy(),
234
+ message=patient_message_strategy()
235
+ )
236
+ @settings(max_examples=100)
237
+ def test_referral_message_contains_indicators_section(self, indicators, message):
238
+ """
239
+ **Feature: chaplain-feedback-system, Property 3: RED Referral Message Contains Required Sections**
240
+ **Validates: Requirements 1.3**
241
+
242
+ For any RED classification, the referral message should contain
243
+ indicators section.
244
+ """
245
+ generator = ContentGenerator()
246
+ explanation = generator.generate_explanation("red", indicators, message)
247
+ referral = generator.generate_referral_message(message, indicators, explanation)
248
+
249
+ assert "INDICATORS" in referral.upper(), (
250
+ "Referral message should contain INDICATORS section"
251
+ )
252
+
253
+ @given(
254
+ indicators=non_empty_indicators_strategy(),
255
+ message=patient_message_strategy()
256
+ )
257
+ @settings(max_examples=100)
258
+ def test_referral_message_contains_justification(self, indicators, message):
259
+ """
260
+ **Feature: chaplain-feedback-system, Property 3: RED Referral Message Contains Required Sections**
261
+ **Validates: Requirements 1.3**
262
+
263
+ For any RED classification, the referral message should contain
264
+ justification section.
265
+ """
266
+ generator = ContentGenerator()
267
+ explanation = generator.generate_explanation("red", indicators, message)
268
+ referral = generator.generate_referral_message(message, indicators, explanation)
269
+
270
+ assert "JUSTIFICATION" in referral.upper(), (
271
+ "Referral message should contain JUSTIFICATION section"
272
+ )
273
+
274
+ @given(
275
+ indicators=non_empty_indicators_strategy(),
276
+ message=patient_message_strategy()
277
+ )
278
+ @settings(max_examples=100)
279
+ def test_referral_message_references_indicators(self, indicators, message):
280
+ """
281
+ **Feature: chaplain-feedback-system, Property 3: RED Referral Message Contains Required Sections**
282
+ **Validates: Requirements 1.3**
283
+
284
+ For any RED classification with indicators, the referral message should
285
+ reference at least one indicator.
286
+ """
287
+ generator = ContentGenerator()
288
+ explanation = generator.generate_explanation("red", indicators, message)
289
+ referral = generator.generate_referral_message(message, indicators, explanation)
290
+
291
+ # Check that at least one indicator is referenced
292
+ indicator_referenced = False
293
+ for indicator in indicators:
294
+ if indicator.subcategory in referral or indicator.category in referral:
295
+ indicator_referenced = True
296
+ break
297
+
298
+ assert indicator_referenced, (
299
+ f"Referral message should reference at least one indicator. "
300
+ f"Indicators: {[i.subcategory for i in indicators]}"
301
+ )
302
+
303
+
304
+
305
+ # =============================================================================
306
+ # Property Tests for Follow-Up Questions
307
+ # =============================================================================
308
+
309
+ class TestYellowGenerates2To3Questions:
310
+ """
311
+ **Feature: chaplain-feedback-system, Property 6: YELLOW Generates 2-3 Questions**
312
+ **Validates: Requirements 2.2**
313
+
314
+ For any YELLOW classification, the system should generate between 2 and 3
315
+ follow-up questions, each containing 1-2 clarifying questions.
316
+ """
317
+
318
+ @given(
319
+ indicators=st.lists(distress_indicator_strategy(), min_size=0, max_size=5),
320
+ message=patient_message_strategy()
321
+ )
322
+ @settings(max_examples=100)
323
+ def test_follow_up_questions_count_in_range(self, indicators, message):
324
+ """
325
+ **Feature: chaplain-feedback-system, Property 6: YELLOW Generates 2-3 Questions**
326
+ **Validates: Requirements 2.2**
327
+
328
+ For any YELLOW classification, the number of follow-up questions
329
+ should be between 2 and 3.
330
+ """
331
+ generator = ContentGenerator()
332
+ questions = generator.generate_follow_up_questions(message, indicators)
333
+
334
+ assert 2 <= len(questions) <= 3, (
335
+ f"Should generate 2-3 follow-up questions, got {len(questions)}"
336
+ )
337
+
338
+ @given(
339
+ indicators=st.lists(distress_indicator_strategy(), min_size=0, max_size=5),
340
+ message=patient_message_strategy()
341
+ )
342
+ @settings(max_examples=100)
343
+ def test_follow_up_questions_have_required_fields(self, indicators, message):
344
+ """
345
+ **Feature: chaplain-feedback-system, Property 6: YELLOW Generates 2-3 Questions**
346
+ **Validates: Requirements 2.2**
347
+
348
+ For any YELLOW classification, each follow-up question should have
349
+ question_id, question_text, and purpose fields.
350
+ """
351
+ generator = ContentGenerator()
352
+ questions = generator.generate_follow_up_questions(message, indicators)
353
+
354
+ for question in questions:
355
+ assert question.question_id, "Question should have question_id"
356
+ assert question.question_text, "Question should have question_text"
357
+ assert question.purpose, "Question should have purpose"
358
+
359
+ @given(
360
+ indicators=st.lists(distress_indicator_strategy(), min_size=0, max_size=5),
361
+ message=patient_message_strategy()
362
+ )
363
+ @settings(max_examples=100)
364
+ def test_follow_up_questions_are_follow_up_question_instances(self, indicators, message):
365
+ """
366
+ **Feature: chaplain-feedback-system, Property 6: YELLOW Generates 2-3 Questions**
367
+ **Validates: Requirements 2.2**
368
+
369
+ For any YELLOW classification, all generated questions should be
370
+ FollowUpQuestion instances.
371
+ """
372
+ generator = ContentGenerator()
373
+ questions = generator.generate_follow_up_questions(message, indicators)
374
+
375
+ for question in questions:
376
+ assert isinstance(question, FollowUpQuestion), (
377
+ f"Question should be FollowUpQuestion instance, got {type(question)}"
378
+ )
379
+
380
+ @given(
381
+ indicators=st.lists(distress_indicator_strategy(), min_size=0, max_size=5),
382
+ message=patient_message_strategy()
383
+ )
384
+ @settings(max_examples=100)
385
+ def test_follow_up_questions_have_unique_ids(self, indicators, message):
386
+ """
387
+ **Feature: chaplain-feedback-system, Property 6: YELLOW Generates 2-3 Questions**
388
+ **Validates: Requirements 2.2**
389
+
390
+ For any YELLOW classification, all generated questions should have
391
+ unique question_ids.
392
+ """
393
+ generator = ContentGenerator()
394
+ questions = generator.generate_follow_up_questions(message, indicators)
395
+
396
+ question_ids = [q.question_id for q in questions]
397
+ assert len(question_ids) == len(set(question_ids)), (
398
+ f"Question IDs should be unique, got: {question_ids}"
399
+ )
tests/chaplain_feedback/test_properties_csv_export.py ADDED
@@ -0,0 +1,290 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # test_properties_csv_export.py
2
+ """
3
+ Property-based tests for Enhanced CSV Export functionality.
4
+
5
+ Tests that CSV export includes all tagging data, generated content,
6
+ interaction logs, and statistics.
7
+ """
8
+
9
+ import pytest
10
+ from hypothesis import given, settings
11
+ from datetime import datetime
12
+
13
+ from src.core.verification_csv_exporter import VerificationCSVExporter
14
+ from src.core.verification_models import VerificationSession, VerificationRecord
15
+ from src.core.chaplain_models import (
16
+ TaggingRecord,
17
+ ClassificationFlowResult,
18
+ InteractionStepLog,
19
+ DistressIndicator,
20
+ FollowUpQuestion,
21
+ )
22
+
23
+ from tests.chaplain_feedback.conftest import (
24
+ tagging_record_strategy,
25
+ classification_flow_result_strategy,
26
+ interaction_step_log_strategy,
27
+ )
28
+
29
+
30
+ class TestExportContainsAllTags:
31
+ """
32
+ **Feature: chaplain-feedback-system, Property 17: Export Contains All Tags**
33
+
34
+ Tests that CSV export includes all tagging categories and subcategories.
35
+ """
36
+
37
+ @given(tagging_record_strategy())
38
+ @settings(max_examples=100)
39
+ def test_export_contains_all_tags(self, tagging_record):
40
+ """
41
+ **Feature: chaplain-feedback-system, Property 17: Export Contains All Tags**
42
+ **Validates: Requirements 9.1**
43
+
44
+ For any TaggingRecord, the CSV export should contain all tagging
45
+ categories and subcategories from that record.
46
+ """
47
+ # Create a minimal session
48
+ session = VerificationSession(
49
+ session_id="test_session",
50
+ verifier_name="Test Verifier",
51
+ dataset_id="test_dataset",
52
+ dataset_name="Test Dataset",
53
+ total_messages=1,
54
+ verified_count=1,
55
+ correct_count=1,
56
+ incorrect_count=0,
57
+ )
58
+
59
+ # Add a verification record
60
+ verification = VerificationRecord(
61
+ message_id=tagging_record.message_id,
62
+ original_message="Test message",
63
+ classifier_decision="red",
64
+ classifier_confidence=0.9,
65
+ classifier_indicators=["indicator1"],
66
+ ground_truth_label="red",
67
+ is_correct=True,
68
+ )
69
+ session.verifications.append(verification)
70
+
71
+ # Generate CSV with tagging records
72
+ csv_content = VerificationCSVExporter.generate_enhanced_csv_content(
73
+ session,
74
+ tagging_records=[tagging_record],
75
+ )
76
+
77
+ # Verify tagging data section exists
78
+ assert "TAGGING DATA" in csv_content
79
+
80
+ # Verify message ID is in export
81
+ assert tagging_record.message_id in csv_content
82
+
83
+ # Verify classification correctness is in export
84
+ correctness_str = "Yes" if tagging_record.is_classification_correct else "No"
85
+ assert correctness_str in csv_content
86
+
87
+ # Verify classification subcategory is in export (if present)
88
+ if tagging_record.classification_subcategory:
89
+ assert tagging_record.classification_subcategory in csv_content
90
+
91
+ # Verify correct classification is in export (if present)
92
+ if tagging_record.correct_classification:
93
+ assert tagging_record.correct_classification in csv_content
94
+
95
+ # Verify question issues are in export (if present)
96
+ if tagging_record.question_issues:
97
+ for issue in tagging_record.question_issues:
98
+ assert issue in csv_content
99
+
100
+ # Verify referral issues are in export (if present)
101
+ if tagging_record.referral_issues:
102
+ for issue in tagging_record.referral_issues:
103
+ assert issue in csv_content
104
+
105
+ # Verify indicator issues are in export (if present)
106
+ if tagging_record.indicator_issues:
107
+ for indicator_id in tagging_record.indicator_issues:
108
+ assert indicator_id in csv_content
109
+
110
+
111
+ class TestExportContainsGeneratedContent:
112
+ """
113
+ **Feature: chaplain-feedback-system, Property 18: Export Contains Generated Content**
114
+
115
+ Tests that CSV export includes all generated content.
116
+ """
117
+
118
+ @given(classification_flow_result_strategy())
119
+ @settings(max_examples=100)
120
+ def test_export_contains_generated_content(self, flow_result):
121
+ """
122
+ **Feature: chaplain-feedback-system, Property 18: Export Contains Generated Content**
123
+ **Validates: Requirements 9.2**
124
+
125
+ For any ClassificationFlowResult, the CSV export should contain
126
+ all generated content (explanations, questions, referral messages).
127
+ """
128
+ # Create a minimal session
129
+ session = VerificationSession(
130
+ session_id="test_session",
131
+ verifier_name="Test Verifier",
132
+ dataset_id="test_dataset",
133
+ dataset_name="Test Dataset",
134
+ total_messages=1,
135
+ verified_count=1,
136
+ correct_count=1,
137
+ incorrect_count=0,
138
+ )
139
+
140
+ # Add a verification record
141
+ message_id = "msg_001"
142
+ verification = VerificationRecord(
143
+ message_id=message_id,
144
+ original_message="Test message",
145
+ classifier_decision=flow_result.classification,
146
+ classifier_confidence=flow_result.confidence,
147
+ classifier_indicators=[ind.indicator_text for ind in flow_result.indicators],
148
+ ground_truth_label=flow_result.classification,
149
+ is_correct=True,
150
+ )
151
+ session.verifications.append(verification)
152
+
153
+ # Generate CSV with flow results
154
+ flow_results = {message_id: flow_result}
155
+ csv_content = VerificationCSVExporter.generate_enhanced_csv_content(
156
+ session,
157
+ flow_results=flow_results,
158
+ )
159
+
160
+ # Verify generated content section exists
161
+ assert "GENERATED CONTENT" in csv_content
162
+
163
+ # Verify message ID is in export
164
+ assert message_id in csv_content
165
+
166
+ # Verify classification is in export
167
+ assert flow_result.classification.upper() in csv_content
168
+
169
+
170
+ class TestExportContainsInteractionLogs:
171
+ """
172
+ Tests that CSV export includes interaction logs.
173
+ """
174
+
175
+ @given(interaction_step_log_strategy())
176
+ @settings(max_examples=100)
177
+ def test_export_contains_interaction_logs(self, log):
178
+ """
179
+ For any InteractionStepLog, the CSV export should contain
180
+ all logged interaction steps.
181
+ """
182
+ # Create a minimal session
183
+ session = VerificationSession(
184
+ session_id=log.session_id,
185
+ verifier_name="Test Verifier",
186
+ dataset_id="test_dataset",
187
+ dataset_name="Test Dataset",
188
+ total_messages=1,
189
+ verified_count=1,
190
+ correct_count=1,
191
+ incorrect_count=0,
192
+ )
193
+
194
+ # Add a verification record
195
+ verification = VerificationRecord(
196
+ message_id=log.message_id,
197
+ original_message="Test message",
198
+ classifier_decision="red",
199
+ classifier_confidence=0.9,
200
+ classifier_indicators=["indicator1"],
201
+ ground_truth_label="red",
202
+ is_correct=True,
203
+ )
204
+ session.verifications.append(verification)
205
+
206
+ # Generate CSV with interaction logs
207
+ csv_content = VerificationCSVExporter.generate_enhanced_csv_content(
208
+ session,
209
+ interaction_logs=[log],
210
+ )
211
+
212
+ # Verify interaction logs section exists
213
+ assert "INTERACTION LOGS" in csv_content
214
+
215
+ # Verify step ID is in export
216
+ assert log.step_id in csv_content
217
+
218
+ # Verify session ID is in export
219
+ assert log.session_id in csv_content
220
+
221
+ # Verify message ID is in export
222
+ assert log.message_id in csv_content
223
+
224
+ # Verify step type is in export
225
+ assert log.step_type in csv_content
226
+
227
+ # Verify approval status is in export (if present)
228
+ if log.approval_status:
229
+ assert log.approval_status in csv_content
230
+
231
+
232
+ class TestExportContainsStatistics:
233
+ """
234
+ Tests that CSV export includes error pattern statistics.
235
+ """
236
+
237
+ @given(tagging_record_strategy())
238
+ @settings(max_examples=100)
239
+ def test_export_contains_statistics(self, tagging_record):
240
+ """
241
+ For any set of TaggingRecords, the CSV export should contain
242
+ error pattern statistics with subcategory breakdowns.
243
+ """
244
+ # Create a minimal session
245
+ session = VerificationSession(
246
+ session_id="test_session",
247
+ verifier_name="Test Verifier",
248
+ dataset_id="test_dataset",
249
+ dataset_name="Test Dataset",
250
+ total_messages=1,
251
+ verified_count=1,
252
+ correct_count=1,
253
+ incorrect_count=0,
254
+ )
255
+
256
+ # Add a verification record
257
+ verification = VerificationRecord(
258
+ message_id=tagging_record.message_id,
259
+ original_message="Test message",
260
+ classifier_decision="red",
261
+ classifier_confidence=0.9,
262
+ classifier_indicators=["indicator1"],
263
+ ground_truth_label="red",
264
+ is_correct=True,
265
+ )
266
+ session.verifications.append(verification)
267
+
268
+ # Generate CSV with tagging records (which triggers statistics)
269
+ csv_content = VerificationCSVExporter.generate_enhanced_csv_content(
270
+ session,
271
+ tagging_records=[tagging_record],
272
+ )
273
+
274
+ # Verify statistics section exists
275
+ assert "ERROR PATTERN STATISTICS" in csv_content
276
+
277
+ # Verify classification errors section exists
278
+ assert "Classification Errors" in csv_content
279
+
280
+ # Verify question issues section exists
281
+ assert "Question Issues" in csv_content
282
+
283
+ # Verify referral issues section exists
284
+ assert "Referral Issues" in csv_content
285
+
286
+ # Verify indicator issues section exists
287
+ assert "Indicator Issues" in csv_content
288
+
289
+ # Verify common patterns section exists
290
+ assert "Common Patterns" in csv_content
tests/chaplain_feedback/test_properties_data_models.py ADDED
@@ -0,0 +1,250 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # test_properties_data_models.py
2
+ """
3
+ Property-based tests for Chaplain Feedback data model serialization.
4
+
5
+ Tests that all data models serialize and deserialize correctly (round-trip).
6
+ """
7
+
8
+ import pytest
9
+ from hypothesis import given, settings
10
+ from datetime import datetime
11
+
12
+ from src.core.chaplain_models import (
13
+ DistressIndicator,
14
+ FollowUpQuestion,
15
+ ClassificationFlowResult,
16
+ TaggingRecord,
17
+ InteractionStepLog,
18
+ INDICATOR_DEFINITIONS,
19
+ )
20
+
21
+ from tests.chaplain_feedback.conftest import (
22
+ distress_indicator_strategy,
23
+ follow_up_question_strategy,
24
+ classification_flow_result_strategy,
25
+ tagging_record_strategy,
26
+ interaction_step_log_strategy,
27
+ interaction_step_log_with_tagging_strategy,
28
+ )
29
+
30
+
31
+ class TestDistressIndicatorRoundTrip:
32
+ """
33
+ **Feature: chaplain-feedback-system, Property: Data Model Round Trip**
34
+
35
+ Tests that DistressIndicator serializes and deserializes correctly.
36
+ """
37
+
38
+ @given(distress_indicator_strategy())
39
+ @settings(max_examples=100)
40
+ def test_distress_indicator_round_trip(self, indicator):
41
+ """
42
+ **Feature: chaplain-feedback-system, Property: Data Model Round Trip**
43
+ **Validates: Requirements 8.5**
44
+
45
+ For any DistressIndicator, converting to dict and back should
46
+ preserve all fields exactly.
47
+ """
48
+ # Convert to dict and back
49
+ indicator_dict = indicator.to_dict()
50
+ restored = DistressIndicator.from_dict(indicator_dict)
51
+
52
+ # Verify all fields match
53
+ assert restored.indicator_text == indicator.indicator_text
54
+ assert restored.category == indicator.category
55
+ assert restored.subcategory == indicator.subcategory
56
+ assert restored.severity == indicator.severity
57
+ assert restored.confidence == indicator.confidence
58
+ assert restored.definition_reference == indicator.definition_reference
59
+
60
+ def test_distress_indicator_from_definition(self):
61
+ """
62
+ Test creating DistressIndicator from INDICATOR_DEFINITIONS.
63
+ """
64
+ # Test with a known indicator
65
+ indicator = DistressIndicator.from_definition(
66
+ indicator_key="excessive_guilt",
67
+ indicator_text="I feel so guilty about everything",
68
+ confidence=0.85
69
+ )
70
+
71
+ assert indicator.category == "Guilt"
72
+ assert indicator.subcategory == "Excessive guilt"
73
+ assert indicator.severity == "red"
74
+ assert indicator.definition_reference == "II.D"
75
+ assert indicator.confidence == 0.85
76
+
77
+
78
+ class TestFollowUpQuestionRoundTrip:
79
+ """
80
+ **Feature: chaplain-feedback-system, Property: Data Model Round Trip**
81
+
82
+ Tests that FollowUpQuestion serializes and deserializes correctly.
83
+ """
84
+
85
+ @given(follow_up_question_strategy())
86
+ @settings(max_examples=100)
87
+ def test_follow_up_question_round_trip(self, question):
88
+ """
89
+ **Feature: chaplain-feedback-system, Property: Data Model Round Trip**
90
+ **Validates: Requirements 8.5**
91
+
92
+ For any FollowUpQuestion, converting to dict and back should
93
+ preserve all fields exactly.
94
+ """
95
+ # Convert to dict and back
96
+ question_dict = question.to_dict()
97
+ restored = FollowUpQuestion.from_dict(question_dict)
98
+
99
+ # Verify all fields match
100
+ assert restored.question_id == question.question_id
101
+ assert restored.question_text == question.question_text
102
+ assert restored.purpose == question.purpose
103
+
104
+
105
+ class TestClassificationFlowResultRoundTrip:
106
+ """
107
+ **Feature: chaplain-feedback-system, Property: Data Model Round Trip**
108
+
109
+ Tests that ClassificationFlowResult serializes and deserializes correctly.
110
+ """
111
+
112
+ @given(classification_flow_result_strategy())
113
+ @settings(max_examples=100)
114
+ def test_classification_flow_result_round_trip(self, result):
115
+ """
116
+ **Feature: chaplain-feedback-system, Property: Data Model Round Trip**
117
+ **Validates: Requirements 8.5**
118
+
119
+ For any ClassificationFlowResult, converting to dict and back should
120
+ preserve all fields exactly.
121
+ """
122
+ # Convert to dict and back
123
+ result_dict = result.to_dict()
124
+ restored = ClassificationFlowResult.from_dict(result_dict)
125
+
126
+ # Verify basic fields match
127
+ assert restored.classification == result.classification
128
+ assert restored.confidence == result.confidence
129
+ assert restored.explanation == result.explanation
130
+ assert restored.permission_check_message == result.permission_check_message
131
+ assert restored.referral_message == result.referral_message
132
+ assert restored.consent_status == result.consent_status
133
+ assert restored.patient_responses == result.patient_responses
134
+ assert restored.re_evaluation_result == result.re_evaluation_result
135
+
136
+ # Verify nested indicators
137
+ assert len(restored.indicators) == len(result.indicators)
138
+ for orig, rest in zip(result.indicators, restored.indicators):
139
+ assert rest.indicator_text == orig.indicator_text
140
+ assert rest.category == orig.category
141
+ assert rest.severity == orig.severity
142
+
143
+ # Verify nested follow-up questions
144
+ assert len(restored.follow_up_questions) == len(result.follow_up_questions)
145
+ for orig, rest in zip(result.follow_up_questions, restored.follow_up_questions):
146
+ assert rest.question_id == orig.question_id
147
+ assert rest.question_text == orig.question_text
148
+ assert rest.purpose == orig.purpose
149
+
150
+
151
+ class TestTaggingRecordRoundTrip:
152
+ """
153
+ **Feature: chaplain-feedback-system, Property: Data Model Round Trip**
154
+
155
+ Tests that TaggingRecord serializes and deserializes correctly.
156
+ """
157
+
158
+ @given(tagging_record_strategy())
159
+ @settings(max_examples=100)
160
+ def test_tagging_record_round_trip(self, record):
161
+ """
162
+ **Feature: chaplain-feedback-system, Property: Data Model Round Trip**
163
+ **Validates: Requirements 8.5**
164
+
165
+ For any TaggingRecord, converting to dict and back should
166
+ preserve all fields exactly.
167
+ """
168
+ # Convert to dict and back
169
+ record_dict = record.to_dict()
170
+ restored = TaggingRecord.from_dict(record_dict)
171
+
172
+ # Verify all fields match
173
+ assert restored.record_id == record.record_id
174
+ assert restored.message_id == record.message_id
175
+ assert restored.is_classification_correct == record.is_classification_correct
176
+ assert restored.classification_subcategory == record.classification_subcategory
177
+ assert restored.correct_classification == record.correct_classification
178
+ assert restored.question_issues == record.question_issues
179
+ assert restored.question_comments == record.question_comments
180
+ assert restored.referral_issues == record.referral_issues
181
+ assert restored.referral_comments == record.referral_comments
182
+ assert restored.indicator_issues == record.indicator_issues
183
+ assert restored.indicator_comments == record.indicator_comments
184
+ assert restored.general_notes == record.general_notes
185
+
186
+
187
+ class TestInteractionStepLogRoundTrip:
188
+ """
189
+ **Feature: chaplain-feedback-system, Property: Data Model Round Trip**
190
+
191
+ Tests that InteractionStepLog serializes and deserializes correctly.
192
+ """
193
+
194
+ @given(interaction_step_log_strategy())
195
+ @settings(max_examples=100)
196
+ def test_interaction_step_log_round_trip(self, log):
197
+ """
198
+ **Feature: chaplain-feedback-system, Property: Data Model Round Trip**
199
+ **Validates: Requirements 8.5**
200
+
201
+ For any InteractionStepLog, converting to dict and back should
202
+ preserve all fields exactly.
203
+ """
204
+ # Convert to dict and back
205
+ log_dict = log.to_dict()
206
+ restored = InteractionStepLog.from_dict(log_dict)
207
+
208
+ # Verify all fields match
209
+ assert restored.step_id == log.step_id
210
+ assert restored.session_id == log.session_id
211
+ assert restored.message_id == log.message_id
212
+ assert restored.step_type == log.step_type
213
+ assert restored.input_text == log.input_text
214
+ assert restored.model_output == log.model_output
215
+ assert restored.approval_status == log.approval_status
216
+ assert restored.tagging_data == log.tagging_data
217
+
218
+ @given(interaction_step_log_with_tagging_strategy())
219
+ @settings(max_examples=100)
220
+ def test_interaction_step_log_with_tagging_round_trip(self, log):
221
+ """
222
+ **Feature: chaplain-feedback-system, Property: Data Model Round Trip**
223
+ **Validates: Requirements 8.5**
224
+
225
+ For any InteractionStepLog with nested TaggingRecord, converting to dict
226
+ and back should preserve all fields exactly.
227
+ """
228
+ # Convert to dict and back
229
+ log_dict = log.to_dict()
230
+ restored = InteractionStepLog.from_dict(log_dict)
231
+
232
+ # Verify basic fields match
233
+ assert restored.step_id == log.step_id
234
+ assert restored.session_id == log.session_id
235
+ assert restored.message_id == log.message_id
236
+ assert restored.step_type == log.step_type
237
+ assert restored.input_text == log.input_text
238
+ assert restored.model_output == log.model_output
239
+ assert restored.approval_status == log.approval_status
240
+
241
+ # Verify nested tagging data
242
+ if log.tagging_data is None:
243
+ assert restored.tagging_data is None
244
+ else:
245
+ assert restored.tagging_data is not None
246
+ assert restored.tagging_data.record_id == log.tagging_data.record_id
247
+ assert restored.tagging_data.message_id == log.tagging_data.message_id
248
+ assert restored.tagging_data.is_classification_correct == log.tagging_data.is_classification_correct
249
+ assert restored.tagging_data.question_issues == log.tagging_data.question_issues
250
+ assert restored.tagging_data.referral_issues == log.tagging_data.referral_issues
tests/chaplain_feedback/test_properties_error_pattern_analyzer.py ADDED
@@ -0,0 +1,194 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Property-based tests for ErrorPatternAnalyzer.
3
+
4
+ Tests universal properties that should hold across all inputs
5
+ for the error pattern analysis functionality.
6
+ """
7
+
8
+ from hypothesis import given, strategies as st
9
+ from src.core.error_pattern_analyzer import ErrorPatternAnalyzer
10
+ from src.core.chaplain_models import (
11
+ CLASSIFICATION_SUBCATEGORIES,
12
+ QUESTION_ISSUE_TYPES,
13
+ REFERRAL_ISSUE_TYPES,
14
+ )
15
+ from .conftest import tagging_record_strategy
16
+
17
+
18
+ class TestErrorPatternAnalyzerProperties:
19
+ """Property-based tests for ErrorPatternAnalyzer."""
20
+
21
+ @given(st.lists(tagging_record_strategy(), min_size=1, max_size=20))
22
+ def test_property_19_statistics_include_subcategory_breakdown(self, records):
23
+ """
24
+ **Feature: chaplain-feedback-system, Property 19: Statistics Include Subcategory Breakdown**
25
+ **Validates: Requirements 4.4, 5.4, 6.4**
26
+ """
27
+ analyzer = ErrorPatternAnalyzer()
28
+ stats = analyzer.get_statistics_summary(records)
29
+
30
+ assert "total_records" in stats
31
+ assert "classification_errors" in stats
32
+ assert "question_issues" in stats
33
+ assert "referral_issues" in stats
34
+ assert "indicator_issues" in stats
35
+ assert "common_patterns" in stats
36
+
37
+ assert stats["total_records"] == len(records)
38
+
39
+ classification_errors = stats["classification_errors"]
40
+ for subcategory in CLASSIFICATION_SUBCATEGORIES:
41
+ assert subcategory in classification_errors
42
+ assert isinstance(classification_errors[subcategory], int)
43
+ assert classification_errors[subcategory] >= 0
44
+
45
+ question_issues = stats["question_issues"]
46
+ for issue_type in QUESTION_ISSUE_TYPES:
47
+ assert issue_type in question_issues
48
+ assert isinstance(question_issues[issue_type], int)
49
+ assert question_issues[issue_type] >= 0
50
+
51
+ referral_issues = stats["referral_issues"]
52
+ for issue_type in REFERRAL_ISSUE_TYPES:
53
+ assert issue_type in referral_issues
54
+ assert isinstance(referral_issues[issue_type], int)
55
+ assert referral_issues[issue_type] >= 0
56
+
57
+ indicator_issues = stats["indicator_issues"]
58
+ assert isinstance(indicator_issues, dict)
59
+ for indicator_id, count in indicator_issues.items():
60
+ assert isinstance(indicator_id, str)
61
+ assert isinstance(count, int)
62
+ assert count >= 0
63
+
64
+ common_patterns = stats["common_patterns"]
65
+ assert isinstance(common_patterns, list)
66
+
67
+ @given(st.lists(tagging_record_strategy(), min_size=1, max_size=20))
68
+ def test_property_20_error_patterns_grouped_by_type(self, records):
69
+ """
70
+ **Feature: chaplain-feedback-system, Property 20: Error Patterns Grouped by Type**
71
+ **Validates: Requirements 10.2, 10.3**
72
+ """
73
+ analyzer = ErrorPatternAnalyzer()
74
+ grouped_patterns = analyzer.get_error_patterns_grouped_by_type(records)
75
+
76
+ assert "classification" in grouped_patterns
77
+ assert "question" in grouped_patterns
78
+ assert "referral" in grouped_patterns
79
+ assert "indicator" in grouped_patterns
80
+
81
+ classification_group = grouped_patterns["classification"]
82
+ assert isinstance(classification_group, dict)
83
+ for subcategory in CLASSIFICATION_SUBCATEGORIES:
84
+ assert subcategory in classification_group
85
+ assert isinstance(classification_group[subcategory], int)
86
+ assert classification_group[subcategory] >= 0
87
+
88
+ question_group = grouped_patterns["question"]
89
+ assert isinstance(question_group, dict)
90
+ for issue_type in QUESTION_ISSUE_TYPES:
91
+ assert issue_type in question_group
92
+ assert isinstance(question_group[issue_type], int)
93
+ assert question_group[issue_type] >= 0
94
+
95
+ referral_group = grouped_patterns["referral"]
96
+ assert isinstance(referral_group, dict)
97
+ for issue_type in REFERRAL_ISSUE_TYPES:
98
+ assert issue_type in referral_group
99
+ assert isinstance(referral_group[issue_type], int)
100
+ assert referral_group[issue_type] >= 0
101
+
102
+ indicator_group = grouped_patterns["indicator"]
103
+ assert isinstance(indicator_group, dict)
104
+ for indicator_id, count in indicator_group.items():
105
+ assert isinstance(indicator_id, str)
106
+ assert isinstance(count, int)
107
+ assert count >= 0
108
+
109
+
110
+ @given(st.lists(tagging_record_strategy(), min_size=0, max_size=20))
111
+ def test_classification_error_analysis_counts_correctly(self, records):
112
+ """Test that classification error analysis counts errors correctly."""
113
+ analyzer = ErrorPatternAnalyzer()
114
+ error_counts = analyzer.analyze_classification_errors(records)
115
+
116
+ for subcategory in CLASSIFICATION_SUBCATEGORIES:
117
+ assert subcategory in error_counts
118
+ assert isinstance(error_counts[subcategory], int)
119
+ assert error_counts[subcategory] >= 0
120
+
121
+ expected_counts = {subcategory: 0 for subcategory in CLASSIFICATION_SUBCATEGORIES}
122
+ for record in records:
123
+ if not record.is_classification_correct and record.classification_subcategory:
124
+ if record.classification_subcategory in expected_counts:
125
+ expected_counts[record.classification_subcategory] += 1
126
+
127
+ assert error_counts == expected_counts
128
+
129
+ @given(st.lists(tagging_record_strategy(), min_size=0, max_size=20))
130
+ def test_question_issue_analysis_counts_correctly(self, records):
131
+ """Test that question issue analysis counts issues correctly."""
132
+ analyzer = ErrorPatternAnalyzer()
133
+ issue_counts = analyzer.analyze_question_issues(records)
134
+
135
+ for issue_type in QUESTION_ISSUE_TYPES:
136
+ assert issue_type in issue_counts
137
+ assert isinstance(issue_counts[issue_type], int)
138
+ assert issue_counts[issue_type] >= 0
139
+
140
+ expected_counts = {issue_type: 0 for issue_type in QUESTION_ISSUE_TYPES}
141
+ for record in records:
142
+ for issue in record.question_issues:
143
+ if issue in expected_counts:
144
+ expected_counts[issue] += 1
145
+
146
+ assert issue_counts == expected_counts
147
+
148
+ @given(st.lists(tagging_record_strategy(), min_size=0, max_size=20))
149
+ def test_referral_issue_analysis_counts_correctly(self, records):
150
+ """Test that referral issue analysis counts issues correctly."""
151
+ analyzer = ErrorPatternAnalyzer()
152
+ issue_counts = analyzer.analyze_referral_issues(records)
153
+
154
+ for issue_type in REFERRAL_ISSUE_TYPES:
155
+ assert issue_type in issue_counts
156
+ assert isinstance(issue_counts[issue_type], int)
157
+ assert issue_counts[issue_type] >= 0
158
+
159
+ expected_counts = {issue_type: 0 for issue_type in REFERRAL_ISSUE_TYPES}
160
+ for record in records:
161
+ for issue in record.referral_issues:
162
+ if issue in expected_counts:
163
+ expected_counts[issue] += 1
164
+
165
+ assert issue_counts == expected_counts
166
+
167
+ @given(st.lists(tagging_record_strategy(), min_size=0, max_size=20))
168
+ def test_indicator_issue_analysis_counts_correctly(self, records):
169
+ """Test that indicator issue analysis counts indicators correctly."""
170
+ analyzer = ErrorPatternAnalyzer()
171
+ indicator_counts = analyzer.analyze_indicator_issues(records)
172
+
173
+ assert isinstance(indicator_counts, dict)
174
+
175
+ expected_counts = {}
176
+ for record in records:
177
+ for indicator_id in record.indicator_issues:
178
+ if indicator_id not in expected_counts:
179
+ expected_counts[indicator_id] = 0
180
+ expected_counts[indicator_id] += 1
181
+
182
+ assert indicator_counts == expected_counts
183
+
184
+ @given(st.lists(tagging_record_strategy(), min_size=0, max_size=20))
185
+ def test_common_patterns_returns_list(self, records):
186
+ """Test that common patterns analysis returns a list of strings."""
187
+ analyzer = ErrorPatternAnalyzer()
188
+ patterns = analyzer.get_common_patterns(records)
189
+
190
+ assert isinstance(patterns, list)
191
+
192
+ for pattern in patterns:
193
+ assert isinstance(pattern, str)
194
+ assert len(pattern) > 0
tests/chaplain_feedback/test_properties_interaction_logging.py ADDED
@@ -0,0 +1,705 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # test_properties_interaction_logging.py
2
+ """
3
+ Property-based tests for Chaplain Feedback interaction logging.
4
+
5
+ Tests that interaction logging correctly records all steps with input/output
6
+ and supports approval status updates.
7
+ """
8
+
9
+ import pytest
10
+ from hypothesis import given, settings
11
+ from datetime import datetime
12
+
13
+ from src.core.interaction_logger import InteractionLogger
14
+ from src.core.chaplain_models import (
15
+ InteractionStepLog,
16
+ TaggingRecord,
17
+ INTERACTION_STEP_TYPES,
18
+ )
19
+
20
+ from tests.chaplain_feedback.conftest import (
21
+ valid_id_strategy,
22
+ tagging_record_strategy,
23
+ )
24
+
25
+
26
+ class TestInteractionLoggingCompleteness:
27
+ """
28
+ **Feature: chaplain-feedback-system, Property 14: Interaction Step Logging Complete**
29
+
30
+ Tests that interaction logging records all required fields for each step.
31
+ """
32
+
33
+ def test_interaction_step_logging_complete_all_types(self):
34
+ """
35
+ **Feature: chaplain-feedback-system, Property 14: Interaction Step Logging Complete**
36
+ **Validates: Requirements 7.1, 7.2**
37
+
38
+ For any interaction step, the log should contain: input text, model output, and timestamp.
39
+ """
40
+ logger = InteractionLogger()
41
+
42
+ # Test all step types
43
+ for step_type in INTERACTION_STEP_TYPES:
44
+ session_id = f"session_{step_type}"
45
+ message_id = f"msg_{step_type}"
46
+ input_text = f"input for {step_type}"
47
+ model_output = f"output for {step_type}"
48
+
49
+ # Log a step
50
+ step_id = logger.log_step(
51
+ session_id=session_id,
52
+ message_id=message_id,
53
+ step_type=step_type,
54
+ input_text=input_text,
55
+ model_output=model_output,
56
+ )
57
+
58
+ # Retrieve the logged step
59
+ logged_step = logger.get_step(step_id)
60
+
61
+ # Verify all required fields are present and correct
62
+ assert logged_step is not None
63
+ assert logged_step.step_id == step_id
64
+ assert logged_step.session_id == session_id
65
+ assert logged_step.message_id == message_id
66
+ assert logged_step.step_type == step_type
67
+ assert logged_step.input_text == input_text
68
+ assert logged_step.model_output == model_output
69
+ assert logged_step.timestamp is not None
70
+ assert isinstance(logged_step.timestamp, datetime)
71
+ assert logged_step.approval_status is None # Initially no approval
72
+ assert logged_step.tagging_data is None # Initially no tagging
73
+
74
+ def test_interaction_step_logging_multiple_steps(self):
75
+ """
76
+ Test that multiple steps are logged correctly for a session.
77
+ """
78
+ logger = InteractionLogger()
79
+ session_id = "test_session_1"
80
+ message_id = "test_message_1"
81
+
82
+ # Log multiple steps
83
+ step_ids = []
84
+ for i in range(3):
85
+ step_id = logger.log_step(
86
+ session_id=session_id,
87
+ message_id=message_id,
88
+ step_type="classification",
89
+ input_text=f"input {i}",
90
+ model_output=f"output {i}",
91
+ )
92
+ step_ids.append(step_id)
93
+
94
+ # Retrieve all session logs
95
+ session_logs = logger.get_session_logs(session_id)
96
+
97
+ # Verify all steps are logged
98
+ assert len(session_logs) == 3
99
+ for i, log in enumerate(session_logs):
100
+ assert log.input_text == f"input {i}"
101
+ assert log.model_output == f"output {i}"
102
+
103
+ def test_interaction_step_logging_preserves_order(self):
104
+ """
105
+ Test that logged steps are retrieved in the order they were logged.
106
+ """
107
+ logger = InteractionLogger()
108
+ session_id = "test_session_order"
109
+
110
+ # Log steps in order
111
+ step_ids = []
112
+ for i in range(5):
113
+ step_id = logger.log_step(
114
+ session_id=session_id,
115
+ message_id=f"msg_{i}",
116
+ step_type="classification",
117
+ input_text=f"input_{i}",
118
+ model_output=f"output_{i}",
119
+ )
120
+ step_ids.append(step_id)
121
+
122
+ # Retrieve logs
123
+ session_logs = logger.get_session_logs(session_id)
124
+
125
+ # Verify order is preserved
126
+ assert len(session_logs) == 5
127
+ for i, log in enumerate(session_logs):
128
+ assert log.message_id == f"msg_{i}"
129
+ assert log.input_text == f"input_{i}"
130
+
131
+ def test_interaction_step_logging_by_type(self):
132
+ """
133
+ Test filtering logs by step type.
134
+ """
135
+ logger = InteractionLogger()
136
+ session_id = "test_session_types"
137
+
138
+ # Log different types of steps
139
+ logger.log_step(session_id, "msg1", "classification", "input1", "output1")
140
+ logger.log_step(session_id, "msg2", "explanation", "input2", "output2")
141
+ logger.log_step(session_id, "msg3", "classification", "input3", "output3")
142
+ logger.log_step(session_id, "msg4", "referral", "input4", "output4")
143
+
144
+ # Filter by type
145
+ classification_logs = logger.get_session_logs_by_type(session_id, "classification")
146
+ explanation_logs = logger.get_session_logs_by_type(session_id, "explanation")
147
+ referral_logs = logger.get_session_logs_by_type(session_id, "referral")
148
+
149
+ # Verify filtering
150
+ assert len(classification_logs) == 2
151
+ assert len(explanation_logs) == 1
152
+ assert len(referral_logs) == 1
153
+
154
+ def test_interaction_step_logging_message_logs(self):
155
+ """
156
+ Test retrieving logs for a specific message across sessions.
157
+ """
158
+ logger = InteractionLogger()
159
+ message_id = "shared_message"
160
+
161
+ # Log same message in different sessions
162
+ logger.log_step("session1", message_id, "classification", "input1", "output1")
163
+ logger.log_step("session2", message_id, "explanation", "input2", "output2")
164
+ logger.log_step("session1", "other_msg", "referral", "input3", "output3")
165
+
166
+ # Get logs for the message
167
+ message_logs = logger.get_message_logs(message_id)
168
+
169
+ # Verify we get logs from both sessions
170
+ assert len(message_logs) == 2
171
+ assert all(log.message_id == message_id for log in message_logs)
172
+
173
+ def test_interaction_step_logging_empty_strings(self):
174
+ """
175
+ Test that empty input/output strings are logged correctly.
176
+ """
177
+ logger = InteractionLogger()
178
+
179
+ step_id = logger.log_step(
180
+ session_id="test_session",
181
+ message_id="test_msg",
182
+ step_type="classification",
183
+ input_text="",
184
+ model_output="",
185
+ )
186
+
187
+ logged_step = logger.get_step(step_id)
188
+
189
+ assert logged_step.input_text == ""
190
+ assert logged_step.model_output == ""
191
+
192
+ def test_interaction_step_logging_long_text(self):
193
+ """
194
+ Test that long input/output text is logged correctly.
195
+ """
196
+ logger = InteractionLogger()
197
+ long_text = "x" * 10000
198
+
199
+ step_id = logger.log_step(
200
+ session_id="test_session",
201
+ message_id="test_msg",
202
+ step_type="classification",
203
+ input_text=long_text,
204
+ model_output=long_text,
205
+ )
206
+
207
+ logged_step = logger.get_step(step_id)
208
+
209
+ assert logged_step.input_text == long_text
210
+ assert logged_step.model_output == long_text
211
+ assert len(logged_step.input_text) == 10000
212
+
213
+ def test_interaction_step_logging_special_characters(self):
214
+ """
215
+ Test that special characters in input/output are preserved.
216
+ """
217
+ logger = InteractionLogger()
218
+ special_text = "Test with special chars: !@#$%^&*()_+-=[]{}|;:',.<>?/~`"
219
+
220
+ step_id = logger.log_step(
221
+ session_id="test_session",
222
+ message_id="test_msg",
223
+ step_type="classification",
224
+ input_text=special_text,
225
+ model_output=special_text,
226
+ )
227
+
228
+ logged_step = logger.get_step(step_id)
229
+
230
+ assert logged_step.input_text == special_text
231
+ assert logged_step.model_output == special_text
232
+
233
+ def test_interaction_step_logging_unicode(self):
234
+ """
235
+ Test that Unicode characters in input/output are preserved.
236
+ """
237
+ logger = InteractionLogger()
238
+ unicode_text = "Test with Unicode: 你好世界 🌍 Привет мир"
239
+
240
+ step_id = logger.log_step(
241
+ session_id="test_session",
242
+ message_id="test_msg",
243
+ step_type="classification",
244
+ input_text=unicode_text,
245
+ model_output=unicode_text,
246
+ )
247
+
248
+ logged_step = logger.get_step(step_id)
249
+
250
+ assert logged_step.input_text == unicode_text
251
+ assert logged_step.model_output == unicode_text
252
+
253
+ def test_interaction_step_logging_statistics(self):
254
+ """
255
+ Test that session statistics are calculated correctly.
256
+ """
257
+ logger = InteractionLogger()
258
+ session_id = "test_session_stats"
259
+
260
+ # Log some steps
261
+ logger.log_step(session_id, "msg1", "classification", "input1", "output1")
262
+ logger.log_step(session_id, "msg2", "explanation", "input2", "output2")
263
+ logger.log_step(session_id, "msg3", "referral", "input3", "output3")
264
+
265
+ # Get statistics
266
+ stats = logger.get_session_statistics(session_id)
267
+
268
+ # Verify statistics
269
+ assert stats["session_id"] == session_id
270
+ assert stats["total_steps"] == 3
271
+ assert stats["approved_steps"] == 0
272
+ assert stats["disapproved_steps"] == 0
273
+ assert stats["unapproved_steps"] == 3
274
+ assert stats["steps_by_type"]["classification"] == 1
275
+ assert stats["steps_by_type"]["explanation"] == 1
276
+ assert stats["steps_by_type"]["referral"] == 1
277
+
278
+ def test_interaction_step_logging_invalid_step_type(self):
279
+ """
280
+ Test that invalid step types raise an error.
281
+ """
282
+ logger = InteractionLogger()
283
+
284
+ with pytest.raises(ValueError):
285
+ logger.log_step(
286
+ session_id="test_session",
287
+ message_id="test_msg",
288
+ step_type="invalid_type",
289
+ input_text="input",
290
+ model_output="output",
291
+ )
292
+
293
+ def test_interaction_step_logging_nonexistent_step(self):
294
+ """
295
+ Test that retrieving a nonexistent step returns None.
296
+ """
297
+ logger = InteractionLogger()
298
+
299
+ result = logger.get_step("nonexistent_step_id")
300
+
301
+ assert result is None
302
+
303
+ def test_interaction_step_logging_empty_session(self):
304
+ """
305
+ Test that retrieving logs for an empty session returns empty list.
306
+ """
307
+ logger = InteractionLogger()
308
+
309
+ session_logs = logger.get_session_logs("nonexistent_session")
310
+
311
+ assert session_logs == []
312
+
313
+ def test_interaction_step_logging_export(self):
314
+ """
315
+ Test that session logs can be exported as dictionaries.
316
+ """
317
+ logger = InteractionLogger()
318
+ session_id = "test_session_export"
319
+
320
+ # Log some steps
321
+ logger.log_step(session_id, "msg1", "classification", "input1", "output1")
322
+ logger.log_step(session_id, "msg2", "explanation", "input2", "output2")
323
+
324
+ # Export logs
325
+ exported = logger.export_session_logs(session_id)
326
+
327
+ # Verify export
328
+ assert len(exported) == 2
329
+ assert all(isinstance(log, dict) for log in exported)
330
+ assert all("step_id" in log for log in exported)
331
+ assert all("input_text" in log for log in exported)
332
+ assert all("model_output" in log for log in exported)
333
+ assert all("timestamp" in log for log in exported)
334
+
335
+
336
+ class TestFeedbackLogging:
337
+ """
338
+ **Feature: chaplain-feedback-system, Property 15: Feedback Logging Complete**
339
+
340
+ Tests that feedback logging correctly records approval/disapproval status
341
+ with tagging categories and comments.
342
+ """
343
+
344
+ def test_feedback_logging_approved_status(self):
345
+ """
346
+ **Feature: chaplain-feedback-system, Property 15: Feedback Logging Complete**
347
+ **Validates: Requirements 7.3, 7.4**
348
+
349
+ For any feedback, the log should record approval status.
350
+ """
351
+ logger = InteractionLogger()
352
+ session_id = "test_session_feedback"
353
+
354
+ # Log a step
355
+ step_id = logger.log_step(
356
+ session_id=session_id,
357
+ message_id="msg1",
358
+ step_type="classification",
359
+ input_text="input",
360
+ model_output="output",
361
+ )
362
+
363
+ # Update with approved status
364
+ logger.update_approval(step_id, "approved")
365
+
366
+ # Retrieve and verify
367
+ logged_step = logger.get_step(step_id)
368
+ assert logged_step.approval_status == "approved"
369
+ assert logged_step.tagging_data is None
370
+
371
+ def test_feedback_logging_disapproved_status(self):
372
+ """
373
+ Test that disapproved status is recorded correctly.
374
+ """
375
+ logger = InteractionLogger()
376
+ session_id = "test_session_feedback"
377
+
378
+ # Log a step
379
+ step_id = logger.log_step(
380
+ session_id=session_id,
381
+ message_id="msg1",
382
+ step_type="classification",
383
+ input_text="input",
384
+ model_output="output",
385
+ )
386
+
387
+ # Update with disapproved status
388
+ logger.update_approval(step_id, "disapproved")
389
+
390
+ # Retrieve and verify
391
+ logged_step = logger.get_step(step_id)
392
+ assert logged_step.approval_status == "disapproved"
393
+
394
+ @given(tagging_record_strategy())
395
+ @settings(max_examples=100)
396
+ def test_feedback_logging_with_tagging_data(self, tagging_record):
397
+ """
398
+ **Feature: chaplain-feedback-system, Property 15: Feedback Logging Complete**
399
+ **Validates: Requirements 7.3, 7.4**
400
+
401
+ For any chaplain feedback, the log should contain: approval/disapproval status,
402
+ and if disapproved, the tagging categories and comments.
403
+ """
404
+ logger = InteractionLogger()
405
+ session_id = "test_session_tagging"
406
+
407
+ # Log a step
408
+ step_id = logger.log_step(
409
+ session_id=session_id,
410
+ message_id=tagging_record.message_id,
411
+ step_type="classification",
412
+ input_text="input",
413
+ model_output="output",
414
+ )
415
+
416
+ # Update with disapproved status and tagging data
417
+ logger.update_approval(step_id, "disapproved", tagging_record)
418
+
419
+ # Retrieve and verify
420
+ logged_step = logger.get_step(step_id)
421
+ assert logged_step.approval_status == "disapproved"
422
+ assert logged_step.tagging_data is not None
423
+ assert logged_step.tagging_data.record_id == tagging_record.record_id
424
+ assert logged_step.tagging_data.message_id == tagging_record.message_id
425
+ assert logged_step.tagging_data.is_classification_correct == tagging_record.is_classification_correct
426
+ assert logged_step.tagging_data.question_issues == tagging_record.question_issues
427
+ assert logged_step.tagging_data.referral_issues == tagging_record.referral_issues
428
+
429
+ def test_feedback_logging_classification_subcategory(self):
430
+ """
431
+ Test that classification subcategory is recorded in tagging data.
432
+ """
433
+ logger = InteractionLogger()
434
+ session_id = "test_session_classification"
435
+
436
+ # Create tagging record with classification subcategory
437
+ tagging = TaggingRecord(
438
+ record_id="tag1",
439
+ message_id="msg1",
440
+ is_classification_correct=False,
441
+ classification_subcategory="missed_indicators",
442
+ correct_classification="red",
443
+ )
444
+
445
+ # Log a step
446
+ step_id = logger.log_step(
447
+ session_id=session_id,
448
+ message_id="msg1",
449
+ step_type="classification",
450
+ input_text="input",
451
+ model_output="output",
452
+ )
453
+
454
+ # Update with tagging
455
+ logger.update_approval(step_id, "disapproved", tagging)
456
+
457
+ # Retrieve and verify
458
+ logged_step = logger.get_step(step_id)
459
+ assert logged_step.tagging_data.classification_subcategory == "missed_indicators"
460
+ assert logged_step.tagging_data.correct_classification == "red"
461
+
462
+ def test_feedback_logging_question_issues(self):
463
+ """
464
+ Test that question issues are recorded in tagging data.
465
+ """
466
+ logger = InteractionLogger()
467
+ session_id = "test_session_questions"
468
+
469
+ # Create tagging record with question issues
470
+ tagging = TaggingRecord(
471
+ record_id="tag1",
472
+ message_id="msg1",
473
+ is_classification_correct=True,
474
+ question_issues=["inappropriate", "too_leading"],
475
+ question_comments="Questions were too intrusive",
476
+ )
477
+
478
+ # Log a step
479
+ step_id = logger.log_step(
480
+ session_id=session_id,
481
+ message_id="msg1",
482
+ step_type="follow_up",
483
+ input_text="input",
484
+ model_output="output",
485
+ )
486
+
487
+ # Update with tagging
488
+ logger.update_approval(step_id, "disapproved", tagging)
489
+
490
+ # Retrieve and verify
491
+ logged_step = logger.get_step(step_id)
492
+ assert logged_step.tagging_data.question_issues == ["inappropriate", "too_leading"]
493
+ assert logged_step.tagging_data.question_comments == "Questions were too intrusive"
494
+
495
+ def test_feedback_logging_referral_issues(self):
496
+ """
497
+ Test that referral issues are recorded in tagging data.
498
+ """
499
+ logger = InteractionLogger()
500
+ session_id = "test_session_referral"
501
+
502
+ # Create tagging record with referral issues
503
+ tagging = TaggingRecord(
504
+ record_id="tag1",
505
+ message_id="msg1",
506
+ is_classification_correct=True,
507
+ referral_issues=["incomplete_summary", "inappropriate_tone"],
508
+ referral_comments="Message was incomplete",
509
+ )
510
+
511
+ # Log a step
512
+ step_id = logger.log_step(
513
+ session_id=session_id,
514
+ message_id="msg1",
515
+ step_type="referral",
516
+ input_text="input",
517
+ model_output="output",
518
+ )
519
+
520
+ # Update with tagging
521
+ logger.update_approval(step_id, "disapproved", tagging)
522
+
523
+ # Retrieve and verify
524
+ logged_step = logger.get_step(step_id)
525
+ assert logged_step.tagging_data.referral_issues == ["incomplete_summary", "inappropriate_tone"]
526
+ assert logged_step.tagging_data.referral_comments == "Message was incomplete"
527
+
528
+ def test_feedback_logging_indicator_issues(self):
529
+ """
530
+ Test that indicator issues are recorded in tagging data.
531
+ """
532
+ logger = InteractionLogger()
533
+ session_id = "test_session_indicators"
534
+
535
+ # Create tagging record with indicator issues
536
+ tagging = TaggingRecord(
537
+ record_id="tag1",
538
+ message_id="msg1",
539
+ is_classification_correct=True,
540
+ indicator_issues=["indicator_1", "indicator_2"],
541
+ indicator_comments="These indicators were incorrectly identified",
542
+ )
543
+
544
+ # Log a step
545
+ step_id = logger.log_step(
546
+ session_id=session_id,
547
+ message_id="msg1",
548
+ step_type="classification",
549
+ input_text="input",
550
+ model_output="output",
551
+ )
552
+
553
+ # Update with tagging
554
+ logger.update_approval(step_id, "disapproved", tagging)
555
+
556
+ # Retrieve and verify
557
+ logged_step = logger.get_step(step_id)
558
+ assert logged_step.tagging_data.indicator_issues == ["indicator_1", "indicator_2"]
559
+ assert logged_step.tagging_data.indicator_comments == "These indicators were incorrectly identified"
560
+
561
+ def test_feedback_logging_general_notes(self):
562
+ """
563
+ Test that general notes are recorded in tagging data.
564
+ """
565
+ logger = InteractionLogger()
566
+ session_id = "test_session_notes"
567
+
568
+ # Create tagging record with general notes
569
+ tagging = TaggingRecord(
570
+ record_id="tag1",
571
+ message_id="msg1",
572
+ is_classification_correct=True,
573
+ general_notes="Overall good classification but needs improvement in tone",
574
+ )
575
+
576
+ # Log a step
577
+ step_id = logger.log_step(
578
+ session_id=session_id,
579
+ message_id="msg1",
580
+ step_type="classification",
581
+ input_text="input",
582
+ model_output="output",
583
+ )
584
+
585
+ # Update with tagging
586
+ logger.update_approval(step_id, "approved", tagging)
587
+
588
+ # Retrieve and verify
589
+ logged_step = logger.get_step(step_id)
590
+ assert logged_step.tagging_data.general_notes == "Overall good classification but needs improvement in tone"
591
+
592
+ def test_feedback_logging_disapproved_steps_retrieval(self):
593
+ """
594
+ Test that disapproved steps can be retrieved from a session.
595
+ """
596
+ logger = InteractionLogger()
597
+ session_id = "test_session_disapproved"
598
+
599
+ # Log multiple steps
600
+ step_id_1 = logger.log_step(session_id, "msg1", "classification", "input1", "output1")
601
+ step_id_2 = logger.log_step(session_id, "msg2", "explanation", "input2", "output2")
602
+ step_id_3 = logger.log_step(session_id, "msg3", "referral", "input3", "output3")
603
+
604
+ # Approve first, disapprove second and third
605
+ logger.update_approval(step_id_1, "approved")
606
+ logger.update_approval(step_id_2, "disapproved")
607
+ logger.update_approval(step_id_3, "disapproved")
608
+
609
+ # Get disapproved steps
610
+ disapproved = logger.get_disapproved_steps(session_id)
611
+
612
+ # Verify
613
+ assert len(disapproved) == 2
614
+ assert all(log.approval_status == "disapproved" for log in disapproved)
615
+
616
+ def test_feedback_logging_unapproved_steps_retrieval(self):
617
+ """
618
+ Test that unapproved steps can be retrieved from a session.
619
+ """
620
+ logger = InteractionLogger()
621
+ session_id = "test_session_unapproved"
622
+
623
+ # Log multiple steps
624
+ step_id_1 = logger.log_step(session_id, "msg1", "classification", "input1", "output1")
625
+ step_id_2 = logger.log_step(session_id, "msg2", "explanation", "input2", "output2")
626
+ step_id_3 = logger.log_step(session_id, "msg3", "referral", "input3", "output3")
627
+
628
+ # Approve first, leave others unapproved
629
+ logger.update_approval(step_id_1, "approved")
630
+
631
+ # Get unapproved steps
632
+ unapproved = logger.get_unapproved_steps(session_id)
633
+
634
+ # Verify
635
+ assert len(unapproved) == 2
636
+ assert all(log.approval_status is None for log in unapproved)
637
+
638
+ def test_feedback_logging_invalid_approval_status(self):
639
+ """
640
+ Test that invalid approval status raises an error.
641
+ """
642
+ logger = InteractionLogger()
643
+ session_id = "test_session_invalid"
644
+
645
+ # Log a step
646
+ step_id = logger.log_step(
647
+ session_id=session_id,
648
+ message_id="msg1",
649
+ step_type="classification",
650
+ input_text="input",
651
+ model_output="output",
652
+ )
653
+
654
+ # Try to update with invalid status
655
+ with pytest.raises(ValueError):
656
+ logger.update_approval(step_id, "invalid_status")
657
+
658
+ def test_feedback_logging_nonexistent_step(self):
659
+ """
660
+ Test that updating a nonexistent step raises an error.
661
+ """
662
+ logger = InteractionLogger()
663
+
664
+ with pytest.raises(ValueError):
665
+ logger.update_approval("nonexistent_step", "approved")
666
+
667
+ def test_feedback_logging_export_with_tagging(self):
668
+ """
669
+ Test that exported logs include tagging data.
670
+ """
671
+ logger = InteractionLogger()
672
+ session_id = "test_session_export_tagging"
673
+
674
+ # Create tagging record
675
+ tagging = TaggingRecord(
676
+ record_id="tag1",
677
+ message_id="msg1",
678
+ is_classification_correct=False,
679
+ classification_subcategory="missed_indicators",
680
+ correct_classification="red",
681
+ general_notes="Missed key indicators",
682
+ )
683
+
684
+ # Log a step
685
+ step_id = logger.log_step(
686
+ session_id=session_id,
687
+ message_id="msg1",
688
+ step_type="classification",
689
+ input_text="input",
690
+ model_output="output",
691
+ )
692
+
693
+ # Update with tagging
694
+ logger.update_approval(step_id, "disapproved", tagging)
695
+
696
+ # Export logs
697
+ exported = logger.export_session_logs(session_id)
698
+
699
+ # Verify export includes tagging data
700
+ assert len(exported) == 1
701
+ assert exported[0]["approval_status"] == "disapproved"
702
+ assert exported[0]["tagging_data"] is not None
703
+ assert exported[0]["tagging_data"]["classification_subcategory"] == "missed_indicators"
704
+ assert exported[0]["tagging_data"]["correct_classification"] == "red"
705
+ assert exported[0]["tagging_data"]["general_notes"] == "Missed key indicators"
tests/chaplain_feedback/test_properties_tagging_service.py ADDED
@@ -0,0 +1,223 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # test_properties_tagging_service.py
2
+ """
3
+ Property-based tests for TaggingService.
4
+
5
+ Tests universal properties that should hold across all inputs
6
+ for the tagging system functionality.
7
+ """
8
+
9
+ import pytest
10
+ from hypothesis import given, strategies as st
11
+
12
+ from src.core.tagging_service import TaggingService
13
+ from src.core.chaplain_models import (
14
+ CLASSIFICATION_SUBCATEGORIES,
15
+ QUESTION_ISSUE_TYPES,
16
+ REFERRAL_ISSUE_TYPES,
17
+ )
18
+ from .conftest import valid_id_strategy
19
+
20
+
21
+ class TestTaggingServiceProperties:
22
+ """Property-based tests for TaggingService."""
23
+
24
+ @given(
25
+ message_id=valid_id_strategy(),
26
+ general_notes=st.text(max_size=200)
27
+ )
28
+ def test_property_10_wrong_classification_subcategories_available(
29
+ self, message_id: str, general_notes: str
30
+ ):
31
+ """
32
+ **Feature: chaplain-feedback-system, Property 10: Wrong Classification Subcategories Available**
33
+ **Validates: Requirements 4.1**
34
+
35
+ For any incorrect classification feedback, the system should provide
36
+ all three subcategory options: "missed_indicators", "false_positive", "missed_distress".
37
+ """
38
+ service = TaggingService()
39
+
40
+ # Get available subcategories
41
+ available_subcategories = service.get_available_classification_subcategories()
42
+
43
+ # Should contain all three required subcategories
44
+ expected_subcategories = {"missed_indicators", "false_positive", "missed_distress"}
45
+ assert set(available_subcategories) == expected_subcategories
46
+
47
+ # Should be able to create records with each subcategory
48
+ for subcategory in available_subcategories:
49
+ record = service.create_classification_correction(
50
+ message_id=f"{message_id}_{subcategory}",
51
+ subcategory=subcategory,
52
+ correct_classification="red",
53
+ general_notes=general_notes
54
+ )
55
+
56
+ assert record.classification_subcategory == subcategory
57
+ assert record.is_classification_correct is False
58
+ assert record.correct_classification == "red"
59
+
60
+ @given(
61
+ message_id=valid_id_strategy(),
62
+ subcategory=st.sampled_from(CLASSIFICATION_SUBCATEGORIES),
63
+ correct_classification=st.sampled_from(["red", "yellow", "green"]),
64
+ general_notes=st.text(max_size=200)
65
+ )
66
+ def test_property_11_wrong_classification_saves_subcategory(
67
+ self,
68
+ message_id: str,
69
+ subcategory: str,
70
+ correct_classification: str,
71
+ general_notes: str
72
+ ):
73
+ """
74
+ **Feature: chaplain-feedback-system, Property 11: Wrong Classification Saves Subcategory**
75
+ **Validates: Requirements 4.3**
76
+
77
+ For any wrong classification tag submission, the saved record should contain
78
+ both the subcategory and the correct classification.
79
+ """
80
+ service = TaggingService()
81
+
82
+ # Create classification correction
83
+ record = service.create_classification_correction(
84
+ message_id=message_id,
85
+ subcategory=subcategory,
86
+ correct_classification=correct_classification,
87
+ general_notes=general_notes
88
+ )
89
+
90
+ # Record should be saved and retrievable
91
+ retrieved_record = service.get_tagging_record(record.record_id)
92
+ assert retrieved_record is not None
93
+
94
+ # Should contain both subcategory and correct classification
95
+ assert retrieved_record.classification_subcategory == subcategory
96
+ assert retrieved_record.correct_classification == correct_classification
97
+ assert retrieved_record.is_classification_correct is False
98
+
99
+ # Should also be retrievable by message ID
100
+ message_records = service.get_records_for_message(message_id)
101
+ assert len(message_records) == 1
102
+ assert message_records[0].classification_subcategory == subcategory
103
+ assert message_records[0].correct_classification == correct_classification
104
+
105
+ @given(
106
+ message_id=valid_id_strategy(),
107
+ question_issues=st.lists(
108
+ st.sampled_from(QUESTION_ISSUE_TYPES),
109
+ min_size=1,
110
+ max_size=len(QUESTION_ISSUE_TYPES),
111
+ unique=True
112
+ ),
113
+ question_comments=st.one_of(st.none(), st.text(max_size=200))
114
+ )
115
+ def test_property_12_question_issues_multi_select(
116
+ self,
117
+ message_id: str,
118
+ question_issues: list,
119
+ question_comments: str
120
+ ):
121
+ """
122
+ **Feature: chaplain-feedback-system, Property 12: Question Issues Multi-Select**
123
+ **Validates: Requirements 5.2**
124
+
125
+ For any follow-up question issue tagging, the system should allow
126
+ selecting multiple subcategories and save all selected values.
127
+ """
128
+ service = TaggingService()
129
+
130
+ # Create record with multiple question issues
131
+ record = service.create_tagging_record(
132
+ message_id=message_id,
133
+ question_issues=question_issues,
134
+ question_comments=question_comments
135
+ )
136
+
137
+ # Should save all selected question issues
138
+ assert set(record.question_issues) == set(question_issues)
139
+ assert record.question_comments == question_comments
140
+
141
+ # Should be retrievable with all issues intact
142
+ retrieved_record = service.get_tagging_record(record.record_id)
143
+ assert retrieved_record is not None
144
+ assert set(retrieved_record.question_issues) == set(question_issues)
145
+ assert retrieved_record.question_comments == question_comments
146
+
147
+ @given(
148
+ message_id=valid_id_strategy(),
149
+ referral_issues=st.lists(
150
+ st.sampled_from(REFERRAL_ISSUE_TYPES),
151
+ min_size=1,
152
+ max_size=len(REFERRAL_ISSUE_TYPES),
153
+ unique=True
154
+ ),
155
+ referral_comments=st.one_of(st.none(), st.text(max_size=200))
156
+ )
157
+ def test_property_13_referral_issues_multi_select(
158
+ self,
159
+ message_id: str,
160
+ referral_issues: list,
161
+ referral_comments: str
162
+ ):
163
+ """
164
+ **Feature: chaplain-feedback-system, Property 13: Referral Issues Multi-Select**
165
+ **Validates: Requirements 6.2**
166
+
167
+ For any referral message issue tagging, the system should allow
168
+ selecting multiple subcategories and save all selected values.
169
+ """
170
+ service = TaggingService()
171
+
172
+ # Create record with multiple referral issues
173
+ record = service.create_tagging_record(
174
+ message_id=message_id,
175
+ referral_issues=referral_issues,
176
+ referral_comments=referral_comments
177
+ )
178
+
179
+ # Should save all selected referral issues
180
+ assert set(record.referral_issues) == set(referral_issues)
181
+ assert record.referral_comments == referral_comments
182
+
183
+ # Should be retrievable with all issues intact
184
+ retrieved_record = service.get_tagging_record(record.record_id)
185
+ assert retrieved_record is not None
186
+ assert set(retrieved_record.referral_issues) == set(referral_issues)
187
+ assert retrieved_record.referral_comments == referral_comments
188
+
189
+ @given(
190
+ message_id=valid_id_strategy(),
191
+ indicator_issues=st.lists(st.text(min_size=1, max_size=50), min_size=1, max_size=5),
192
+ indicator_comments=st.one_of(st.none(), st.text(max_size=200))
193
+ )
194
+ def test_indicator_issue_tagging_functionality(
195
+ self,
196
+ message_id: str,
197
+ indicator_issues: list,
198
+ indicator_comments: str
199
+ ):
200
+ """
201
+ Test that indicator issue tagging works correctly.
202
+
203
+ This tests the indicator issue tagging functionality to ensure
204
+ incorrectly identified indicators can be marked with comments.
205
+ """
206
+ service = TaggingService()
207
+
208
+ # Create record with indicator issues
209
+ record = service.create_indicator_issue_tagging(
210
+ message_id=message_id,
211
+ indicator_issues=indicator_issues,
212
+ indicator_comments=indicator_comments
213
+ )
214
+
215
+ # Should save all indicator issues
216
+ assert record.indicator_issues == indicator_issues
217
+ assert record.indicator_comments == indicator_comments
218
+
219
+ # Should be retrievable with all issues intact
220
+ retrieved_record = service.get_tagging_record(record.record_id)
221
+ assert retrieved_record is not None
222
+ assert retrieved_record.indicator_issues == indicator_issues
223
+ assert retrieved_record.indicator_comments == indicator_comments