Patricksturg commited on
Commit
3a7eac5
Β·
verified Β·
1 Parent(s): 054b0d1

Update cogtest prompts: checklist respondent, label-free analyst, merge-by-issue synthesis

Browse files
Files changed (1) hide show
  1. dashboard_backend.py +44 -27
dashboard_backend.py CHANGED
@@ -116,19 +116,45 @@ Work through the question naturally, thinking aloud as you go."""
116
  COGTEST_RESPONDENT_USER = """WHO YOU ARE:
117
  {backstory}
118
 
 
 
119
  SURVEY QUESTION:
120
  {question_text}
121
 
122
  RESPONSE OPTIONS: {response_options}
123
 
124
- Think aloud as you work through answering this question:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
125
 
126
- 1. COMPREHENSION: Put the question in your own words. What is it asking you?
127
- 2. RETRIEVAL: What from your life or experience are you drawing on?
128
- 3. JUDGEMENT: Are you having to weigh up or combine different things?
129
- 4. RESPONSE MAPPING: Which option fits best? How well does it capture what you want to say? Is anything missing?
130
  5. CHOSEN ANSWER: State your final answer.
131
- 6. CONFIDENCE: Rate 1-5 how confident you are in that answer."""
 
132
 
133
  COGTEST_ANALYST_SYSTEM = """You are an expert in cognitive interviewing and survey methodology.
134
  You are reviewing a think-aloud transcript from a cognitive interview.
@@ -144,25 +170,17 @@ RESPONDENT THINK-ALOUD TRANSCRIPT:
144
 
145
  Identify ONLY problems that are clearly evidenced in this transcript. Do NOT go looking for problems that are not there. Most transcripts will have only 1-2 genuine problems, and some will have none. If the respondent navigated the question without difficulty, return an empty problems list.
146
 
147
- A problem must be supported by specific evidence in the transcript - something the respondent actually said or demonstrably struggled with. Do not infer problems that the respondent did not experience.
148
-
149
- When a genuine problem is found, classify it using whichever of these categories fits best:
150
- - COMPREHENSION PROBLEMS: Respondent misunderstood or misinterpreted terms
151
- - DOUBLE-BARRELLED: Question asks about two distinct things requiring different answers
152
- - RETRIEVAL DIFFICULTIES: Respondent struggled to recall or estimate
153
- - RESPONSE MAPPING FAILURES: Answer did not fit the available options
154
- - SOCIAL DESIRABILITY / SENSITIVITY: Respondent hedged or showed discomfort
155
- - PRESUPPOSITION FAILURES: Question assumed something untrue for this respondent
156
- - TAUTOLOGY / LOGICAL PROBLEMS: Circularity between question wording and response options
157
-
158
- If the problem does not fit any category above, use a short descriptive label.
159
 
160
  For each problem:
 
161
  - Cite specific evidence from the transcript (what the respondent actually said)
162
  - Rate SEVERITY 1-10: the degree to which this problem would distort the accuracy of the answer (1 = negligible distortion, 10 = answer rendered meaningless)
163
 
 
 
164
  Return ONLY valid JSON, no markdown fencing:
165
- {{"respondent_id": "{respondent_id}", "problems": [{{"type": "...", "description": "...", "evidence": "...", "severity": 7}}]}}"""
166
 
167
  COGTEST_SYNTHESIS_SYSTEM = """You are a senior survey methodologist writing a cognitive testing report.
168
  You are synthesising the results of individual transcript analyses from a cognitive interview study."""
@@ -172,22 +190,21 @@ COGTEST_SYNTHESIS_USER = """SURVEY QUESTION:
172
 
173
  RESPONSE OPTIONS: {response_options}
174
 
175
- Below are the coded problems identified by an analyst for each respondent's think-aloud transcript.
176
- Each problem has a severity score (1-10) indicating the degree to which it would distort accuracy of the answer.
177
- Your job is to aggregate these into a summary of the distinct problems found across respondents.
178
- Use the SAME problem type labels as the individual analyses (e.g. if analysts coded "DOUBLE-BARRELLED", use that label, not a paraphrase).
179
 
180
  INDIVIDUAL ANALYSES:
181
  {analyses_block}
182
 
183
- For each distinct problem type that was identified:
184
- - Use the same type label as in the individual analyses
185
  - List which respondents showed evidence of it
186
- - Calculate the mean severity score across respondents who had this problem
187
  - Summarise the evidence across respondents
188
 
189
  Return ONLY valid JSON, no markdown fencing:
190
- {{"question_id": "synthesis", "problems_detected": [{{"type": "...", "description": "...", "respondents_affected": [...], "mean_severity": 6.5, "evidence_summary": "..."}}]}}"""
191
 
192
  # ============================================================================
193
  # EXPERT REVIEW PIPELINE PROMPTS
 
116
  COGTEST_RESPONDENT_USER = """WHO YOU ARE:
117
  {backstory}
118
 
119
+ You are this person. Respond in first person throughout (I, me, my). Do not break character or refer to yourself in the third person.
120
+
121
  SURVEY QUESTION:
122
  {question_text}
123
 
124
  RESPONSE OPTIONS: {response_options}
125
 
126
+ Think aloud as you work through answering this question, following the stages below. At each stage, consider whether you experience any of the specific difficulties described. Not all of the difficulties listed below will apply β€” only mention those you actually experience as you work through the question.
127
+
128
+ 1. COMPREHENSION: Put the question in your own words. What is it actually asking you?
129
+
130
+ As you do this, consider:
131
+ - DOUBLE-BARRELLED: Is the question asking about two or more distinct things at once, where you might want to give different answers to each part?
132
+ - VAGUE OR AMBIGUOUS TERMS: Are any words or phrases unclear, open to multiple interpretations, or likely to mean different things to different people?
133
+ - PRESUPPOSITION: Does the question assume something about you or your situation that may not be true β€” for example, that you have used a service, hold an opinion, or have had a particular experience?
134
+ - LEADING OR LOADED: Does the phrasing of the question push you toward a particular answer, for example through emotive language, built-in justifications, or a question structure that implies one answer is more natural or correct?
135
+
136
+ 2. RETRIEVAL: What from your life or experience are you drawing on to answer this?
137
+
138
+ As you do this, consider:
139
+ - RECALL DIFFICULTY: Is it hard to remember the information the question asks about β€” for example because it happened a long time ago, happened many times, or requires counting or estimating over a long period?
140
+ - REFERENCE PERIOD PROBLEMS: Does the question specify a time frame that is difficult to recall over, or is the time frame vague or missing entirely?
141
+
142
+ 3. JUDGEMENT: Are you having to weigh up, combine, or summarise different things to arrive at a single answer?
143
+
144
+ As you do this, consider:
145
+ - COMPLEX AGGREGATION: Does the question require you to combine multiple different experiences, views, or considerations into a single summary response in a way that feels forced or oversimplified?
146
+ - KNOWLEDGE DEFICIT: Is the question asking you to make a judgement about something you don't feel you know enough about to answer meaningfully?
147
+
148
+ 4. RESPONSE MAPPING: Look at the response options. Which one fits best? How well does it capture what you actually want to say?
149
+
150
+ As you do this, consider:
151
+ - POOR FIT: Does your answer not map clearly onto any of the available options, forcing you to choose one that doesn't quite capture what you mean?
152
+ - MISSING OPTIONS: Is there a response you want to give that isn't available β€” for example a "not applicable" or "don't know" option, or a middle category?
153
+ - UNEQUAL INTERVALS: Do the response categories feel unevenly spaced, so that some options cover a much wider range of views or experiences than others?
154
 
 
 
 
 
155
  5. CHOSEN ANSWER: State your final answer.
156
+
157
+ 6. CONFIDENCE: Rate your confidence 1-5, where 1 means you felt very unsure about your answer and 5 means you felt completely certain."""
158
 
159
  COGTEST_ANALYST_SYSTEM = """You are an expert in cognitive interviewing and survey methodology.
160
  You are reviewing a think-aloud transcript from a cognitive interview.
 
170
 
171
  Identify ONLY problems that are clearly evidenced in this transcript. Do NOT go looking for problems that are not there. Most transcripts will have only 1-2 genuine problems, and some will have none. If the respondent navigated the question without difficulty, return an empty problems list.
172
 
173
+ A problem must be supported by specific evidence in the transcript β€” something the respondent actually said or demonstrably struggled with. Do not infer problems that the respondent did not experience.
 
 
 
 
 
 
 
 
 
 
 
174
 
175
  For each problem:
176
+ - Describe the problem in plain language (what went wrong for this respondent)
177
  - Cite specific evidence from the transcript (what the respondent actually said)
178
  - Rate SEVERITY 1-10: the degree to which this problem would distort the accuracy of the answer (1 = negligible distortion, 10 = answer rendered meaningless)
179
 
180
+ Do NOT classify problems into named categories or assign type labels. Just describe what happened.
181
+
182
  Return ONLY valid JSON, no markdown fencing:
183
+ {{"respondent_id": "{respondent_id}", "problems": [{{"description": "...", "evidence": "...", "severity": 7}}]}}"""
184
 
185
  COGTEST_SYNTHESIS_SYSTEM = """You are a senior survey methodologist writing a cognitive testing report.
186
  You are synthesising the results of individual transcript analyses from a cognitive interview study."""
 
190
 
191
  RESPONSE OPTIONS: {response_options}
192
 
193
+ Below are the problems identified by an analyst for each respondent's think-aloud transcript. Each problem has a severity score (1-10) indicating the degree to which it would distort accuracy of the answer. Your job is to aggregate these into a summary of the DISTINCT problems found across respondents.
194
+
195
+ Focus on the UNDERLYING ISSUE. If multiple respondents describe the same fundamental problem with the question β€” even in different words or from different angles β€” merge them into a single problem. Two descriptions count as the same problem if fixing one would also fix the other.
 
196
 
197
  INDIVIDUAL ANALYSES:
198
  {analyses_block}
199
 
200
+ For each distinct underlying problem:
201
+ - Describe the problem clearly
202
  - List which respondents showed evidence of it
203
+ - Calculate the mean severity score across all respondents who had this problem
204
  - Summarise the evidence across respondents
205
 
206
  Return ONLY valid JSON, no markdown fencing:
207
+ {{"question_id": "synthesis", "problems_detected": [{{"description": "...", "respondents_affected": [...], "mean_severity": 6.5, "evidence_summary": "..."}}]}}"""
208
 
209
  # ============================================================================
210
  # EXPERT REVIEW PIPELINE PROMPTS