BtB-ExpC commited on
Commit
2a11ca0
·
1 Parent(s): d5dd05e

improving prompts

Browse files
chains/distractors/distractors_chain.py CHANGED
@@ -47,25 +47,25 @@ class DistractorsChain(BaseModel):
47
  tasks.append(run_brainstorm(
48
  self.template_distractors_brainstorm_1,
49
  self.llm_brainstorm_1,
50
- "T1-1"
51
  ))
52
  # Template 1, LLM 2
53
  tasks.append(run_brainstorm(
54
  self.template_distractors_brainstorm_1,
55
  self.llm_brainstorm_2,
56
- "T1-2"
57
  ))
58
  # Template 2, LLM 1
59
  tasks.append(run_brainstorm(
60
  self.template_distractors_brainstorm_2,
61
  self.llm_brainstorm_1,
62
- "T2-1"
63
  ))
64
  # Template 2, LLM 2
65
  tasks.append(run_brainstorm(
66
  self.template_distractors_brainstorm_2,
67
  self.llm_brainstorm_2,
68
- "T2-2"
69
  ))
70
 
71
  # Kick them off concurrently
 
47
  tasks.append(run_brainstorm(
48
  self.template_distractors_brainstorm_1,
49
  self.llm_brainstorm_1,
50
+ "T1-L1"
51
  ))
52
  # Template 1, LLM 2
53
  tasks.append(run_brainstorm(
54
  self.template_distractors_brainstorm_1,
55
  self.llm_brainstorm_2,
56
+ "T1-L2"
57
  ))
58
  # Template 2, LLM 1
59
  tasks.append(run_brainstorm(
60
  self.template_distractors_brainstorm_2,
61
  self.llm_brainstorm_1,
62
+ "T2-L1"
63
  ))
64
  # Template 2, LLM 2
65
  tasks.append(run_brainstorm(
66
  self.template_distractors_brainstorm_2,
67
  self.llm_brainstorm_2,
68
+ "T2-L2"
69
  ))
70
 
71
  # Kick them off concurrently
config/system_prompt_texts.py ADDED
@@ -0,0 +1,260 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # config/prompt_texts.py
2
+
3
+ template_standardize_exercise_text = """
4
+ """
5
+
6
+ template_standardize_studytext_text = """
7
+ """
8
+
9
+ template_diagnose_double_negation_text = """
10
+ Analyze a multiple-choice exercise (Question- or Statement-type) for the presence of double negatives: either two negations in the question/statement itself, or a negation in the question/statement AND in an answer option.
11
+ Here are some examples of double negatives:
12
+
13
+ <example>
14
+ <exercise>
15
+ <prompt>
16
+ <type>Vraag</type>
17
+ <text>Wat is geen veel voorkomend symptoom (volgens de enquêteuitslag) van niet gelukkig zijn?</text>
18
+ </prompt>
19
+
20
+ <options>
21
+ 1. Gezondheidsproblemen
22
+ 2. Weinig tijd voor ontspanning
23
+ 3. Vaak alleen zijn
24
+ 4. Veel te doen hebben op je werk
25
+ </options>
26
+
27
+ <correct_answer>4</correct_answer>
28
+ </exercise>
29
+
30
+ <double_negative>
31
+ First negation: "Wat is geen" in prompt
32
+ Second negation: "van niet gelukkig zijn" in prompt
33
+
34
+ Explanation: Two negations in the prompt ("geen" and "niet") form a double negation.
35
+ </double_negative>
36
+ </example>
37
+
38
+ <example>
39
+ <exercise>
40
+ <prompt>
41
+ <type>Stelling</type>
42
+ <text>Expertfolio wordt niet aangeboden door ENI.</text>
43
+ </prompt>
44
+
45
+ <options>
46
+ 1. Deze stelling is correct
47
+ 2. Deze stelling is niet correct
48
+ </options>
49
+
50
+ <correct_answer>1</correct_answer>
51
+ </exercise>
52
+
53
+ <double_negative>
54
+ First negation: "wordt niet aangeboden" in prompt
55
+ Second negation: "is niet correct" in option 1
56
+
57
+ Explanation: Interpreted together (as the student would in their head, trying to pick the correct answer option), they form a statement with a double negation: "De stelling dat Expertfolio niet wordt aangeboden is niet correct"
58
+ </double_negative>
59
+ </example>
60
+
61
+ <example>
62
+ <exercise>
63
+ <prompt>
64
+ <type>Vraag</type>
65
+ <text>Welk aspect hoort niet bij eenzaamheid?</text>
66
+ </prompt>
67
+
68
+ <options>
69
+ 1. Betekenisvolle relaties hebben
70
+ 2. Depressiviteit en angst
71
+ 3. Veel alleen zijn
72
+ 4. Geen lijfelijk contact hebben
73
+ </options>
74
+
75
+ <correct_answer>1</correct_answer>
76
+ </exercise>
77
+
78
+ <double_negative>
79
+ First negation: "hoort niet bij" in prompt
80
+ Second negation: "Geen lijfelijk contact" in option 4
81
+
82
+ Explanation: Together, these create a double negative - "Geen lichamelijk contact
83
+ hebben hoort niet bij eenzaamheid"
84
+ </double_negative>
85
+ </example>
86
+
87
+ If it's obvious that there is or isn't a double negative in this exercise, just give a short one-sentence diagnosis on this.
88
+ If the issue is more nuanced, take more time to do some reasoning first, and give your diagnosis only then.
89
+ """
90
+
91
+ template_diagnose_correct_answer_stands_out_text = """
92
+ You evaluate a multiple-choice exercise to determine if the correct answer
93
+ stands out inappropriately compared to the distractors. If the correct answer is significantly
94
+ longer, much more specific, or grammatically or otherwise structurally different, this is undesirable. This is because any clear pattern in the answer options which distinguishes the correct answer from the other options, makes it easier for the students to guess correctly regardless of their factual knowledge. So, we are looking for cases where the correct answer differs from the rest in a cosmetic, meta-, visual or otherwise superficial way, which doesn't require factual understanding to spot. It is your task to diagnose such
95
+ cases.
96
+ Here are some examples of cases where the correct answer stands out inappropriately:
97
+
98
+ <examples>
99
+ <example>
100
+ <exercise>
101
+ <context>
102
+ De volgende afbeelding komt uit een onderzoek over eenzaamheid dat in 2012 is uitgevoerd.
103
+ </context>
104
+
105
+ <prompt>
106
+ <type>Vraag</type>
107
+ <text>Bij welke groep komt eenzaamheid volgens dit onderzoek het vaakst voor?</text>
108
+ </prompt>
109
+
110
+ <options>
111
+ 1. Gehandicapten
112
+ 2. Mantelzorgers
113
+ 3. Mensen met langdurige psychische aandoeningen
114
+ 4. Sporters
115
+ </options>
116
+
117
+ <correct_answer>3</correct_answer>
118
+ </exercise>
119
+
120
+ <answer_consistency_analysis>
121
+ Issue: Length difference
122
+ Pattern: All distractors are single words, while correct answer is a multi-word phrase
123
+ Diagnosis: The longer length of the correct answer makes it stand out inappropriately
124
+ </answer_consistency_analysis>
125
+ </example>
126
+
127
+ <example>
128
+ <exercise>
129
+ <prompt>
130
+ <type>Vraag</type>
131
+ <text>Wat is alimentatie?</text>
132
+ </prompt>
133
+
134
+ <options>
135
+ 1. Geld dat betaald moet worden na een scheiding
136
+ 2. Een lening van de overheid
137
+ 3. Een maandelijkse bijdrage aan liefdadigheid
138
+ 4. Een belastingteruggave
139
+ </options>
140
+
141
+ <correct_answer>1</correct_answer>
142
+ </exercise>
143
+
144
+ <answer_consistency_analysis>
145
+ Issue: Grammatical structure difference
146
+ Pattern: All distractors start with "Een", while correct answer starts with "Geld"
147
+ Diagnosis: The different grammatical structure of the correct answer makes it stand out undesirably (a superficial pattern that could hint at the answer)
148
+ </answer_consistency_analysis>
149
+ </example>
150
+
151
+ <example>
152
+ <exercise>
153
+ <prompt>
154
+ <type>Vraag</type>
155
+ <text>Welke onderwijskundige benadering wordt hier beschreven: "Leerlingen werken samen in kleine groepen en hebben elk een eigen rol en verantwoordelijkheid binnen de groep"?</text>
156
+ </prompt>
157
+
158
+ <options>
159
+ 1. Een activerende methode
160
+ 2. Jigsaw cooperative learning
161
+ 3. Benadering waarbij gedrag belangrijk is
162
+ 4. Leren als informatieverwerking
163
+ </options>
164
+
165
+ <correct_answer>2</correct_answer>
166
+ </exercise>
167
+
168
+ <answer_consistency_analysis>
169
+ Issue: Level of specificity difference
170
+ Pattern: While distractors use general educational terms that could apply to many approaches, the correct answer uses a very specific named methodology
171
+ Impact: The precise, technical term in the correct answer stands out against the more general educational concepts in the distractors
172
+ </answer_consistency_analysis>
173
+ </example>
174
+ </examples>
175
+
176
+ Your only focus is to accurately diagnose this issue of an inappropriately different correct answer, no need to provide a fix. Really take your time to arrive at the correct diagnosis, weighing if the pattern is clear enough or not.
177
+ Do some reasoning first, and give your diagnosis then.
178
+ """
179
+
180
+ template_diagnose_distractor_clearly_wrong_text = """
181
+ """
182
+
183
+ template_diagnose_distractor_partially_correct_text = """
184
+ """
185
+
186
+ diagnose_scorecard_template_text = """
187
+ """
188
+
189
+ template_distractors_brainstorm_1_text = """
190
+ """
191
+
192
+ template_distractors_brainstorm_2_text = """
193
+ """
194
+
195
+ template_consolidate_distractors_text = """
196
+ """
197
+
198
+ template_gen_prompt_a_text = """
199
+ """
200
+
201
+ template_gen_prompt_b_text = """
202
+ """
203
+
204
+ template_sanitize_learning_objectives_text = """
205
+ """
206
+
207
+ XML_templates= [
208
+ """
209
+ <example>
210
+ <exercise>
211
+ <prompt>
212
+ <type>Stelling</type>
213
+ <text></text>
214
+ </prompt>
215
+
216
+ <options>
217
+ 1. Deze stelling is correct
218
+ 2. Deze stelling is niet correct
219
+ </options>
220
+
221
+ <correct_answer></correct_answer>
222
+ </exercise>
223
+
224
+ <double_negative>
225
+ -
226
+ -
227
+
228
+ </double_negative>
229
+ </example>
230
+ """
231
+ ,
232
+ """
233
+ <example>
234
+ <exercise>
235
+ <prompt>
236
+ <type>Vraag</type>
237
+ <text></text>
238
+ </prompt>
239
+
240
+ <options>
241
+ 1.
242
+ 2.
243
+ 3.
244
+ 4.
245
+ </options>
246
+
247
+ <correct_answer></correct_answer>
248
+ </exercise>
249
+
250
+ <double_negative>
251
+ -
252
+ -
253
+
254
+ </double_negative>
255
+ </example>
256
+ """
257
+ ,
258
+ """
259
+ """
260
+ ]
config/templates.py CHANGED
@@ -1,5 +1,22 @@
1
  # config/templates.py
2
  from langchain_core.prompts.chat import ChatPromptTemplate
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
 
4
  template_standardize_exercise = ChatPromptTemplate(
5
  messages=[
@@ -28,46 +45,7 @@ template_standardize_studytext = ChatPromptTemplate(
28
 
29
  template_diagnose_double_negation = ChatPromptTemplate(
30
  messages=[
31
- ("system", """Analyze a multiple-choice exercise for the presence of double negatives: either two negations in the question/statement itself, or a negation in the question/statement AND in an answer option.
32
- Here are some examples of double negatives:
33
-
34
- <example 1>
35
- <exercise>
36
- Stelling
37
- Expertfolio wordt niet aangeboden door ENI.
38
-
39
- Keuzeopties:
40
- 1. Deze stelling is niet correct
41
- 2. Deze stelling is correct
42
-
43
- Correct antwoord:
44
- 1. Deze stelling is niet correct
45
- </exercise>
46
- <double negative explanation>
47
- The statement itself contains one negation (wordt 'niet' aangeboden), and one answer option contains another (is 'niet' correct). Interpreted together, this forms a statement with a double negation ('het is niet correct dat Expertfolio niet wordt aangeboden' is een dubbele ontkenning).
48
- </double negative explanation>
49
- </example 1>
50
-
51
- <example 2>
52
- <exercise>
53
- Vraag
54
- Welk aspect hoort niet bij eenzaamheid?
55
-
56
- Keuzeopties:
57
- 1. Betekenisvolle relaties hebben
58
- 2. Depressiviteit en angst
59
- 3. Veel alleen zijn
60
- 4. Geen lijfelijk contact hebben
61
-
62
- Correct antwoord:
63
- 1. Betekenisvolle relaties hebben
64
- </exercise>
65
- <double negative explanation>
66
- The question itself contains one negation (hoort 'niet' bij), and an answer option contains the second ('Geen' lijfelijk contact). Together, the resulting statement contains a double negation ('Geen lichamelijk contact hebben hoort niet bij eenzaamheid').
67
- </double negative explanation>
68
- </example 2>.
69
- If it's obvious that there is or isn't a double negative in this exercise, just give a short one-sentence diagnosis on this.
70
- If the issue is more nuanced, take more time to do some reasoning first, and give your diagnosis only after."""),
71
  ("human", "{standardized_exercise}")
72
  ],
73
  input_variables=["standardized_exercise"]
@@ -75,53 +53,7 @@ template_diagnose_double_negation = ChatPromptTemplate(
75
 
76
  template_diagnose_correct_answer_stands_out = ChatPromptTemplate(
77
  messages=[
78
- ("system", """You evaluate a multiple-choice exercise to determine if the correct answer
79
- stands out too much compared to the distractors. If the correct answer is significantly
80
- longer, more detailed, or structurally or grammatically different, this is undesirable. Identify such
81
- cases.
82
- Here are some examples of cases where the correct answer stands out:
83
-
84
- <example where the correct answer is much longer>
85
- <exercise>
86
- Theorie:
87
- De volgende afbeelding komt uit een onderzoek over eenzaamheid dat in 2012 is uitgevoerd.
88
-
89
- Vraag:
90
- Bij welke groep komt eenzaamheid volgens dit onderzoek het vaakst voor?
91
-
92
- 1. Gehandicapten
93
- 2. Mantelzorgers
94
- 3. Mensen met langdurige psychische aandoeningen
95
- 4. Sporters
96
-
97
- Correct antwoord:
98
- 3. Mensen met langdurige psychische aandoeningen.
99
- </exercise>
100
- <explanation how the correct answer stands out>
101
- Alle afleiders zijn 1 woord (kort), terwijl het correcte antwoord een zin is (duidelijk langer).
102
- </explanation how the correct answer stands out>
103
- </example where X>
104
-
105
- <example where the correct answer is grammatically different>
106
- <exercise>
107
- Vraag: Wat is alimentatie?
108
-
109
- 1. Geld dat betaald moet worden na een scheiding
110
- 2. Een lening van de overheid
111
- 3. Een maandelijkse bijdrage aan liefdadigheid
112
- 4. Een belastingteruggave
113
-
114
- Correct antwoord:
115
- 1. Geld dat betaald moet worden na een scheiding of als men niet meer samen is met de andere ouder van de kinderen.
116
-
117
- </exercise>
118
- <explanation how the correct answer stands out>
119
- Alle afleiders beginnen met "Een", maar het correcte antwoord begint anders.
120
- </explanation how the correct answer stands out>
121
- </example where the correct answer is grammatically different>
122
-
123
- Your only focus is to accurately diagnose this issue, no need to provide a fix. Really take your time to arrive at the correct diagnosis.
124
- Do some reasoning first, and give your diagnosis then."""),
125
  ("human", "{standardized_exercise}")
126
  ],
127
  input_variables=["standardized_exercise"]
@@ -222,7 +154,7 @@ template_distractors_brainstorm_2 = ChatPromptTemplate(
222
  "Those are the two bounds of the spectrum range we aim to operate between during brainstorming.\n"
223
  "So, through the above process of picking some júst faulty distractors in the context of the given question, both barely too correct and barely too obviously false, you establish the two bounds of acceptable distractors. When brainstorming, don't play it entirely safe though; when in doubt about where exactly on the spectrum the distractors would lie, just list the distractors you came up with anyway.\n\n"
224
  "Next, in the brainstorming phase, it's most important that you get really creative and really try to think outside the box, to come up with the required potential alternative answer options to the exercise. We want to approach this task from all different angles, "
225
- "to arrive at a varied selection of options, to serve as inspiration for a later stage of final selection (not now) to make the exercise the best it can be. For now, carry out the above-described prep in writing, then draft the list of{intermediate_distractors_specification} alternative distractors (in the same language as the existing exercise)."),
226
  ("human", "{standardized_exercise}")
227
  ],
228
  input_variables=["standardized_exercise", "intermediate_distractors_specification"]
@@ -262,7 +194,15 @@ template_gen_prompt_a = ChatPromptTemplate(
262
  - Use exactly the same terminology that's used in the study text
263
  - Mirror also the general language level of the study text. If the text is written with very simple words, then the learning objectives should be also written in very simple words
264
  - Mirror also the voice of the text (passive or active voice) and the perspective of the text (second or third person)
265
- - Are as concise as can be: they contain the smallest possible knowledge element. A learning objective does not combine multiple facts, but rather isolates individual facts
 
 
 
 
 
 
 
 
266
  - Avoid absolute terms that overstate their universality, like 'always' and 'never', unless that actually is true 100% of the time (usually there are exceptions to every rule, so account for those in your phrasing)
267
  - Alternatively avoid vague terms that make what they wanna say too meaningless, like 'can', 'could', 'might' and 'may' (many things 'can', 'could' or 'might be', this doesn't say much)
268
  - Also avoid subjective terms like 'often', 'sometimes', 'many', 'few', 'common', 'rare'. Instead, make more specific and falsifiable claims like 'in most cases' or 'A is more common than B'
@@ -279,6 +219,17 @@ template_gen_prompt_a = ChatPromptTemplate(
279
  input_variables=["standardized_text"]
280
  )
281
 
 
 
 
 
 
 
 
 
 
 
 
282
  template_gen_prompt_b = ChatPromptTemplate(
283
  messages=[
284
  ("system", """
@@ -372,11 +323,11 @@ template_gen_prompt_b = ChatPromptTemplate(
372
  template_sanitize_learning_objectives = ChatPromptTemplate(
373
  messages=[
374
  ("system", "You are given an output of a brainstorming session that lead to the generation of learning objectives. Your task is to "
375
- "turn this output into a neat numbered list of just the learning objectives, nothing else. Do not translate or otherwise edit the learning objectives, just relay them as a list.\n"
376
  "<example of a perfect list>\n"
377
- "1. De student weet dat de neus een zintuig is.\n"
378
- "2. De student weet dat de tong een zintuig is.\n"
379
- "3. De student weet dat de huid een zintuig is.\n"
380
  "</example of a perfect list>"),
381
  ("human", "Here is the output:\n "
382
  "{raw_output}")
 
1
  # config/templates.py
2
  from langchain_core.prompts.chat import ChatPromptTemplate
3
+ # config/templates.py
4
+ from config.system_prompt_texts import (
5
+ template_standardize_exercise_text,
6
+ template_standardize_studytext_text,
7
+ template_diagnose_double_negation_text,
8
+ template_diagnose_correct_answer_stands_out_text,
9
+ template_diagnose_distractor_clearly_wrong_text,
10
+ template_diagnose_distractor_partially_correct_text,
11
+ diagnose_scorecard_template_text,
12
+ template_distractors_brainstorm_1_text,
13
+ template_distractors_brainstorm_2_text,
14
+ template_consolidate_distractors_text,
15
+ template_gen_prompt_a_text,
16
+ template_gen_prompt_b_text,
17
+ template_sanitize_learning_objectives_text,
18
+ )
19
+
20
 
21
  template_standardize_exercise = ChatPromptTemplate(
22
  messages=[
 
45
 
46
  template_diagnose_double_negation = ChatPromptTemplate(
47
  messages=[
48
+ ("system", template_diagnose_double_negation_text),
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
49
  ("human", "{standardized_exercise}")
50
  ],
51
  input_variables=["standardized_exercise"]
 
53
 
54
  template_diagnose_correct_answer_stands_out = ChatPromptTemplate(
55
  messages=[
56
+ ("system", template_diagnose_correct_answer_stands_out_text),
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
  ("human", "{standardized_exercise}")
58
  ],
59
  input_variables=["standardized_exercise"]
 
154
  "Those are the two bounds of the spectrum range we aim to operate between during brainstorming.\n"
155
  "So, through the above process of picking some júst faulty distractors in the context of the given question, both barely too correct and barely too obviously false, you establish the two bounds of acceptable distractors. When brainstorming, don't play it entirely safe though; when in doubt about where exactly on the spectrum the distractors would lie, just list the distractors you came up with anyway.\n\n"
156
  "Next, in the brainstorming phase, it's most important that you get really creative and really try to think outside the box, to come up with the required potential alternative answer options to the exercise. We want to approach this task from all different angles, "
157
+ "to arrive at a varied selection of options, to serve as inspiration for a later stage of final selection (not now) to make the exercise the best it can be. For now, carry out the above-described prep in writing, then draft the list of{intermediate_distractors_specification}alternative distractors (in the same language as the existing exercise)."),
158
  ("human", "{standardized_exercise}")
159
  ],
160
  input_variables=["standardized_exercise", "intermediate_distractors_specification"]
 
194
  - Use exactly the same terminology that's used in the study text
195
  - Mirror also the general language level of the study text. If the text is written with very simple words, then the learning objectives should be also written in very simple words
196
  - Mirror also the voice of the text (passive or active voice) and the perspective of the text (second or third person)
197
+ - Are as **specific** as can be: they contain the smallest possible knowledge element. A learning objective does not combine multiple facts, but rather isolates individual facts
198
+ <illustration of 'specific'>
199
+ <bad example: not specific enough>
200
+
201
+ </bad example: not specific enough>
202
+ <good example: states isolated fact>
203
+
204
+ </good example: states isolated fact>
205
+ </illustration of 'specific'>
206
  - Avoid absolute terms that overstate their universality, like 'always' and 'never', unless that actually is true 100% of the time (usually there are exceptions to every rule, so account for those in your phrasing)
207
  - Alternatively avoid vague terms that make what they wanna say too meaningless, like 'can', 'could', 'might' and 'may' (many things 'can', 'could' or 'might be', this doesn't say much)
208
  - Also avoid subjective terms like 'often', 'sometimes', 'many', 'few', 'common', 'rare'. Instead, make more specific and falsifiable claims like 'in most cases' or 'A is more common than B'
 
219
  input_variables=["standardized_text"]
220
  )
221
 
222
+ """
223
+ <illustration of 'specific'>
224
+ <bad example: not specific enough>
225
+
226
+ </bad example: not specific enough>
227
+ <good example: states isolated fact>
228
+
229
+ </good example: states isolated fact>
230
+ </illustration of 'specific'>
231
+ """
232
+
233
  template_gen_prompt_b = ChatPromptTemplate(
234
  messages=[
235
  ("system", """
 
323
  template_sanitize_learning_objectives = ChatPromptTemplate(
324
  messages=[
325
  ("system", "You are given an output of a brainstorming session that lead to the generation of learning objectives. Your task is to "
326
+ "turn this output into a neat clean list of just the learning objectives, nothing else. Do not translate or otherwise edit the learning objectives, just relay them as a list.\n"
327
  "<example of a perfect list>\n"
328
+ "De student weet dat de neus een zintuig is.\n"
329
+ "De student weet dat de tong een zintuig is.\n"
330
+ "De student weet dat de huid een zintuig is.\n"
331
  "</example of a perfect list>"),
332
  ("human", "Here is the output:\n "
333
  "{raw_output}")
test samples.md CHANGED
@@ -1,4 +1,4 @@
1
- # Exercises
2
  ## False positives
3
  ### 1
4
  Theorie:
@@ -34,7 +34,7 @@ Correct antwoord:
34
  ### 3
35
 
36
  ---
37
- ## Double negative
38
  ### 1
39
  Stelling:
40
  Voor een volledig overzicht van organisaties die ondersteuning bieden bij eenzaamheid kun je beter niet terecht bij het WMO-loket.
@@ -46,7 +46,21 @@ Correct antwoord:
46
  2. Deze stelling is niet correct.
47
 
48
 
49
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
50
  ### 2
51
  Stelling:
52
  De bovenste holle ader (*Vena Cava Superior*) komt niet uit in de rechterboezem.
 
1
+ # Exercises Test Set (don't contaminate prompts)
2
  ## False positives
3
  ### 1
4
  Theorie:
 
34
  ### 3
35
 
36
  ---
37
+ ## Clear double negative
38
  ### 1
39
  Stelling:
40
  Voor een volledig overzicht van organisaties die ondersteuning bieden bij eenzaamheid kun je beter niet terecht bij het WMO-loket.
 
46
  2. Deze stelling is niet correct.
47
 
48
 
49
+ ## Clear jumping out answer (specificity pattern)
50
+ ### 1
51
+ <prompt>
52
+ <type>Vraag</type>
53
+ <text>Hoe heet de natuurkundige wet die magnetisch gedrag van paramagnetische materialen beschrijft?</text>
54
+ </prompt>
55
+
56
+ <options>
57
+ 1. Wet van verandering
58
+ 2. Wet van magnetisme
59
+ 3. Wet van Curie-Weiss
60
+ 4. Wet van faseovergangen
61
+ </options>
62
+
63
+ <correct_answer>3</correct_answer>
64
  ### 2
65
  Stelling:
66
  De bovenste holle ader (*Vena Cava Superior*) komt niet uit in de rechterboezem.