Spaces:

BtB-ExpC
/

Exercises

Sleeping

BtB-ExpC commited on Feb 15, 2025

Commit

5b78005

1 Parent(s): 77dacd9

improved scorecard propmt by adding standardized exercise as context

Files changed (4) hide show

app/ui/diagnoser_tab.py CHANGED Viewed

@@ -51,7 +51,12 @@ def build_diagnoser_tab():
         # Create 10 Response textboxes
         with gr.Column():
             diagnoser_responses = [
-                gr.Textbox(label=f"Response {i + 1}", interactive=False, visible=(i == 0))
                 for i in range(10)
             ]

         # Create 10 Response textboxes
         with gr.Column():
             diagnoser_responses = [
+                gr.Textbox(
+                    label=f"Response {i + 1}",
+                    interactive=False,
+                    visible=(i == 0),
+                    lines=30
+                )
                 for i in range(10)
             ]

chains/diagnoser/diagnoser_chain.py CHANGED Viewed

@@ -41,7 +41,7 @@ class DiagnoserChain(BaseModel):
         # Step 4: Generate a one-line scorecard
         prompt = await self.template_diagnose_scorecard.aformat_prompt(
-            combined_diagnosis=combined_diagnosis
         )
         scorecard_messages = prompt.to_messages()
         scorecard_response = await self.llm_4o.ainvoke(scorecard_messages)

         # Step 4: Generate a one-line scorecard
         prompt = await self.template_diagnose_scorecard.aformat_prompt(
+            combined_diagnosis=combined_diagnosis, standardized_exercise=standardized_exercise
         )
         scorecard_messages = prompt.to_messages()
         scorecard_response = await self.llm_4o.ainvoke(scorecard_messages)

config/system_prompt_texts.py CHANGED Viewed

@@ -758,7 +758,7 @@ After lots of iterative prep and reasoning, considering a wide range of options,
 ## Pointers
 - Try to exactly match the content and language level in the learning objective. If it's stated in simple words, use equally simple words in the exercises as well.
-- Avoid the use of unnecessarily strong false statements or distractors using words like "all", "never" or "exclusively" etc., because they're often too easy to dismiss (unless the correct answer is similarly extreme). For example: instead of your false statement being "De enige factor die slaapkwaliteit beïnvloedt, is consistent naar bed gaan", it is better to give a less extreme (and therefore more plausible-sounding) statement, like: "De hoofdfactor die slaapkwaliteit beïnvloedt, is consistent naar bed gaan".
 - Output format doesn't matter, prioritize careful reasoning.
 """

 ## Pointers
 - Try to exactly match the content and language level in the learning objective. If it's stated in simple words, use equally simple words in the exercises as well.
+- Avoid the use of unnecessarily strong false statements or distractors using words like "all", "never" or "exclusively" etc., unless the correct answer is equally extreme. Otherwise, distractors using such strong claims are too easy to dismiss. For example: instead of your false statement being "De *enige* factor die slaapkwaliteit beïnvloedt, is consistent naar bed gaan", it is better to give a less extreme (and therefore more plausible-sounding) statement, like: "De hoofdfactor die slaapkwaliteit beïnvloedt, is consistent naar bed gaan".
 - Output format doesn't matter, prioritize careful reasoning.
 """

config/templates.py CHANGED Viewed

@@ -119,9 +119,12 @@ diagnose_scorecard_template = ChatPromptTemplate(
         </example 3>
         Oftentimes, diagnoses will be elaborate and quite nuanced, first viewing the issue from different angles, considering both scenarios of passing and failing equally. For this reason, when deciding on your binary classification, you should focus only on the very last concluding sentences of each diagnosis to determine a pass or fail.
         """),
-        ("human", "{combined_diagnosis}")
     ],
-    input_variables=["combined_diagnosis"]
 )

         </example 3>
         Oftentimes, diagnoses will be elaborate and quite nuanced, first viewing the issue from different angles, considering both scenarios of passing and failing equally. For this reason, when deciding on your binary classification, you should focus only on the very last concluding sentences of each diagnosis to determine a pass or fail.
         """),
+        ("human", "For context, here is the exercise that's being diagnose:\n"
+                  "{standardized_exercise}\n\n"
+                  "Here are the diagnoses:\n"
+                  "{combined_diagnosis}")
     ],
+    input_variables=["combined_diagnosis", "standardized_exercise"]
 )