Spaces:

aail-hf
/

ensemble_machine

Running

App Files Files Community

diana3135 commited on Nov 23, 2024

Commit

856481b

1 Parent(s): 7017268

evaluation prompt

Browse files

Files changed (1) hide show

utils.py +21 -21

utils.py CHANGED Viewed

@@ -58,11 +58,11 @@ def merge_texts_parallel(task_description, human_text, ai_text, api_key = None):
     "You are tasked with merging two answers into a single, coherent, and logically structured response. "
     "Your response should synthesize the strengths of both answers, eliminate redundancy, and ensure a logical flow. "
     "Follow this structured chain of thought to guide your process:\n\n"
-    "1. **Understand the Task:** Start by carefully analyzing the task description to ensure your final response aligns fully with the objective.\n"
-    "2. **Extract Key Points:** Review both answers and identify the key ideas and arguments they present.\n"
-    "3. **Resolve Redundancies and Conflicts:** Compare the two answers to eliminate repetitive content and resolve any contradictions.\n"
-    "4. **Integrate Seamlessly:** Combine the extracted points into a single, unified response, ensuring a logical and coherent structure.\n"
-    "5. **Refine for Clarity:** Polish the final response for clarity, conciseness, and alignment with the task requirements.\n\n"
     "Task Description:\n"
     f"{task_description}\n\n"
     "Answer 1:\n"
@@ -79,11 +79,11 @@ def merge_texts_sequential(task_description, human_text, api_key = None):
     f"{human_text}\n\n"
     "Your task is to refine the human's response while ensuring that the final answer fully aligns with their intent. "
     "Follow this logical chain of thought:\n"
-    "1. **Understand the Intent:** Analyze the human's response to identify their key points and intended message.\n"
-    "2. **Evaluate Clarity:** Check for any ambiguities, gaps, or unclear phrasing in the response.\n"
-    "3. **Improve Coherence:** Ensure the response flows logically and connects ideas seamlessly.\n"
-    "4. **Enhance Precision:** Refine wording to be concise and impactful while preserving the human's original intent.\n"
-    "5. **Check Alignment:** Confirm that the final response accurately represents and strengthens the human's input without introducing unintended changes.\n\n"
     "Provide the refined response below, clearly marked as the final version."
 )
@@ -97,11 +97,11 @@ def modify_with_suggestion(task_description, text, suggestions, api_key = None):
     f"{suggestions}\n\n"
     "Your task is to modify the provided answer by systematically incorporating the suggestions while maintaining clarity, coherence, and alignment with the original task. "
     "Follow this logical chain of thought:\n"
-    "1. **Understand the Task:** Reassess the original task description to ensure the modified answer remains aligned with the requirements.\n"
-    "2. **Analyze the Suggestions:** Break down each suggestion to understand its purpose and how it improves the original answer.\n"
-    "3. **Apply Improvements:** Integrate the suggestions into the answer thoughtfully, ensuring each one is addressed adequately.\n"
-    "4. **Ensure Coherence:** Verify that the modified answer flows logically and maintains a clear structure.\n"
-    "5. **Preserve Integrity:** Ensure the modified answer stays true to the original content while enhancing it as per the suggestions.\n\n"
     "Provide the improved answer below, clearly marked as the revised version."
 )
     return generate_text_with_gpt(prompt, api_key)
@@ -111,27 +111,27 @@ def get_evaluation_with_gpt(task_description, text, api_key=None):
     f"Given the task: {task_description}, the provided answer is:\n{text}\n\n"
     "You are tasked with evaluating the answer using a scale from 0 to 10 based on specific criteria. "
     "Follow a structured chain-of-thought approach for your evaluation to ensure thoroughness and objectivity:\n\n"
-    "1. **Understand the Criteria:** Carefully review each evaluation criterion to ensure you fully grasp what is being assessed:\n"
     "   - Novelty: The uniqueness and originality of the ideas.\n"
     "   - Feasibility: The practicality and implementability of suggested actions.\n"
     "   - Inimitability: How difficult for competitors to replicate.\n"
     "   - Alignment: How aligned the ideas are with Airbnb’s business objectives and 17 SDGs.\n\n"
-    "2. **Analyze the Answer:** Break down the answer into its key components and assess how well it meets each criterion.\n"
     "   - Identify strengths, weaknesses, and any gaps in the ideas provided.\n"
     "   - Consider the context of the task and whether the ideas are realistic and relevant.\n\n"
-    "3. **Assign Scores:** Use the following scale to score each criterion:\n"
     "   - 0-2: Poor fit; the idea demonstrates minimal relevance to the criteria.\n"
     "   - 3-5: Partial fit; the idea shows some relevance but contains significant shortcomings.\n"
     "   - 6-8: Good fit; the idea aligns well with the criteria, showing clear relevance and thoughtfulness.\n"
     "   - 9-10: Excellent fit; the idea fully aligns with the criteria, demonstrating exceptional insight.\n"
     "   - Note: Use the entire scoring range (0-10) and avoid defaulting to mid-range scores. If the provided answer is vague or off-topic, assign scores between 0-5.\n\n"
-    "4. **Justify Each Score:** Provide a brief explanation for each score, highlighting specific aspects of the answer that influenced your evaluation.\n\n"
-    "Format your output exactly as follows:\n"
     "Novelty: [Score] - [Justification]\n"
     "Feasibility: [Score] - [Justification]\n"
     "Inimitability: [Score] - [Justification]\n"
     "Alignment: [Score] - [Justification]\n\n"
-    "Begin your evaluation below:"
 )
     return generate_text_with_gpt(prompt, api_key)

     "You are tasked with merging two answers into a single, coherent, and logically structured response. "
     "Your response should synthesize the strengths of both answers, eliminate redundancy, and ensure a logical flow. "
     "Follow this structured chain of thought to guide your process:\n\n"
+    "1. Understand the Task: Start by carefully analyzing the task description to ensure your final response aligns fully with the objective.\n"
+    "2. Extract Key Points: Review both answers and identify the key ideas and arguments they present.\n"
+    "3. Resolve Redundancies and Conflicts: Compare the two answers to eliminate repetitive content and resolve any contradictions.\n"
+    "4. Integrate Seamlessly: Combine the extracted points into a single, unified response, ensuring a logical and coherent structure.\n"
+    "5. Refine for Clarity: Polish the final response for clarity, conciseness, and alignment with the task requirements.\n\n"
     "Task Description:\n"
     f"{task_description}\n\n"
     "Answer 1:\n"
     f"{human_text}\n\n"
     "Your task is to refine the human's response while ensuring that the final answer fully aligns with their intent. "
     "Follow this logical chain of thought:\n"
+    "1. Understand the Intent: Analyze the human's response to identify their key points and intended message.\n"
+    "2. Evaluate Clarity: Check for any ambiguities, gaps, or unclear phrasing in the response.\n"
+    "3. Improve Coherence: Ensure the response flows logically and connects ideas seamlessly.\n"
+    "4. Enhance Precision: Refine wording to be concise and impactful while preserving the human's original intent.\n"
+    "5. Check Alignment: Confirm that the final response accurately represents and strengthens the human's input without introducing unintended changes.\n\n"
     "Provide the refined response below, clearly marked as the final version."
 )
     f"{suggestions}\n\n"
     "Your task is to modify the provided answer by systematically incorporating the suggestions while maintaining clarity, coherence, and alignment with the original task. "
     "Follow this logical chain of thought:\n"
+    "1. Understand the Task: Reassess the original task description to ensure the modified answer remains aligned with the requirements.\n"
+    "2. Analyze the Suggestions: Break down each suggestion to understand its purpose and how it improves the original answer.\n"
+    "3. Apply Improvements: Integrate the suggestions into the answer thoughtfully, ensuring each one is addressed adequately.\n"
+    "4. Ensure Coherence: Verify that the modified answer flows logically and maintains a clear structure.\n"
+    "5. Preserve Integrity: Ensure the modified answer stays true to the original content while enhancing it as per the suggestions.\n\n"
     "Provide the improved answer below, clearly marked as the revised version."
 )
     return generate_text_with_gpt(prompt, api_key)
     f"Given the task: {task_description}, the provided answer is:\n{text}\n\n"
     "You are tasked with evaluating the answer using a scale from 0 to 10 based on specific criteria. "
     "Follow a structured chain-of-thought approach for your evaluation to ensure thoroughness and objectivity:\n\n"
+    "1. Understand the Criteria: Carefully review each evaluation criterion to ensure you fully grasp what is being assessed:\n"
     "   - Novelty: The uniqueness and originality of the ideas.\n"
     "   - Feasibility: The practicality and implementability of suggested actions.\n"
     "   - Inimitability: How difficult for competitors to replicate.\n"
     "   - Alignment: How aligned the ideas are with Airbnb’s business objectives and 17 SDGs.\n\n"
+    "2. Analyze the Answer: Break down the answer into its key components and assess how well it meets each criterion.\n"
     "   - Identify strengths, weaknesses, and any gaps in the ideas provided.\n"
     "   - Consider the context of the task and whether the ideas are realistic and relevant.\n\n"
+    "3. Assign Scores: Use the following scale to score each criterion:\n"
     "   - 0-2: Poor fit; the idea demonstrates minimal relevance to the criteria.\n"
     "   - 3-5: Partial fit; the idea shows some relevance but contains significant shortcomings.\n"
     "   - 6-8: Good fit; the idea aligns well with the criteria, showing clear relevance and thoughtfulness.\n"
     "   - 9-10: Excellent fit; the idea fully aligns with the criteria, demonstrating exceptional insight.\n"
     "   - Note: Use the entire scoring range (0-10) and avoid defaulting to mid-range scores. If the provided answer is vague or off-topic, assign scores between 0-5.\n\n"
+    "4. Justify Each Score: Provide a brief explanation for each score, highlighting specific aspects of the answer that influenced your evaluation.\n\n"
+    "You should format your output exactly as follows:\n"
     "Novelty: [Score] - [Justification]\n"
     "Feasibility: [Score] - [Justification]\n"
     "Inimitability: [Score] - [Justification]\n"
     "Alignment: [Score] - [Justification]\n\n"
+    "Output your evaluation below:"
 )
     return generate_text_with_gpt(prompt, api_key)