Spaces:
Sleeping
Sleeping
diana3135
commited on
Commit
·
856481b
1
Parent(s):
7017268
evaluation prompt
Browse files
utils.py
CHANGED
|
@@ -58,11 +58,11 @@ def merge_texts_parallel(task_description, human_text, ai_text, api_key = None):
|
|
| 58 |
"You are tasked with merging two answers into a single, coherent, and logically structured response. "
|
| 59 |
"Your response should synthesize the strengths of both answers, eliminate redundancy, and ensure a logical flow. "
|
| 60 |
"Follow this structured chain of thought to guide your process:\n\n"
|
| 61 |
-
"1.
|
| 62 |
-
"2.
|
| 63 |
-
"3.
|
| 64 |
-
"4.
|
| 65 |
-
"5.
|
| 66 |
"Task Description:\n"
|
| 67 |
f"{task_description}\n\n"
|
| 68 |
"Answer 1:\n"
|
|
@@ -79,11 +79,11 @@ def merge_texts_sequential(task_description, human_text, api_key = None):
|
|
| 79 |
f"{human_text}\n\n"
|
| 80 |
"Your task is to refine the human's response while ensuring that the final answer fully aligns with their intent. "
|
| 81 |
"Follow this logical chain of thought:\n"
|
| 82 |
-
"1.
|
| 83 |
-
"2.
|
| 84 |
-
"3.
|
| 85 |
-
"4.
|
| 86 |
-
"5.
|
| 87 |
"Provide the refined response below, clearly marked as the final version."
|
| 88 |
)
|
| 89 |
|
|
@@ -97,11 +97,11 @@ def modify_with_suggestion(task_description, text, suggestions, api_key = None):
|
|
| 97 |
f"{suggestions}\n\n"
|
| 98 |
"Your task is to modify the provided answer by systematically incorporating the suggestions while maintaining clarity, coherence, and alignment with the original task. "
|
| 99 |
"Follow this logical chain of thought:\n"
|
| 100 |
-
"1.
|
| 101 |
-
"2.
|
| 102 |
-
"3.
|
| 103 |
-
"4.
|
| 104 |
-
"5.
|
| 105 |
"Provide the improved answer below, clearly marked as the revised version."
|
| 106 |
)
|
| 107 |
return generate_text_with_gpt(prompt, api_key)
|
|
@@ -111,27 +111,27 @@ def get_evaluation_with_gpt(task_description, text, api_key=None):
|
|
| 111 |
f"Given the task: {task_description}, the provided answer is:\n{text}\n\n"
|
| 112 |
"You are tasked with evaluating the answer using a scale from 0 to 10 based on specific criteria. "
|
| 113 |
"Follow a structured chain-of-thought approach for your evaluation to ensure thoroughness and objectivity:\n\n"
|
| 114 |
-
"1.
|
| 115 |
" - Novelty: The uniqueness and originality of the ideas.\n"
|
| 116 |
" - Feasibility: The practicality and implementability of suggested actions.\n"
|
| 117 |
" - Inimitability: How difficult for competitors to replicate.\n"
|
| 118 |
" - Alignment: How aligned the ideas are with Airbnb’s business objectives and 17 SDGs.\n\n"
|
| 119 |
-
"2.
|
| 120 |
" - Identify strengths, weaknesses, and any gaps in the ideas provided.\n"
|
| 121 |
" - Consider the context of the task and whether the ideas are realistic and relevant.\n\n"
|
| 122 |
-
"3.
|
| 123 |
" - 0-2: Poor fit; the idea demonstrates minimal relevance to the criteria.\n"
|
| 124 |
" - 3-5: Partial fit; the idea shows some relevance but contains significant shortcomings.\n"
|
| 125 |
" - 6-8: Good fit; the idea aligns well with the criteria, showing clear relevance and thoughtfulness.\n"
|
| 126 |
" - 9-10: Excellent fit; the idea fully aligns with the criteria, demonstrating exceptional insight.\n"
|
| 127 |
" - Note: Use the entire scoring range (0-10) and avoid defaulting to mid-range scores. If the provided answer is vague or off-topic, assign scores between 0-5.\n\n"
|
| 128 |
-
"4.
|
| 129 |
-
"
|
| 130 |
"Novelty: [Score] - [Justification]\n"
|
| 131 |
"Feasibility: [Score] - [Justification]\n"
|
| 132 |
"Inimitability: [Score] - [Justification]\n"
|
| 133 |
"Alignment: [Score] - [Justification]\n\n"
|
| 134 |
-
"
|
| 135 |
)
|
| 136 |
|
| 137 |
return generate_text_with_gpt(prompt, api_key)
|
|
|
|
| 58 |
"You are tasked with merging two answers into a single, coherent, and logically structured response. "
|
| 59 |
"Your response should synthesize the strengths of both answers, eliminate redundancy, and ensure a logical flow. "
|
| 60 |
"Follow this structured chain of thought to guide your process:\n\n"
|
| 61 |
+
"1. Understand the Task: Start by carefully analyzing the task description to ensure your final response aligns fully with the objective.\n"
|
| 62 |
+
"2. Extract Key Points: Review both answers and identify the key ideas and arguments they present.\n"
|
| 63 |
+
"3. Resolve Redundancies and Conflicts: Compare the two answers to eliminate repetitive content and resolve any contradictions.\n"
|
| 64 |
+
"4. Integrate Seamlessly: Combine the extracted points into a single, unified response, ensuring a logical and coherent structure.\n"
|
| 65 |
+
"5. Refine for Clarity: Polish the final response for clarity, conciseness, and alignment with the task requirements.\n\n"
|
| 66 |
"Task Description:\n"
|
| 67 |
f"{task_description}\n\n"
|
| 68 |
"Answer 1:\n"
|
|
|
|
| 79 |
f"{human_text}\n\n"
|
| 80 |
"Your task is to refine the human's response while ensuring that the final answer fully aligns with their intent. "
|
| 81 |
"Follow this logical chain of thought:\n"
|
| 82 |
+
"1. Understand the Intent: Analyze the human's response to identify their key points and intended message.\n"
|
| 83 |
+
"2. Evaluate Clarity: Check for any ambiguities, gaps, or unclear phrasing in the response.\n"
|
| 84 |
+
"3. Improve Coherence: Ensure the response flows logically and connects ideas seamlessly.\n"
|
| 85 |
+
"4. Enhance Precision: Refine wording to be concise and impactful while preserving the human's original intent.\n"
|
| 86 |
+
"5. Check Alignment: Confirm that the final response accurately represents and strengthens the human's input without introducing unintended changes.\n\n"
|
| 87 |
"Provide the refined response below, clearly marked as the final version."
|
| 88 |
)
|
| 89 |
|
|
|
|
| 97 |
f"{suggestions}\n\n"
|
| 98 |
"Your task is to modify the provided answer by systematically incorporating the suggestions while maintaining clarity, coherence, and alignment with the original task. "
|
| 99 |
"Follow this logical chain of thought:\n"
|
| 100 |
+
"1. Understand the Task: Reassess the original task description to ensure the modified answer remains aligned with the requirements.\n"
|
| 101 |
+
"2. Analyze the Suggestions: Break down each suggestion to understand its purpose and how it improves the original answer.\n"
|
| 102 |
+
"3. Apply Improvements: Integrate the suggestions into the answer thoughtfully, ensuring each one is addressed adequately.\n"
|
| 103 |
+
"4. Ensure Coherence: Verify that the modified answer flows logically and maintains a clear structure.\n"
|
| 104 |
+
"5. Preserve Integrity: Ensure the modified answer stays true to the original content while enhancing it as per the suggestions.\n\n"
|
| 105 |
"Provide the improved answer below, clearly marked as the revised version."
|
| 106 |
)
|
| 107 |
return generate_text_with_gpt(prompt, api_key)
|
|
|
|
| 111 |
f"Given the task: {task_description}, the provided answer is:\n{text}\n\n"
|
| 112 |
"You are tasked with evaluating the answer using a scale from 0 to 10 based on specific criteria. "
|
| 113 |
"Follow a structured chain-of-thought approach for your evaluation to ensure thoroughness and objectivity:\n\n"
|
| 114 |
+
"1. Understand the Criteria: Carefully review each evaluation criterion to ensure you fully grasp what is being assessed:\n"
|
| 115 |
" - Novelty: The uniqueness and originality of the ideas.\n"
|
| 116 |
" - Feasibility: The practicality and implementability of suggested actions.\n"
|
| 117 |
" - Inimitability: How difficult for competitors to replicate.\n"
|
| 118 |
" - Alignment: How aligned the ideas are with Airbnb’s business objectives and 17 SDGs.\n\n"
|
| 119 |
+
"2. Analyze the Answer: Break down the answer into its key components and assess how well it meets each criterion.\n"
|
| 120 |
" - Identify strengths, weaknesses, and any gaps in the ideas provided.\n"
|
| 121 |
" - Consider the context of the task and whether the ideas are realistic and relevant.\n\n"
|
| 122 |
+
"3. Assign Scores: Use the following scale to score each criterion:\n"
|
| 123 |
" - 0-2: Poor fit; the idea demonstrates minimal relevance to the criteria.\n"
|
| 124 |
" - 3-5: Partial fit; the idea shows some relevance but contains significant shortcomings.\n"
|
| 125 |
" - 6-8: Good fit; the idea aligns well with the criteria, showing clear relevance and thoughtfulness.\n"
|
| 126 |
" - 9-10: Excellent fit; the idea fully aligns with the criteria, demonstrating exceptional insight.\n"
|
| 127 |
" - Note: Use the entire scoring range (0-10) and avoid defaulting to mid-range scores. If the provided answer is vague or off-topic, assign scores between 0-5.\n\n"
|
| 128 |
+
"4. Justify Each Score: Provide a brief explanation for each score, highlighting specific aspects of the answer that influenced your evaluation.\n\n"
|
| 129 |
+
"You should format your output exactly as follows:\n"
|
| 130 |
"Novelty: [Score] - [Justification]\n"
|
| 131 |
"Feasibility: [Score] - [Justification]\n"
|
| 132 |
"Inimitability: [Score] - [Justification]\n"
|
| 133 |
"Alignment: [Score] - [Justification]\n\n"
|
| 134 |
+
"Output your evaluation below:"
|
| 135 |
)
|
| 136 |
|
| 137 |
return generate_text_with_gpt(prompt, api_key)
|