Mustafa-albakkar commited on
Commit
74006c8
·
verified ·
1 Parent(s): 3b8d936

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +52 -6
app.py CHANGED
@@ -33,14 +33,60 @@ model = AutoModelForCausalLM.from_pretrained(
33
  )
34
  model.eval()
35
 
36
- PROMPT_TEMPLATE = """You are a GAIA final-answer extractor.
37
- Your job is to extract only what comes after 'Final Answer:' in the text below.
38
- If no 'Final Answer:' is present, infer the most likely short final answer (one line only).
 
39
 
40
- Text:
41
- {text}
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42
 
43
- Return only the final answer (no extra words).
 
44
  """
45
 
46
  def extract_final_answer(raw: str) -> str:
 
33
  )
34
  model.eval()
35
 
36
+ PROMPT_TEMPLATE = """You are an expert final-answer extractor working for the GAIA benchmark.you are facing a GAIA questions exam.
37
+ the exam's results evaluator is an AI model that use the exact matching method to evaluate your answers, so you must format your Final Answer in the most consice and useful format.
38
+ think in logic steps to solve the GAIA benchmark problems
39
+ Your task is to read the raw output of an AI agent and extract the final concise answer.
40
 
41
+ ### INSTRUCTIONS:
42
+ 1. Look for the "Final Answer:" or "final answer:".
43
+ 2. Extract only what comes after it .
44
+ 3. If there is no explicit "Final Answer:", infer what you think is the final answer.
45
+
46
+ 4. Format the answer according to GAIA Benchmark rules:
47
+ - Numbers → return as-is (e.g., 42)
48
+ - Dates → format as YYYY-MM-DD
49
+ - Lists → comma-separated, e.g., "Paris, London"
50
+ - Short words or names → return exactly that
51
+ 5. DO NOT include any other text, reasoning, or punctuation beyond the final concise answer.
52
+ DON'T take out your output of the logging or error information
53
+ ## EXAMPLES:
54
+ [
55
+ {{
56
+ "question": "What is the name of the fourth planet in the solar system?",
57
+ "agent_response": "The fourth planet from the Sun comes right after Earth. It is known for its reddish color due to the iron oxide on its surface.",
58
+ "reasoning": "The planet described as the 'red planet' is Mars.",
59
+ Final Answer: Mars
60
+ }},
61
+ {{
62
+ "question": "Order the following planets from smallest to largest by diameter: Earth, Mercury, Neptune, Venus.",
63
+ "agent_response": "The smallest is Mercury, followed by Venus, then Earth, and the largest among these four is Neptune.",
64
+ "reasoning": "The correct order is Mercury < Venus < Earth < Neptune.",
65
+ Final Answer: ["Mercury", "Venus", "Earth", "Neptune"]
66
+ }},
67
+ {{
68
+ "question": "In which year did NASA launch the Voyager 1 spacecraft?",
69
+ "agent_response": "It was launched in the late 1970s, a few weeks after Voyager 2, specifically in September 1977.",
70
+ "reasoning": "Launch year is 1977.",
71
+ Final Answer: "1977"
72
+ }},
73
+ {{
74
+ "question": "If a book costs $12 and a pen costs $3, what is the total cost of buying 5 books and 4 pens?",
75
+ "agent_response": "The books cost $60 in total, and the pens cost $12, so the total is their sum.",
76
+ "reasoning": "60 + 12 = 72.",
77
+ Final Answer: "72"
78
+ }},
79
+ {{
80
+ "question": "Which of the following animals is a mammal? [Crocodile, Dolphin, Falcon]",
81
+ "agent_response": "The crocodile is a reptile, the falcon is a bird, while the dolphin breathes air and nurses its young.",
82
+ "reasoning": "The only mammal in the list is the dolphin.",
83
+ Final Answer: "Dolphin"
84
+ }}
85
+ ]
86
+ ---
87
 
88
+ Now extract the GAIA final answer from this text:
89
+ {text}
90
  """
91
 
92
  def extract_final_answer(raw: str) -> str: