kredd25 Claude Opus 4.7 (1M context) commited on
Commit
28ee428
·
1 Parent(s): 18b5aed

v0.15.4: risk_agent plain_verdict empty for zh + ar — language fix

Browse files

Live testing of all 7 languages (en/es/zh/vi/ht/ar/tl against the
Chicago Drexel test address) showed every language correctly populated
the advisor's insurance + mitigation arrays (confirming the v0.15.3
Tier-1 fix), AND advisor.tldr rendered fluent prose in every language.
But risk.plain_verdict came back EMPTY for zh and ar — the two
non-Latin-script languages. Other risk-agent fields (risk_score,
risk_level, fema_gap_explanation) populated normally.

Root cause: the field's instruction said "Second person, plain English,
~10th-grade reading level" — when the system-prompt language directive
says "write everything in Mandarin/Arabic", the model gets two
contradictory signals about what language to use for THIS particular
field and either skips it or returns an empty string. The Latin-script
languages were robust to the contradiction; the non-Latin ones weren't.

Fix:
- Drop "plain English" from the plain_verdict instruction — replace
with explicit "Write the VALUE in the user's chosen output language
(per the language directive in the system prompt) — do NOT default
to English if another language was requested. The JSON key
'plain_verdict' itself stays English."
- Mark the field "REQUIRED, NEVER EMPTY" and add a closing reminder
near the end of the prompt: "Generate plain_verdict early in the
JSON object (it is the most prominent field on the dossier)."
- Reorder the JSON schema so plain_verdict comes third (right after
risk_score and risk_level) rather than fifth. Token-budget safety:
if reasoning mode produces a long CoT and the JSON output truncates,
the user still gets the headline verdict before later fields drop.
- Also extend the "value in user's chosen language" reminder to
fema_gap_explanation, visual_corroboration, key_risk_factors,
mitigating_factors, and summary — same class of risk, fix once.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Files changed (1) hide show
  1. app/agents/risk_agent.py +10 -8
app/agents/risk_agent.py CHANGED
@@ -142,23 +142,25 @@ IMPORTANT CONTEXT:
142
  - If there are many 311 flood reports but FEMA says "minimal risk", the FEMA designation is misleading
143
  - Chicago's sewer system overwhelms after ~0.67 inches of rain per hour
144
 
145
- Return a JSON object with:
146
  {{
147
  "risk_score": <0-100 integer>,
148
  "risk_level": "low" | "medium" | "high",
 
149
  "aep_estimate": <estimated annual exceedance probability as decimal, e.g. 0.04>,
150
  "mortgage_30yr_probability": <cumulative probability over 30 years, e.g. 0.68>,
151
- "plain_verdict": "<the BOTTOM LINE for someone about to live/buy/rent here, written as if you were the friend they texted who happens to be a flood expert. Second person, plain English, ~10th-grade reading level, ONE paragraph (3-5 sentences). Lead with the verdict in the first sentence, then the single most important reason, then the trend direction (improving / stable / worsening) and why. Quantify where possible (probabilities, counts, dollar figures). Do not hedge unnecessarily and do not generate alarm; sound like an analyst, not a parent.>",
152
- "fema_gap_explanation": "<2-3 sentences explaining if/why FEMA designation is misleading>",
153
- "visual_corroboration": {"<2-3 sentences on what the photo confirms, contradicts, or adds beyond the data; '' if no image was provided>" if has_image else "''"},
154
- "key_risk_factors": ["<ranked list of top risk factors>"],
155
- "mitigating_factors": ["<factors that reduce risk>"],
156
- "summary": "<1 sentence for the status feed>"
157
  }}
158
 
159
  Think step by step. Integrate visual and data evidence. Reference the
160
  photo directly in your reasoning ("I can see ...", "The image shows ...")
161
- when relevant. Return ONLY the JSON object at the end."""
 
 
162
 
163
  # Build the user message content. Per Gemma 4 best practice,
164
  # image content parts go BEFORE the text part. Order matches the
 
142
  - If there are many 311 flood reports but FEMA says "minimal risk", the FEMA designation is misleading
143
  - Chicago's sewer system overwhelms after ~0.67 inches of rain per hour
144
 
145
+ Return a JSON object with these fields, IN THIS ORDER:
146
  {{
147
  "risk_score": <0-100 integer>,
148
  "risk_level": "low" | "medium" | "high",
149
+ "plain_verdict": "<REQUIRED, NEVER EMPTY. The BOTTOM LINE for someone about to live/buy/rent here, as if you were the friend they texted who happens to be a flood expert. Second person, ~10th-grade reading level, ONE paragraph (3-5 sentences). Write the VALUE in the user's chosen output language (per the language directive in the system prompt) — do NOT default to English if another language was requested. The JSON key 'plain_verdict' itself stays English. Lead with the verdict in sentence 1, then the single most important reason, then the trend direction (improving / stable / worsening). Quantify where possible (probabilities, counts, dollar figures). Do not hedge; do not generate alarm; sound like an analyst, not a parent.>",
150
  "aep_estimate": <estimated annual exceedance probability as decimal, e.g. 0.04>,
151
  "mortgage_30yr_probability": <cumulative probability over 30 years, e.g. 0.68>,
152
+ "fema_gap_explanation": "<2-3 sentences explaining if/why FEMA designation is misleading. Value in the user's chosen language.>",
153
+ "visual_corroboration": {"<2-3 sentences on what the photo confirms, contradicts, or adds beyond the data. Value in the user's chosen language. '' if no image was provided>" if has_image else "''"},
154
+ "key_risk_factors": ["<ranked list of top risk factors. Each entry in the user's chosen language.>"],
155
+ "mitigating_factors": ["<factors that reduce risk. Each entry in the user's chosen language.>"],
156
+ "summary": "<1 sentence for the status feed, in the user's chosen language.>"
 
157
  }}
158
 
159
  Think step by step. Integrate visual and data evidence. Reference the
160
  photo directly in your reasoning ("I can see ...", "The image shows ...")
161
+ when relevant. Generate plain_verdict early in the JSON object (it is
162
+ the most prominent field on the dossier). Return ONLY the JSON object
163
+ at the end."""
164
 
165
  # Build the user message content. Per Gemma 4 best practice,
166
  # image content parts go BEFORE the text part. Order matches the