v0.15.4: risk_agent plain_verdict empty for zh + ar — language fix
Browse filesLive testing of all 7 languages (en/es/zh/vi/ht/ar/tl against the
Chicago Drexel test address) showed every language correctly populated
the advisor's insurance + mitigation arrays (confirming the v0.15.3
Tier-1 fix), AND advisor.tldr rendered fluent prose in every language.
But risk.plain_verdict came back EMPTY for zh and ar — the two
non-Latin-script languages. Other risk-agent fields (risk_score,
risk_level, fema_gap_explanation) populated normally.
Root cause: the field's instruction said "Second person, plain English,
~10th-grade reading level" — when the system-prompt language directive
says "write everything in Mandarin/Arabic", the model gets two
contradictory signals about what language to use for THIS particular
field and either skips it or returns an empty string. The Latin-script
languages were robust to the contradiction; the non-Latin ones weren't.
Fix:
- Drop "plain English" from the plain_verdict instruction — replace
with explicit "Write the VALUE in the user's chosen output language
(per the language directive in the system prompt) — do NOT default
to English if another language was requested. The JSON key
'plain_verdict' itself stays English."
- Mark the field "REQUIRED, NEVER EMPTY" and add a closing reminder
near the end of the prompt: "Generate plain_verdict early in the
JSON object (it is the most prominent field on the dossier)."
- Reorder the JSON schema so plain_verdict comes third (right after
risk_score and risk_level) rather than fifth. Token-budget safety:
if reasoning mode produces a long CoT and the JSON output truncates,
the user still gets the headline verdict before later fields drop.
- Also extend the "value in user's chosen language" reminder to
fema_gap_explanation, visual_corroboration, key_risk_factors,
mitigating_factors, and summary — same class of risk, fix once.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- app/agents/risk_agent.py +10 -8
|
@@ -142,23 +142,25 @@ IMPORTANT CONTEXT:
|
|
| 142 |
- If there are many 311 flood reports but FEMA says "minimal risk", the FEMA designation is misleading
|
| 143 |
- Chicago's sewer system overwhelms after ~0.67 inches of rain per hour
|
| 144 |
|
| 145 |
-
Return a JSON object with:
|
| 146 |
{{
|
| 147 |
"risk_score": <0-100 integer>,
|
| 148 |
"risk_level": "low" | "medium" | "high",
|
|
|
|
| 149 |
"aep_estimate": <estimated annual exceedance probability as decimal, e.g. 0.04>,
|
| 150 |
"mortgage_30yr_probability": <cumulative probability over 30 years, e.g. 0.68>,
|
| 151 |
-
"
|
| 152 |
-
"
|
| 153 |
-
"
|
| 154 |
-
"
|
| 155 |
-
"
|
| 156 |
-
"summary": "<1 sentence for the status feed>"
|
| 157 |
}}
|
| 158 |
|
| 159 |
Think step by step. Integrate visual and data evidence. Reference the
|
| 160 |
photo directly in your reasoning ("I can see ...", "The image shows ...")
|
| 161 |
-
when relevant.
|
|
|
|
|
|
|
| 162 |
|
| 163 |
# Build the user message content. Per Gemma 4 best practice,
|
| 164 |
# image content parts go BEFORE the text part. Order matches the
|
|
|
|
| 142 |
- If there are many 311 flood reports but FEMA says "minimal risk", the FEMA designation is misleading
|
| 143 |
- Chicago's sewer system overwhelms after ~0.67 inches of rain per hour
|
| 144 |
|
| 145 |
+
Return a JSON object with these fields, IN THIS ORDER:
|
| 146 |
{{
|
| 147 |
"risk_score": <0-100 integer>,
|
| 148 |
"risk_level": "low" | "medium" | "high",
|
| 149 |
+
"plain_verdict": "<REQUIRED, NEVER EMPTY. The BOTTOM LINE for someone about to live/buy/rent here, as if you were the friend they texted who happens to be a flood expert. Second person, ~10th-grade reading level, ONE paragraph (3-5 sentences). Write the VALUE in the user's chosen output language (per the language directive in the system prompt) — do NOT default to English if another language was requested. The JSON key 'plain_verdict' itself stays English. Lead with the verdict in sentence 1, then the single most important reason, then the trend direction (improving / stable / worsening). Quantify where possible (probabilities, counts, dollar figures). Do not hedge; do not generate alarm; sound like an analyst, not a parent.>",
|
| 150 |
"aep_estimate": <estimated annual exceedance probability as decimal, e.g. 0.04>,
|
| 151 |
"mortgage_30yr_probability": <cumulative probability over 30 years, e.g. 0.68>,
|
| 152 |
+
"fema_gap_explanation": "<2-3 sentences explaining if/why FEMA designation is misleading. Value in the user's chosen language.>",
|
| 153 |
+
"visual_corroboration": {"<2-3 sentences on what the photo confirms, contradicts, or adds beyond the data. Value in the user's chosen language. '' if no image was provided>" if has_image else "''"},
|
| 154 |
+
"key_risk_factors": ["<ranked list of top risk factors. Each entry in the user's chosen language.>"],
|
| 155 |
+
"mitigating_factors": ["<factors that reduce risk. Each entry in the user's chosen language.>"],
|
| 156 |
+
"summary": "<1 sentence for the status feed, in the user's chosen language.>"
|
|
|
|
| 157 |
}}
|
| 158 |
|
| 159 |
Think step by step. Integrate visual and data evidence. Reference the
|
| 160 |
photo directly in your reasoning ("I can see ...", "The image shows ...")
|
| 161 |
+
when relevant. Generate plain_verdict early in the JSON object (it is
|
| 162 |
+
the most prominent field on the dossier). Return ONLY the JSON object
|
| 163 |
+
at the end."""
|
| 164 |
|
| 165 |
# Build the user message content. Per Gemma 4 best practice,
|
| 166 |
# image content parts go BEFORE the text part. Order matches the
|