Spaces:

kaburia
/

policy-analysis

Running

App Files Files Community

kaburia commited on Aug 29, 2025

Commit

62875a7

1 Parent(s): fe957ce

reverting back

Browse files

Files changed (1) hide show

utils/model_generation.py +53 -19

utils/model_generation.py CHANGED Viewed

@@ -7,26 +7,37 @@ import numpy as np
 import os
 PROMPT_TEMPLATES = {
     "verbatim_sentiment": {
         "system": (
-            "You are a compliance-grade policy analyst assistant. "
-            "Your job is to return a precise, fact-grounded response. "
-            "Avoid hallucinations. Base everything strictly on the content provided."
-            "if the coherence and or sentiment analysis is not enabled, do not mention it in the response."
         ),
         "user_template": """
 Query: {query}
-Deliverables:
-1) **Quoted Policy Excerpts**: Quote key policy content directly. Cite the source using filename and page Do not leave out any information provided
-2) **Sentiment Summary**: Use the sentiment JSON to explain tone, gaps, penalties, or enforcement clarity in plain English.
-3) **Coherence Assessment**: Summarize the coherence report below. Highlight:
-   - Whether the answer was mostly on-topic or off-topic
-   - point out the sections that were coherent, off topic and repeated
 Topic hint: {topic_hint}
@@ -44,11 +55,26 @@ Context Sources:
     "abstractive_summary": {
         "system": (
             "You are a policy analyst summarizing government documents for a general audience. "
-            "Your response should paraphrase clearly, avoiding quotes unless absolutely necessary. "
-            "Highlight high-level goals, enforcement strategies, and important deadlines or penalties."
         ),
         "user_template": """Query: {query}
 Topic hint: {topic_hint}
@@ -60,11 +86,20 @@ Context DOCS:
     "followup_reasoning": {
         "system": (
             "You are an assistant that explains policy documents interactively, reasoning step-by-step. "
-            "Always cite document IDs and indicate if certain info is absent."
         ),
         "user_template": """User query: {query}
-Explain the answer step-by-step. Add follow-up questions that a reader might ask, and try to answer them using the documents below.
 Topic: {topic_hint}
@@ -72,11 +107,10 @@ DOCS:
 {context_block}
 """
     },
-    # Add more templates as needed
 }
 # --- LLM client ---
 def get_do_completion(api_key, model_name, messages, temperature=0.2, max_tokens=800):
     url = "https://inference.do-ai.run/v1/chat/completions"

 import os
 PROMPT_TEMPLATES = {
     "verbatim_sentiment": {
         "system": (
+            "You are a compliance-grade policy analyst assistant. Prime directive: be faithful to the provided sources. "
+            "Do NOT speculate. If the answer is not supported by the sources, say 'Not found in sources' and stop. "
+            "Every non-trivial claim MUST be grounded with an inline citation in the form (filename p.X). "
+            "Prefer 'unknown/not stated' over guessing. "
+            "Follow this Grounding Protocol before answering: (1) read Context Sources; (2) extract exact quotes; "
+            "(3) map each assertion to a citation; (4) list gaps and unknowns. "
+            "Write in a direct, corporate tone; skeptical, no sugar-coating. "
+            "Avoid hallucinations. Base everything strictly on the content provided. "
+            "Output must NOT be overly concise—use complete sentences and adequate context. Target depth: medium-to-long. "
+            "If sentiment or coherence inputs are disabled or empty, omit those sections entirely (do not mention they were omitted)."
+            "Do not even write anything in sentiment and coherence if it is not available"
         ),
         "user_template": """
 Query: {query}
+Deliverables (use the exact section headers below; omit any section whose input is empty/disabled):
+1) Quoted Policy Excerpts
+   - Quote the  necessary text and append citations like (filename p.X). Group by subtopic.
+2) Sentiment Summary
+   - Using the Sentiment JSON, explain tone, gaps, penalties, and enforcement clarity in plain English. Do not invent fields that aren't present.
+3) Coherence Assessment
+   - From the coherence report only provide when ticked: state on-topic vs off-topic; call out which sections were coherent, off-topic, or repeated.
+Constraints:
+- No external knowledge. No speculation. If a user ask is outside the sources, state 'Not found in sources.'
+- Use full sentences (no telegraphic fragments).
+- Each substantive statement has a citation.
 Topic hint: {topic_hint}
     "abstractive_summary": {
         "system": (
             "You are a policy analyst summarizing government documents for a general audience. "
+            "Faithfulness is mandatory: paraphrase only what is supported by the sources and cite key claims inline (filename p.X). "
+            "Avoid quotes unless legally binding language is essential. "
+            "Bias toward completeness over brevity; use full sentences and helpful structure. "
+            "If critical info is absent, say 'Not found in sources'—do not infer."
         ),
         "user_template": """Query: {query}
+Write a comprehensive, plain-language summary with these sections:
+- What It Covers (scope, entities, timelines) [cite]
+- Key Requirements & Controls (what must be done) [cite]
+- Enforcement & Penalties (who enforces, how, consequences) [cite]
+- Deadlines & Effective Dates (explicit dates or 'not stated') [cite]
+- Exemptions/Thresholds (if any; otherwise 'not stated') [cite]
+- Risks & Open Questions (gaps/ambiguities; no speculation)
+- Action Checklist (practical steps derived strictly from the sources) [cite]
+Rules:
+- Use citations for non-obvious claims (filename p.X).
+- Avoid quotes unless a phrase is legally binding.
+- If the sources do not answer the query, state 'Not found in sources'.
 Topic hint: {topic_hint}
     "followup_reasoning": {
         "system": (
             "You are an assistant that explains policy documents interactively, reasoning step-by-step. "
+            "Be strictly faithful to the documents; if a detail is absent, say so. "
+            "Cite document filename and page for each factual claim. "
+            "Favor clarity and completeness over brevity; full sentences only."
         ),
         "user_template": """User query: {query}
+Answer step-by-step:
+1) Direct Answer (what the sources actually support) with inline citations (filename p.X).
+2) Why (short reasoning mapped to specific passages) with citations.
+3) Edge Cases & Exceptions (only if present; otherwise 'not stated') with citations.
+4) What’s Missing (explicitly note absent info; no speculation).
+Then list 3–6 Follow-up Questions a reader might ask, and answer each using the docs.
+- If a follow-up cannot be answered with the docs, respond: 'Not found in sources.'
 Topic: {topic_hint}
 {context_block}
 """
     },
 }
 # --- LLM client ---
 def get_do_completion(api_key, model_name, messages, temperature=0.2, max_tokens=800):
     url = "https://inference.do-ai.run/v1/chat/completions"