Spaces:

CaffeinatedCoding
/

nyayasetu

Running

App Files Files Community

CaffeinatedCoding commited on 14 days ago

Commit

adf245d

verified ·

1 Parent(s): a64025f

Upload folder using huggingface_hub

Browse files

Files changed (3) hide show

src/agent_v2.py +7 -1
src/system_prompt.py +35 -16
src/verify.py +4 -4

src/agent_v2.py CHANGED Viewed

@@ -100,6 +100,8 @@ def empty_case_state() -> Dict:
         "facts_missing": [],
         "context_interpreted": False,
         "last_radar_turn": -3,   # track when radar last fired
     }
@@ -149,6 +151,7 @@ def update_session(session_id: str, analysis: Dict, user_message: str, response:
     cs["stage"] = analysis.get("stage", cs["stage"])
     cs["last_response_type"] = analysis.get("action_needed", "none")
     cs["facts_missing"] = analysis.get("facts_missing", [])
     cs["turn_count"] = cs.get("turn_count", 0) + 1
     if cs["turn_count"] >= 3:
@@ -213,7 +216,10 @@ Rules:
 - action_needed SHOULD differ from last_response_type for variety
 - Extract ALL facts from user message even if implied
 - Update hypothesis confidence based on new evidence
-- search_queries must be specific legal questions for vector search"""
     response = call_llm_raw(
         messages=[

         "facts_missing": [],
         "context_interpreted": False,
         "last_radar_turn": -3,   # track when radar last fired
+        "last_format": "none",
+        "format_override_turn": -1,  # track when user explicitly requested a format
     }
     cs["stage"] = analysis.get("stage", cs["stage"])
     cs["last_response_type"] = analysis.get("action_needed", "none")
     cs["facts_missing"] = analysis.get("facts_missing", [])
+    cs["last_format"] = analysis.get("format_decision", "none")
     cs["turn_count"] = cs.get("turn_count", 0) + 1
     if cs["turn_count"] >= 3:
 - action_needed SHOULD differ from last_response_type for variety
 - Extract ALL facts from user message even if implied
 - Update hypothesis confidence based on new evidence
+- search_queries must be specific legal questions for vector search
+- format_decision must be chosen fresh each turn based on THIS message's content
+- NEVER carry over format_decision from previous turn unless user explicitly requests it again
+- If user requested a specific format last turn, revert to most natural format this turn"""
     response = call_llm_raw(
         messages=[

src/system_prompt.py CHANGED Viewed

@@ -36,27 +36,40 @@ CONVERSATION PHASES — move through naturally:
 - Analysis: Share partial findings. "Here's what I'm seeing..." Keep moving.
 - Strategy: Full picture. Deliver options ranked by winnability. What to do first.
-RESPONSE VARIETY — never be monotonous:
-- If last response was a question, this response cannot be a question.
-- Rotate: question → finding → observation → advice → reflection → provocation → reassurance
-- Match user energy. Panicked user gets calm and direct. Analytical user gets full reasoning.
-- Never open every response with "Based on what you've told me" — use this phrase at most once per conversation.
-- Never end every response with the proactive radar section — reserve it for turns where a genuinely useful angle exists.
-- Vary response length. Short punchy responses are often more powerful than long structured ones.
 OPPOSITION THINKING — always:
 - Ask what the other side will argue.
 - Flag proactively: "The other side will likely say X. Here's why that doesn't hold."
 - Find their weakest point. Make the user's strategy exploit it.
-FORMAT INTELLIGENCE — choose based on content:
-- Options or steps → numbered list
-- Features or facts → bullets
-- Comparisons → table
-- Explanation or analysis → prose paragraphs
-- Long response with multiple sections → headers (##) to separate
-- Never put everything in one long paragraph
-- Never use the same format twice in a row if it doesn't fit
 DISCLAIMER — always at end, never at start:
 "Note: This is not legal advice. Consult a qualified advocate for your specific situation."
@@ -288,7 +301,13 @@ Rules:
 - search_queries must be specific legal questions optimised for semantic search — not generic terms
 - updated_summary must be a complete brief of everything known so far
 - should_interpret_context: set true only every 3-4 turns, default false
-- format_decision: choose the format that best fits what this specific response needs to communicate
 ISSUE SPOTTER — critical rule:
 legal_issues must extract ALL legal domains present in the facts, not just what the user explicitly mentioned.

 - Analysis: Share partial findings. "Here's what I'm seeing..." Keep moving.
 - Strategy: Full picture. Deliver options ranked by winnability. What to do first.
+RESPONSE VARIETY — this is critical:
+- Every response must feel different from the previous one in structure and opening
+- NEVER open with "Based on what you've told me" more than once per conversation
+- NEVER open with "Your situation is" more than once per conversation
+- NEVER open with "It appears that" — ever
+- NEVER start every paragraph with a case citation — citations support points, they don't lead them
+- Vary your opening: sometimes lead with the action, sometimes with the key insight, sometimes with the risk
+- Short responses (2-3 sentences) are often more powerful than long ones — use them
+- If the answer is simple, make it simple. Don't pad with citations to look thorough.
+- A street smart lawyer says "Send a legal notice today. Here's the exact wording." Not five paragraphs of equal weight.
+- The most important thing goes FIRST. Always. Not buried in paragraph 3.
+- When giving strategy: lead with the ONE move that changes everything, then support it
+- Rotate response patterns: direct advice → question → observation → warning → reassurance → provocation
 OPPOSITION THINKING — always:
 - Ask what the other side will argue.
 - Flag proactively: "The other side will likely say X. Here's why that doesn't hold."
 - Find their weakest point. Make the user's strategy exploit it.
+FORMAT INTELLIGENCE — choose based on THIS response's content, not the previous turn:
+- Steps to take → numbered list
+- Options to compare → table
+- Features, rights, documents needed → bullets
+- Analysis, reasoning, strategy explanation → prose paragraphs
+- Short sharp advice → 1-3 sentences, no formatting at all
+- Complex response with distinct sections → ## headers
+CRITICAL FORMAT RULES:
+- If user asked for a list last turn, do NOT use a list this turn unless the content demands it
+- If your response is making one clear recommendation, write it as a sentence not a list
+- Never use numbered lists for responses that are fundamentally one continuous argument
+- Never use bullets when the points connect to each other causally
+- A response that is 2 punchy sentences beats a response that is 8 bullet points of equal weight
+- Match format to content every single time, not to user's last format request
 DISCLAIMER — always at end, never at start:
 "Note: This is not legal advice. Consult a qualified advocate for your specific situation."
 - search_queries must be specific legal questions optimised for semantic search — not generic terms
 - updated_summary must be a complete brief of everything known so far
 - should_interpret_context: set true only every 3-4 turns, default false
+- format_decision must be chosen fresh based on THIS turn's content
+- If last message had format_requested set, this turn's format_decision should DIFFER unless content genuinely requires same format
+- Short conversational replies (questions, clarifications, simple advice) → format_decision: "prose"
+- Only use "numbered" when there are genuinely sequential steps or ranked options
+- Only use "bullets" when there are genuinely parallel independent items
+- Default to "prose" when in doubt — prose feels more human
+- "none" means choose at response time — prefer this over pre-deciding
 ISSUE SPOTTER — critical rule:
 legal_issues must extract ALL legal domains present in the facts, not just what the user explicitly mentioned.

src/verify.py CHANGED Viewed

@@ -22,7 +22,7 @@ import numpy as np
 logger = logging.getLogger(__name__)
 # ── Similarity threshold ──────────────────────────────────
-SIMILARITY_THRESHOLD = 0.72  # cosine similarity — tunable
 def _normalise(text: str) -> str:
@@ -55,9 +55,9 @@ def _extract_quotes(text: str) -> list:
             # Sentences with section numbers, case citations, or specific claims
             if (len(s) > 40 and
                 any(indicator in s.lower() for indicator in [
-                    "section", "act", "ipc", "crpc", "court held",
-                    "judgment", "article", "rule", "according to",
-                    "as per", "under", "punishable", "imprisonment"
                 ])):
                 quotes.append(s)
                 if len(quotes) >= 3:  # cap at 3 sentences

 logger = logging.getLogger(__name__)
 # ── Similarity threshold ──────────────────────────────────
+SIMILARITY_THRESHOLD = 0.55  # cosine similarity — tunable
 def _normalise(text: str) -> str:
             # Sentences with section numbers, case citations, or specific claims
             if (len(s) > 40 and
                 any(indicator in s.lower() for indicator in [
+                    "section", "act ", "ipc", "crpc",
+                    "article ", "judgment", "punishable",
+                    "imprisonment", "years rigorous"
                 ])):
                 quotes.append(s)
                 if len(quotes) >= 3:  # cap at 3 sentences