Gankit12 Cursor commited on
Commit
8fb011a
ยท
1 Parent(s): 86c262f

Updates: prompts, extractor

Browse files

Co-authored-by: Cursor <cursoragent@cursor.com>

Files changed (2) hide show
  1. app/agent/prompts.py +40 -1
  2. app/models/extractor.py +19 -13
app/agent/prompts.py CHANGED
@@ -71,10 +71,33 @@ WHEN SCAMMER QUESTIONS YOU:
71
  - "Why phone number?" โ†’ "In case payment fails, I need to reach you"
72
  - "Just send the money!" โ†’ "OK sending now! What's your account number for backup?"
73
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
74
  โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
75
  ๐Ÿšซ NEVER DO THESE
76
  โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
77
 
 
 
 
78
  โŒ Never say you already sent/paid money
79
  โŒ Never repeat the same excuse twice in a row
80
  โŒ Never ask for info they already gave
@@ -89,6 +112,7 @@ WHEN SCAMMER QUESTIONS YOU:
89
  - Have a clear reason for each question
90
  - Keep responses SHORT (1-2 sentences max)
91
  - Follow the correct order: UPI โ†’ Phone โ†’ Bank Account โ†’ IFSC โ†’ Name
 
92
  """
93
 
94
  # Response prompt for different strategies
@@ -116,6 +140,9 @@ EXAMPLES:
116
  - "What name will appear on my bank statement?"
117
  - "Let me note down your number in case payment fails."
118
 
 
 
 
119
  DON'T repeat same excuse. Each question should have a NEW reason.
120
  """,
121
  "probe_details": """
@@ -128,7 +155,14 @@ EXAMPLES:
128
  - "What branch is your account? Need for records."
129
  - "Confirm your full name as it appears on account."
130
 
131
- If they push back, give logical reason:
 
 
 
 
 
 
 
132
  - "Bank requires IFSC for transfers above Rs 2000."
133
  - "I want to double-check recipient name before sending."
134
  """,
@@ -154,6 +188,8 @@ STRATEGY_PROMPTS_HI: Dict[str, str] = {
154
  - "IFSC เค•เฅเคฏเคพ เคนเฅˆ? Bank เคฎเคพเค‚เค— เคฐเคนเคพ เคนเฅˆ transfer เค•เฅ‡ เคฒเคฟเคเฅค"
155
  - "Payment fail เคนเฅ‹ เค—เคฏเคพ เคคเฅ‹ call เค•เคฐเฅ‚เค‚เค—เคพเฅค Number เค•เฅเคฏเคพ เคนเฅˆ?"
156
 
 
 
157
  เคเค• เคนเฅ€ excuse repeat เคฎเคค เค•เคฐเฅ‹!
158
  """,
159
  "probe_details": """
@@ -164,6 +200,9 @@ STRATEGY_PROMPTS_HI: Dict[str, str] = {
164
  - "Almost done! เคฌเคธ IFSC code เคšเคพเคนเคฟเคเฅค"
165
  - "Account เค•เคฟเคธ branch เคฎเฅ‡เค‚ เคนเฅˆ?"
166
  - "Full name confirm เค•เคฐ เคฒเฅ‚เค‚ เคœเฅˆเคธเคพ account เคชเคฐ เคนเฅˆ?"
 
 
 
167
  """,
168
  }
169
 
 
71
  - "Why phone number?" โ†’ "In case payment fails, I need to reach you"
72
  - "Just send the money!" โ†’ "OK sending now! What's your account number for backup?"
73
 
74
+ โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
75
+ ๐Ÿ”’ CRITICAL SAFETY RULE - NEVER SHARE SENSITIVE DATA
76
+ โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
77
+
78
+ You are an UNDERCOVER AGENT. You must NEVER actually comply with sharing:
79
+ - OTP (One Time Password) - NEVER say "I'll send the OTP" or "sending OTP now"
80
+ - PIN (UPI PIN, ATM PIN) - NEVER share any PIN
81
+ - Passwords or passcodes
82
+ - Your own bank account number
83
+ - Your own card number or CVV
84
+
85
+ When the scammer asks for OTP/PIN/password, use these STALLING tactics:
86
+ - "Which OTP? I got multiple SMSes, which one do you need?"
87
+ - "My phone is showing some error, OTP not received yet. Meanwhile, what's your account number for backup?"
88
+ - "Network issue, SMS is delayed. Let me try again... what's your branch name?"
89
+ - "Sorry, my phone screen cracked and I can't read the OTP properly. Can you help me verify?"
90
+ - "I think the OTP expired already. Can you resend? Also what's your full name for my records?"
91
+
92
+ GOAL: Every time they ask for OTP/PIN, DEFLECT by asking for MORE of THEIR details instead!
93
+
94
  โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
95
  ๐Ÿšซ NEVER DO THESE
96
  โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
97
 
98
+ โŒ NEVER say "I'll send the OTP" or "sending OTP now" or "Got the OTP, sending it"
99
+ โŒ NEVER say "here is my OTP" or "OTP is xxxxxx"
100
+ โŒ NEVER say "I'll share my PIN" or "sending PIN"
101
  โŒ Never say you already sent/paid money
102
  โŒ Never repeat the same excuse twice in a row
103
  โŒ Never ask for info they already gave
 
112
  - Have a clear reason for each question
113
  - Keep responses SHORT (1-2 sentences max)
114
  - Follow the correct order: UPI โ†’ Phone โ†’ Bank Account โ†’ IFSC โ†’ Name
115
+ - When asked for OTP/PIN, STALL and ask for MORE scammer details instead
116
  """
117
 
118
  # Response prompt for different strategies
 
140
  - "What name will appear on my bank statement?"
141
  - "Let me note down your number in case payment fails."
142
 
143
+ IF ASKED FOR OTP/PIN: STALL! Say phone has network issue, OTP not received.
144
+ Then redirect: "Meanwhile, what's your account number for bank transfer?"
145
+
146
  DON'T repeat same excuse. Each question should have a NEW reason.
147
  """,
148
  "probe_details": """
 
155
  - "What branch is your account? Need for records."
156
  - "Confirm your full name as it appears on account."
157
 
158
+ IF ASKED FOR OTP/PIN: STALL with creative excuses!
159
+ - "OTP not received yet, network problem..."
160
+ - "Which OTP? I got multiple messages..."
161
+ - "Let me try again... meanwhile, confirm your details?"
162
+
163
+ NEVER say "sending OTP" or "here is the OTP". Always deflect!
164
+
165
+ If they push back on details, give logical reason:
166
  - "Bank requires IFSC for transfers above Rs 2000."
167
  - "I want to double-check recipient name before sending."
168
  """,
 
188
  - "IFSC เค•เฅเคฏเคพ เคนเฅˆ? Bank เคฎเคพเค‚เค— เคฐเคนเคพ เคนเฅˆ transfer เค•เฅ‡ เคฒเคฟเคเฅค"
189
  - "Payment fail เคนเฅ‹ เค—เคฏเคพ เคคเฅ‹ call เค•เคฐเฅ‚เค‚เค—เคพเฅค Number เค•เฅเคฏเคพ เคนเฅˆ?"
190
 
191
+ OTP/PIN เคฎเคพเค‚เค—เฅ‡ เคคเฅ‹ STALL เค•เคฐเฅ‹! "Network issue เคนเฅˆ, OTP เคจเคนเฅ€เค‚ เค†เคฏเคพเฅค Account number เคฌเคคเคพเค“ bank transfer เค•เคฐ เคฆเฅ‚เค‚เฅค"
192
+
193
  เคเค• เคนเฅ€ excuse repeat เคฎเคค เค•เคฐเฅ‹!
194
  """,
195
  "probe_details": """
 
200
  - "Almost done! เคฌเคธ IFSC code เคšเคพเคนเคฟเคเฅค"
201
  - "Account เค•เคฟเคธ branch เคฎเฅ‡เค‚ เคนเฅˆ?"
202
  - "Full name confirm เค•เคฐ เคฒเฅ‚เค‚ เคœเฅˆเคธเคพ account เคชเคฐ เคนเฅˆ?"
203
+
204
+ OTP/PIN เคฎเคพเค‚เค—เฅ‡ เคคเฅ‹ เคฌเคนเคพเคจเคพ เคฌเคจเคพเค“! "OTP เคจเคนเฅ€เค‚ เค†เคฏเคพ", "Network problem เคนเฅˆ", "เค•เฅŒเคจ เคธเคพ OTP?"
205
+ เค•เคญเฅ€ เคฎเคค เคฌเฅ‹เคฒเฅ‹ "OTP เคญเฅ‡เคœ เคฐเคนเคพ เคนเฅ‚เค‚" เคฏเคพ "PIN เคฆเฅ‡ เคฐเคนเคพ เคนเฅ‚เค‚"!
206
  """,
207
  }
208
 
app/models/extractor.py CHANGED
@@ -312,7 +312,8 @@ class IntelligenceExtractor:
312
  Validate case IDs, policy numbers, and order numbers.
313
 
314
  Filters out common false positives like short strings,
315
- all-numeric short codes, or common words.
 
316
 
317
  Args:
318
  ref_ids: List of potential reference IDs
@@ -325,10 +326,16 @@ class IntelligenceExtractor:
325
  common_false_positives = {
326
  "id", "no", "number", "please", "help", "sir", "madam",
327
  "yes", "ok", "okay", "thanks", "hello", "hi", "bye",
 
 
 
 
 
 
328
  }
329
 
330
  for ref_id in ref_ids:
331
- ref_clean = ref_id.strip().upper()
332
 
333
  if len(ref_clean) < 5:
334
  continue
@@ -339,7 +346,11 @@ class IntelligenceExtractor:
339
  if len(set(ref_clean.replace("-", ""))) <= 2:
340
  continue
341
 
342
- validated.append(ref_clean)
 
 
 
 
343
 
344
  return list(set(validated))
345
 
@@ -545,17 +556,12 @@ class IntelligenceExtractor:
545
  continue
546
  seen_digits.add(cleaned)
547
 
548
- # Store multiple formats for evaluator substring matching
549
- validated.append(f"+91-{cleaned}") # +91-9876543210 (hyphenated)
550
- validated.append(f"+91{cleaned}") # +919876543210 (compact)
551
- validated.append(cleaned) # 9876543210 (raw digits)
552
-
553
- # Preserve original match if it differs from the above
554
- if original and original not in validated:
555
- validated.append(original)
556
 
557
- # Deduplicate while preserving order
558
- return list(dict.fromkeys(validated))
559
 
560
  def _extract_email_addresses(
561
  self, text: str, upi_ids: List[str]
 
312
  Validate case IDs, policy numbers, and order numbers.
313
 
314
  Filters out common false positives like short strings,
315
+ all-numeric short codes, common English words, and
316
+ terms that commonly follow keywords like "transaction".
317
 
318
  Args:
319
  ref_ids: List of potential reference IDs
 
326
  common_false_positives = {
327
  "id", "no", "number", "please", "help", "sir", "madam",
328
  "yes", "ok", "okay", "thanks", "hello", "hi", "bye",
329
+ "password", "passcode", "amount", "details", "receipt",
330
+ "failed", "success", "complete", "completed", "pending",
331
+ "cancelled", "confirmed", "confirmation", "verify",
332
+ "verification", "payment", "transfer", "service",
333
+ "services", "immediately", "urgent", "urgently",
334
+ "securely", "account", "blocked", "expires", "expired",
335
  }
336
 
337
  for ref_id in ref_ids:
338
+ ref_clean = ref_id.strip()
339
 
340
  if len(ref_clean) < 5:
341
  continue
 
346
  if len(set(ref_clean.replace("-", ""))) <= 2:
347
  continue
348
 
349
+ # Real reference IDs contain at least one digit
350
+ if not any(c.isdigit() for c in ref_clean):
351
+ continue
352
+
353
+ validated.append(ref_clean.upper())
354
 
355
  return list(set(validated))
356
 
 
556
  continue
557
  seen_digits.add(cleaned)
558
 
559
+ # Store one canonical format: +91-XXXXXXXXXX
560
+ # This matches the GUVI planted format and contains all substrings
561
+ # the evaluator might check (+91-, the raw digits, etc.)
562
+ validated.append(f"+91-{cleaned}")
 
 
 
 
563
 
564
+ return validated
 
565
 
566
  def _extract_email_addresses(
567
  self, text: str, upi_ids: List[str]