chinmayjha commited on
Commit
2fd251c
Β·
unverified Β·
1 Parent(s): d8c683d

Enforce strict citation format in summarizer prompt

Browse files

- Added explicit examples of CORRECT vs WRONG citation formats
- Emphasized that citations must be EXACTLY [Doc X] with nothing else
- Updated system message to enforce strict formatting rules
- Added critical rules section to prevent LLM from being creative with citations

src/second_brain_online/application/agents/tools/summarizer.py CHANGED
@@ -87,13 +87,19 @@ Create a two-part response:
87
  - DO NOT mention specific customer names or personal identifiers
88
  - Group related insights by topic with bullet points
89
  - Be concise and general, highlighting the problem/concern rather than individuals
90
- - Add INLINE CITATIONS at the end of each point using format: [Doc X]
 
 
91
  - Number each unique document sequentially (Doc 1, Doc 2, etc.)
92
 
93
- Example:
94
  β€’ Organizations are planning phone number porting transitions, but custom porting is expensive (~$1,000) and should be done in bulk [Doc 1]
95
  β€’ Questions about additional license requirements for integrations ($45 per user) [Doc 1]
96
  β€’ Ringtone volume issues in embedded Salesforce app [Doc 2]
 
 
 
 
97
 
98
  2. **πŸ“š Sources** (at the end):
99
  - List ONLY UNIQUE documents (de-duplicate by Document ID)
@@ -131,7 +137,12 @@ Create a two-part response:
131
  - [Pricing Concern/High] Request for discount due to porting delays
132
  - [Policy Gap/Medium] No current policy for inactivity-based discounts
133
 
134
- Provide a focused answer with inline citations followed by the well-formatted Sources section with conversation insights."""
 
 
 
 
 
135
 
136
  def __init__(self, *args, **kwargs) -> None:
137
  super().__init__(*args, **kwargs)
@@ -156,7 +167,7 @@ Provide a focused answer with inline citations followed by the well-formatted So
156
  messages=[
157
  {
158
  "role": "system",
159
- "content": "You are an expert analyst. Answer the user's question based on the search results provided. Create a comprehensive answer with a Sources section."
160
  },
161
  {
162
  "role": "user",
 
87
  - DO NOT mention specific customer names or personal identifiers
88
  - Group related insights by topic with bullet points
89
  - Be concise and general, highlighting the problem/concern rather than individuals
90
+ - Add INLINE CITATIONS at the end of each point using ONLY this format: [Doc X]
91
+ - CRITICAL: Citations must be EXACTLY "[Doc 1]", "[Doc 2]", etc. - nothing else
92
+ - DO NOT add any other information in citations (no titles, dates, IDs, or sources in the citation)
93
  - Number each unique document sequentially (Doc 1, Doc 2, etc.)
94
 
95
+ CORRECT Example:
96
  β€’ Organizations are planning phone number porting transitions, but custom porting is expensive (~$1,000) and should be done in bulk [Doc 1]
97
  β€’ Questions about additional license requirements for integrations ($45 per user) [Doc 1]
98
  β€’ Ringtone volume issues in embedded Salesforce app [Doc 2]
99
+
100
+ WRONG Example (DO NOT DO THIS):
101
+ β€’ Custom porting costs around $1,000 [Source: JustCall Checkin, Document ID: abc123]
102
+ β€’ License fees are $45 per user [JustCall, 2025-10-07]
103
 
104
  2. **πŸ“š Sources** (at the end):
105
  - List ONLY UNIQUE documents (de-duplicate by Document ID)
 
137
  - [Pricing Concern/High] Request for discount due to porting delays
138
  - [Policy Gap/Medium] No current policy for inactivity-based discounts
139
 
140
+ Provide a focused answer with inline citations followed by the well-formatted Sources section with conversation insights.
141
+
142
+ CRITICAL RULES:
143
+ - In the ANSWER section, use ONLY [Doc X] format for citations
144
+ - In the Sources section, provide full details about each Doc
145
+ - NEVER mix citation formats - keep them separate and clean"""
146
 
147
  def __init__(self, *args, **kwargs) -> None:
148
  super().__init__(*args, **kwargs)
 
167
  messages=[
168
  {
169
  "role": "system",
170
+ "content": "You are an expert analyst. Follow the formatting instructions EXACTLY. Use only [Doc X] citations in the answer section, never include titles, dates, or IDs in citations."
171
  },
172
  {
173
  "role": "user",