Spaces:
Sleeping
Sleeping
Enforce strict citation format in summarizer prompt
Browse files- Added explicit examples of CORRECT vs WRONG citation formats
- Emphasized that citations must be EXACTLY [Doc X] with nothing else
- Updated system message to enforce strict formatting rules
- Added critical rules section to prevent LLM from being creative with citations
src/second_brain_online/application/agents/tools/summarizer.py
CHANGED
|
@@ -87,13 +87,19 @@ Create a two-part response:
|
|
| 87 |
- DO NOT mention specific customer names or personal identifiers
|
| 88 |
- Group related insights by topic with bullet points
|
| 89 |
- Be concise and general, highlighting the problem/concern rather than individuals
|
| 90 |
-
- Add INLINE CITATIONS at the end of each point using format: [Doc X]
|
|
|
|
|
|
|
| 91 |
- Number each unique document sequentially (Doc 1, Doc 2, etc.)
|
| 92 |
|
| 93 |
-
Example:
|
| 94 |
β’ Organizations are planning phone number porting transitions, but custom porting is expensive (~$1,000) and should be done in bulk [Doc 1]
|
| 95 |
β’ Questions about additional license requirements for integrations ($45 per user) [Doc 1]
|
| 96 |
β’ Ringtone volume issues in embedded Salesforce app [Doc 2]
|
|
|
|
|
|
|
|
|
|
|
|
|
| 97 |
|
| 98 |
2. **π Sources** (at the end):
|
| 99 |
- List ONLY UNIQUE documents (de-duplicate by Document ID)
|
|
@@ -131,7 +137,12 @@ Create a two-part response:
|
|
| 131 |
- [Pricing Concern/High] Request for discount due to porting delays
|
| 132 |
- [Policy Gap/Medium] No current policy for inactivity-based discounts
|
| 133 |
|
| 134 |
-
Provide a focused answer with inline citations followed by the well-formatted Sources section with conversation insights.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 135 |
|
| 136 |
def __init__(self, *args, **kwargs) -> None:
|
| 137 |
super().__init__(*args, **kwargs)
|
|
@@ -156,7 +167,7 @@ Provide a focused answer with inline citations followed by the well-formatted So
|
|
| 156 |
messages=[
|
| 157 |
{
|
| 158 |
"role": "system",
|
| 159 |
-
"content": "You are an expert analyst.
|
| 160 |
},
|
| 161 |
{
|
| 162 |
"role": "user",
|
|
|
|
| 87 |
- DO NOT mention specific customer names or personal identifiers
|
| 88 |
- Group related insights by topic with bullet points
|
| 89 |
- Be concise and general, highlighting the problem/concern rather than individuals
|
| 90 |
+
- Add INLINE CITATIONS at the end of each point using ONLY this format: [Doc X]
|
| 91 |
+
- CRITICAL: Citations must be EXACTLY "[Doc 1]", "[Doc 2]", etc. - nothing else
|
| 92 |
+
- DO NOT add any other information in citations (no titles, dates, IDs, or sources in the citation)
|
| 93 |
- Number each unique document sequentially (Doc 1, Doc 2, etc.)
|
| 94 |
|
| 95 |
+
CORRECT Example:
|
| 96 |
β’ Organizations are planning phone number porting transitions, but custom porting is expensive (~$1,000) and should be done in bulk [Doc 1]
|
| 97 |
β’ Questions about additional license requirements for integrations ($45 per user) [Doc 1]
|
| 98 |
β’ Ringtone volume issues in embedded Salesforce app [Doc 2]
|
| 99 |
+
|
| 100 |
+
WRONG Example (DO NOT DO THIS):
|
| 101 |
+
β’ Custom porting costs around $1,000 [Source: JustCall Checkin, Document ID: abc123]
|
| 102 |
+
β’ License fees are $45 per user [JustCall, 2025-10-07]
|
| 103 |
|
| 104 |
2. **π Sources** (at the end):
|
| 105 |
- List ONLY UNIQUE documents (de-duplicate by Document ID)
|
|
|
|
| 137 |
- [Pricing Concern/High] Request for discount due to porting delays
|
| 138 |
- [Policy Gap/Medium] No current policy for inactivity-based discounts
|
| 139 |
|
| 140 |
+
Provide a focused answer with inline citations followed by the well-formatted Sources section with conversation insights.
|
| 141 |
+
|
| 142 |
+
CRITICAL RULES:
|
| 143 |
+
- In the ANSWER section, use ONLY [Doc X] format for citations
|
| 144 |
+
- In the Sources section, provide full details about each Doc
|
| 145 |
+
- NEVER mix citation formats - keep them separate and clean"""
|
| 146 |
|
| 147 |
def __init__(self, *args, **kwargs) -> None:
|
| 148 |
super().__init__(*args, **kwargs)
|
|
|
|
| 167 |
messages=[
|
| 168 |
{
|
| 169 |
"role": "system",
|
| 170 |
+
"content": "You are an expert analyst. Follow the formatting instructions EXACTLY. Use only [Doc X] citations in the answer section, never include titles, dates, or IDs in citations."
|
| 171 |
},
|
| 172 |
{
|
| 173 |
"role": "user",
|