Spaces:
Running
Running
fix(prompt): prevent hallucinated citations - enforce strict source index
Browse files- LLM was inventing [1]-[9] citations from training data when only 2 sources existed
- Now explicitly tells LLM: 'You have EXACTLY N sources, only cite those numbers'
- Source Index moved to top of prompt (before context) for higher attention
- Added hard stop rule: if no sources, say so and stop - no training data fallback
- Applied fix to both execute_chat and execute_stream prompts
src/core/use_cases/rag_chat_use_case.py
CHANGED
|
@@ -769,45 +769,45 @@ JSON:"""
|
|
| 769 |
|
| 770 |
prompt = f"""You are ARKI AI, a real-time news assistant. Today's date is {datetime.utcnow().strftime("%B %d, %Y")}.
|
| 771 |
|
| 772 |
-
|
| 773 |
-
|
| 774 |
-
|
| 775 |
-
|
| 776 |
-
|
| 777 |
-
|
| 778 |
-
|
| 779 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 780 |
Read the News Context below and determine:
|
| 781 |
|
| 782 |
A) DIRECT MATCH β Sources directly answer the question:
|
| 783 |
-
β
|
| 784 |
-
β
|
| 785 |
-
β Use numbered points with **bold** headlines
|
| 786 |
|
| 787 |
B) RELATED INFORMATION β Sources have related but not exact information:
|
| 788 |
-
β
|
| 789 |
-
β
|
| 790 |
-
β Provide the related information anyway (it may still be helpful)
|
| 791 |
|
| 792 |
-
C) NO
|
| 793 |
β Say clearly: "I couldn't find relevant news on that topic in today's feed."
|
| 794 |
-
β
|
| 795 |
|
| 796 |
-
STEP
|
| 797 |
1. Use ONLY facts from the News Context below. NEVER use training data or general knowledge.
|
| 798 |
-
2.
|
| 799 |
-
3.
|
| 800 |
-
4.
|
| 801 |
-
5.
|
| 802 |
-
6. Be helpful and flexible β if exact match not found, offer related information.
|
| 803 |
-
7. At the END of your answer, on a new line, write exactly:
|
| 804 |
FOLLOW_UP: question1 | question2 | question3
|
| 805 |
-
|
| 806 |
|
| 807 |
-
Source Index:
|
| 808 |
-
{source_index_lines}
|
| 809 |
News Context (from live multilingual database):
|
| 810 |
-
{context_text}
|
| 811 |
|
| 812 |
Conversation History:
|
| 813 |
{history_text}
|
|
@@ -903,45 +903,45 @@ Answer:"""
|
|
| 903 |
|
| 904 |
prompt_stream = f"""You are ARKI AI, a real-time news assistant. Today's date is {datetime.utcnow().strftime("%B %d, %Y")}.
|
| 905 |
|
| 906 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 907 |
|
| 908 |
-
|
| 909 |
-
-
|
| 910 |
-
-
|
| 911 |
-
-
|
|
|
|
|
|
|
| 912 |
|
| 913 |
-
STEP
|
| 914 |
Read the News Context below and determine:
|
| 915 |
|
| 916 |
A) DIRECT MATCH β Sources directly answer the question:
|
| 917 |
-
β
|
| 918 |
-
β
|
| 919 |
-
β Use numbered points with **bold** headlines
|
| 920 |
|
| 921 |
B) RELATED INFORMATION β Sources have related but not exact information:
|
| 922 |
-
β
|
| 923 |
-
β
|
| 924 |
-
β Provide the related information anyway (it may still be helpful)
|
| 925 |
|
| 926 |
-
C) NO
|
| 927 |
β Say clearly: "I couldn't find relevant news on that topic in today's feed."
|
| 928 |
-
β
|
| 929 |
|
| 930 |
-
STEP
|
| 931 |
1. Use ONLY facts from the News Context below. NEVER use training data or general knowledge.
|
| 932 |
-
2.
|
| 933 |
-
3.
|
| 934 |
-
4.
|
| 935 |
-
5.
|
| 936 |
-
6. Be helpful and flexible β if exact match not found, offer related information.
|
| 937 |
-
7. At the END of your answer, on a new line, write exactly:
|
| 938 |
FOLLOW_UP: question1 | question2 | question3
|
| 939 |
-
|
| 940 |
|
| 941 |
-
Source Index:
|
| 942 |
-
{source_index_lines}
|
| 943 |
News Context (from live multilingual database):
|
| 944 |
-
{context_text}
|
| 945 |
|
| 946 |
Conversation History:
|
| 947 |
{history_text}
|
|
|
|
| 769 |
|
| 770 |
prompt = f"""You are ARKI AI, a real-time news assistant. Today's date is {datetime.utcnow().strftime("%B %d, %Y")}.
|
| 771 |
|
| 772 |
+
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 773 |
+
SOURCE INDEX β ONLY THESE SOURCES EXIST. DO NOT INVENT ANY OTHERS.
|
| 774 |
+
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 775 |
+
{source_index_lines if source_index_lines else "NO SOURCES RETRIEVED."}
|
| 776 |
+
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 777 |
+
|
| 778 |
+
CRITICAL CITATION RULE:
|
| 779 |
+
- You have EXACTLY {len(final_sources)} source(s) listed above.
|
| 780 |
+
- ONLY cite numbers that appear in the Source Index above (e.g. if you have 2 sources, only use [1] and [2]).
|
| 781 |
+
- NEVER write [3], [4], [5]... if those numbers are not in the Source Index.
|
| 782 |
+
- NEVER invent sources, facts, or citations from your training data.
|
| 783 |
+
- Every fact you state MUST come from the News Context below AND be cited with its number.
|
| 784 |
+
|
| 785 |
+
STEP 1 β EVALUATE THE SOURCES:
|
| 786 |
Read the News Context below and determine:
|
| 787 |
|
| 788 |
A) DIRECT MATCH β Sources directly answer the question:
|
| 789 |
+
β Answer using ONLY facts from the context, cite each fact with [number]
|
| 790 |
+
β Use **bold** headlines for structure
|
|
|
|
| 791 |
|
| 792 |
B) RELATED INFORMATION β Sources have related but not exact information:
|
| 793 |
+
β Say: "I found articles about [related topic], but not specifically about [exact query]."
|
| 794 |
+
β Share what IS in the context, citing with [number]
|
|
|
|
| 795 |
|
| 796 |
+
C) NO SOURCES / NO RELEVANT INFORMATION:
|
| 797 |
β Say clearly: "I couldn't find relevant news on that topic in today's feed."
|
| 798 |
+
β STOP. Do not add any information from your training data.
|
| 799 |
|
| 800 |
+
STEP 2 β ANSWER RULES:
|
| 801 |
1. Use ONLY facts from the News Context below. NEVER use training data or general knowledge.
|
| 802 |
+
2. Cite every fact with its source number: [1] or [2] etc. Only use numbers from the Source Index.
|
| 803 |
+
3. Non-English articles β translate content to English in your answer.
|
| 804 |
+
4. Always respond in English.
|
| 805 |
+
5. At the END of your answer, on a new line, write exactly:
|
|
|
|
|
|
|
| 806 |
FOLLOW_UP: question1 | question2 | question3
|
| 807 |
+
(3 short follow-up questions based only on what you actually found)
|
| 808 |
|
|
|
|
|
|
|
| 809 |
News Context (from live multilingual database):
|
| 810 |
+
{context_text if context_text else "NO CONTEXT RETRIEVED."}
|
| 811 |
|
| 812 |
Conversation History:
|
| 813 |
{history_text}
|
|
|
|
| 903 |
|
| 904 |
prompt_stream = f"""You are ARKI AI, a real-time news assistant. Today's date is {datetime.utcnow().strftime("%B %d, %Y")}.
|
| 905 |
|
| 906 |
+
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 907 |
+
SOURCE INDEX β ONLY THESE SOURCES EXIST. DO NOT INVENT ANY OTHERS.
|
| 908 |
+
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 909 |
+
{source_index_lines if source_index_lines else "NO SOURCES RETRIEVED."}
|
| 910 |
+
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 911 |
|
| 912 |
+
CRITICAL CITATION RULE:
|
| 913 |
+
- You have EXACTLY {len(final_sources)} source(s) listed above.
|
| 914 |
+
- ONLY cite numbers that appear in the Source Index above (e.g. if you have 2 sources, only use [1] and [2]).
|
| 915 |
+
- NEVER write [3], [4], [5]... if those numbers are not in the Source Index.
|
| 916 |
+
- NEVER invent sources, facts, or citations from your training data.
|
| 917 |
+
- Every fact you state MUST come from the News Context below AND be cited with its number.
|
| 918 |
|
| 919 |
+
STEP 1 β EVALUATE THE SOURCES:
|
| 920 |
Read the News Context below and determine:
|
| 921 |
|
| 922 |
A) DIRECT MATCH β Sources directly answer the question:
|
| 923 |
+
β Answer using ONLY facts from the context, cite each fact with [number]
|
| 924 |
+
β Use **bold** headlines for structure
|
|
|
|
| 925 |
|
| 926 |
B) RELATED INFORMATION β Sources have related but not exact information:
|
| 927 |
+
β Say: "I found articles about [related topic], but not specifically about [exact query]."
|
| 928 |
+
β Share what IS in the context, citing with [number]
|
|
|
|
| 929 |
|
| 930 |
+
C) NO SOURCES / NO RELEVANT INFORMATION:
|
| 931 |
β Say clearly: "I couldn't find relevant news on that topic in today's feed."
|
| 932 |
+
β STOP. Do not add any information from your training data.
|
| 933 |
|
| 934 |
+
STEP 2 β ANSWER RULES:
|
| 935 |
1. Use ONLY facts from the News Context below. NEVER use training data or general knowledge.
|
| 936 |
+
2. Cite every fact with its source number: [1] or [2] etc. Only use numbers from the Source Index.
|
| 937 |
+
3. Non-English articles β translate content to English in your answer.
|
| 938 |
+
4. Always respond in English.
|
| 939 |
+
5. At the END of your answer, on a new line, write exactly:
|
|
|
|
|
|
|
| 940 |
FOLLOW_UP: question1 | question2 | question3
|
| 941 |
+
(3 short follow-up questions based only on what you actually found)
|
| 942 |
|
|
|
|
|
|
|
| 943 |
News Context (from live multilingual database):
|
| 944 |
+
{context_text if context_text else "NO CONTEXT RETRIEVED."}
|
| 945 |
|
| 946 |
Conversation History:
|
| 947 |
{history_text}
|