Spaces:

Qar-Raz
/

NLP-RAG

Sleeping

App Files Files Community

Qar-Raz commited on Apr 5

Commit

860aa5d

verified ·

1 Parent(s): f1d2c2b

Sync backend Docker context from GitHub main

Browse files

Files changed (2) hide show

backend/routes/predict_stream.py +1 -1
retriever/generator.py +27 -5

backend/routes/predict_stream.py CHANGED Viewed

@@ -100,7 +100,7 @@ def predict_stream(payload: PredictRequest) -> StreamingResponse:
                 yield to_ndjson({"type": "token", "token": token})
             inference_time = time.perf_counter() - inference_start
-            answer = "".join(answer_parts)
             retrieved_chunks = build_retrieved_chunks(contexts=contexts, chunk_lookup=chunk_lookup)
             yield to_ndjson(

                 yield to_ndjson({"type": "token", "token": token})
             inference_time = time.perf_counter() - inference_start
+            answer = rag_engine.truncate_incomplete_tail("".join(answer_parts))
             retrieved_chunks = build_retrieved_chunks(contexts=contexts, chunk_lookup=chunk_lookup)
             yield to_ndjson(

retriever/generator.py CHANGED Viewed

@@ -1,7 +1,29 @@
 #changed the prompt to output as markdown, plus some formating details
 #also added get answer stream for incremental token rendering on the frontend
 # --@Qamar
 class RAGGenerator:
     def generate_prompt(self, query, retrieved_contexts, context_urls=None):
         if context_urls:
             context_text = "\n\n".join([f"[Source {i+1}] {url}: {c}" for i, (c, url) in enumerate(zip(retrieved_contexts, context_urls))])
@@ -14,9 +36,8 @@ INSTRUCTIONS:
 1. THERAPEUTIC DIALOGUE: Respond directly to the user as your client. Start by briefly validating their feelings, then gently apply CBT concepts, psychoeducation, or interventions found STRICTLY in the provided documents.
 2. PATIENT EXAMPLES & NAMES (CRITICAL): The provided documents contain transcripts and examples of other patients and therapists (e.g., Abe, Judith, Joseph). These are illustrative case studies ONLY. DO NOT assume the user is "Abe" or any other person mentioned in the text. NEVER address or refer to the user by these names. Extract the CBT concepts/techniques demonstrated in these transcripts and apply them to the current user's unique situation.
 3. GROUNDING (NO OPINIONS): Do not give your own opinions, general life advice, or use outside knowledge. Every therapeutic concept, identified cognitive distortion, or suggested exercise must come directly from the provided text.
-4. CITATIONS: You must cite the sources used in your response to show where the clinical guidance comes from (e.g., "It sounds like you might be experiencing what is known as 'all-or-nothing thinking' [Source 1]").
-5. FORMAT: Use clear Markdown formatting. Use paragraphs for conversational tone, and bullet points if you are breaking down specific steps, questions, or exercises found in the text.
-6. MISSING INFO: If the provided excerpts do not contain relevant CBT concepts to address the client's specific statement, explicitly state: "While I hear how difficult this is for you, the clinical materials I have right now do not contain specific steps to address this." Do not invent therapeutic advice.
 RETRIEVED CLINICAL CONTEXT:
 {context_text}
@@ -28,7 +49,8 @@ THERAPEUTIC RESPONSE (GROUNDED IN SOURCES):"""
     def get_answer(self, model_instance, query, retrieved_contexts, context_urls=None, **kwargs):
         """Uses a specific model instance to generate the final answer."""
         prompt = self.generate_prompt(query, retrieved_contexts, context_urls)
-        return model_instance.generate(prompt, **kwargs)
     def get_answer_stream(self, model_instance, query, retrieved_contexts, context_urls=None, **kwargs):
         """Streams model output token-by-token for incremental UI updates."""
@@ -43,4 +65,4 @@ THERAPEUTIC RESPONSE (GROUNDED IN SOURCES):"""
         # Fallback for model wrappers that only expose sync generation.
         answer = model_instance.generate(prompt, **kwargs)
         if answer:
-            yield answer

 #changed the prompt to output as markdown, plus some formating details
 #also added get answer stream for incremental token rendering on the frontend
 # --@Qamar
+# called in get_answer_stream, to truncate to last full stop
 class RAGGenerator:
+    def truncate_incomplete_tail(self, answer: str) -> str:
+        """Trim incomplete trailing text so responses end on a full stop."""
+        if not answer:
+            return answer
+        trimmed = answer.rstrip()
+        if not trimmed:
+            return trimmed
+        if trimmed.endswith("."):
+            return trimmed
+        last_full_stop = trimmed.rfind(".")
+        if last_full_stop == -1:
+            # If no sentence boundary exists, keep original text.
+            return trimmed
+        return trimmed[: last_full_stop + 1].rstrip()
     def generate_prompt(self, query, retrieved_contexts, context_urls=None):
         if context_urls:
             context_text = "\n\n".join([f"[Source {i+1}] {url}: {c}" for i, (c, url) in enumerate(zip(retrieved_contexts, context_urls))])
 1. THERAPEUTIC DIALOGUE: Respond directly to the user as your client. Start by briefly validating their feelings, then gently apply CBT concepts, psychoeducation, or interventions found STRICTLY in the provided documents.
 2. PATIENT EXAMPLES & NAMES (CRITICAL): The provided documents contain transcripts and examples of other patients and therapists (e.g., Abe, Judith, Joseph). These are illustrative case studies ONLY. DO NOT assume the user is "Abe" or any other person mentioned in the text. NEVER address or refer to the user by these names. Extract the CBT concepts/techniques demonstrated in these transcripts and apply them to the current user's unique situation.
 3. GROUNDING (NO OPINIONS): Do not give your own opinions, general life advice, or use outside knowledge. Every therapeutic concept, identified cognitive distortion, or suggested exercise must come directly from the provided text.
+4. FORMAT: Use clear Markdown formatting. Use paragraphs for conversational tone, and bullet points if you are breaking down specific steps, questions, or exercises found in the text.
+5. MISSING INFO: If the provided excerpts do not contain relevant CBT concepts to address the client's specific statement, explicitly state: "While I hear how difficult this is for you, the clinical materials I have right now do not contain specific steps to address this." Do not invent therapeutic advice.
 RETRIEVED CLINICAL CONTEXT:
 {context_text}
     def get_answer(self, model_instance, query, retrieved_contexts, context_urls=None, **kwargs):
         """Uses a specific model instance to generate the final answer."""
         prompt = self.generate_prompt(query, retrieved_contexts, context_urls)
+        answer = model_instance.generate(prompt, **kwargs)
+        return self.truncate_incomplete_tail(answer)
     def get_answer_stream(self, model_instance, query, retrieved_contexts, context_urls=None, **kwargs):
         """Streams model output token-by-token for incremental UI updates."""
         # Fallback for model wrappers that only expose sync generation.
         answer = model_instance.generate(prompt, **kwargs)
         if answer:
+            yield self.truncate_incomplete_tail(answer)