Spaces:

Zeggai
/

AgenticRAG

Sleeping

App Files Files Community

Zeggai Abdellah commited on Jun 8, 2025

Commit

b9bf1c6

1 Parent(s): fdc8d14

update the main from test space

Browse files

Files changed (4) hide show

data/Guide-pratique-de-mise-en-oeuvre-du-calendrier-national-de-vaccination-2023.json +3 -3
data/section_two_chunks.json +3 -3
prepare_env.py +166 -225
rag_pipeline.py +62 -56

data/Guide-pratique-de-mise-en-oeuvre-du-calendrier-national-de-vaccination-2023.json CHANGED Viewed

@@ -9061,13 +9061,13 @@
   {
     "type": "TableElement",
     "element_id": "chunk-69_table_-1677335896359719811",
-    "text": "\n  DESCRIPTION:\n  Le calendrier vaccinal 2023 comprend les vaccins BCG, HBV, DTCaVPI-Hib-HBV, VPC, VPO, ROR, DTCaVPI et dT, administrés selon l'âge.\n  :TABLE DATA:\n    Vaccin Âge Naissance 02 mois 04 mois 11 mois 12 mois 18 mois 6 ans 11-13 ans 16-18 ans Tous les 10 ans à partir de 18 ans BCG  BCG HBV  HBV DTCaVPI-Hib-HBV   DTCaVPI-Hib-HBV DTCaVPI-Hib-HBV DTCaVPI-Hib-HBV VPC   VPC VPC VPC VPO   VPO VPO VPO ROR     ROR ROR DTCaVPI       DTCaVPI dT        dT dT dT\n  :END TABLE DATA:\n\n  ",
     "filename": "Guide-pratique-de-mise-en-oeuvre-du-calendrier-national-de-vaccination-2023.pdf",
     "filetype": "application/pdf",
     "elements": {
       "type": "Table",
       "element_id": "2fbbc8eb32492ddeb5527973a42eeada",
-      "text": "Vaccin Âge Naissance 02 mois 04 mois 11 mois 12 mois 18 mois 6 ans 11-13 ans 16-18 ans Tous les 10 ans à partir de 18 ans BCG  BCG HBV  HBV DTCaVPI-Hib-HBV   DTCaVPI-Hib-HBV DTCaVPI-Hib-HBV DTCaVPI-Hib-HBV VPC   VPC VPC VPC VPO   VPO VPO VPO ROR     ROR ROR DTCaVPI       DTCaVPI dT        dT dT dT",
       "metadata": {
         "category_depth": 1,
         "page_number": 10,
@@ -9128,7 +9128,7 @@
       {
         "type": "Table",
         "element_id": "2fbbc8eb32492ddeb5527973a42eeada",
-        "text": "Vaccin Âge Naissance 02 mois 04 mois 11 mois 12 mois 18 mois 6 ans 11-13 ans 16-18 ans Tous les 10 ans à partir de 18 ans BCG  BCG HBV  HBV DTCaVPI-Hib-HBV   DTCaVPI-Hib-HBV DTCaVPI-Hib-HBV DTCaVPI-Hib-HBV VPC   VPC VPC VPC VPO   VPO VPO VPO ROR     ROR ROR DTCaVPI       DTCaVPI dT        dT dT dT",
         "metadata": {
           "category_depth": 1,
           "page_number": 10,

   {
     "type": "TableElement",
     "element_id": "chunk-69_table_-1677335896359719811",
+    "text": "\n  DESCRIPTION:\n  Le calendrier vaccinal 2023 comprend les vaccins BCG, HBV, DTCaVPI-Hib-HBV, VPC, VPO, ROR, DTCaVPI et dT, administrés selon l'âge.\n  :TABLE DATA:\n    * **Naissance (Birth):** BCG, HBV\n* **02 mois (2 months):** DTCaVPI-Hib-HBV, VPC, VPO\n* **04 mois (4 months):** DTCaVPI-Hib-HBV, VPC, VPO\n* **11 mois (11 months):** ROR\n* **12 mois (12 months):** DTCaVPI-Hib-HBV, VPC, VPO\n* **18 mois (18 months):** ROR\n* **6 ans (6 years):** DTCaVPI\n* **11-13 ans (11-13 years):** dT\n* **16-18 ans (16-18 years):** dT\n* **Tous les 10 ans à partir de 18 ans (Every 10 years from age 18):** dT\n\nVaccin Âge Naissance 02 mois 04 mois 11 mois 12 mois 18 mois 6 ans 11-13 ans 16-18 ans Tous les 10 ans à partir de 18 ans BCG  BCG HBV  HBV DTCaVPI-Hib-HBV   DTCaVPI-Hib-HBV DTCaVPI-Hib-HBV DTCaVPI-Hib-HBV VPC   VPC VPC VPC VPO   VPO VPO VPO ROR     ROR ROR DTCaVPI       DTCaVPI dT        dT dT dT\n  :END TABLE DATA:\n\n  ",
     "filename": "Guide-pratique-de-mise-en-oeuvre-du-calendrier-national-de-vaccination-2023.pdf",
     "filetype": "application/pdf",
     "elements": {
       "type": "Table",
       "element_id": "2fbbc8eb32492ddeb5527973a42eeada",
+      "text": "* **Naissance (Birth):** BCG, HBV\n* **02 mois (2 months):** DTCaVPI-Hib-HBV, VPC, VPO\n* **04 mois (4 months):** DTCaVPI-Hib-HBV, VPC, VPO\n* **11 mois (11 months):** ROR\n* **12 mois (12 months):** DTCaVPI-Hib-HBV, VPC, VPO\n* **18 mois (18 months):** ROR\n* **6 ans (6 years):** DTCaVPI\n* **11-13 ans (11-13 years):** dT\n* **16-18 ans (16-18 years):** dT\n* **Tous les 10 ans à partir de 18 ans (Every 10 years from age 18):** dT\n\nVaccin Âge Naissance 02 mois 04 mois 11 mois 12 mois 18 mois 6 ans 11-13 ans 16-18 ans Tous les 10 ans à partir de 18 ans BCG  BCG HBV  HBV DTCaVPI-Hib-HBV   DTCaVPI-Hib-HBV DTCaVPI-Hib-HBV DTCaVPI-Hib-HBV VPC   VPC VPC VPC VPO   VPO VPO VPO ROR     ROR ROR DTCaVPI       DTCaVPI dT        dT dT dT",
       "metadata": {
         "category_depth": 1,
         "page_number": 10,
       {
         "type": "Table",
         "element_id": "2fbbc8eb32492ddeb5527973a42eeada",
+        "text": "* **Naissance (Birth):** BCG, HBV\n* **02 mois (2 months):** DTCaVPI-Hib-HBV, VPC, VPO\n* **04 mois (4 months):** DTCaVPI-Hib-HBV, VPC, VPO\n* **11 mois (11 months):** ROR\n* **12 mois (12 months):** DTCaVPI-Hib-HBV, VPC, VPO\n* **18 mois (18 months):** ROR\n* **6 ans (6 years):** DTCaVPI\n* **11-13 ans (11-13 years):** dT\n* **16-18 ans (16-18 years):** dT\n* **Tous les 10 ans à partir de 18 ans (Every 10 years from age 18):** dT\n\nVaccin Âge Naissance 02 mois 04 mois 11 mois 12 mois 18 mois 6 ans 11-13 ans 16-18 ans Tous les 10 ans à partir de 18 ans BCG  BCG HBV  HBV DTCaVPI-Hib-HBV   DTCaVPI-Hib-HBV DTCaVPI-Hib-HBV DTCaVPI-Hib-HBV VPC   VPC VPC VPC VPO   VPO VPO VPO ROR     ROR ROR DTCaVPI       DTCaVPI dT        dT dT dT",
         "metadata": {
           "category_depth": 1,
           "page_number": 10,

data/section_two_chunks.json CHANGED Viewed

@@ -6609,13 +6609,13 @@
   {
     "type": "TableElement",
     "element_id": "chunk-59_table_-3658241022386714145",
-    "text": "\n  DESCRIPTION:\n  Le calendrier vaccinal 2023 débute à la naissance avec BCG et HBV, se poursuit avec des vaccins combinés et des rappels réguliers tout au long de la vie.\n  :TABLE DATA:\n    Vaccin Âge Naissance 02 mois 04 mois 11 mois 12 mois 18 mois 6 ans 11-13 ans 16-18 ans Tous les 10 ans à partir de 18 ans BCG  BCG HBV  HBV DTCaVPI-Hib-HBV   DTCaVPI-Hib-HBV DTCaVPI-Hib-HBV DTCaVPI-Hib-HBV VPC   VPC VPC VPC VPO   VPO VPO VPO ROR     ROR ROR DTCaVPI       DTCaVPI dT        dT dT dT\n  :END TABLE DATA:\n\n  ",
     "filename": "Guide-pratique-de-mise-en-oeuvre-du-calendrier-national-de-vaccination-2023.pdf",
     "filetype": "application/pdf",
     "elements": {
       "type": "Table",
       "element_id": "2fbbc8eb32492ddeb5527973a42eeada",
-      "text": "Vaccin Âge Naissance 02 mois 04 mois 11 mois 12 mois 18 mois 6 ans 11-13 ans 16-18 ans Tous les 10 ans à partir de 18 ans BCG  BCG HBV  HBV DTCaVPI-Hib-HBV   DTCaVPI-Hib-HBV DTCaVPI-Hib-HBV DTCaVPI-Hib-HBV VPC   VPC VPC VPC VPO   VPO VPO VPO ROR     ROR ROR DTCaVPI       DTCaVPI dT        dT dT dT",
       "metadata": {
         "category_depth": 1,
         "page_number": 10,
@@ -6676,7 +6676,7 @@
       {
         "type": "Table",
         "element_id": "2fbbc8eb32492ddeb5527973a42eeada",
-        "text": "Vaccin Âge Naissance 02 mois 04 mois 11 mois 12 mois 18 mois 6 ans 11-13 ans 16-18 ans Tous les 10 ans à partir de 18 ans BCG  BCG HBV  HBV DTCaVPI-Hib-HBV   DTCaVPI-Hib-HBV DTCaVPI-Hib-HBV DTCaVPI-Hib-HBV VPC   VPC VPC VPC VPO   VPO VPO VPO ROR     ROR ROR DTCaVPI       DTCaVPI dT        dT dT dT",
         "metadata": {
           "category_depth": 1,
           "page_number": 10,

   {
     "type": "TableElement",
     "element_id": "chunk-59_table_-3658241022386714145",
+    "text": "\n  DESCRIPTION:\n  Le calendrier vaccinal 2023 comprend les vaccins BCG, HBV, DTCaVPI-Hib-HBV, VPC, VPO, ROR, DTCaVPI et dT, administrés selon l'âge.\n  :TABLE DATA:\n    * **Naissance (Birth):** BCG, HBV\n* **02 mois (2 months):** DTCaVPI-Hib-HBV, VPC, VPO\n* **04 mois (4 months):** DTCaVPI-Hib-HBV, VPC, VPO\n* **11 mois (11 months):** ROR\n* **12 mois (12 months):** DTCaVPI-Hib-HBV, VPC, VPO\n* **18 mois (18 months):** ROR\n* **6 ans (6 years):** DTCaVPI\n* **11-13 ans (11-13 years):** dT\n* **16-18 ans (16-18 years):** dT\n* **Tous les 10 ans à partir de 18 ans (Every 10 years from age 18):** dT\n\nVaccin Âge Naissance 02 mois 04 mois 11 mois 12 mois 18 mois 6 ans 11-13 ans 16-18 ans Tous les 10 ans à partir de 18 ans BCG  BCG HBV  HBV DTCaVPI-Hib-HBV   DTCaVPI-Hib-HBV DTCaVPI-Hib-HBV DTCaVPI-Hib-HBV VPC   VPC VPC VPC VPO   VPO VPO VPO ROR     ROR ROR DTCaVPI       DTCaVPI dT        dT dT dT\n  :END TABLE DATA:\n\n  ",
     "filename": "Guide-pratique-de-mise-en-oeuvre-du-calendrier-national-de-vaccination-2023.pdf",
     "filetype": "application/pdf",
     "elements": {
       "type": "Table",
       "element_id": "2fbbc8eb32492ddeb5527973a42eeada",
+      "text": "* **Naissance (Birth):** BCG, HBV\n* **02 mois (2 months):** DTCaVPI-Hib-HBV, VPC, VPO\n* **04 mois (4 months):** DTCaVPI-Hib-HBV, VPC, VPO\n* **11 mois (11 months):** ROR\n* **12 mois (12 months):** DTCaVPI-Hib-HBV, VPC, VPO\n* **18 mois (18 months):** ROR\n* **6 ans (6 years):** DTCaVPI\n* **11-13 ans (11-13 years):** dT\n* **16-18 ans (16-18 years):** dT\n* **Tous les 10 ans à partir de 18 ans (Every 10 years from age 18):** dT\n\nVaccin Âge Naissance 02 mois 04 mois 11 mois 12 mois 18 mois 6 ans 11-13 ans 16-18 ans Tous les 10 ans à partir de 18 ans BCG  BCG HBV  HBV DTCaVPI-Hib-HBV   DTCaVPI-Hib-HBV DTCaVPI-Hib-HBV DTCaVPI-Hib-HBV VPC   VPC VPC VPC VPO   VPO VPO VPO ROR     ROR ROR DTCaVPI       DTCaVPI dT        dT dT dT",
       "metadata": {
         "category_depth": 1,
         "page_number": 10,
       {
         "type": "Table",
         "element_id": "2fbbc8eb32492ddeb5527973a42eeada",
+        "text": "* **Naissance (Birth):** BCG, HBV\n* **02 mois (2 months):** DTCaVPI-Hib-HBV, VPC, VPO\n* **04 mois (4 months):** DTCaVPI-Hib-HBV, VPC, VPO\n* **11 mois (11 months):** ROR\n* **12 mois (12 months):** DTCaVPI-Hib-HBV, VPC, VPO\n* **18 mois (18 months):** ROR\n* **6 ans (6 years):** DTCaVPI\n* **11-13 ans (11-13 years):** dT\n* **16-18 ans (16-18 years):** dT\n* **Tous les 10 ans à partir de 18 ans (Every 10 years from age 18):** dT\n\nVaccin Âge Naissance 02 mois 04 mois 11 mois 12 mois 18 mois 6 ans 11-13 ans 16-18 ans Tous les 10 ans à partir de 18 ans BCG  BCG HBV  HBV DTCaVPI-Hib-HBV   DTCaVPI-Hib-HBV DTCaVPI-Hib-HBV DTCaVPI-Hib-HBV VPC   VPC VPC VPC VPO   VPO VPO VPO ROR     ROR ROR DTCaVPI       DTCaVPI dT        dT dT dT",
         "metadata": {
           "category_depth": 1,
           "page_number": 10,

prepare_env.py CHANGED Viewed

@@ -17,7 +17,11 @@ from langchain.retrievers.multi_query import MultiQueryRetriever
 from langchain_google_genai import ChatGoogleGenerativeAI
 from llama_index.core.tools import FunctionTool
 from llama_index.core.schema import TextNode
 def extract_source_ids(response_text):
     """
@@ -124,21 +128,54 @@ def create_vectorstore_from_json(json_path: str, collection_name: str, embedding
     print(f"✅ Vector store created with collection: {collection_name}")
     return vectorstore, documents
-def create_retriever(vectorstore, docs, llm):
-    """Create ensemble retriever with vector and BM25 search"""
     print("🔍 Creating ensemble retriever...")
     # Vector retriever
     vector_retriever = vectorstore.as_retriever(
         search_type="similarity",
-        search_kwargs={"k": 6}
     )
-    print("✅ Vector retriever created (k=6)")
     # BM25 retriever
     bm25_retriever = BM25Retriever.from_documents(docs)
-    bm25_retriever.k = 2
-    print("✅ BM25 retriever created (k=2)")
     # Ensemble retriever
     ensemble_retriever = EnsembleRetriever(
@@ -147,10 +184,12 @@ def create_retriever(vectorstore, docs, llm):
     )
     print("✅ Ensemble retriever created (weights: 0.5, 0.5)")
-    # Multi-query expanding retriever
     expanding_retriever = MultiQueryRetriever.from_llm(
         retriever=ensemble_retriever,
-        llm=llm
     )
     print("✅ Multi-query expanding retriever created")
@@ -219,13 +258,15 @@ def section_tool_wrapper(retriever, section_path_chunks, query):
         return f"Error retrieving documents: {str(e)}"
 def create_section_tools(embedding_function, llm):
-    """Create all section-specific retrieval tools"""
-    print("🛠️ Creating section-specific retrieval tools...")
     # Define section paths - Fixed path structure
     section_paths = {
         'one': './data/section_one_chunks.json',
-        'two': './data/section_two_chunks.json',
         'three': './data/section_three_chunks.json',
         'four': './data/section_four_chunks.json',
         'five': './data/section_five_chunks.json',
@@ -235,7 +276,7 @@ def create_section_tools(embedding_function, llm):
         'nine': './data/section_nine_chunks.json',
         'ten': './data/section_ten_chunks.json'
     }
     # Create retrievers for each section
     section_retrievers = {}
     for section, path in section_paths.items():
@@ -243,7 +284,7 @@ def create_section_tools(embedding_function, llm):
             if os.path.exists(path):
                 print(f"📁 Creating retriever for section {section} from {path}")
                 vstore, docs = create_vectorstore_from_json(path, f"Guide_2023_{section}", embedding_function)
-                section_retrievers[section] = create_retriever(vstore, docs, llm)
                 print(f"✅ Successfully created retriever for section {section}")
             else:
                 print(f"⚠️ Warning: File not found for section {section}: {path}")
@@ -251,7 +292,7 @@ def create_section_tools(embedding_function, llm):
         except Exception as e:
             print(f"❌ Error creating retriever for section {section}: {e}")
             section_retrievers[section] = None
     # Create main guide retriever
     guide_path = './data/Guide-pratique-de-mise-en-oeuvre-du-calendrier-national-de-vaccination-2023.json'
     guide_retriever = None
@@ -284,71 +325,48 @@ def create_section_tools(embedding_function, llm):
     except Exception as e:
         print(f"❌ Error creating immunization retriever: {e}")
-    # General-purpose tool (entire Algerian guide)
-    def guide_retrieval_tool(query: str) -> str:
-        """
-        General-purpose retrieval tool for the entire Algerian National Vaccination Guide (2023).
-        Use ONLY when a query spans multiple unrelated sections or cannot be confidently routed
-        to a more specific tool. This tool provides a fallback when the intent is ambiguous
-        or multi-topic (e.g., combining schedule, cold chain, and public outreach in one query).
-        Do NOT use this tool for clearly scoped questions related to vaccination schedules, disease profiles,
-        vaccine logistics, or procedural workflows — use the specific section tools instead.
-        Secondary source: WHO Immunization Guide (use `immunization_tool` when broader context is required).
         Args:
-            query (str): A general, complex, or cross-sectional vaccination query.
         Returns:
-            str: Synthesized answer from the entire national guide.
         """
-        print(f"🏥 GUIDE TOOL CALLED: {query[:50]}...")
         if not guide_retriever:
-            print("❌ Guide retriever not available - main guide file may be missing")
             return "Guide retriever not available - main guide file may be missing"
-        try:
-            return section_tool_wrapper(guide_retriever, guide_path, query)
-        except Exception as e:
-            print(f"❌ Error accessing guide retriever: {str(e)}")
-            return f"Error accessing guide retriever: {str(e)}"
-    def immunization_tool(query: str) -> str:
         """
-        WHO Immunization in Practice 2015 retrieval tool.
-        Use ONLY when global guidance or procedural context is needed that is not covered or
-        is unclear in the Algerian guide. This is a secondary reference for training standards,
-        general immunization logistics, and vaccine delivery practices.
-        Do NOT use this tool to answer country-specific policy or scheduling questions.
         Args:
-            query (str): A question seeking global immunization practices.
         Returns:
             str: Content from the WHO Immunization in Practice guide.
         """
         print(f"🌍 WHO TOOL CALLED: {query[:50]}...")
         if not immunization_retriever:
-            print("❌ Immunization in Practice retriever not available - WHO guide file may be missing")
             return "Immunization in Practice retriever not available - WHO guide file may be missing"
-        try:
-            return section_tool_wrapper(immunization_retriever, immunization_path, query)
-        except Exception as e:
-            print(f"❌ Error accessing immunization retriever: {str(e)}")
-            return f"Error accessing immunization retriever: {str(e)}"
-    # Section-Specific Tools - Fixed implementation
-    def section_one_tool(query: str) -> str:
         """
-        Section 1: Programme Élargi de Vaccination (PEV)
-        Use for queries about the national immunization program structure: its objectives,
-        history, evaluation, and rationale for updates to the Algerian calendar.
-        Do NOT use for vaccine schedules, disease information, or administration techniques.
         Args:
             query (str): A question about the foundation or evolution of the PEV.
@@ -356,24 +374,16 @@ def create_section_tools(embedding_function, llm):
         Returns:
             str: Response from Section 1.
         """
-        print(f"📋 SECTION 1 TOOL CALLED: {query[:50]}...")
         if not section_retrievers.get('one'):
-            print("❌ Section 1 retriever not available - file may be missing")
-            return "Section 1 retriever not available - file may be missing"
-        try:
-            return section_tool_wrapper(section_retrievers['one'], section_paths['one'], query)
-        except Exception as e:
-            print(f"❌ Error accessing section 1: {str(e)}")
-            return f"Error accessing section 1: {str(e)}"
-    def section_two_tool(query: str) -> str:
         """
-        Section 2: Maladies Ciblées
-        Use ONLY for questions about the characteristics of vaccine-preventable diseases:
-        symptoms, transmission, complications, and prevention.
-        Do NOT use for questions about vaccines, administration schedules, or procedures.
         Args:
             query (str): A question about a disease covered by the national vaccination program.
@@ -381,99 +391,67 @@ def create_section_tools(embedding_function, llm):
         Returns:
             str: Disease-specific content from Section 2.
         """
-        print(f"🦠 SECTION 2 TOOL CALLED: {query[:50]}...")
         if not section_retrievers.get('two'):
-            print("❌ Section 2 retriever not available - file may be missing")
-            return "Section 2 retriever not available - file may be missing"
-        try:
-            return section_tool_wrapper(section_retrievers['two'], section_paths['two'], query)
-        except Exception as e:
-            print(f"❌ Error accessing section 2: {str(e)}")
-            return f"Error accessing section 2: {str(e)}"
-    def section_three_tool(query: str) -> str:
         """
-        Section 3: Vaccins du Calendrier
-        Use ONLY for questions about the vaccines themselves: their types, compositions,
-        methods of administration, and how they work.
-        Do NOT use for schedule timing, catch-up protocols, or disease information.
         Args:
-            query (str): A question about a vaccine's formulation or method of delivery.
         Returns:
-            str: Vaccine info from Section 3.
         """
-        print(f"💉 SECTION 3 TOOL CALLED: {query[:50]}...")
         if not section_retrievers.get('three'):
-            print("❌ Section 3 retriever not available - file may be missing")
-            return "Section 3 retriever not available - file may be missing"
-        try:
-            return section_tool_wrapper(section_retrievers['three'], section_paths['three'], query)
-        except Exception as e:
-            print(f"❌ Error accessing section 3: {str(e)}")
-            return f"Error accessing section 3: {str(e)}"
-    def section_four_tool(query: str) -> str:
         """
-        Section 4: Rattrapage Vaccinal
-        Use ONLY when the question involves missed or delayed vaccinations and how to reschedule them
-        based on the child's current age.
-        Do NOT use for standard schedules (on-time), vaccine properties, or cold chain issues.
         Args:
-            query (str): A question about catch-up vaccination based on delay or omission.
         Returns:
-            str: Catch-up guidance from Section 4.
         """
-        print(f"🔄 SECTION 4 TOOL CALLED: {query[:50]}...")
         if not section_retrievers.get('four'):
-            print("❌ Section 4 retriever not available - file may be missing")
-            return "Section 4 retriever not available - file may be missing"
-        try:
-            return section_tool_wrapper(section_retrievers['four'], section_paths['four'], query)
-        except Exception as e:
-            print(f"❌ Error accessing section 4: {str(e)}")
-            return f"Error accessing section 4: {str(e)}"
-    def section_five_tool(query: str) -> str:
         """
-        Section 5: Populations Particulières
-        Use ONLY for vaccination questions concerning special populations:
-        preterm infants, immunosuppressed patients, chronic illness, or allergy conditions.
-        Do NOT use for general population, standard calendar, or vaccine preparation.
         Args:
-            query (str): A question about tailored vaccination for vulnerable groups.
         Returns:
             str: Custom recommendations from Section 5.
         """
-        print(f"👥 SECTION 5 TOOL CALLED: {query[:50]}...")
         if not section_retrievers.get('five'):
-            print("❌ Section 5 retriever not available - file may be missing")
-            return "Section 5 retriever not available - file may be missing"
-        try:
-            return section_tool_wrapper(section_retrievers['five'], section_paths['five'], query)
-        except Exception as e:
-            print(f"❌ Error accessing section 5: {str(e)}")
-            return f"Error accessing section 5: {str(e)}"
-    def section_six_tool(query: str) -> str:
         """
-        Section 6: Chaîne du Froid
-        Use ONLY for questions about vaccine storage, transport, cold chain equipment,
-        temperature monitoring, or cold chain failures.
-        Do NOT use for dose timing, administration methods, or disease information.
         Args:
             query (str): A logistics-related question about vaccine temperature management.
@@ -481,99 +459,67 @@ def create_section_tools(embedding_function, llm):
         Returns:
             str: Cold chain instructions from Section 6.
         """
-        print(f"❄️ SECTION 6 TOOL CALLED: {query[:50]}...")
         if not section_retrievers.get('six'):
-            print("�� Section 6 retriever not available - file may be missing")
-            return "Section 6 retriever not available - file may be missing"
-        try:
-            return section_tool_wrapper(section_retrievers['six'], section_paths['six'], query)
-        except Exception as e:
-            print(f"❌ Error accessing section 6: {str(e)}")
-            return f"Error accessing section 6: {str(e)}"
-    def section_seven_tool(query: str) -> str:
         """
-        Section 7: Sécurité des Injections
-        Use ONLY for questions related to the safe administration of vaccines:
-        equipment use, technique, safety precautions, and waste disposal.
-        Do NOT use for vaccine types, schedules, or cold chain issues.
         Args:
-            query (str): A question about how to inject vaccines safely.
         Returns:
             str: Best practices from Section 7.
         """
-        print(f"🛡️ SECTION 7 TOOL CALLED: {query[:50]}...")
         if not section_retrievers.get('seven'):
-            print("❌ Section 7 retriever not available - file may be missing")
-            return "Section 7 retriever not available - file may be missing"
-        try:
-            return section_tool_wrapper(section_retrievers['seven'], section_paths['seven'], query)
-        except Exception as e:
-            print(f"❌ Error accessing section 7: {str(e)}")
-            return f"Error accessing section 7: {str(e)}"
-    def section_eight_tool(query: str) -> str:
         """
-        Section 8: Séance de Vaccination & Vaccinovigilance
-        Use ONLY for questions about running a vaccination session, preparing the setting,
-        recording injections, and monitoring for adverse events (AEFI).
-        Do NOT use for disease, vaccine, or scheduling details.
         Args:
-            query (str): A question about operational conduct during vaccination.
         Returns:
             str: Workflow and safety monitoring details from Section 8.
         """
-        print(f"📊 SECTION 8 TOOL CALLED: {query[:50]}...")
         if not section_retrievers.get('eight'):
-            print("❌ Section 8 retriever not available - file may be missing")
-            return "Section 8 retriever not available - file may be missing"
-        try:
-            return section_tool_wrapper(section_retrievers['eight'], section_paths['eight'], query)
-        except Exception as e:
-            print(f"❌ Error accessing section 8: {str(e)}")
-            return f"Error accessing section 8: {str(e)}"
-    def section_nine_tool(query: str) -> str:
         """
-        Section 9: Planification des Séances de Vaccination
-        Use ONLY for planning and logistics questions: session scheduling, stock estimation,
-        and operational preparation at the facility level.
-        Do NOT use for vaccine info, schedules, or injection techniques.
         Args:
-            query (str): A question about how to plan or organize vaccination services.
         Returns:
             str: Planning and stock guidance from Section 9.
         """
-        print(f"📅 SECTION 9 TOOL CALLED: {query[:50]}...")
         if not section_retrievers.get('nine'):
-            print("❌ Section 9 retriever not available - file may be missing")
-            return "Section 9 retriever not available - file may be missing"
-        try:
-            return section_tool_wrapper(section_retrievers['nine'], section_paths['nine'], query)
-        except Exception as e:
-            print(f"❌ Error accessing section 9: {str(e)}")
-            return f"Error accessing section 9: {str(e)}"
-    def section_ten_tool(query: str) -> str:
         """
-        Section 10: Mobilisation Sociale
-        Use ONLY for questions about communication strategies, overcoming vaccine hesitancy,
-        rumor management, or community outreach.
-        Do NOT use for medical, logistical, or procedural topics.
         Args:
             query (str): A question about public engagement or communication for vaccination.
@@ -581,34 +527,29 @@ def create_section_tools(embedding_function, llm):
         Returns:
             str: Public mobilization strategies from Section 10.
         """
-        print(f"📢 SECTION 10 TOOL CALLED: {query[:50]}...")
         if not section_retrievers.get('ten'):
-            print("❌ Section 10 retriever not available - file may be missing")
-            return "Section 10 retriever not available - file may be missing"
-        try:
-            return section_tool_wrapper(section_retrievers['ten'], section_paths['ten'], query)
-        except Exception as e:
-            print(f"❌ Error accessing section 10: {str(e)}")
-            return f"Error accessing section 10: {str(e)}"
-    # Create FunctionTool objects
     tools = [
-        FunctionTool.from_defaults(name="Guide_vector_tool", fn=guide_retrieval_tool),
-        FunctionTool.from_defaults(name="Immunization_in_Practice_tool", fn=immunization_tool),
         # Section-specific tools
-        FunctionTool.from_defaults(name="section_one_vector_query_tool", fn=section_one_tool),
-        FunctionTool.from_defaults(name="section_two_vector_query_tool", fn=section_two_tool),
-        FunctionTool.from_defaults(name="section_three_vector_query_tool", fn=section_three_tool),
-        FunctionTool.from_defaults(name="section_four_vector_query_tool", fn=section_four_tool),
-        FunctionTool.from_defaults(name="section_five_vector_query_tool", fn=section_five_tool),
-        FunctionTool.from_defaults(name="section_six_vector_query_tool", fn=section_six_tool),
-        FunctionTool.from_defaults(name="section_seven_vector_query_tool", fn=section_seven_tool),
-        FunctionTool.from_defaults(name="section_eight_vector_query_tool", fn=section_eight_tool),
-        FunctionTool.from_defaults(name="section_nine_vector_query_tool", fn=section_nine_tool),
-        FunctionTool.from_defaults(name="section_ten_vector_query_tool", fn=section_ten_tool),
     ]
-    print(f"✅ Created {len(tools)} section tools")
     return tools
 def prepare_environment():

 from langchain_google_genai import ChatGoogleGenerativeAI
 from llama_index.core.tools import FunctionTool
 from llama_index.core.schema import TextNode
+from langchain.prompts import PromptTemplate
+import logging
+logging.basicConfig()
+logging.getLogger("langchain.retrievers.multi_query").setLevel(logging.INFO)
 def extract_source_ids(response_text):
     """
     print(f"✅ Vector store created with collection: {collection_name}")
     return vectorstore, documents
+def create_retriever(vectorstore, docs, llm, bm25_k=3,vector_k=6):
+    """Create ensemble retriever with vector and BM25 search
+    Args:
+        vectorstore: The vector store for similarity search
+        docs: Documents for BM25 retriever
+        llm: Language model for multi-query generation
+        bm25_k: Number of documents to retrieve with BM25
+        vector_k: Number of documents to retrieve with vector search
+    Returns:
+        Configured retriever (MultiQueryRetriever or EnsembleRetriever)
+    """
     print("🔍 Creating ensemble retriever...")
+    # PromptTemplate for Vaccine Assistant MultiQuery Retriever
+    VACCINE_MULTIQUERY_PROMPT = PromptTemplate(
+        input_variables=["question"],
+        template="""You are an AI assistant specialized in vaccine-related medical information retrieval.
+    Your task is to generate multiple search queries based on the original question to find relevant information from official vaccine medical documents.
+    IMPORTANT GUIDELINES:
+    - Keep all vaccine-specific terminology and medical terms intact
+    - Maintain the clinical and medical context
+    - Focus on evidence-based vaccine information
+    - Preserve any specific vaccine names, diseases, or medical conditions mentioned
+    - Generate queries that would help retrieve information about vaccine schedules, dosing, contraindications, adverse events, and disease prevention
+    Original question: {question}
+    Generate 4 different search queries that rephrase the original question while maintaining vaccine terminology and medical accuracy. Each query should approach the topic from a slightly different angle to maximize retrieval from vaccine medical documents.
+    Provide only the alternative questions, one per line."""
+    )
     # Vector retriever
     vector_retriever = vectorstore.as_retriever(
         search_type="similarity",
+        search_kwargs={"k": vector_k}
     )
+    print(f"✅ Vector retriever created (k={vector_k})")
     # BM25 retriever
     bm25_retriever = BM25Retriever.from_documents(docs)
+    bm25_retriever.k = bm25_k
+    print(f"✅ BM25 retriever created (k={bm25_k})")
     # Ensemble retriever
     ensemble_retriever = EnsembleRetriever(
     )
     print("✅ Ensemble retriever created (weights: 0.5, 0.5)")
+    # Multi-query expanding retriever (only for filtered mode)
     expanding_retriever = MultiQueryRetriever.from_llm(
         retriever=ensemble_retriever,
+        llm=llm,
+        prompt=VACCINE_MULTIQUERY_PROMPT,
     )
     print("✅ Multi-query expanding retriever created")
         return f"Error retrieving documents: {str(e)}"
 def create_section_tools(embedding_function, llm):
+    """
+    Create all section-specific retrieval tools with improved descriptions for accurate routing.
+    """
+    print("🛠️ Creating section-specific retrieval tools with enhanced descriptions...")
     # Define section paths - Fixed path structure
     section_paths = {
         'one': './data/section_one_chunks.json',
+        'two': './data/section_two_chunks.json',
         'three': './data/section_three_chunks.json',
         'four': './data/section_four_chunks.json',
         'five': './data/section_five_chunks.json',
         'nine': './data/section_nine_chunks.json',
         'ten': './data/section_ten_chunks.json'
     }
     # Create retrievers for each section
     section_retrievers = {}
     for section, path in section_paths.items():
             if os.path.exists(path):
                 print(f"📁 Creating retriever for section {section} from {path}")
                 vstore, docs = create_vectorstore_from_json(path, f"Guide_2023_{section}", embedding_function)
+                section_retrievers[section] = create_retriever(vstore, docs, llm, bm25_k=7, vector_k=10)
                 print(f"✅ Successfully created retriever for section {section}")
             else:
                 print(f"⚠️ Warning: File not found for section {section}: {path}")
         except Exception as e:
             print(f"❌ Error creating retriever for section {section}: {e}")
             section_retrievers[section] = None
     # Create main guide retriever
     guide_path = './data/Guide-pratique-de-mise-en-oeuvre-du-calendrier-national-de-vaccination-2023.json'
     guide_retriever = None
     except Exception as e:
         print(f"❌ Error creating immunization retriever: {e}")
+    # --- Tool Definitions with Improved Descriptions ---
+    def general_guide_tool(query: str) -> str:
+        """
+        A general-purpose tool for the Algerian National Vaccination Guide.
+        **Use this tool as a fallback** if no other specific tool seems appropriate, or for very broad, multi-topic questions
+        (e.g., 'Summarize the Algerian vaccination policy and its safety measures').
+        **Always prefer a more specific tool if the query matches its description** (e.g., use 'cold_chain_tool' for temperature questions).
         Args:
+            query (str): A broad or ambiguous question about the Algerian National Vaccination Guide.
         Returns:
+            str: Content retrieved from the entire guide.
         """
+        print(f"🏥 GENERAL GUIDE TOOL CALLED (FALLBACK): {query[:50]}...")
         if not guide_retriever:
             return "Guide retriever not available - main guide file may be missing"
+        return section_tool_wrapper(guide_retriever, guide_path, query)
+    def who_immunization_tool(query: str) -> str:
         """
+        Provides information from the WHO's 'Immunization in Practice' guide. Use this for questions about
+        **global immunization standards**, international best practices, or for comparing Algerian policy to
+        general WHO recommendations on topics like cold chain, safety, and disease control.
         Args:
+            query (str): A question seeking global or general immunization practices.
         Returns:
             str: Content from the WHO Immunization in Practice guide.
         """
         print(f"🌍 WHO TOOL CALLED: {query[:50]}...")
         if not immunization_retriever:
             return "Immunization in Practice retriever not available - WHO guide file may be missing"
+        return section_tool_wrapper(immunization_retriever, immunization_path, query)
+    def program_overview_tool(query: str) -> str:
         """
+        (Section 1) The primary tool for questions about the **history, objectives, and structure** of Algeria's
+        national immunization program (PEV - Programme Élargi de Vaccination). Use this for topics like
+        the program's rationale, key achievements, and the reasons for updates to the vaccination calendar.
         Args:
             query (str): A question about the foundation or evolution of the PEV.
         Returns:
             str: Response from Section 1.
         """
+        print(f"📋 PROGRAM OVERVIEW (S1) TOOL CALLED: {query[:50]}...")
         if not section_retrievers.get('one'):
+            return "Section 1 retriever not available"
+        return section_tool_wrapper(section_retrievers['one'], section_paths['one'], query)
+    def disease_info_tool(query: str) -> str:
         """
+        (Section 2) The definitive tool for information on **specific vaccine-preventable diseases**.
+        Use this to find details on **symptoms, transmission methods, complications**, and prevention
+        strategies for diseases like Diphtheria, Measles, Polio, Tetanus, etc.
         Args:
             query (str): A question about a disease covered by the national vaccination program.
         Returns:
             str: Disease-specific content from Section 2.
         """
+        print(f"🦠 DISEASE INFO (S2) TOOL CALLED: {query[:50]}...")
         if not section_retrievers.get('two'):
+            return "Section 2 retriever not available"
+        return section_tool_wrapper(section_retrievers['two'], section_paths['two'], query)
+    def vaccine_properties_tool(query: str) -> str:
         """
+        (Section 3) The specific tool for questions about the **vaccines themselves**: their types (e.g., BCG, ROR,
+        DTCaVPI), composition, whether they are live or inactivated, and the correct **method of administration**
+        (e.g., intradermal, intramuscular, oral).
         Args:
+            query (str): A question about a vaccine's formulation or how it is administered.
         Returns:
+            str: Vaccine-specific info from Section 3.
         """
+        print(f"💉 VACCINE PROPERTIES (S3) TOOL CALLED: {query[:50]}...")
         if not section_retrievers.get('three'):
+            return "Section 3 retriever not available"
+        return section_tool_wrapper(section_retrievers['three'], section_paths['three'], query)
+    def catch_up_vaccination_tool(query: str) -> str:
         """
+        (Section 4) Specialized tool for **missed or delayed vaccinations (rattrapage vaccinal)**.
+        Use this for questions about creating a **catch-up schedule** for a child who is behind
+        on their shots, based on their age and vaccination history.
         Args:
+            query (str): A question about catch-up vaccination due to a delay or missed dose.
         Returns:
+            str: Catch-up schedule guidance from Section 4.
         """
+        print(f"🔄 CATCH-UP (S4) TOOL CALLED: {query[:50]}...")
         if not section_retrievers.get('four'):
+            return "Section 4 retriever not available"
+        return section_tool_wrapper(section_retrievers['four'], section_paths['four'], query)
+    def special_populations_tool(query: str) -> str:
         """
+        (Section 5) The designated tool for vaccination guidelines concerning **special populations**.
+        Use for questions about vaccinating preterm infants, allergic children, or patients with
+        immunosuppression, chronic illnesses (cardiac, pulmonary), or other specific health conditions.
         Args:
+            query (str): A question about tailored vaccination for a vulnerable or special group.
         Returns:
             str: Custom recommendations from Section 5.
         """
+        print(f"👥 SPECIAL POPULATIONS (S5) TOOL CALLED: {query[:50]}...")
         if not section_retrievers.get('five'):
+            return "Section 5 retriever not available"
+        return section_tool_wrapper(section_retrievers['five'], section_paths['five'], query)
+    def cold_chain_tool(query: str) -> str:
         """
+        (Section 6) The definitive tool for all questions about the **cold chain**, including vaccine **storage
+        temperatures**, transport protocols, refrigerators, temperature monitoring (like PCV pastilles),
+        and procedures for handling cold chain failures or power outages.
         Args:
             query (str): A logistics-related question about vaccine temperature management.
         Returns:
             str: Cold chain instructions from Section 6.
         """
+        print(f"❄️ COLD CHAIN (S6) TOOL CALLED: {query[:50]}...")
         if not section_retrievers.get('six'):
+            return "Section 6 retriever not available"
+        return section_tool_wrapper(section_retrievers['six'], section_paths['six'], query)
+    def injection_safety_tool(query: str) -> str:
         """
+        (Section 7) The primary tool for questions related to the **safe administration of injections**.
+        Use for topics like sterile equipment, proper injection techniques, preventing needlestick injuries,
+        and safe disposal of medical waste (DASRI).
         Args:
+            query (str): A question about how to perform vaccine injections safely.
         Returns:
             str: Best practices from Section 7.
         """
+        print(f"🛡️ INJECTION SAFETY (S7) TOOL CALLED: {query[:50]}...")
         if not section_retrievers.get('seven'):
+            return "Section 7 retriever not available"
+        return section_tool_wrapper(section_retrievers['seven'], section_paths['seven'], query)
+    def session_management_tool(query: str) -> str:
         """
+        (Section 8) Use this tool for questions about the **operational conduct of a vaccination session**
+        and **vaccinovigilance**. This includes preparing the session, material setup, registering vaccination
+        acts, and monitoring/reporting adverse events post-vaccination (MPVI).
         Args:
+            query (str): A question about running a vaccination session or post-vaccine monitoring.
         Returns:
             str: Workflow and safety monitoring details from Section 8.
         """
+        print(f"📊 SESSION MGMT (S8) TOOL CALLED: {query[:50]}...")
         if not section_retrievers.get('eight'):
+            return "Section 8 retriever not available"
+        return section_tool_wrapper(section_retrievers['eight'], section_paths['eight'], query)
+    def planning_and_logistics_tool(query: str) -> str:
         """
+        (Section 9) This tool is for **planning vaccination sessions and managing logistics**. Use it for
+        questions about creating operational maps, estimating vaccine and supply needs, managing stock,
+        and reducing vaccine wastage.
         Args:
+            query (str): A question about organizing vaccination services or managing stock.
         Returns:
             str: Planning and stock guidance from Section 9.
         """
+        print(f"📅 PLANNING & LOGISTICS (S9) TOOL CALLED: {query[:50]}...")
         if not section_retrievers.get('nine'):
+            return "Section 9 retriever not available"
+        return section_tool_wrapper(section_retrievers['nine'], section_paths['nine'], query)
+    def communication_tool(query: str) -> str:
         """
+        (Section 10) The specific tool for **social mobilization and communication**. Use this for
+        questions about communication strategies, addressing **vaccine hesitancy**, managing rumors,
+        and community outreach to promote vaccination.
         Args:
             query (str): A question about public engagement or communication for vaccination.
         Returns:
             str: Public mobilization strategies from Section 10.
         """
+        print(f"📢 COMMUNICATION (S10) TOOL CALLED: {query[:50]}...")
         if not section_retrievers.get('ten'):
+            return "Section 10 retriever not available"
+        return section_tool_wrapper(section_retrievers['ten'], section_paths['ten'], query)
+    # Create FunctionTool objects with new, clearer names
     tools = [
+        FunctionTool.from_defaults(name="general_guide_tool", fn=general_guide_tool),
+        FunctionTool.from_defaults(name="who_immunization_tool", fn=who_immunization_tool),
         # Section-specific tools
+        FunctionTool.from_defaults(name="program_overview_tool", fn=program_overview_tool),
+        FunctionTool.from_defaults(name="disease_info_tool", fn=disease_info_tool),
+        FunctionTool.from_defaults(name="vaccine_properties_tool", fn=vaccine_properties_tool),
+        FunctionTool.from_defaults(name="catch_up_vaccination_tool", fn=catch_up_vaccination_tool),
+        FunctionTool.from_defaults(name="special_populations_tool", fn=special_populations_tool),
+        FunctionTool.from_defaults(name="cold_chain_tool", fn=cold_chain_tool),
+        FunctionTool.from_defaults(name="injection_safety_tool", fn=injection_safety_tool),
+        FunctionTool.from_defaults(name="session_management_tool", fn=session_management_tool),
+        FunctionTool.from_defaults(name="planning_and_logistics_tool", fn=planning_and_logistics_tool),
+        FunctionTool.from_defaults(name="communication_tool", fn=communication_tool),
     ]
+    print(f"✅ Created {len(tools)} tools with improved routing descriptions")
     return tools
 def prepare_environment():

rag_pipeline.py CHANGED Viewed

@@ -121,7 +121,7 @@ You provide evidence-based guidance using only information from official vaccine
 Answer the doctor's question accurately and concisely using only the provided information.
 ## FALLBACK MODE INSTRUCTIONS
-- You have access to only 2 powerful tools: Guide_vector_tool (Algerian National Vaccination Guide) and Immunization_in_Practice_tool (WHO global guidance).
 - **MANDATORY TOOL USAGE**: Always use the relevant tool(s) to search for information before answering, even if you initially think no information is available.
 - Be direct and efficient - search once with each tool if needed, then provide your answer.
 - Do not overthink or search repeatedly - these tools are comprehensive.
@@ -132,7 +132,7 @@ Answer the doctor's question accurately and concisely using only the provided in
 1. For each fact in your response, include an inline citation in the format [Source ID] immediately following the information, e.g., [e795ebd28318886c0b1a5395ac30ad90].
 2. The Source ID must be the exact alphanumeric identifier from the search results, NOT the tool name or any other text.
 3. Do NOT use 'Source:' in the citation format; use only the Source ID in square brackets.
-4. Do NOT use tool names (like Guide_vector_tool, Immunization_in_Practice_tool) as citations.
 5. If a fact is supported by multiple sources, use adjacent citations: [e795ebd28318886c0b1a5395ac30ad90][21a932b2340bb16707763f57f0ad2]
 6. Use ONLY the provided information from tool outputs and never include facts from your general knowledge.
@@ -146,20 +146,18 @@ Answer the doctor's question accurately and concisely using only the provided in
 ### CRITICAL: Efficient Fallback Strategy
 1. **MANDATORY SEARCH**: Use each relevant tool at least once to search for information, even if you suspect the information might not be available.
-2. **BREAK DOWN COMPLEX QUERIES**: For comparative or multi-part questions (e.g., comparing Algerian and WHO guidelines), break the query into sub-queries and use the appropriate tool for each part:
-   - Use Guide_vector_tool for Algerian-specific information (e.g., national schedules, coverage targets).
-   - Use Immunization_in_Practice_tool for WHO-specific information (e.g., global recommendations, coverage targets).
 3. **DO NOT STOP PREMATURELY**: Do not conclude "no information is available" without using the relevant tool(s) to search for the answer.
 4. **BE DECISIVE**: Once you find relevant information for each sub-query, formulate your response immediately.
 5. **ANSWER FULLY**: Address all parts of the question, using multiple tools if required by the query.
 ### Response Guidelines
 - **MANDATORY TOOL SELECTION**:
-  - For queries mentioning "WHO," "World Health Organization," "international," "global guidance," or WHO documents (e.g., page numbers), use Immunization_in_Practice_tool first.
-  - For queries mentioning "Algerian," "national guide," or Algerian-specific terms (e.g., page numbers), use Guide_vector_tool first.
-  - For comparative queries (e.g., Algerian vs. WHO), use both Guide_vector_tool and Immunization_in_Practice_tool, addressing each part systematically.
 - **EXPLICIT REASONING**: Before answering, log your reasoning steps, including which tools you will use and why, based on the query’s content.
-- **Query Decomposition**: Break comparative or multi-part queries into sub-queries (e.g., one for Algerian information, one for WHO information) and use the appropriate tool for each.
 - Provide all found information with proper citations using Source IDs only.
 - If information is limited, clearly state: "Based on the available documents, I can provide the following information..." and indicate what is not available.
@@ -178,7 +176,7 @@ Answer the doctor's question accurately and concisely using only the provided in
 1. For each fact in your response, include an inline citation in the format [Source ID] immediately following the information, e.g., [e795ebd28318886c0b1a5395ac30ad90].
 2. The Source ID must be the exact alphanumeric identifier from the search results, NOT the tool name or any other text.
 3. Do NOT use 'Source:' in the citation format; use only the Source ID in square brackets.
-4. Do NOT use tool names (like Guide_vector_tool, Immunization_in_Practice_tool) as citations.
 5. If a fact is supported by multiple sources, use adjacent citations: [e795ebd28318886c0b1a5395ac30ad90][21a932b2340bb16707763f57f0ad2]
 6. Use ONLY the provided information from tool outputs and never include facts from your general knowledge.
@@ -193,28 +191,23 @@ Answer the doctor's question accurately and concisely using only the provided in
 ### CRITICAL: Efficient Response Strategy
 1. **MANDATORY SEARCH**: Always use the relevant tool(s) to search for information before answering, even if you initially think no information is available.
 2. **MANDATORY TOOL SELECTION**:
-   - For queries mentioning "WHO," "World Health Organization," "international," "global guidance," or WHO documents (e.g., page numbers), use Immunization_in_Practice_tool first.
-   - For queries mentioning "Algerian," "national guide," or Algerian-specific terms (e.g., page numbers), use Guide_vector_tool first.
-   - For comparative queries (e.g., Algerian vs. WHO), use both Guide_vector_tool and Immunization_in_Practice_tool, addressing each part systematically.
-3. **Query Decomposition**: Break comparative or multi-part queries into sub-queries (e.g., one for Algerian information, one for WHO information) and use the appropriate tool for each.
 4. **DO NOT STOP PREMATURELY**: Do not conclude "no information is available" without using the relevant tool(s) to search for the answer.
-5. **EXPLICIT REASONING**: Before answering, log your reasoning steps, including which tools you will use and why, based on the query’s content.
-6. **BE DECISIVE**: Once you find relevant information for each sub-query, formulate your response immediately.
-7. **ANSWER FULLY**: Address all parts of the question, using multiple tools if required by the query.
-8. **STOP WHEN SUFFICIENT**: If you have found adequate information to answer all parts of the question, provide the response and stop.
 ### Response Guidelines for Complex Questions
-- For comparative questions: Break the query into sub-queries (e.g., Algerian vs. WHO), use Guide_vector_tool for Algerian specifics and Immunization_in_Practice_tool for WHO specifics, then provide the comparison.
 - For multi-part questions: Address each part systematically, using the appropriate tool for each sub-query.
 - If information is not found after using the relevant tool(s): State clearly: "Based on the available documents, I can provide the following information..." and specify what is not available.
-- Do not repeatedly search for the same terms or rephrase searches excessively.
-### When Information is Limited
-If you cannot find complete information to fully answer a question:
-1. Provide whatever relevant information you did find with proper citations using Source IDs only.
-2. Clearly state: "Based on the available documents, I can provide the following information..."
-3. Indicate what specific information is not available: "However, information about [specific topic] was not found in the provided documents after searching with the relevant tool(s)."
-4. Do not conclude "no information is available" without attempting a search with the appropriate tool(s).
 ---
 """
@@ -243,11 +236,13 @@ If you cannot find complete information to fully answer a question:
         print(f"[LOG] ⚠️ Using fallback prompt template for {'fallback' if is_fallback else 'standard'} agent")
         return PromptTemplate(template=safe_template)
 def create_agent(tools, llm, is_fallback=False):
     """Create the ReAct agent with custom prompt"""
     agent_type = "FALLBACK" if is_fallback else "STANDARD"
-    max_iter = 3 if is_fallback else 8
     print(f"[LOG] Creating {agent_type} ReAct agent with {len(tools)} tools and max_iterations={max_iter}")
@@ -256,14 +251,14 @@ def create_agent(tools, llm, is_fallback=False):
         tools,
         llm=llm,
         verbose=True,
-        max_iterations=max_iter,  # Reduced iterations for fallback agent
     )
-    # Create and apply appropriate custom prompt
     try:
         safe_custom_prompt = create_safe_custom_prompt(tools, llm, is_fallback=is_fallback)
         agent.update_prompts({"agent_worker:system_prompt": safe_custom_prompt})
-        print(f"✅ Successfully updated {agent_type} agent with custom prompt")
     except Exception as e:
         print(f"❌ {agent_type} agent prompt update failed: {e}")
         print(f"⚠️  Using original {agent_type} agent without modifications")
@@ -273,16 +268,16 @@ def create_agent(tools, llm, is_fallback=False):
 def create_fallback_tools(all_tools):
-    """Extract only the guide_retrieval_tool and immunization_tool for fallback agent"""
-    print("[LOG] Creating fallback tools (guide + immunization only)")
     fallback_tools = []
     tool_names_found = []
     for tool in all_tools:
         tool_name = tool.metadata.name if hasattr(tool, 'metadata') else str(tool)
-        if tool_name in ["Guide_vector_tool", "Immunization_in_Practice_tool"]:
             fallback_tools.append(tool)
             tool_names_found.append(tool_name)
@@ -333,7 +328,14 @@ def initialize_rag_pipeline(tools):
 def detect_max_iterations_error(response_text):
-    """Detect if the response indicates a max iterations error"""
     max_iteration_indicators = [
         "max iterations",
@@ -343,11 +345,10 @@ def detect_max_iterations_error(response_text):
         "iteration limit"
     ]
-    response_lower = response_text.lower()
-    # Check for max iterations indicators
     for indicator in max_iteration_indicators:
         if indicator in response_lower:
             return True
     # Check for very short or empty responses (often indicates failure)
@@ -388,7 +389,7 @@ def process_question(agents_dict, question: str) -> str:
         # Check if we need to use fallback
         if detect_max_iterations_error(response_text):
-            print("[LOG] 🔄 Max iterations detected, switching to FALLBACK AGENT...")
             if fallback_agent is None:
                 print("[LOG] ❌ Fallback agent not available, returning error message")
@@ -418,7 +419,7 @@ def process_question(agents_dict, question: str) -> str:
                 # Check if fallback also failed
                 if detect_max_iterations_error(fallback_text):
-                    print("[LOG] ❌ Fallback agent also hit max iterations")
                     return ("I apologize, but I'm having difficulty finding specific information about your question in the available documents. "
                            "Please try asking a more specific question or rephrasing your query.")
@@ -500,16 +501,17 @@ def process_question_with_sequential_citations(agents_dict, question: str, chunk
     print(f"[LOG] Chunks directory: {chunks_directory}")
     start_time = time.time()
-    used_fallback = False
     try:
         # Get the response using the enhanced process_question function
         response_text = process_question(agents_dict, question)
-        # Check if this looks like a fallback was used (simple heuristic)
-        if "fallback" in response_text.lower() or len(response_text) < 50:
             used_fallback = True
-            print("[LOG] 🛡️ Fallback agent was likely used")
         agent_time = time.time() - start_time
         print(f"[LOG] Agent processing completed in {agent_time:.2f} seconds")
@@ -533,6 +535,10 @@ def process_question_with_sequential_citations(agents_dict, question: str, chunk
         for json_file in min_chunks_files:
             json_path = os.path.join(chunks_directory, json_file)
             print(f"[LOG] Loading {json_file}...")
             try:
                 with open(json_path, "r", encoding="utf-8") as f:
@@ -548,23 +554,23 @@ def process_question_with_sequential_citations(agents_dict, question: str, chunk
         print("[LOG] Finding cited elements...")
         cited_elements_ordered = []
         for i, source_id in enumerate(unique_ids):  # This preserves the order
-            print(f"[LOG] Looking for source ID {i+1}/{len(unique_ids)}: {source_id}")
             found = False
             for element in all_chunks_data:
                 if element.get("type") == 'TableElement':
-                    if element.get("elements",{}).get("element_id") == source_id:
-                        cited_elements_ordered.append(element.get("elements",{}))
                         found = True
                         break
-                else:
-                    if "elements" in element:
-                        for nested_element in element["elements"]:
-                            if nested_element.get("element_id") == source_id:
-                                cited_elements_ordered.append(nested_element)
-                                found = True
-                                break
-                        else:
-                            continue
                         break
             if not found:
                 print(f"[LOG] ⚠️ Source ID {source_id} not found in chunks data")

 Answer the doctor's question accurately and concisely using only the provided information.
 ## FALLBACK MODE INSTRUCTIONS
+- You have access to only 2 powerful tools: general_guide_tool (Algerian National Vaccination Guide) and who_immunization_tool (WHO global guidance).
 - **MANDATORY TOOL USAGE**: Always use the relevant tool(s) to search for information before answering, even if you initially think no information is available.
 - Be direct and efficient - search once with each tool if needed, then provide your answer.
 - Do not overthink or search repeatedly - these tools are comprehensive.
 1. For each fact in your response, include an inline citation in the format [Source ID] immediately following the information, e.g., [e795ebd28318886c0b1a5395ac30ad90].
 2. The Source ID must be the exact alphanumeric identifier from the search results, NOT the tool name or any other text.
 3. Do NOT use 'Source:' in the citation format; use only the Source ID in square brackets.
+4. Do NOT use tool names (like general_guide_tool, who_immunization_tool) as citations.
 5. If a fact is supported by multiple sources, use adjacent citations: [e795ebd28318886c0b1a5395ac30ad90][21a932b2340bb16707763f57f0ad2]
 6. Use ONLY the provided information from tool outputs and never include facts from your general knowledge.
 ### CRITICAL: Efficient Fallback Strategy
 1. **MANDATORY SEARCH**: Use each relevant tool at least once to search for information, even if you suspect the information might not be available.
+2. **BREAK DOWN COMPLEX QUERIES**: For comparative or multi-part questions (e.g., comparing Algerian and WHO guidelines), break the query into sub-queries and use the appropriate tool for each part.
 3. **DO NOT STOP PREMATURELY**: Do not conclude "no information is available" without using the relevant tool(s) to search for the answer.
 4. **BE DECISIVE**: Once you find relevant information for each sub-query, formulate your response immediately.
 5. **ANSWER FULLY**: Address all parts of the question, using multiple tools if required by the query.
+6. **FINAL ANSWER**: Once you have your answer, present it directly. Do not output your internal 'thought' or 'action' steps. Your final output must be the synthesized answer itself.
 ### Response Guidelines
 - **MANDATORY TOOL SELECTION**:
+  - For queries mentioning "WHO," "World Health Organization," "international," "global guidance," or WHO documents, use who_immunization_tool first.
+  - For queries mentioning "Algerian," "national guide," or Algerian-specific terms, use general_guide_tool first.
+  - For comparative queries (e.g., Algerian vs. WHO), use both tools, addressing each part systematically.
 - **EXPLICIT REASONING**: Before answering, log your reasoning steps, including which tools you will use and why, based on the query’s content.
 - Provide all found information with proper citations using Source IDs only.
 - If information is limited, clearly state: "Based on the available documents, I can provide the following information..." and indicate what is not available.
 1. For each fact in your response, include an inline citation in the format [Source ID] immediately following the information, e.g., [e795ebd28318886c0b1a5395ac30ad90].
 2. The Source ID must be the exact alphanumeric identifier from the search results, NOT the tool name or any other text.
 3. Do NOT use 'Source:' in the citation format; use only the Source ID in square brackets.
+4. Do NOT use tool names (like general_guide_tool, cold_chain_tool) as citations.
 5. If a fact is supported by multiple sources, use adjacent citations: [e795ebd28318886c0b1a5395ac30ad90][21a932b2340bb16707763f57f0ad2]
 6. Use ONLY the provided information from tool outputs and never include facts from your general knowledge.
 ### CRITICAL: Efficient Response Strategy
 1. **MANDATORY SEARCH**: Always use the relevant tool(s) to search for information before answering, even if you initially think no information is available.
 2. **MANDATORY TOOL SELECTION**:
+   - For queries about global standards or WHO, use who_immunization_tool.
+   - For broad questions about the Algerian guide, use general_guide_tool.
+   - For specific topics like cold chain, disease info, etc., use the most specific tool (e.g., cold_chain_tool, disease_info_tool).
+3. **Query Decomposition**: Break comparative or multi-part queries into sub-queries and use the appropriate tool for each.
 4. **DO NOT STOP PREMATURELY**: Do not conclude "no information is available" without using the relevant tool(s) to search for the answer.
+5. **EXPLICIT REASONING**: Before answering, log your reasoning steps, including which tools you will use and why.
+6. **BE DECISIVE**: Once you find relevant information, formulate your response.
+### Final Answer Generation
+- **STOP WHEN SUFFICIENT**: Once you have gathered enough information from the tools to answer the user's question completely, you MUST stop using tools and formulate a final answer.
+- **SYNTHESIZE THE ANSWER**: Formulate a comprehensive, final answer based ONLY on the observed tool outputs.
+- **PRESENT CLEANLY**: Present this final answer directly to the user. Your final output must be the answer itself, not your internal 'thought' or 'action' steps.
 ### Response Guidelines for Complex Questions
+- For comparative questions: Break the query into sub-queries, use the appropriate tools, then provide the comparison.
 - For multi-part questions: Address each part systematically, using the appropriate tool for each sub-query.
 - If information is not found after using the relevant tool(s): State clearly: "Based on the available documents, I can provide the following information..." and specify what is not available.
 ---
 """
         print(f"[LOG] ⚠️ Using fallback prompt template for {'fallback' if is_fallback else 'standard'} agent")
         return PromptTemplate(template=safe_template)
 def create_agent(tools, llm, is_fallback=False):
     """Create the ReAct agent with custom prompt"""
     agent_type = "FALLBACK" if is_fallback else "STANDARD"
+    # **FIX**: Increased max_iterations to give the agent more steps to reason
+    max_iter = 15
     print(f"[LOG] Creating {agent_type} ReAct agent with {len(tools)} tools and max_iterations={max_iter}")
         tools,
         llm=llm,
         verbose=True,
+        max_iterations=max_iter,
     )
+    # Create and apply safe custom prompt
     try:
         safe_custom_prompt = create_safe_custom_prompt(tools, llm, is_fallback=is_fallback)
         agent.update_prompts({"agent_worker:system_prompt": safe_custom_prompt})
+        print(f"✅ Successfully updated {agent_type} agent with safe custom prompt")
     except Exception as e:
         print(f"❌ {agent_type} agent prompt update failed: {e}")
         print(f"⚠️  Using original {agent_type} agent without modifications")
 def create_fallback_tools(all_tools):
+    """Extract only the general_guide_tool and who_immunization_tool for fallback agent"""
+    print("[LOG] Creating fallback tools (guide + WHO only)")
     fallback_tools = []
     tool_names_found = []
     for tool in all_tools:
         tool_name = tool.metadata.name if hasattr(tool, 'metadata') else str(tool)
+        if tool_name in ["general_guide_tool", "who_immunization_tool"]:
             fallback_tools.append(tool)
             tool_names_found.append(tool_name)
 def detect_max_iterations_error(response_text):
+    """Detect if the response indicates a max iterations error OR is an unfinished thought."""
+    response_lower = response_text.lower().strip()
+    # **FIX**: Check if the response is the agent's raw thought process.
+    if response_lower.startswith("a:```thought") or response_lower.startswith("```thought"):
+        print("[LOG] Detected unfinished agent thought process.")
+        return True
     max_iteration_indicators = [
         "max iterations",
         "iteration limit"
     ]
+    # Check for explicit max iterations indicators
     for indicator in max_iteration_indicators:
         if indicator in response_lower:
+            print(f"[LOG] Detected max iteration indicator: '{indicator}'")
             return True
     # Check for very short or empty responses (often indicates failure)
         # Check if we need to use fallback
         if detect_max_iterations_error(response_text):
+            print("[LOG] 🔄 Max iterations or unfinished thought detected, switching to FALLBACK AGENT...")
             if fallback_agent is None:
                 print("[LOG] ❌ Fallback agent not available, returning error message")
                 # Check if fallback also failed
                 if detect_max_iterations_error(fallback_text):
+                    print("[LOG] ❌ Fallback agent also hit max iterations or failed to produce an answer.")
                     return ("I apologize, but I'm having difficulty finding specific information about your question in the available documents. "
                            "Please try asking a more specific question or rephrasing your query.")
     print(f"[LOG] Chunks directory: {chunks_directory}")
     start_time = time.time()
+    used_fallback = False # This flag is a heuristic
     try:
         # Get the response using the enhanced process_question function
         response_text = process_question(agents_dict, question)
+        # Check if fallback was likely used (simple heuristic based on logs)
+        # A more robust way would be for `process_question` to return a tuple (response, used_fallback)
+        if "switching to fallback agent" in response_text.lower():
             used_fallback = True
+            print("[LOG] 🛡️ Fallback agent was likely used based on log indicators.")
         agent_time = time.time() - start_time
         print(f"[LOG] Agent processing completed in {agent_time:.2f} seconds")
         for json_file in min_chunks_files:
             json_path = os.path.join(chunks_directory, json_file)
+            if not os.path.exists(json_path):
+                print(f"[LOG] ⚠️ Skipping non-existent file: {json_path}")
+                continue
             print(f"[LOG] Loading {json_file}...")
             try:
                 with open(json_path, "r", encoding="utf-8") as f:
         print("[LOG] Finding cited elements...")
         cited_elements_ordered = []
         for i, source_id in enumerate(unique_ids):  # This preserves the order
+            # print(f"[LOG] Looking for source ID {i+1}/{len(unique_ids)}: {source_id}") # This is too verbose for normal operation
             found = False
             for element in all_chunks_data:
+                # Handle TableElement structure
                 if element.get("type") == 'TableElement':
+                    if element.get("elements", {}).get("element_id") == source_id:
+                        cited_elements_ordered.append(element.get("elements", {}))
                         found = True
                         break
+                # Handle other element structures
+                elif "elements" in element and isinstance(element["elements"], list):
+                    for nested_element in element["elements"]:
+                        if isinstance(nested_element, dict) and nested_element.get("element_id") == source_id:
+                            cited_elements_ordered.append(nested_element)
+                            found = True
+                            break
+                    if found:
                         break
             if not found:
                 print(f"[LOG] ⚠️ Source ID {source_id} not found in chunks data")