Spaces:

MayarWaleed
/

Legal_Chatbot

Sleeping

App Files Files Community

Ahmd1 commited on Feb 5

Commit

e37d541

0 Parent(s):

Legal Assistant with RAG evaluation

Browse files

Files changed (9) hide show

.gitignore +51 -0
README.md +383 -0
app_final.py +625 -0
app_final_pheonix.py +838 -0
app_final_updated.py +704 -0
evaluate.py +620 -0
evaluate_rag.py +535 -0
requirements.txt +57 -0
test_dataset_5_questions.json +22 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,51 @@

+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+# Virtual Environment
+venv/
+env/
+ENV/
+# Environment variables
+.env
+.env.local
+# Reranker model files
+reranker/
+# Vector database
+chroma_db/
+# Data files - CSV
+*.csv
+# Data files - JSON (exclude all except specific test file)
+*.json
+!test_dataset_5_questions.json
+# Markdown files (exclude all except README)
+*.md
+!README.md
+# Wheel files
+*.whl
+# IDE
+.vscode/
+.idea/
+*.swp
+*.swo
+# OS
+.DS_Store
+Thumbs.db
+# Logs
+*.log
+# Jupyter Notebooks checkpoints
+.ipynb_checkpoints/

README.md ADDED Viewed

	@@ -0,0 +1,383 @@

+# ⚖️ Constitutional Legal Assistant - Egyptian Constitution Chatbot
+An intelligent RAG-based chatbot for answering questions about the Egyptian Constitution in Arabic.
+---
+## 📁 Project Structure
+```
+Chatbot_me/
+├── app_final.py                 # Main Streamlit app (v1 - basic)
+├── app_final_pheonix.py         # Streamlit app with Phoenix tracing
+├── app_final_updated.py         # Latest production version with improvements
+├── evaluate_rag.py              # RAG evaluation with RAGAS metrics (simplified output)
+├── evaluate.py                  # Full standalone evaluation script
+├── requirements.txt             # Python dependencies
+├── .env                         # Environment variables (create this - NOT in repo)
+├── .gitignore                   # Git ignore rules
+├── test_dataset_5_questions.json # Test dataset (5 questions from different categories)
+├── data/                        # Legal documents (NOT in repo)
+│   ├── Egyptian_Constitution_legalnature_only.json
+│   ├── Egyptian_Civil.json
+│   ├── Egyptian_Labour_Law.json
+│   ├── Egyptian_Personal Status Laws.json
+│   ├── Technology Crimes Law.json
+│   └── قانون_الإجراءات_الجنائية.json
+├── chroma_db/                   # Vector database (auto-generated - NOT in repo)
+├── reranker/                    # Arabic reranker model files (NOT in repo)
+│   ├── model.safetensors
+│   ├── config.json
+│   └── ...
+└── *.whl                        # Local wheel packages for Phoenix (NOT in repo)
+```
+---
+## 🚀 Quick Start
+### Step 1: Create Virtual Environment (Recommended)
+```powershell
+# Create virtual environment
+python -m venv venv
+# Activate it (Windows PowerShell)
+.\venv\Scripts\Activate.ps1
+# Or (Windows CMD)
+.\venv\Scripts\activate.bat
+```
+### Step 2: Install Dependencies
+```powershell
+# Install all requirements
+pip install -r requirements.txt
+```
+### Step 3: Install Local Wheel Packages (For Phoenix Tracing)
+```powershell
+# Install OpenInference instrumentation packages
+pip install openinference_instrumentation_langchain-0.1.56-py3-none-any.whl
+pip install openinference_instrumentation_openai-0.1.41-py3-none-any.whl
+```
+### Step 4: Create `.env` File
+Create a `.env` file in the project root with:
+```env
+# Required: Groq API Key (get from https://console.groq.com)
+GROQ_API_KEY=gsk_your_groq_api_key_here
+# Optional: For Phoenix tracing
+PHOENIX_OTLP_ENDPOINT=http://localhost:6006/v1/traces
+PHOENIX_SERVICE_NAME=constitutional-assistant
+```
+---
+## 🏃 Running the Applications
+### 1. Run Latest Production App (`app_final_updated.py`) ⭐ RECOMMENDED
+The most recent version with improved prompt engineering and decision tree logic:
+```powershell
+streamlit run app_final_updated.py
+```
+Then open: **http://localhost:8501**
+**Features:**
+- Enhanced Arabic RTL support
+- Improved decision tree for handling different question types
+- Better handling of procedural vs. constitutional questions
+- Cleaner response formatting
+---
+### 2. Run Basic App (`app_final.py`)
+The original version:
+```powershell
+streamlit run app_final.py
+```
+Then open: **http://localhost:8501**
+---
+### 3. Run App with Phoenix Tracing (`app_final_pheonix.py`)
+This version includes observability/tracing with Phoenix.
+#### Step A: Start Phoenix Server First
+```powershell
+# In a separate terminal
+python -m phoenix.server.main serve
+```
+Phoenix UI will be at: **http://localhost:6006**
+#### Step B: Run the App
+```powershell
+streamlit run app_final_pheonix.py
+```
+Then open:
+- **App**: http://localhost:8501
+- **Phoenix Traces**: http://localhost:6006
+---
+### 4. Run Evaluation (`evaluate_rag.py`) ⭐ NEW SIMPLIFIED FORMAT
+Evaluate the RAG system with simplified output showing only essential information:
+```powershell
+# Uses default test dataset (test_dataset_5_questions.json)
+python evaluate_rag.py
+# With custom test file
+python evaluate_rag.py path/to/your_test.json
+# Set via environment variable
+set QA_FILE_PATH=test_dataset_5_questions.json
+python evaluate_rag.py
+```
+**Output Files:**
+- `evaluation_breakdown.json` - **Simplified format** with:
+  - Question
+  - Ground truth
+  - Actual answer
+  - Score (average of all metrics per question)
+  - Average score across all questions
+- `evaluation_results.json` - Detailed metrics breakdown
+- `evaluation_detailed.json` - Full raw evaluation data
+**Sample Output Format:**
+```json
+{
+  "questions": [
+    {
+      "question": "ما الطبيعة القانونية لحق العمل في الدستور المصري؟",
+      "ground_truth": "حق أساسي/حرية: العمل حق وواجب...",
+      "actual_answer": "حسب المادة (12) من الدستور المصري...",
+      "score": 0.8542
+    }
+  ],
+  "average_score": 0.8542
+}
+```
+**⚠️ Note:** This script has a **60-second delay** between questions to avoid Groq API rate limits.
+---
+### 5. Run Full Evaluation (`evaluate.py`)
+More comprehensive evaluation with external test dataset and rate limiting:
+```powershell
+# Basic run (uses test_dataset.json)
+python evaluate.py
+# With custom test file
+python evaluate.py test_dataset_small.json
+# With custom test and output files
+python evaluate.py test_dataset_small.json my_results.json
+```
+**⚠️ Note:** This script has a **2-minute delay** between questions to avoid Groq API rate limits.
+---
+## 📊 Test Dataset
+The project includes a curated test dataset with 5 questions covering different legal categories:
+**`test_dataset_5_questions.json`** includes:
+1. **الدستور (Constitution)** - Constitutional rights and principles
+2. **قانون العمل (Labour Law)** - Workplace rights and regulations
+3. **الإجراءات الجنائية (Criminal Procedures)** - Criminal law procedures
+4. **جرائم تقنية المعلومات (Technology Crimes)** - Cybercrime laws
+5. **الأحوال الشخصية (Personal Status Laws)** - Family law matters
+This diverse dataset ensures comprehensive testing across all major legal domains covered by the system.
+---
+## 📊 Understanding RAGAS Metrics
+The evaluation system uses RAGAS metrics to assess the quality of the RAG pipeline. The simplified output combines these into a single score per question:
+| Metric | Description | Good Score |
+|--------|-------------|------------|
+| **faithfulness** | Is answer grounded in context? | > 0.7 |
+| **answer_relevancy** | Does answer match the question? | > 0.8 |
+| **context_precision** | How much context was useful? | > 0.6 |
+| **context_recall** | Did we retrieve all needed info? | > 0.7 |
+**Question Score** = Average of all four metrics (0-1 scale)
+**Overall Score** = Average of all question scores
+---
+## � Repository Structure & Git
+### Files NOT Included in Repository (via `.gitignore`)
+The following files are excluded from version control for security, size, or privacy reasons:
+1. **`reranker/`** - Large model files (download separately or train locally)
+2. **`__pycache__/`** - Python compiled bytecode
+3. **`chroma_db/`** - Vector database (auto-generated on first run)
+4. **`.env`** - Environment variables with API keys (NEVER commit this!)
+5. **`*.json`** - All JSON files EXCEPT `test_dataset_5_questions.json`
+6. **`*.csv`** - CSV data files
+7. **`*.md`** - All markdown files EXCEPT `README.md`
+8. **`*.whl`** - Wheel package files
+### First-Time Setup
+When cloning this repository, you'll need to:
+1. **Create `.env` file** with your API keys
+2. **Download/prepare data files** in the `data/` folder
+3. **Download reranker model** to `reranker/` folder
+4. **Install dependencies** from `requirements.txt`
+5. **Run the app** - ChromaDB will auto-generate on first run
+---
+## �🔧 Troubleshooting
+### "GROQ_API_KEY not found"
+Make sure your `.env` file exists and contains:
+```env
+GROQ_API_KEY=gsk_your_key_here
+```
+### "Reranker path not found"
+Ensure the `reranker/` folder exists with model files:
+```
+reranker/
+├── model.safetensors
+├── config.json
+├── tokenizer.json
+└── ...
+```
+### "Phoenix connection refused"
+Start Phoenix server first:
+```powershell
+python -m phoenix.server.main serve
+```
+### Rate Limit Errors (Groq)
+- Wait a few minutes and try again
+- Use `test_dataset_small.json` for fewer questions
+- The `evaluate.py` script has built-in 2-minute delays
+### Import Errors
+```powershell
+# Reinstall all dependencies
+pip install -r requirements.txt --force-reinstall
+```
+---
+## 📝 API Keys Required
+| Service | Purpose | Get Key From |
+|---------|---------|--------------|
+| **Groq** | LLM (Llama 3.1 8B) | https://console.groq.com |
+| **HuggingFace** | Embeddings (auto-download) | No key needed |
+---
+## 🔄 How the System Works
+```
+User Question (Arabic)
+        ↓
+┌─────────────────────────────────┐
+│  Decision Tree Logic            │
+│  (app_final_updated.py)         │
+│  ├── Constitutional questions   │
+│  ├── Procedural questions       │
+│  ├── General legal advice       │
+│  └── Out-of-scope filtering     │
+└─────────────────────────────────┘
+        ↓
+┌─────────────────────────────────┐
+│  Hybrid Retrieval (RRF)         │
+│  ├── Semantic Search (50%)      │
+│  ├── BM25 Keyword (30%)         │
+│  └── Metadata Filter (20%)      │
+└─────────────────────────────────┘
+        ↓
+┌─────────────────────────────────┐
+│  Cross-Reference Expansion      │
+│  (Fetch related articles)       │
+└─────────────────────────────────┘
+        ↓
+┌─────────────────────────────────┐
+│  Arabic Reranker (ARM-V1)       │
+│  (Select top 5 most relevant)   │
+└─────────────────────────────────┘
+        ↓
+┌─────────────────────────────────┐
+│  LLM (Llama 3.1 via Groq)       │
+│  (Generate Arabic answer)       │
+│  - Separate system/user prompts │
+│  - Citation with article numbers│
+│  - Temperature: 0.3              │
+└─────────────────────────────────┘
+        ↓
+    Final Answer
+```
+---
+## 📋 Version History
+### Latest Updates (Feb 2026)
+- ✅ Added `app_final_updated.py` with improved decision tree logic
+- ✅ Simplified evaluation output (question, ground_truth, answer, score)
+- ✅ Created curated 5-question test dataset covering 5 legal categories
+- ✅ Added comprehensive `.gitignore` for repository management
+- ✅ Updated documentation with all recent changes
+- ✅ Improved Arabic RTL support and number formatting
+### Previous Features
+- Multi-source legal document support (Constitution, Civil, Labour, etc.)
+- Hybrid retrieval with RRF (Reciprocal Rank Fusion)
+- Arabic-specific reranker integration
+- Phoenix tracing for observability
+- RAGAS-based evaluation system
+---
+## 📞 Support
+For issues, check:
+1. `.env` file has correct API keys
+2. All dependencies installed
+3. `reranker/` folder exists with model files
+4. Internet connection for API calls
+---
+## 📄 License
+This project is for educational purposes - Egyptian Constitution Legal Assistant.

app_final.py ADDED Viewed

	@@ -0,0 +1,625 @@

+# -*- coding: utf-8 -*-
+import os
+import sys
+import json
+from dotenv import load_dotenv
+import streamlit as st
+import logging
+import warnings
+# Suppress progress bars from transformers/tqdm
+os.environ['TRANSFORMERS_NO_PROGRESS_BAR'] = '1'
+warnings.filterwarnings('ignore')
+# 1. Loaders & Splitters
+from langchain_core.documents import Document
+from langchain_text_splitters import RecursiveCharacterTextSplitter
+from langchain_core.retrievers import BaseRetriever
+from langchain_core.callbacks import CallbackManagerForRetrieverRun
+from typing import List
+from rank_bm25 import BM25Okapi
+import numpy as np
+# 2. Vector Store & Embeddings
+from langchain_chroma import Chroma
+from langchain_huggingface import HuggingFaceEmbeddings
+# 3. Reranker Imports
+from langchain_classic.retrievers.document_compressors import CrossEncoderReranker
+from langchain_classic.retrievers import ContextualCompressionRetriever
+from langchain_community.cross_encoders import HuggingFaceCrossEncoder
+# 4. LLM
+from langchain_groq import ChatGroq
+from langchain_core.prompts import ChatPromptTemplate
+from langchain_core.output_parsers import StrOutputParser
+from langchain_core.runnables import RunnablePassthrough, RunnableParallel
+# Configure logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+load_dotenv()
+# ==========================================
+# 🎨 UI SETUP (CSS FOR ARABIC & RTL)
+# ==========================================
+st.set_page_config(page_title="المساعد القانوني", page_icon="⚖️")
+# This CSS block fixes the "001" number issue and right alignment
+st.markdown("""
+<style>
+    /* Force the main app container to be Right-to-Left */
+    .stApp {
+        direction: rtl;
+        text-align: right;
+    }
+    /* Fix input fields to type from right */
+    .stTextInput input {
+        direction: rtl;
+        text-align: right;
+    }
+    /* Fix chat messages alignment */
+    .stChatMessage {
+        direction: rtl;
+        text-align: right;
+    }
+    /* Ensure proper paragraph spacing */
+    .stMarkdown p {
+        margin: 0.5em 0 !important;
+        line-height: 1.6;
+        word-spacing: 0.1em;
+    }
+    /* Ensure numbers display correctly in RTL */
+    p, div, span, label {
+        unicode-bidi: embed;
+        direction: inherit;
+        white-space: normal;
+        word-wrap: break-word;
+    }
+    /* Force all content to respect RTL */
+    * {
+        direction: rtl !important;
+    }
+    /* Preserve line breaks and spacing */
+    .stMarkdown pre {
+        direction: rtl;
+        text-align: right;
+        white-space: pre-wrap;
+        word-wrap: break-word;
+    }
+    /* Hide the "Deploy" button and standard menu for cleaner look */
+    #MainMenu {visibility: hidden;}
+    footer {visibility: hidden;}
+</style>
+""", unsafe_allow_html=True)
+# Put this at the top of your code
+def convert_to_eastern_arabic(text):
+    """Converts 0123456789 to ٠١٢٣٤٥٦٧٨٩"""
+    if not isinstance(text, str):
+        return text
+    western_numerals = '0123456789'
+    eastern_numerals = '٠١٢٣٤٥٦٧٨٩'
+    translation_table = str.maketrans(western_numerals, eastern_numerals)
+    return text.translate(translation_table)
+st.title("⚖️ المساعد القانوني الذكي (دستور مصر)")
+# ==========================================
+# 🚀 CACHED RESOURCE LOADING (THE FIX)
+# ==========================================
+# This decorator tells Streamlit: "Run this ONCE and save the result."
+@st.cache_resource
+def initialize_rag_pipeline():
+    print("🔄 Initializing system...")
+    print("📥 Loading data...")
+    # 1. Load JSON
+    json_path = "Egyptian_Constitution_legalnature_only.json"
+    if not os.path.exists(json_path):
+        raise FileNotFoundError(f"File not found: {json_path}")
+    with open(json_path, "r", encoding="utf-8") as f:
+        data = json.load(f)
+    # Create a mapping of article numbers for cross-reference lookup
+    article_map = {str(item['article_number']): item for item in data}
+    docs = []
+    for item in data:
+        # Build cross-reference section
+        cross_ref_text = ""
+        if item.get('cross_references') and len(item['cross_references']) > 0:
+            cross_ref_text = "\nالمواد ذات الصلة (المراجع المتقاطعة): " + ", ".join(
+                [f"المادة {ref}" for ref in item['cross_references']]
+            )
+        # Construct content
+        page_content = f"""
+        رقم المادة: {item['article_number']}
+        النص الأصلي: {item['original_text']}
+        الشرح المبسط: {item['simplified_summary']}{cross_ref_text}
+        """
+        metadata = {
+            "article_id": item['article_id'],
+            "article_number": str(item['article_number']),
+            "legal_nature": item['legal_nature'],
+            "keywords": ", ".join(item['keywords']),
+            "part": item.get('part (Bab)', ''),
+            "chapter": item.get('chapter (Fasl)', ''),
+            "cross_references": ", ".join([str(ref) for ref in item.get('cross_references', [])])  # Convert list to string
+        }
+        docs.append(Document(page_content=page_content, metadata=metadata))
+    print(f"✅ Loaded {len(docs)} constitutional articles")
+    # 2. Embeddings
+    print("Loading embeddings model...")
+    embeddings = HuggingFaceEmbeddings(
+        model_name="Omartificial-Intelligence-Space/GATE-AraBert-v1"
+    )
+    print("✅ Embeddings model ready")
+    # 3. No splitting - keep articles as complete units
+    chunks = docs
+    # 4. Vector Store
+    print("Building vector database...")
+    vectorstore = Chroma.from_documents(
+        chunks,
+        embeddings,
+        persist_directory="chroma_db"
+    )
+    base_retriever = vectorstore.as_retriever(search_kwargs={"k": 15})
+    print("✅ Vector database ready")
+    # 5. Create BM25 Keyword Retriever
+    class BM25Retriever(BaseRetriever):
+        """BM25-based keyword retriever for constitutional articles"""
+        corpus_docs: List[Document]
+        bm25: BM25Okapi = None
+        k: int = 15
+        class Config:
+            arbitrary_types_allowed = True
+        def __init__(self, **data):
+            super().__init__(**data)
+            # Tokenize corpus for BM25
+            tokenized_corpus = [doc.page_content.split() for doc in self.corpus_docs]
+            self.bm25 = BM25Okapi(tokenized_corpus)
+        def _get_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            # Tokenize query
+            tokenized_query = query.split()
+            # Get BM25 scores
+            scores = self.bm25.get_scores(tokenized_query)
+            # Get top k indices
+            top_indices = np.argsort(scores)[::-1][:self.k]
+            # Return documents
+            return [self.corpus_docs[i] for i in top_indices if scores[i] > 0]
+        async def _aget_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            return self._get_relevant_documents(query, run_manager=run_manager)
+    bm25_retriever = BM25Retriever(corpus_docs=docs, k=15)
+    print("✅ BM25 keyword retriever ready")
+    # 6. Create Metadata Filter Retriever
+    class MetadataFilterRetriever(BaseRetriever):
+        """Metadata-based filtering retriever"""
+        corpus_docs: List[Document]
+        k: int = 15
+        class Config:
+            arbitrary_types_allowed = True
+        def _get_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            query_lower = query.lower()
+            scored_docs = []
+            for doc in self.corpus_docs:
+                score = 0
+                # Match keywords
+                keywords = doc.metadata.get('keywords', '').lower()
+                if any(word in keywords for word in query_lower.split()):
+                    score += 3
+                # Match legal nature
+                legal_nature = doc.metadata.get('legal_nature', '').lower()
+                if any(word in legal_nature for word in query_lower.split()):
+                    score += 2
+                # Match part/chapter
+                part = doc.metadata.get('part', '').lower()
+                chapter = doc.metadata.get('chapter', '').lower()
+                if any(word in part or word in chapter for word in query_lower.split()):
+                    score += 1
+                # Match in content
+                if any(word in doc.page_content.lower() for word in query_lower.split()):
+                    score += 1
+                if score > 0:
+                    scored_docs.append((doc, score))
+            # Sort by score and return top k
+            scored_docs.sort(key=lambda x: x[1], reverse=True)
+            return [doc for doc, _ in scored_docs[:self.k]]
+        async def _aget_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            return self._get_relevant_documents(query, run_manager=run_manager)
+    metadata_retriever = MetadataFilterRetriever(corpus_docs=docs, k=15)
+    print("✅ Metadata filter retriever ready")
+    # 7. Create Hybrid RRF Retriever
+    class HybridRRFRetriever(BaseRetriever):
+        """Combines semantic, BM25, and metadata retrievers using Reciprocal Rank Fusion"""
+        semantic_retriever: BaseRetriever
+        bm25_retriever: BM25Retriever
+        metadata_retriever: MetadataFilterRetriever
+        beta_semantic: float = 0.6  # Weight for semantic search
+        beta_keyword: float = 0.2   # Weight for BM25 keyword search
+        beta_metadata: float = 0.2  # Weight for metadata filtering
+        k: int = 60  # RRF constant (typically 60)
+        top_k: int = 15
+        class Config:
+            arbitrary_types_allowed = True
+        def _get_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            # Get results from all three retrievers
+            semantic_docs = self.semantic_retriever.invoke(query)
+            bm25_docs = self.bm25_retriever.invoke(query)
+            metadata_docs = self.metadata_retriever.invoke(query)
+            # Apply Reciprocal Rank Fusion
+            rrf_scores = {}
+            # Process semantic results
+            for rank, doc in enumerate(semantic_docs, start=1):
+                doc_id = doc.metadata.get('article_number', str(hash(doc.page_content)))
+                rrf_scores[doc_id] = rrf_scores.get(doc_id, 0) + self.beta_semantic / (self.k + rank)
+            # Process BM25 results
+            for rank, doc in enumerate(bm25_docs, start=1):
+                doc_id = doc.metadata.get('article_number', str(hash(doc.page_content)))
+                rrf_scores[doc_id] = rrf_scores.get(doc_id, 0) + self.beta_keyword / (self.k + rank)
+            # Process metadata results
+            for rank, doc in enumerate(metadata_docs, start=1):
+                doc_id = doc.metadata.get('article_number', str(hash(doc.page_content)))
+                rrf_scores[doc_id] = rrf_scores.get(doc_id, 0) + self.beta_metadata / (self.k + rank)
+            # Create document lookup
+            all_docs = {}
+            for doc in semantic_docs + bm25_docs + metadata_docs:
+                doc_id = doc.metadata.get('article_number', str(hash(doc.page_content)))
+                if doc_id not in all_docs:
+                    all_docs[doc_id] = doc
+            # Sort by RRF score
+            sorted_doc_ids = sorted(rrf_scores.items(), key=lambda x: x[1], reverse=True)
+            # Return top k documents
+            result_docs = []
+            for doc_id, score in sorted_doc_ids[:self.top_k]:
+                if doc_id in all_docs:
+                    result_docs.append(all_docs[doc_id])
+            return result_docs
+        async def _aget_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            return self._get_relevant_documents(query, run_manager=run_manager)
+    # Create hybrid retriever with tuned beta weights
+    hybrid_retriever = HybridRRFRetriever(
+        semantic_retriever=base_retriever,
+        bm25_retriever=bm25_retriever,
+        metadata_retriever=metadata_retriever,
+        beta_semantic=0.5,   # Semantic search gets highest weight (most reliable)
+        beta_keyword=0.3,    # BM25 keyword search (good for exact term matches)
+        beta_metadata=0.2,   # Metadata filtering (supporting role)
+        k=60,
+        top_k=20
+    )
+    print("✅ Hybrid RRF retriever ready with β weights: semantic=0.5, keyword=0.3, metadata=0.2")
+    # 8. Create Cross-Reference Enhanced Retriever
+    class CrossReferenceRetriever(BaseRetriever):
+        """Enhances retrieval by automatically fetching cross-referenced articles"""
+        base_retriever: BaseRetriever
+        article_map: dict
+        class Config:
+            arbitrary_types_allowed = True
+        def _get_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            # Get initial results
+            initial_docs = self.base_retriever.invoke(query)
+            # Collect all related article numbers
+            all_article_numbers = set()
+            for doc in initial_docs:
+                if 'article_number' in doc.metadata:
+                    all_article_numbers.add(doc.metadata['article_number'])
+                # Parse cross_references (now stored as comma-separated string)
+                cross_refs_str = doc.metadata.get('cross_references', '')
+                if cross_refs_str:
+                    cross_refs = [ref.strip() for ref in cross_refs_str.split(',')]
+                    for ref in cross_refs:
+                        if ref:  # Skip empty strings
+                            all_article_numbers.add(str(ref))
+            # Build enhanced document list
+            enhanced_docs = []
+            seen_numbers = set()
+            # Add initially retrieved documents
+            for doc in initial_docs:
+                enhanced_docs.append(doc)
+                seen_numbers.add(doc.metadata.get('article_number'))
+            # Add cross-referenced articles not yet retrieved
+            for article_num in all_article_numbers:
+                if article_num not in seen_numbers and article_num in self.article_map:
+                    article_data = self.article_map[article_num]
+                    cross_ref_text = ""
+                    if article_data.get('cross_references'):
+                        cross_ref_text = "\nالمواد ذات الصلة: " + ", ".join(
+                            [f"المادة {ref}" for ref in article_data['cross_references']]
+                        )
+                    page_content = f"""
+                    رقم المادة: {article_data['article_number']}
+                    النص الأصلي: {article_data['original_text']}
+                    الشرح المبسط: {article_data['simplified_summary']}{cross_ref_text}
+                    """
+                    enhanced_doc = Document(
+                        page_content=page_content,
+                        metadata={
+                            "article_id": article_data['article_id'],
+                            "article_number": str(article_data['article_number']),
+                            "legal_nature": article_data['legal_nature'],
+                            "keywords": ", ".join(article_data['keywords']),
+                            "cross_references": ", ".join([str(ref) for ref in article_data.get('cross_references', [])])
+                        }
+                    )
+                    enhanced_docs.append(enhanced_doc)
+                    seen_numbers.add(article_num)
+            return enhanced_docs
+        async def _aget_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            return self._get_relevant_documents(query, run_manager=run_manager)
+    cross_ref_retriever = CrossReferenceRetriever(
+        base_retriever=hybrid_retriever,
+        article_map=article_map
+    )
+    print("✅ Cross-reference retriever ready (using hybrid RRF base)")
+    # 9. Reranker
+    print("Loading reranker model...")
+    local_model_path = r"D:\FOE\Senior 2\Graduation Project\Chatbot_me\reranker"
+    if not os.path.exists(local_model_path):
+        raise FileNotFoundError(f"Reranker path not found: {local_model_path}")
+    model = HuggingFaceCrossEncoder(model_name=local_model_path)
+    compressor = CrossEncoderReranker(model=model, top_n=5)
+    compression_retriever = ContextualCompressionRetriever(
+        base_compressor=compressor,
+        base_retriever=cross_ref_retriever
+    )
+    print("✅ Reranker model ready")
+    # 7. LLM - Balanced for consistency with slight creativity
+    # 7. LLM Configuration
+    llm = ChatGroq(
+        groq_api_key=os.getenv("GROQ_API_KEY"),
+        model_name="llama-3.1-8b-instant",
+        temperature=0.3,       # Slightly increased to allow helpful general advice
+        model_kwargs={"top_p": 0.9}
+    )
+# ==================================================
+    # 🛠️ THE FIX: SEPARATE SYSTEM INSTRUCTIONS FROM USER INPUT
+    # ==================================================
+# ==================================================
+    # 🧠 PROMPT ENGINEERING: DECISION TREE LOGIC
+    # ==================================================
+    system_instructions = """
+    <role>
+    أنت "المساعد القانوني الذكي"، خبير متخصص في الدستور المصري والقوانين الإجرائية.
+    مهمتك: تقديم إجابات دقيقة بناءً على "السياق التشريعي" المرفق أولاً، أو تقديم نصائح إجرائية عامة عند الضرورة.
+    </role>
+    <decision_logic>
+    عليك تحليل "سؤال المستخدم" و"السياق التشريعي" وتصنيف الحالة واختيار الرد المناسب بناءً على القواعد التالية بدقة:
+    🔴 الحالة الأولى: (الإجابة موجودة في السياق التشريعي)
+    الشرط: إذا وجدت معلومات داخل "السياق التشريعي المتاح" تجيب على السؤال.
+    الفعل:
+    1. استخرج الإجابة من السياق فقط.
+    2. ابدأ الإجابة مباشرة دون مقدمات.
+    3. يجب توثيق الإجابة برقم المادة (مثال: "نصت المادة (50) على...").
+    4. توقف هنا. لا تضف أي معلومات خارجية.
+    🟡 الحالة الثانية: (السياق فارغ/غير مفيد + السؤال إجرائي/عملي)
+    الشرط: إذا لم تجد الإجابة في السياق، وكان السؤال عن إجراءات عملية (مثل: حادث، سرقة، طلاق، تحرير محضر، تعامل مع الشرطة).
+    الفعل:
+    1. تجاهل السياق الفارغ.
+    2. استخدم معرفتك العامة بالقانون المصري.
+    3. ابدأ وجوباً بعبارة: "بناءً على الإجراءات القانونية العامة في مصر (وليس ن��اً دستورياً محدداً):"
+    4. قدم الخطوات في نقاط مرقمة واضحة ومختصرة (1، 2، 3).
+    5. تحذير: لا تذكر أرقام مواد قانونية (لا تخترع أرقام مواد).
+    🔵 الحالة الثالثة: (السياق فارغ + السؤال عن نص دستوري محدد)
+    الشرط: إذا سأل عن (مجلس الشعب، الشورى، مادة محددة) ولم تجدها في السياق.
+    الفعل:
+    1. قل بوضوح: "عذراً، لم يرد ذكر لهذا الموضوع في المواد الدستورية التي تم استرجاعها في السياق الحالي."
+    2. لا تحاول الإجابة من ذاكرتك لكي لا تخطئ في النصوص الدستورية الحساسة.
+    🟢 الحالة الرابعة: (محادثة ودية)
+    الشرط: تحية، شكر، أو "كيف حالك".
+    الفعل: رد بتحية مهذبة جداً ومقتضبة، ثم قل: "أنا جاهز للإجابة على استفساراتك القانونية."
+    ⚫ الحالة الخامسة: (خارج النطاق تماماً)
+    الشرط: طبخ، رياضة، برمجة، أو أي موضوع غير قانوني.
+    الفعل: اعتذر بلطف ووجه المستخدم للسؤال في القانون.
+    </decision_logic>
+    <formatting_rules>
+    - لا تكرر هذه التعليمات في ردك.
+    - استخدم فقرات قصيرة واترك سطراً فارغاً بينها.
+    - لا تستخدم عبارات مثل "بناء على السياق المرفق" في بداية الجملة، بل ادخل في صلب الموضوع فوراً.
+    - التزم باللغة العربية الفصحى المبسطة والرصينة.
+    </formatting_rules>
+    """
+    # We use .from_messages to strictly separate instructions from data
+    prompt = ChatPromptTemplate.from_messages([
+        ("system", system_instructions),
+        ("system", "السياق التشريعي المتاح (المصدر الأساسي):\n{context}"),
+        ("human", "سؤال المستفيد:\n{input}")
+    ])
+    # 9. Build Chain with RunnableParallel (returns both context and answer)
+    qa_chain = (
+        RunnableParallel({
+            "context": compression_retriever,
+            "input": RunnablePassthrough()
+        })
+        .assign(answer=(
+            prompt
+            | llm
+            | StrOutputParser()
+        ))
+    )
+    print("✅ System ready to use!")
+    return qa_chain
+# ==========================================
+# ⚡ MAIN EXECUTION
+# ==========================================
+try:
+    # Only need the chain now - it handles all retrieval internally
+    qa_chain = initialize_rag_pipeline()
+except Exception as e:
+    st.error(f"Critical Error loading application: {e}")
+    st.stop()
+# ==========================================
+# 💬 CHAT LOOP
+# ==========================================
+if "messages" not in st.session_state:
+    st.session_state.messages = []
+# Display Chat History (with Eastern Arabic numerals)
+for message in st.session_state.messages:
+    with st.chat_message(message["role"]):
+        # Convert to Eastern Arabic when displaying from history
+        st.markdown(convert_to_eastern_arabic(message["content"]))
+# Handle New User Input
+if prompt_input := st.chat_input("اكتب سؤالك القانوني هنا..."):
+    # Show user message
+    st.session_state.messages.append({"role": "user", "content": prompt_input})
+    with st.chat_message("user"):
+        st.markdown(prompt_input)
+    # Generate Response
+    with st.chat_message("assistant"):
+        with st.spinner("جاري التحليل القانوني..."):
+            try:
+                # Invoke chain ONCE - returns Dict with 'context', 'input', and 'answer'
+                result = qa_chain.invoke(prompt_input)
+                # Extract answer and context from result
+                response_text = result["answer"]
+                source_docs = result["context"]  # Context is already in the result!
+                # Display Answer
+                response_text_arabic = convert_to_eastern_arabic(response_text)
+                st.markdown(response_text_arabic)
+                # Display Sources
+                if source_docs and len(source_docs) > 0:
+                    print(f"✅ Found {len(source_docs)} documents")
+                    # Deduplicate documents by article_number
+                    seen_articles = set()
+                    unique_docs = []
+                    for doc in source_docs:
+                        article_num = str(doc.metadata.get('article_number', '')).strip()
+                        if article_num and article_num not in seen_articles:
+                            seen_articles.add(article_num)
+                            unique_docs.append(doc)
+                    st.markdown("---")  # Separator before sources
+                    if unique_docs:
+                        with st.expander(f"📚 المصادر المستخدمة ({len(unique_docs)} مادة)"):
+                            st.markdown("### المواد الدستورية المستخدمة في التحليل:")
+                            st.markdown("---")
+                            for idx, doc in enumerate(unique_docs, 1):
+                                article_num = str(doc.metadata.get('article_number', '')).strip()
+                                legal_nature = doc.metadata.get('legal_nature', '')
+                                if article_num:
+                                    st.markdown(f"**المادة رقم {convert_to_eastern_arabic(article_num)}**")
+                                    if legal_nature:
+                                        st.markdown(f"*الطبيعة القانونية: {legal_nature}*")
+                                    # Display article content
+                                    content_lines = doc.page_content.strip().split('\n')
+                                    for line in content_lines:
+                                        line = line.strip()
+                                        if line:
+                                            st.markdown(convert_to_eastern_arabic(line))
+                                    st.markdown("---")
+                    else:
+                        st.info("📌 لم يتم العثور على مصادر")
+                else:
+                    st.info("📌 لم يتم العثور على مصادر")
+                # Persist the raw answer to avoid double conversion glitches on rerun
+                st.session_state.messages.append({"role": "assistant", "content": response_text})
+            except Exception as e:
+                st.error(f"حدث خطأ: {e}")

app_final_pheonix.py ADDED Viewed

	@@ -0,0 +1,838 @@

+# === Phoenix Observability Setup ===
+import os
+from datetime import datetime
+try:
+    # OpenTelemetry SDK + OTLP exporter (Phoenix consumes OTLP)
+    from opentelemetry import trace
+    from opentelemetry.sdk.resources import SERVICE_NAME, Resource
+    from opentelemetry.sdk.trace import TracerProvider
+    from opentelemetry.sdk.trace.export import BatchSpanProcessor
+    from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
+    PHOENIX_AVAILABLE = True
+except Exception:
+    PHOENIX_AVAILABLE = False
+def setup_phoenix_tracing():
+    """Configure OTLP tracing for Phoenix. Uses PHOENIX_OTLP_ENDPOINT env if set."""
+    if not PHOENIX_AVAILABLE:
+        return None
+    service_name = os.getenv("PHOENIX_SERVICE_NAME", "constitutional-assistant")
+    otlp_endpoint = os.getenv("PHOENIX_OTLP_ENDPOINT", "http://localhost:6006/v1/traces")
+    resource = Resource(attributes={SERVICE_NAME: service_name})
+    provider = TracerProvider(resource=resource)
+    exporter = OTLPSpanExporter(endpoint=otlp_endpoint)
+    span_processor = BatchSpanProcessor(exporter)
+    provider.add_span_processor(span_processor)
+    trace.set_tracer_provider(provider)
+    return trace.get_tracer(service_name)
+# Create a module-level tracer
+_phoenix_tracer = setup_phoenix_tracing()
+class PhoenixSpan:
+    """Context manager helper to create spans with proper parent-child hierarchy."""
+    def __init__(self, name: str, attributes: dict | None = None, kind: str = "INTERNAL"):
+        self.name = name
+        self.attributes = attributes or {}
+        self.kind = kind
+        self._span_context = None
+        self._span = None
+        self._start_time = None
+    def __enter__(self):
+        if _phoenix_tracer:
+            from opentelemetry.trace import SpanKind
+            import time
+            self._start_time = time.time()
+            # Map string kind to SpanKind enum
+            kind_map = {
+                "CLIENT": SpanKind.CLIENT,
+                "SERVER": SpanKind.SERVER,
+                "INTERNAL": SpanKind.INTERNAL,
+            }
+            span_kind = kind_map.get(self.kind, SpanKind.INTERNAL)
+            # Use start_as_current_span to establish parent-child relationships
+            self._span_context = _phoenix_tracer.start_as_current_span(
+                self.name,
+                kind=span_kind
+            )
+            self._span = self._span_context.__enter__()
+            for k, v in self.attributes.items():
+                try:
+                    self._span.set_attribute(k, v)
+                except Exception:
+                    pass
+        return self
+    def set_attr(self, key: str, value):
+        if self._span:
+            try:
+                self._span.set_attribute(key, value)
+            except Exception:
+                pass
+    def __exit__(self, exc_type, exc, tb):
+        if self._span_context:
+            try:
+                if exc_type:
+                    self._span.record_exception(exc)
+                    from opentelemetry.trace import Status, StatusCode
+                    self._span.set_status(Status(StatusCode.ERROR, str(exc)))
+                else:
+                    # Add duration as attribute
+                    if self._start_time:
+                        import time
+                        duration = time.time() - self._start_time
+                        self._span.set_attribute("duration_ms", round(duration * 1000, 2))
+                    from opentelemetry.trace import Status, StatusCode
+                    self._span.set_status(Status(StatusCode.OK))
+                self._span_context.__exit__(exc_type, exc, tb)
+            except Exception:
+                pass
+# -*- coding: utf-8 -*-
+import os
+import sys
+import json
+from dotenv import load_dotenv
+import streamlit as st
+import logging
+import warnings
+# Suppress progress bars from transformers/tqdm
+os.environ['TRANSFORMERS_NO_PROGRESS_BAR'] = '1'
+warnings.filterwarnings('ignore')
+# 1. Loaders & Splitters
+from langchain_core.documents import Document
+from langchain_text_splitters import RecursiveCharacterTextSplitter
+from langchain_core.retrievers import BaseRetriever
+from langchain_core.callbacks import CallbackManagerForRetrieverRun
+from typing import List
+from rank_bm25 import BM25Okapi
+import numpy as np
+# 2. Vector Store & Embeddings
+from langchain_chroma import Chroma
+from langchain_huggingface import HuggingFaceEmbeddings
+# 3. Reranker Imports
+from langchain_classic.retrievers.document_compressors import CrossEncoderReranker
+from langchain_classic.retrievers import ContextualCompressionRetriever
+from langchain_community.cross_encoders import HuggingFaceCrossEncoder
+# 4. LLM
+from langchain_groq import ChatGroq
+from langchain_core.prompts import ChatPromptTemplate
+from langchain_core.output_parsers import StrOutputParser
+from langchain_core.runnables import RunnablePassthrough, RunnableParallel
+# Configure logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+load_dotenv()
+# ==========================================
+# 🎨 UI SETUP (CSS FOR ARABIC & RTL)
+# ==========================================
+st.set_page_config(page_title="المساعد القانوني", page_icon="⚖️")
+# This CSS block fixes the "001" number issue and right alignment
+st.markdown("""
+<style>
+    /* Force the main app container to be Right-to-Left */
+    .stApp {
+        direction: rtl;
+        text-align: right;
+    }
+    /* Fix input fields to type from right */
+    .stTextInput input {
+        direction: rtl;
+        text-align: right;
+    }
+    /* Fix chat messages alignment */
+    .stChatMessage {
+        direction: rtl;
+        text-align: right;
+    }
+    /* Ensure proper paragraph spacing */
+    .stMarkdown p {
+        margin: 0.5em 0 !important;
+        line-height: 1.6;
+        word-spacing: 0.1em;
+    }
+    /* Ensure numbers display correctly in RTL */
+    p, div, span, label {
+        unicode-bidi: embed;
+        direction: inherit;
+        white-space: normal;
+        word-wrap: break-word;
+    }
+    /* Force all content to respect RTL */
+    * {
+        direction: rtl !important;
+    }
+    /* Preserve line breaks and spacing */
+    .stMarkdown pre {
+        direction: rtl;
+        text-align: right;
+        white-space: pre-wrap;
+        word-wrap: break-word;
+    }
+    /* Hide the "Deploy" button and standard menu for cleaner look */
+    #MainMenu {visibility: hidden;}
+    footer {visibility: hidden;}
+</style>
+""", unsafe_allow_html=True)
+# Put this at the top of your code
+def convert_to_eastern_arabic(text):
+    """Converts 0123456789 to ٠١٢٣٤٥٦٧٨٩"""
+    if not isinstance(text, str):
+        return text
+    western_numerals = '0123456789'
+    eastern_numerals = '٠١٢٣٤٥٦٧٨٩'
+    translation_table = str.maketrans(western_numerals, eastern_numerals)
+    return text.translate(translation_table)
+st.title("⚖️ المساعد القانوني الذكي (دستور مصر)")
+# ==========================================
+# 🚀 CACHED RESOURCE LOADING (THE FIX)
+# ==========================================
+# This decorator tells Streamlit: "Run this ONCE and save the result."
+@st.cache_resource
+def initialize_rag_pipeline():
+    print("🔄 Initializing system...")
+    print("📥 Loading data...")
+    # 1. Load JSON
+    json_path = "Egyptian_Constitution_legalnature_only.json"
+    if not os.path.exists(json_path):
+        raise FileNotFoundError(f"File not found: {json_path}")
+    with open(json_path, "r", encoding="utf-8") as f:
+        data = json.load(f)
+    # Create a mapping of article numbers for cross-reference lookup
+    article_map = {str(item['article_number']): item for item in data}
+    docs = []
+    for item in data:
+        # Build cross-reference section
+        cross_ref_text = ""
+        if item.get('cross_references') and len(item['cross_references']) > 0:
+            cross_ref_text = "\nالمواد ذات الصلة (المراجع المتقاطعة): " + ", ".join(
+                [f"المادة {ref}" for ref in item['cross_references']]
+            )
+        # Construct content
+        page_content = f"""
+        رقم المادة: {item['article_number']}
+        النص الأصلي: {item['original_text']}
+        الشرح المبسط: {item['simplified_summary']}{cross_ref_text}
+        """
+        metadata = {
+            "article_id": item['article_id'],
+            "article_number": str(item['article_number']),
+            "legal_nature": item['legal_nature'],
+            "keywords": ", ".join(item['keywords']),
+            "part": item.get('part (Bab)', ''),
+            "chapter": item.get('chapter (Fasl)', ''),
+            "cross_references": ", ".join([str(ref) for ref in item.get('cross_references', [])])  # Convert list to string
+        }
+        docs.append(Document(page_content=page_content, metadata=metadata))
+    print(f"✅ Loaded {len(docs)} constitutional articles")
+    # 2. Embeddings
+    print("Loading embeddings model...")
+    embeddings = HuggingFaceEmbeddings(
+        model_name="Omartificial-Intelligence-Space/GATE-AraBert-v1"
+    )
+    print("✅ Embeddings model ready")
+    # 3. No splitting - keep articles as complete units
+    chunks = docs
+    # 4. Vector Store
+    print("Building vector database...")
+    vectorstore = Chroma.from_documents(
+        chunks,
+        embeddings,
+        persist_directory="chroma_db"
+    )
+    base_retriever = vectorstore.as_retriever(search_kwargs={"k": 15})
+    print("✅ Vector database ready")
+    # 5. Create BM25 Keyword Retriever
+    class BM25Retriever(BaseRetriever):
+        """BM25-based keyword retriever for constitutional articles"""
+        corpus_docs: List[Document]
+        bm25: BM25Okapi = None
+        k: int = 15
+        class Config:
+            arbitrary_types_allowed = True
+        def __init__(self, **data):
+            super().__init__(**data)
+            # Tokenize corpus for BM25
+            tokenized_corpus = [doc.page_content.split() for doc in self.corpus_docs]
+            self.bm25 = BM25Okapi(tokenized_corpus)
+        def _get_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            # Tokenize query
+            tokenized_query = query.split()
+            # Get BM25 scores
+            scores = self.bm25.get_scores(tokenized_query)
+            # Get top k indices
+            top_indices = np.argsort(scores)[::-1][:self.k]
+            # Return documents
+            return [self.corpus_docs[i] for i in top_indices if scores[i] > 0]
+        async def _aget_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            return self._get_relevant_documents(query, run_manager=run_manager)
+    bm25_retriever = BM25Retriever(corpus_docs=docs, k=15)
+    print("✅ BM25 keyword retriever ready")
+    # 6. Create Metadata Filter Retriever
+    class MetadataFilterRetriever(BaseRetriever):
+        """Metadata-based filtering retriever"""
+        corpus_docs: List[Document]
+        k: int = 15
+        class Config:
+            arbitrary_types_allowed = True
+        def _get_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            query_lower = query.lower()
+            scored_docs = []
+            for doc in self.corpus_docs:
+                score = 0
+                # Match keywords (boosted)
+                keywords = doc.metadata.get('keywords', '').lower()
+                if any(word in keywords for word in query_lower.split()):
+                    score += 4
+                # Match legal nature (boosted)
+                legal_nature = doc.metadata.get('legal_nature', '').lower()
+                if any(word in legal_nature for word in query_lower.split()):
+                    score += 3
+                # Match part/chapter
+                part = doc.metadata.get('part', '').lower()
+                chapter = doc.metadata.get('chapter', '').lower()
+                if any(word in part or word in chapter for word in query_lower.split()):
+                    score += 1
+                # Match in content
+                if any(word in doc.page_content.lower() for word in query_lower.split()):
+                    score += 1
+                if score > 0:
+                    scored_docs.append((doc, score))
+            # Sort by score and return top k
+            scored_docs.sort(key=lambda x: x[1], reverse=True)
+            return [doc for doc, _ in scored_docs[:self.k]]
+        async def _aget_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            return self._get_relevant_documents(query, run_manager=run_manager)
+    metadata_retriever = MetadataFilterRetriever(corpus_docs=docs, k=15)
+    print("✅ Metadata filter retriever ready")
+    # 7. Create Hybrid RRF Retriever
+    class HybridRRFRetriever(BaseRetriever):
+        """Combines semantic, BM25, and metadata retrievers using Reciprocal Rank Fusion"""
+        semantic_retriever: BaseRetriever
+        bm25_retriever: BM25Retriever
+        metadata_retriever: MetadataFilterRetriever
+        beta_semantic: float = 0.6  # Weight for semantic search
+        beta_keyword: float = 0.25  # Weight for BM25 keyword search
+        beta_metadata: float = 0.15 # Weight for metadata filtering
+        k: int = 60  # RRF constant (typically 60)
+        top_k: int = 25
+        class Config:
+            arbitrary_types_allowed = True
+        def _get_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            # Get results from all three retrievers (no separate spans - details logged in hybrid_retrieval span)
+            semantic_docs = self.semantic_retriever.invoke(query)
+            bm25_docs = self.bm25_retriever.invoke(query)
+            metadata_docs = self.metadata_retriever.invoke(query)
+            # Apply Reciprocal Rank Fusion
+            rrf_scores = {}
+            # Process semantic results
+            for rank, doc in enumerate(semantic_docs, start=1):
+                doc_id = doc.metadata.get('article_number', str(hash(doc.page_content)))
+                rrf_scores[doc_id] = rrf_scores.get(doc_id, 0) + self.beta_semantic / (self.k + rank)
+            # Process BM25 results
+            for rank, doc in enumerate(bm25_docs, start=1):
+                doc_id = doc.metadata.get('article_number', str(hash(doc.page_content)))
+                rrf_scores[doc_id] = rrf_scores.get(doc_id, 0) + self.beta_keyword / (self.k + rank)
+            # Process metadata results
+            for rank, doc in enumerate(metadata_docs, start=1):
+                doc_id = doc.metadata.get('article_number', str(hash(doc.page_content)))
+                rrf_scores[doc_id] = rrf_scores.get(doc_id, 0) + self.beta_metadata / (self.k + rank)
+            # Create document lookup
+            all_docs = {}
+            for doc in semantic_docs + bm25_docs + metadata_docs:
+                doc_id = doc.metadata.get('article_number', str(hash(doc.page_content)))
+                if doc_id not in all_docs:
+                    all_docs[doc_id] = doc
+            # Sort by RRF score
+            sorted_doc_ids = sorted(rrf_scores.items(), key=lambda x: x[1], reverse=True)
+            # Return top k documents
+            result_docs = []
+            for doc_id, score in sorted_doc_ids[:self.top_k]:
+                if doc_id in all_docs:
+                    result_docs.append(all_docs[doc_id])
+            # Log all retrieval details in one place (no nested spans to avoid hierarchy issues)
+            try:
+                with PhoenixSpan("hybrid_retrieval", {
+                    "query": query[:200],
+                    "beta_semantic": self.beta_semantic,
+                    "beta_keyword": self.beta_keyword,
+                    "beta_metadata": self.beta_metadata,
+                    "rrf_k_constant": self.k,
+                    "top_k_limit": self.top_k
+                }, kind="INTERNAL") as fusion_span:
+                    # Semantic retrieval details
+                    fusion_span.set_attr("semantic_input_count", len(semantic_docs))
+                    if semantic_docs:
+                        fusion_span.set_attr("semantic_top_5", ", ".join([d.metadata.get('article_number', 'N/A') for d in semantic_docs[:5]]))
+                    # BM25 retrieval details
+                    fusion_span.set_attr("bm25_input_count", len(bm25_docs))
+                    if bm25_docs:
+                        fusion_span.set_attr("bm25_top_5", ", ".join([d.metadata.get('article_number', 'N/A') for d in bm25_docs[:5]]))
+                    # Metadata retrieval details
+                    fusion_span.set_attr("metadata_input_count", len(metadata_docs))
+                    if metadata_docs:
+                        fusion_span.set_attr("metadata_top_5", ", ".join([d.metadata.get('article_number', 'N/A') for d in metadata_docs[:5]]))
+                    # Fusion results
+                    fusion_span.set_attr("unique_docs_before_fusion", len(all_docs))
+                    fusion_span.set_attr("final_doc_count", len(result_docs))
+                    if result_docs:
+                        top_article_nums = [d.metadata.get('article_number', 'N/A') for d in result_docs[:10]]
+                        fusion_span.set_attr("fused_top_10_articles", ", ".join(map(str, top_article_nums)))
+                        # Show top 5 RRF scores
+                        top_scores = [(doc_id, f"{score:.4f}") for doc_id, score in sorted_doc_ids[:5]]
+                        fusion_span.set_attr("top_5_rrf_scores", str(top_scores))
+                        fusion_span.set_attr("top_doc_preview", result_docs[0].page_content[:300])
+            except Exception:
+                pass
+            return result_docs
+        async def _aget_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            return self._get_relevant_documents(query, run_manager=run_manager)
+    # Create hybrid retriever with tuned beta weights
+    hybrid_retriever = HybridRRFRetriever(
+        semantic_retriever=base_retriever,
+        bm25_retriever=bm25_retriever,
+        metadata_retriever=metadata_retriever,
+        beta_semantic=0.6,   # Semantic search gets highest weight (most reliable)
+        beta_keyword=0.25,   # BM25 keyword search (good for exact term matches)
+        beta_metadata=0.15,  # Metadata filtering (supporting role)
+        k=60,
+        top_k=25
+    )
+    print("✅ Hybrid RRF retriever ready with β weights: semantic=0.6, keyword=0.25, metadata=0.15, top_k=25")
+    # 8. Create Cross-Reference Enhanced Retriever
+    class CrossReferenceRetriever(BaseRetriever):
+        """Enhances retrieval by automatically fetching cross-referenced articles"""
+        base_retriever: BaseRetriever
+        article_map: dict
+        class Config:
+            arbitrary_types_allowed = True
+        def _get_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            with PhoenixSpan("cross_reference_expansion", {"query": query[:200]}, kind="INTERNAL") as xref_span:
+                # Get initial results
+                initial_docs = self.base_retriever.invoke(query)
+                xref_span.set_attr("initial_doc_count", len(initial_docs))
+                # Collect all related article numbers
+                all_article_numbers = set()
+                for doc in initial_docs:
+                    if 'article_number' in doc.metadata:
+                        all_article_numbers.add(doc.metadata['article_number'])
+                    # Parse cross_references (now stored as comma-separated string)
+                    cross_refs_str = doc.metadata.get('cross_references', '')
+                    if cross_refs_str:
+                        cross_refs = [ref.strip() for ref in cross_refs_str.split(',')]
+                        for ref in cross_refs:
+                            if ref:  # Skip empty strings
+                                all_article_numbers.add(str(ref))
+                # Build enhanced document list
+                enhanced_docs = []
+                seen_numbers = set()
+                # Add initially retrieved documents
+                for doc in initial_docs:
+                    enhanced_docs.append(doc)
+                    seen_numbers.add(doc.metadata.get('article_number'))
+                # Add cross-referenced articles not yet retrieved
+                for article_num in all_article_numbers:
+                    if article_num not in seen_numbers and article_num in self.article_map:
+                        article_data = self.article_map[article_num]
+                        cross_ref_text = ""
+                        if article_data.get('cross_references'):
+                            cross_ref_text = "\nالمواد ذات الصلة: " + ", ".join(
+                                [f"المادة {ref}" for ref in article_data['cross_references']]
+                            )
+                        page_content = f"""
+                        رقم المادة: {article_data['article_number']}
+                        النص الأصلي: {article_data['original_text']}
+                        الشرح المبسط: {article_data['simplified_summary']}{cross_ref_text}
+                        """
+                        enhanced_doc = Document(
+                            page_content=page_content,
+                            metadata={
+                                "article_id": article_data['article_id'],
+                                "article_number": str(article_data['article_number']),
+                                "legal_nature": article_data['legal_nature'],
+                                "keywords": ", ".join(article_data['keywords']),
+                                "cross_references": ", ".join([str(ref) for ref in article_data.get('cross_references', [])])
+                            }
+                        )
+                        enhanced_docs.append(enhanced_doc)
+                        seen_numbers.add(article_num)
+                # Record expansion stats (OUTSIDE the loop, at the end)
+                expanded_articles = [doc.metadata.get('article_number') for doc in enhanced_docs if doc not in initial_docs]
+                xref_span.set_attr("cross_refs_added", len(expanded_articles))
+                xref_span.set_attr("final_doc_count", len(enhanced_docs))
+                if expanded_articles:
+                    xref_span.set_attr("expanded_article_numbers", ", ".join(map(str, expanded_articles[:15])))
+                return enhanced_docs
+        async def _aget_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            return self._get_relevant_documents(query, run_manager=run_manager)
+    cross_ref_retriever = CrossReferenceRetriever(
+        base_retriever=hybrid_retriever,
+        article_map=article_map
+    )
+    print("✅ Cross-reference retriever ready (using hybrid RRF base)")
+    # 9. Reranker
+    print("Loading reranker model...")
+    local_model_path = r"D:\FOE\Senior 2\Graduation Project\Chatbot_me\reranker"
+    if not os.path.exists(local_model_path):
+        raise FileNotFoundError(f"Reranker path not found: {local_model_path}")
+    model = HuggingFaceCrossEncoder(model_name=local_model_path)
+    compressor = CrossEncoderReranker(model=model, top_n=10)
+    # Wrap compression retriever to add Phoenix spans
+    class InstrumentedCompressionRetriever(BaseRetriever):
+        base_retriever: ContextualCompressionRetriever
+        class Config:
+            arbitrary_types_allowed = True
+        def _get_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            with PhoenixSpan("reranker_compression", {
+                "query": query[:200],
+                "model": "HuggingFaceCrossEncoder",
+                "top_n": 10
+            }, kind="INTERNAL") as rerank_span:
+                # Apply reranking (this will call cross_ref_retriever internally)
+                reranked_docs = self.base_retriever.invoke(query)
+                rerank_span.set_attr("output_doc_count", len(reranked_docs))
+                if reranked_docs:
+                    output_articles = [d.metadata.get('article_number', 'N/A') for d in reranked_docs]
+                    rerank_span.set_attr("reranked_articles", ", ".join(map(str, output_articles)))
+                    rerank_span.set_attr("top_doc_preview", reranked_docs[0].page_content[:400] if reranked_docs else "")
+                return reranked_docs
+        async def _aget_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            return self._get_relevant_documents(query, run_manager=run_manager)
+    base_compression_retriever = ContextualCompressionRetriever(
+        base_compressor=compressor,
+        base_retriever=cross_ref_retriever
+    )
+    compression_retriever = InstrumentedCompressionRetriever(base_retriever=base_compression_retriever)
+    print("✅ Reranker model ready")
+    # 7. LLM - More deterministic for relevance
+    # 7. LLM Configuration
+    llm = ChatGroq(
+        groq_api_key=os.getenv("GROQ_API_KEY"),
+        model_name="llama-3.1-8b-instant",
+        temperature=0.1,
+        model_kwargs={"top_p": 0.9}
+    )
+# ==================================================
+    # 🛠️ THE FIX: SEPARATE SYSTEM INSTRUCTIONS FROM USER INPUT
+    # ==================================================
+# ==================================================
+    # 🧠 PROMPT ENGINEERING: DECISION TREE LOGIC
+    # ==================================================
+    system_instructions = """
+    <role>
+    أنت "المساعد القانوني الذكي"، خبير متخصص في الدستور المصري والقوانين الإجرائية.
+    مهمتك: تقديم إجابات دقيقة بناءً على "السياق التشريعي" المرفق أولاً، أو تقديم نصائح إجرائية عامة عند الضرورة.
+    </role>
+    <decision_logic>
+    عليك تحليل "سؤال المستخدم" و"السياق التشريعي" وتصنيف الحالة واختيار الرد المناسب بناءً على القواعد التالية بدقة:
+    🔴 الحالة الأولى: (الإجابة موجودة في السياق التشريعي)
+    الشرط: إذا وجدت معلومات داخل "السياق التشريعي المتاح" تجيب على السؤال.
+    الفعل:
+    1. استخرج الإجابة من السياق فقط.
+    2. ابدأ الإجابة مباشرة دون مقدمات.
+    3. يجب توثيق الإجابة برقم المادة (مثال: "نصت المادة (50) على...").
+    4. توقف هنا. لا تضف أي معلومات خارجية.
+    🟡 الحالة الثانية: (السياق فارغ/غير مفيد + السؤال إجرائي/عملي)
+    الشرط: إذا لم تجد الإجابة في السياق، وكان السؤال عن إجراءات عملية (مثل: حادث، سرقة، طلاق، تحرير محضر، تعامل مع الشرطة).
+    الفعل:
+    1. تجاهل السياق الفارغ.
+    2. استخدم معرفتك العامة بالقانون المصري.
+    3. ابدأ وجوباً بعبارة: "بناءً على الإجراءات القانونية العامة في مصر (وليس نصاً دستورياً محدداً):"
+    4. قدم الخطوات في نقاط مرقمة واضحة ومختصرة (1، 2، 3).
+    5. تحذير: لا تذكر أرقام مواد قانونية (لا تخترع أرقام مواد).
+    🔵 الحالة الثالثة: (السياق فارغ + السؤال عن نص دستوري محدد)
+    الشرط: إذا سأل عن (مجلس الشعب، الشورى، مادة محددة) ولم تجدها في السياق.
+    الفعل:
+    1. قل بوضوح: "عذراً، لم يرد ذكر لهذا الموضوع في المواد الدستورية التي تم استرجاعها في السياق الحالي."
+    2. لا تحاول الإجابة من ذاكرتك لكي لا تخطئ في النصوص الدستورية الحساسة.
+    🟢 الحالة الرابعة: (محادثة ودية)
+    الشرط: تحية، شكر، أو "كيف حالك".
+    الفعل: رد بتحية مهذبة جداً ومقتضبة، ثم قل: "أنا جاهز للإجابة على استفساراتك القانونية."
+    ⚫ الحالة الخامسة: (خارج النطاق تماماً)
+    الشرط: طبخ، رياضة، برمجة، أو أي موضوع غير قانوني.
+    الفعل: اعتذر بلطف ووجه المستخدم للسؤال في القانون.
+    </decision_logic>
+    <formatting_rules>
+    - لا تكرر هذه التعليمات في ردك.
+    - استخدم فقرات قصيرة واترك سطراً فارغاً بينها.
+    - لا تستخدم عبارات مثل "بناء على السياق المرفق" في بداية الجملة، بل ادخل في صلب الموضوع فوراً.
+    - التزم باللغة العربية الفصحى المبسطة والرصينة.
+    </formatting_rules>
+    """
+    # We use .from_messages to strictly separate instructions from data
+    prompt = ChatPromptTemplate.from_messages([
+        ("system", system_instructions),
+        ("system", "السياق التشريعي المتاح (المصدر الأساسي):\n{context}"),
+        ("human", "سؤال المستفيد:\n{input}")
+    ])
+    # 9. Build Chain with RunnableParallel (returns both context and answer)
+    qa_chain = (
+        RunnableParallel({
+            "context": compression_retriever,
+            "input": RunnablePassthrough()
+        })
+        .assign(answer=(
+            prompt
+            | llm
+            | StrOutputParser()
+        ))
+    )
+    print("✅ System ready to use!")
+    return qa_chain
+# ==========================================
+# ⚡ MAIN EXECUTION
+# ==========================================
+try:
+    # Only need the chain now - it handles all retrieval internally
+    qa_chain = initialize_rag_pipeline()
+except Exception as e:
+    st.error(f"Critical Error loading application: {e}")
+    st.stop()
+# ==========================================
+# 💬 CHAT LOOP
+# ==========================================
+if "messages" not in st.session_state:
+    st.session_state.messages = []
+# Display Chat History (with Eastern Arabic numerals)
+for message in st.session_state.messages:
+    with st.chat_message(message["role"]):
+        # Convert to Eastern Arabic when displaying from history
+        st.markdown(convert_to_eastern_arabic(message["content"]))
+# Handle New User Input
+if prompt_input := st.chat_input("اكتب سؤالك القانوني هنا..."):
+    # Show user message
+    st.session_state.messages.append({"role": "user", "content": prompt_input})
+    with st.chat_message("user"):
+        st.markdown(prompt_input)
+    # Generate Response
+    with st.chat_message("assistant"):
+        with st.spinner("جاري التحليل القانوني..."):
+            try:
+                # Invoke chain ONCE - returns Dict with 'context', 'input', and 'answer'
+                with PhoenixSpan("chat_request", {
+                    "question": prompt_input,
+                    "question_len": len(prompt_input or ""),
+                    "timestamp": datetime.utcnow().isoformat(),
+                }, kind="SERVER") as span:
+                    result = qa_chain.invoke(prompt_input)
+                    # Extract answer and context from result
+                    response_text = result["answer"]
+                    source_docs = result["context"]
+                    # Attach detailed context attributes
+                    try:
+                        ctx_list = result.get("context", []) or []
+                        ctx_count = len(ctx_list)
+                        span.set_attr("context_count", ctx_count)
+                        if ctx_count:
+                            # Record all article numbers
+                            article_nums = [doc.metadata.get("article_number", "N/A") for doc in ctx_list]
+                            span.set_attr("context_articles", ", ".join(map(str, article_nums)))
+                            # Record legal natures
+                            legal_natures = [doc.metadata.get("legal_nature", "N/A") for doc in ctx_list]
+                            span.set_attr("legal_natures", ", ".join(legal_natures[:5]))
+                            # Add context preview (first doc)
+                            span.set_attr("context_preview", ctx_list[0].page_content[:500])
+                    except Exception:
+                        pass
+                    # Log LLM generation as a nested span (properly nested under chat_request)
+                    with PhoenixSpan("llm_generation", {
+                        "model": "llama-3.1-8b-instant",
+                        "temperature": 0.1,
+                        "top_p": 0.9,
+                        "prompt_preview": prompt_input[:300]
+                    }, kind="CLIENT") as llm_span:
+                        llm_span.set_attr("response", response_text)
+                        llm_span.set_attr("response_len", len(response_text))
+                        llm_span.set_attr("response_preview", response_text[:500])
+                        llm_span.set_attr("context_docs_used", len(source_docs))
+                # Display Answer
+                response_text_arabic = convert_to_eastern_arabic(response_text)
+                st.markdown(response_text_arabic)
+                # Display Sources
+                if source_docs and len(source_docs) > 0:
+                    print(f"✅ Found {len(source_docs)} documents")
+                    # Deduplicate documents by article_number
+                    seen_articles = set()
+                    unique_docs = []
+                    for doc in source_docs:
+                        article_num = str(doc.metadata.get('article_number', '')).strip()
+                        if article_num and article_num not in seen_articles:
+                            seen_articles.add(article_num)
+                            unique_docs.append(doc)
+                    st.markdown("---")  # Separator before sources
+                    if unique_docs:
+                        with st.expander(f"📚 المصادر المستخدمة ({len(unique_docs)} مادة)"):
+                            st.markdown("### المواد الدستورية المستخدمة في التحليل:")
+                            st.markdown("---")
+                            for idx, doc in enumerate(unique_docs, 1):
+                                article_num = str(doc.metadata.get('article_number', '')).strip()
+                                legal_nature = doc.metadata.get('legal_nature', '')
+                                if article_num:
+                                    st.markdown(f"**المادة رقم {convert_to_eastern_arabic(article_num)}**")
+                                    if legal_nature:
+                                        st.markdown(f"*الطبيعة القانونية: {legal_nature}*")
+                                    # Display article content
+                                    content_lines = doc.page_content.strip().split('\n')
+                                    for line in content_lines:
+                                        line = line.strip()
+                                        if line:
+                                            st.markdown(convert_to_eastern_arabic(line))
+                                    st.markdown("---")
+                    else:
+                        st.info("📌 لم يتم العثور على مصادر")
+                else:
+                    st.info("📌 لم يتم العثور على مصادر")
+                # Persist the raw answer to avoid double conversion glitches on rerun
+                st.session_state.messages.append({"role": "assistant", "content": response_text})
+            except Exception as e:
+                st.error(f"حدث خطأ: {e}")

app_final_updated.py ADDED Viewed

	@@ -0,0 +1,704 @@

+# -*- coding: utf-8 -*-
+import os
+import sys
+import json
+from dotenv import load_dotenv
+import logging
+import warnings
+# Suppress progress bars from transformers/tqdm
+os.environ['TRANSFORMERS_NO_PROGRESS_BAR'] = '1'
+warnings.filterwarnings('ignore')
+# 1. Loaders & Splitters
+from langchain_core.documents import Document
+#from langchain_text_splitters import RecursiveCharacterTextSplitter
+from langchain_core.retrievers import BaseRetriever
+from langchain_core.callbacks import CallbackManagerForRetrieverRun
+from typing import List
+from rank_bm25 import BM25Okapi
+import numpy as np
+# 2. Vector Store & Embeddings
+from langchain_chroma import Chroma
+from langchain_huggingface import HuggingFaceEmbeddings
+# 3. Reranker Imports
+from langchain_classic.retrievers.document_compressors import CrossEncoderReranker
+from langchain_classic.retrievers import ContextualCompressionRetriever
+from langchain_community.cross_encoders import HuggingFaceCrossEncoder
+# 4. LLM
+from langchain_groq import ChatGroq
+from langchain_core.prompts import ChatPromptTemplate
+from langchain_core.output_parsers import StrOutputParser
+from langchain_core.runnables import RunnablePassthrough, RunnableParallel
+# Configure logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+load_dotenv()
+# ==========================================
+# 🧭 RUNTIME MODE (UI vs CLI)
+# ==========================================
+RUN_MODE = os.getenv("RUN_MODE", "").strip().lower()
+IS_CLI = RUN_MODE in {"cli", "terminal", "eval", "evaluation"}
+if IS_CLI:
+    class _DummyStreamlit:
+        @staticmethod
+        def cache_resource(func=None, **_kwargs):
+            if func is None:
+                def decorator(f):
+                    return f
+                return decorator
+            return func
+    st = _DummyStreamlit()
+else:
+    import streamlit as st
+# ==========================================
+# 📁 PATHS (use project-relative folders)
+# ==========================================
+BASE_DIR = os.path.dirname(os.path.abspath(__file__))
+DATA_DIR = os.path.join(BASE_DIR, "data")
+CHROMA_DIR = os.path.join(BASE_DIR, "chroma_db")
+if not IS_CLI:
+    # ==========================================
+    # 🎨 UI SETUP (CSS FOR ARABIC & RTL)
+    # ==========================================
+    st.set_page_config(page_title="المساعد القانوني", page_icon="⚖️")
+    # This CSS block fixes the "001" number issue and right alignment
+    st.markdown("""
+    <style>
+        /* Force the main app container to be Right-to-Left */
+        .stApp {
+            direction: rtl;
+            text-align: right;
+        }
+        /* Fix input fields to type from right */
+        .stTextInput input {
+            direction: rtl;
+            text-align: right;
+        }
+        /* Fix chat messages alignment */
+        .stChatMessage {
+            direction: rtl;
+            text-align: right;
+        }
+        /* Ensure proper paragraph spacing */
+        .stMarkdown p {
+            margin: 0.5em 0 !important;
+            line-height: 1.6;
+            word-spacing: 0.1em;
+        }
+        /* Ensure numbers display correctly in RTL */
+        p, div, span, label {
+            unicode-bidi: embed;
+            direction: inherit;
+            white-space: normal;
+            word-wrap: break-word;
+        }
+        /* Force all content to respect RTL */
+        * {
+            direction: rtl !important;
+        }
+        /* Preserve line breaks and spacing */
+        .stMarkdown pre {
+            direction: rtl;
+            text-align: right;
+            white-space: pre-wrap;
+            word-wrap: break-word;
+        }
+        /* Hide the "Deploy" button and standard menu for cleaner look */
+        #MainMenu {visibility: hidden;}
+        footer {visibility: hidden;}
+    </style>
+    """, unsafe_allow_html=True)
+# Put this at the top of your code
+def convert_to_eastern_arabic(text):
+    """Converts 0123456789 to ٠١٢٣٤٥٦٧٨٩"""
+    if not isinstance(text, str):
+        return text
+    western_numerals = '0123456789'
+    eastern_numerals = '٠١٢٣٤٥٦٧٨٩'
+    translation_table = str.maketrans(western_numerals, eastern_numerals)
+    return text.translate(translation_table)
+if not IS_CLI:
+    st.title("⚖️ المساعد القانوني الذكي (دستور مصر)")
+# ==========================================
+# 🚀 CACHED RESOURCE LOADING (THE FIX)
+# ==========================================
+# This decorator tells Streamlit: "Run this ONCE and save the result."
+@st.cache_resource
+def initialize_rag_pipeline():
+    print("🔄 Initializing system...")
+    print("📥 Loading data...")
+    # 1. Load JSONs from ./data (supports multiple files)
+    def load_json_folder(folder_path: str):
+        all_items = []
+        for filename in os.listdir(folder_path):
+            if not filename.lower().endswith(".json"):
+                continue
+            file_path = os.path.join(folder_path, filename)
+            with open(file_path, "r", encoding="utf-8") as f:
+                obj = json.load(f)
+            # Support: list of articles, or dict with 'data'/'articles', or single dict article
+            if isinstance(obj, list):
+                all_items.extend(obj)
+            elif isinstance(obj, dict):
+                if "data" in obj and isinstance(obj["data"], list):
+                    all_items.extend(obj["data"])
+                elif "articles" in obj and isinstance(obj["articles"], list):
+                    all_items.extend(obj["articles"])
+                else:
+                    all_items.append(obj)
+            else:
+                logger.warning(f"Unsupported JSON format in: {file_path}")
+        return all_items
+    if not os.path.exists(DATA_DIR):
+        raise FileNotFoundError(f"Data folder not found: {DATA_DIR}")
+    data = load_json_folder(DATA_DIR)
+    # Optional: de-duplicate (article_id preferred, fallback to article_number)
+    unique = {}
+    for item in data:
+        key = str(item.get("article_id") or item.get("article_number") or hash(json.dumps(item, ensure_ascii=False)))
+        unique[key] = item
+    data = list(unique.values())
+    # Create a mapping of article numbers for cross-reference lookup
+    article_map = {str(item['article_number']): item for item in data if 'article_number' in item}
+    docs = []
+    for item in data:
+        article_number = item.get("article_number")
+        original_text = item.get("original_text")
+        simplified_summary = item.get("simplified_summary")
+        if not article_number or not original_text or not simplified_summary:
+            logger.warning("Skipping item with missing fields (article_number/original_text/simplified_summary)")
+            continue
+        cross_refs = item.get("cross_references")
+        if not isinstance(cross_refs, list):
+            cross_refs = []
+        # Build cross-reference section
+        cross_ref_text = ""
+        if cross_refs:
+            cross_ref_text = "\nالمواد ذات الصلة (المراجع المتقاطعة): " + ", ".join(
+                [f"المادة {ref}" for ref in cross_refs]
+            )
+        # Construct content
+        page_content = f"""
+        رقم المادة: {article_number}
+        النص الأصلي: {original_text}
+        الشرح المبسط: {simplified_summary}{cross_ref_text}
+        """
+        metadata = {
+            "article_id": item.get("article_id") or str(article_number),
+            "article_number": str(article_number),
+            "legal_nature": item.get("legal_nature", ""),
+            "keywords": ", ".join(item.get("keywords", []) or []),
+            "part": item.get("part (Bab)", ""),
+            "chapter": item.get("chapter (Fasl)", ""),
+            "cross_references": ", ".join([str(ref) for ref in cross_refs])
+        }
+        docs.append(Document(page_content=page_content, metadata=metadata))
+    print(f"✅ Loaded {len(docs)} constitutional articles")
+    # 2. Embeddings
+    print("Loading embeddings model...")
+    embeddings = HuggingFaceEmbeddings(
+        model_name="Omartificial-Intelligence-Space/GATE-AraBert-v1"
+    )
+    print("✅ Embeddings model ready")
+    # 3. No splitting - keep articles as complete units
+    chunks = docs
+    # 4. Vector Store (persist once, load on next runs)
+    if os.path.exists(CHROMA_DIR) and os.listdir(CHROMA_DIR):
+        print("📦 Loading existing vector database...")
+        vectorstore = Chroma(
+            persist_directory=CHROMA_DIR,
+            embedding_function=embeddings
+        )
+        print("✅ Loaded existing Chroma DB (no re-embedding)")
+    else:
+        print("🧱 Building vector database for the first time (this will create embeddings)...")
+        vectorstore = Chroma.from_documents(
+            chunks,
+            embeddings,
+            persist_directory=CHROMA_DIR
+        )
+        print("✅ Built Chroma DB and persisted to disk")
+    base_retriever = vectorstore.as_retriever(search_kwargs={"k": 15})
+# 5. Create BM25 Keyword Retriever
+    class BM25Retriever(BaseRetriever):
+        """BM25-based keyword retriever for constitutional articles"""
+        corpus_docs: List[Document]
+        bm25: BM25Okapi = None
+        k: int = 15
+        class Config:
+            arbitrary_types_allowed = True
+        def __init__(self, **data):
+            super().__init__(**data)
+            # Tokenize corpus for BM25
+            tokenized_corpus = [doc.page_content.split() for doc in self.corpus_docs]
+            self.bm25 = BM25Okapi(tokenized_corpus)
+        def _get_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            # Tokenize query
+            tokenized_query = query.split()
+            # Get BM25 scores
+            scores = self.bm25.get_scores(tokenized_query)
+            # Get top k indices
+            top_indices = np.argsort(scores)[::-1][:self.k]
+            # Return documents
+            return [self.corpus_docs[i] for i in top_indices if scores[i] > 0]
+        async def _aget_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            return self._get_relevant_documents(query, run_manager=run_manager)
+    bm25_retriever = BM25Retriever(corpus_docs=docs, k=15)
+    print("✅ BM25 keyword retriever ready")
+    # 6. Create Metadata Filter Retriever
+    class MetadataFilterRetriever(BaseRetriever):
+        """Metadata-based filtering retriever"""
+        corpus_docs: List[Document]
+        k: int = 15
+        class Config:
+            arbitrary_types_allowed = True
+        def _get_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            query_lower = query.lower()
+            scored_docs = []
+            for doc in self.corpus_docs:
+                score = 0
+                # Match keywords
+                keywords = doc.metadata.get('keywords', '').lower()
+                if any(word in keywords for word in query_lower.split()):
+                    score += 3
+                # Match legal nature
+                legal_nature = doc.metadata.get('legal_nature', '').lower()
+                if any(word in legal_nature for word in query_lower.split()):
+                    score += 2
+                # Match part/chapter
+                part = doc.metadata.get('part', '').lower()
+                chapter = doc.metadata.get('chapter', '').lower()
+                if any(word in part or word in chapter for word in query_lower.split()):
+                    score += 1
+                # Match in content
+                if any(word in doc.page_content.lower() for word in query_lower.split()):
+                    score += 1
+                if score > 0:
+                    scored_docs.append((doc, score))
+            # Sort by score and return top k
+            scored_docs.sort(key=lambda x: x[1], reverse=True)
+            return [doc for doc, _ in scored_docs[:self.k]]
+        async def _aget_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            return self._get_relevant_documents(query, run_manager=run_manager)
+    metadata_retriever = MetadataFilterRetriever(corpus_docs=docs, k=15)
+    print("✅ Metadata filter retriever ready")
+    # 7. Create Hybrid RRF Retriever
+    class HybridRRFRetriever(BaseRetriever):
+        """Combines semantic, BM25, and metadata retrievers using Reciprocal Rank Fusion"""
+        semantic_retriever: BaseRetriever
+        bm25_retriever: BM25Retriever
+        metadata_retriever: MetadataFilterRetriever
+        beta_semantic: float = 0.6  # Weight for semantic search
+        beta_keyword: float = 0.2   # Weight for BM25 keyword search
+        beta_metadata: float = 0.2  # Weight for metadata filtering
+        k: int = 60  # RRF constant (typically 60)
+        top_k: int = 15
+        class Config:
+            arbitrary_types_allowed = True
+        def _get_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            # Get results from all three retrievers
+            semantic_docs = self.semantic_retriever.invoke(query)
+            bm25_docs = self.bm25_retriever.invoke(query)
+            metadata_docs = self.metadata_retriever.invoke(query)
+            # Apply Reciprocal Rank Fusion
+            rrf_scores = {}
+            # Process semantic results
+            for rank, doc in enumerate(semantic_docs, start=1):
+                doc_id = (doc.metadata.get('article_id') or doc.metadata.get('article_number') or str(hash(doc.page_content)))
+                rrf_scores[doc_id] = rrf_scores.get(doc_id, 0) + self.beta_semantic / (self.k + rank)
+            # Process BM25 results
+            for rank, doc in enumerate(bm25_docs, start=1):
+                doc_id = (doc.metadata.get('article_id') or doc.metadata.get('article_number') or str(hash(doc.page_content)))
+                rrf_scores[doc_id] = rrf_scores.get(doc_id, 0) + self.beta_keyword / (self.k + rank)
+            # Process metadata results
+            for rank, doc in enumerate(metadata_docs, start=1):
+                doc_id = (doc.metadata.get('article_id') or doc.metadata.get('article_number') or str(hash(doc.page_content)))
+                rrf_scores[doc_id] = rrf_scores.get(doc_id, 0) + self.beta_metadata / (self.k + rank)
+            # Create document lookup
+            all_docs = {}
+            for doc in semantic_docs + bm25_docs + metadata_docs:
+                doc_id = (doc.metadata.get('article_id') or doc.metadata.get('article_number') or str(hash(doc.page_content)))
+                if doc_id not in all_docs:
+                    all_docs[doc_id] = doc
+            # Sort by RRF score
+            sorted_doc_ids = sorted(rrf_scores.items(), key=lambda x: x[1], reverse=True)
+            # Return top k documents
+            result_docs = []
+            for doc_id, score in sorted_doc_ids[:self.top_k]:
+                if doc_id in all_docs:
+                    result_docs.append(all_docs[doc_id])
+            return result_docs
+        async def _aget_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            return self._get_relevant_documents(query, run_manager=run_manager)
+    # Create hybrid retriever with tuned beta weights
+    hybrid_retriever = HybridRRFRetriever(
+        semantic_retriever=base_retriever,
+        bm25_retriever=bm25_retriever,
+        metadata_retriever=metadata_retriever,
+        beta_semantic=0.5,   # Semantic search gets highest weight (most reliable)
+        beta_keyword=0.3,    # BM25 keyword search (good for exact term matches)
+        beta_metadata=0.2,   # Metadata filtering (supporting role)
+        k=60,
+        top_k=20
+    )
+    print("✅ Hybrid RRF retriever ready with β weights: semantic=0.5, keyword=0.3, metadata=0.2")
+    # 8. Create Cross-Reference Enhanced Retriever
+    class CrossReferenceRetriever(BaseRetriever):
+        """Enhances retrieval by automatically fetching cross-referenced articles"""
+        base_retriever: BaseRetriever
+        article_map: dict
+        class Config:
+            arbitrary_types_allowed = True
+        def _get_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            # Get initial results
+            initial_docs = self.base_retriever.invoke(query)
+            # Collect all related article numbers
+            all_article_numbers = set()
+            for doc in initial_docs:
+                if 'article_number' in doc.metadata:
+                    all_article_numbers.add(doc.metadata['article_number'])
+                # Parse cross_references (now stored as comma-separated string)
+                cross_refs_str = doc.metadata.get('cross_references', '')
+                if cross_refs_str:
+                    cross_refs = [ref.strip() for ref in cross_refs_str.split(',')]
+                    for ref in cross_refs:
+                        if ref:  # Skip empty strings
+                            all_article_numbers.add(str(ref))
+            # Build enhanced document list
+            enhanced_docs = []
+            seen_numbers = set()
+            # Add initially retrieved documents
+            for doc in initial_docs:
+                enhanced_docs.append(doc)
+                seen_numbers.add(doc.metadata.get('article_number'))
+            # Add cross-referenced articles not yet retrieved
+            for article_num in all_article_numbers:
+                if article_num not in seen_numbers and article_num in self.article_map:
+                    article_data = self.article_map[article_num]
+                    cross_ref_text = ""
+                    cross_refs = article_data.get("cross_references")
+                    if not isinstance(cross_refs, list):
+                        cross_refs = []
+                    if cross_refs:
+                        cross_ref_text = "\nالمواد ذات الصلة: " + ", ".join(
+                            [f"المادة {ref}" for ref in cross_refs]
+                        )
+                    page_content = f"""
+                    رقم المادة: {article_data.get('article_number', '')}
+                    النص الأصلي: {article_data.get('original_text', '')}
+                    الشرح المبسط: {article_data.get('simplified_summary', '')}{cross_ref_text}
+                    """
+                    enhanced_doc = Document(
+                        page_content=page_content,
+                        metadata={
+                            "article_id": article_data.get("article_id") or str(article_data.get("article_number", "")),
+                            "article_number": str(article_data.get("article_number", "")),
+                            "legal_nature": article_data.get("legal_nature", ""),
+                            "keywords": ", ".join(article_data.get("keywords", []) or []),
+                            "cross_references": ", ".join([str(ref) for ref in cross_refs])
+                        }
+                    )
+                    enhanced_docs.append(enhanced_doc)
+                    seen_numbers.add(article_num)
+            return enhanced_docs
+        async def _aget_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            return self._get_relevant_documents(query, run_manager=run_manager)
+    cross_ref_retriever = CrossReferenceRetriever(
+        base_retriever=hybrid_retriever,
+        article_map=article_map
+    )
+    print("✅ Cross-reference retriever ready (using hybrid RRF base)")
+    # 9. Reranker
+    print("Loading reranker model...")
+    local_model_path = r"D:\FOE\Senior 2\Graduation Project\Chatbot_me\reranker"
+    if not os.path.exists(local_model_path):
+        raise FileNotFoundError(f"Reranker path not found: {local_model_path}")
+    model = HuggingFaceCrossEncoder(model_name=local_model_path)
+    compressor = CrossEncoderReranker(model=model, top_n=5)
+    compression_retriever = ContextualCompressionRetriever(
+        base_compressor=compressor,
+        base_retriever=cross_ref_retriever
+    )
+    print("✅ Reranker model ready")
+    # 7. LLM - Balanced for consistency with slight creativity
+    # 7. LLM Configuration
+    llm = ChatGroq(
+        groq_api_key=os.getenv("GROQ_API_KEY"),
+        model_name="llama-3.1-8b-instant",
+        temperature=0.3,       # Slightly increased to allow helpful general advice
+        model_kwargs={"top_p": 0.9}
+    )
+# ==================================================
+    # 🛠️ THE FIX: SEPARATE SYSTEM INSTRUCTIONS FROM USER INPUT
+    # ==================================================
+# ==================================================
+    # 🧠 PROMPT ENGINEERING: DECISION TREE LOGIC
+    # ==================================================
+    system_instructions = """
+    <role>
+    أنت "المساعد القانوني الذكي"، خبير متخصص في الدستور المصري والقوانين الإجرائية.
+    مهمتك: تقديم إجابات دقيقة بناءً على "السياق التشريعي" المرفق أولاً، أو تقديم نصائح إجرائية عامة عند الضرورة.
+    </role>
+    <decision_logic>
+    عليك تحليل "سؤال المستخدم" و"السياق التشريعي" وتصنيف الحالة واختيار الرد المناسب بناءً على القواعد التالية بدقة:
+    🔴 الحالة الأولى: (الإجابة موجودة في السياق التشريعي)
+    الشرط: إذا وجدت معلومات داخل "السياق التشريعي المتاح" تجيب على السؤال.
+    الفعل:
+    1. استخرج الإجابة من السياق فقط.
+    2. ابدأ الإجابة مباشرة دون مقدمات.
+    3. يجب توثيق الإجابة برقم المادة (مثال: "نصت المادة (50) على...").
+    4. توقف هنا. لا تضف أي معلومات خارجية.
+    🟡 الحالة الثانية: (السياق فارغ/غير مفيد + السؤال إجرائي/عملي)
+    الشرط: إذا لم تجد الإجابة في السياق، وكان السؤال عن إجراءات عملية (مثل: حادث، سرقة، طلاق، تحرير محضر، تعامل مع الشرطة).
+    الفعل:
+    1. تجاهل السياق الفارغ.
+    2. استخدم معرفتك العامة بالقانون المصري.
+    3. ابدأ وجوباً بعبارة: "بناءً على الإجراءات القانونية العامة في مصر (وليس نصاً دستورياً محدداً):"
+    4. قدم الخطوات في نقاط مرقمة واضحة ومختصرة (1، 2، 3).
+    5. تحذير: لا تذكر أرقام مواد قانونية (لا تخترع أرقام مواد).
+    🔵 الحالة الثالثة: (السياق فارغ + السؤال عن نص دستوري محدد)
+    الشرط: إذا سأل عن (مجلس الشعب، الشورى، مادة محددة) ولم تجدها في السياق.
+    الفعل:
+    1. قل بوضوح: "عذراً، لم يرد ذكر لهذا الموضوع في المواد الدستورية التي تم استرجاعها في السياق الحالي."
+    2. لا تحاول الإجابة من ذاكرتك لكي لا تخطئ في النصوص الدستورية الحساسة.
+    🟢 الحالة الرابعة: (محادثة ودية)
+    الشرط: تحية، شكر، أو "كيف حالك".
+    الفعل: رد بتحية مهذبة جداً ومقتضبة، ثم قل: "أنا جاهز للإجابة على استفساراتك القانونية."
+    ⚫ الحالة الخامسة: (خارج النطاق تماماً)
+    الشرط: طبخ، رياضة، برمجة، أو أي موضوع غير قانوني.
+    الفعل: اعتذر بلطف ووجه المستخدم للسؤال في القانون.
+    </decision_logic>
+    <formatting_rules>
+    - لا تكرر هذه التعليمات في ردك.
+    - استخدم فقرات قصيرة واترك سطراً فارغاً بينها.
+    - لا تستخدم عبارات مثل "بناء على السياق المرفق" في بداية الجملة، بل ادخل في صلب الموضوع فوراً.
+    - التزم باللغة العربية الفصحى المبسطة والرصينة.
+    </formatting_rules>
+    """
+    # We use .from_messages to strictly separate instructions from data
+    prompt = ChatPromptTemplate.from_messages([
+        ("system", system_instructions),
+        ("system", "السياق التشريعي المتاح (المصدر الأساسي):\n{context}"),
+        ("human", "سؤال المستفيد:\n{input}")
+    ])
+    # 9. Build Chain with RunnableParallel (returns both context and answer)
+    qa_chain = (
+        RunnableParallel({
+            "context": compression_retriever,
+            "input": RunnablePassthrough()
+        })
+        .assign(answer=(
+            prompt
+            | llm
+            | StrOutputParser()
+        ))
+    )
+    print("✅ System ready to use!")
+    return qa_chain
+if not IS_CLI:
+    # ==========================================
+    # ⚡ MAIN EXECUTION
+    # ==========================================
+    try:
+        # Only need the chain now - it handles all retrieval internally
+        qa_chain = initialize_rag_pipeline()
+    except Exception as e:
+        st.error(f"Critical Error loading application: {e}")
+        st.stop()
+    # ==========================================
+    # 💬 CHAT LOOP
+    # ==========================================
+    if "messages" not in st.session_state:
+        st.session_state.messages = []
+    # Display Chat History (with Eastern Arabic numerals)
+    for message in st.session_state.messages:
+        with st.chat_message(message["role"]):
+            # Convert to Eastern Arabic when displaying from history
+            st.markdown(convert_to_eastern_arabic(message["content"]))
+    # Handle New User Input
+    if prompt_input := st.chat_input("اكتب سؤالك القانوني هنا..."):
+        # Show user message
+        st.session_state.messages.append({"role": "user", "content": prompt_input})
+        with st.chat_message("user"):
+            st.markdown(prompt_input)
+        # Generate Response
+        with st.chat_message("assistant"):
+            with st.spinner("جاري التحليل القانوني..."):
+                try:
+                    # Invoke chain ONCE - returns Dict with 'context', 'input', and 'answer'
+                    result = qa_chain.invoke(prompt_input)
+                    # Extract answer and context from result
+                    response_text = result["answer"]
+                    source_docs = result["context"]  # Context is already in the result!
+                    # Display Answer
+                    response_text_arabic = convert_to_eastern_arabic(response_text)
+                    st.markdown(response_text_arabic)
+                    # Display Sources
+                    if source_docs and len(source_docs) > 0:
+                        print(f"✅ Found {len(source_docs)} documents")
+                        # Deduplicate documents by article_number
+                        seen_articles = set()
+                        unique_docs = []
+                        for doc in source_docs:
+                            article_num = str(doc.metadata.get('article_number', '')).strip()
+                            if article_num and article_num not in seen_articles:
+                                seen_articles.add(article_num)
+                                unique_docs.append(doc)
+                        st.markdown("---")  # Separator before sources
+                        if unique_docs:
+                            with st.expander(f"📚 المصادر المستخدمة ({len(unique_docs)} مادة)"):
+                                st.markdown("### المواد الدستورية المستخدمة في التحليل:")
+                                st.markdown("---")
+                                for idx, doc in enumerate(unique_docs, 1):
+                                    article_num = str(doc.metadata.get('article_number', '')).strip()
+                                    legal_nature = doc.metadata.get('legal_nature', '')
+                                    if article_num:
+                                        st.markdown(f"**المادة رقم {convert_to_eastern_arabic(article_num)}**")
+                                        if legal_nature:
+                                            st.markdown(f"*الطبيعة القانونية: {legal_nature}*")
+                                        # Display article content
+                                        content_lines = doc.page_content.strip().split('\n')
+                                        for line in content_lines:
+                                            line = line.strip()
+                                            if line:
+                                                st.markdown(convert_to_eastern_arabic(line))
+                                        st.markdown("---")
+                        else:
+                            st.info("📌 لم يتم العثور على مصادر")
+                    else:
+                        st.info("📌 لم يتم العثور على مصادر")
+                    # Persist the raw answer to avoid double conversion glitches on rerun
+                    st.session_state.messages.append({"role": "assistant", "content": response_text})
+                except Exception as e:
+                    st.error(f"حدث خطأ: {e}")

evaluate.py ADDED Viewed

	@@ -0,0 +1,620 @@

+# -*- coding: utf-8 -*-
+"""
+RAGAS Evaluation Script for Constitutional Legal Assistant
+Evaluates: faithfulness, answer_relevancy, context_precision, context_recall
+"""
+import os
+import json
+from dotenv import load_dotenv
+import logging
+import warnings
+# Suppress progress bars
+os.environ['TRANSFORMERS_NO_PROGRESS_BAR'] = '1'
+warnings.filterwarnings('ignore')
+# Core imports
+from langchain_core.documents import Document
+from langchain_core.retrievers import BaseRetriever
+from langchain_core.callbacks import CallbackManagerForRetrieverRun
+from typing import List
+from rank_bm25 import BM25Okapi
+import numpy as np
+# Vector Store & Embeddings
+from langchain_chroma import Chroma
+from langchain_huggingface import HuggingFaceEmbeddings
+# Reranker
+from langchain_classic.retrievers.document_compressors import CrossEncoderReranker
+from langchain_classic.retrievers import ContextualCompressionRetriever
+from langchain_community.cross_encoders import HuggingFaceCrossEncoder
+# LLM
+from langchain_groq import ChatGroq
+from langchain_core.prompts import ChatPromptTemplate
+from langchain_core.output_parsers import StrOutputParser
+from langchain_core.runnables import RunnablePassthrough, RunnableParallel
+# Evaluation
+from datasets import Dataset
+from ragas import evaluate
+from ragas.metrics import (
+    faithfulness,
+    answer_relevancy,
+    context_precision,
+    context_recall,
+)
+from ragas.llms import LangchainLLMWrapper
+from ragas.embeddings import LangchainEmbeddingsWrapper
+# Configure logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+load_dotenv()
+# ==========================================
+# 🚀 RAG PIPELINE INITIALIZATION
+# ==========================================
+def initialize_rag_pipeline():
+    """Initialize the RAG pipeline for constitutional legal questions"""
+    print("🔄 Initializing RAG pipeline...")
+    print("📥 Loading data...")
+    # 1. Load JSON
+    json_path = "Egyptian_Constitution_legalnature_only.json"
+    if not os.path.exists(json_path):
+        raise FileNotFoundError(f"File not found: {json_path}")
+    with open(json_path, "r", encoding="utf-8") as f:
+        data = json.load(f)
+    # Create article mapping for cross-references
+    article_map = {str(item['article_number']): item for item in data}
+    docs = []
+    for item in data:
+        # Build cross-reference section
+        cross_ref_text = ""
+        if item.get('cross_references') and len(item['cross_references']) > 0:
+            cross_ref_text = "\nالمواد ذات الصلة (المراجع المتقاطعة): " + ", ".join(
+                [f"المادة {ref}" for ref in item['cross_references']]
+            )
+        # Construct document content
+        page_content = f"""
+        رقم المادة: {item['article_number']}
+        النص الأصلي: {item['original_text']}
+        الشرح المبسط: {item['simplified_summary']}{cross_ref_text}
+        """
+        metadata = {
+            "article_id": item['article_id'],
+            "article_number": str(item['article_number']),
+            "legal_nature": item['legal_nature'],
+            "keywords": ", ".join(item['keywords']),
+            "part": item.get('part (Bab)', ''),
+            "chapter": item.get('chapter (Fasl)', ''),
+            "cross_references": ", ".join([str(ref) for ref in item.get('cross_references', [])])
+        }
+        docs.append(Document(page_content=page_content, metadata=metadata))
+    print(f"✅ Loaded {len(docs)} constitutional articles")
+    # 2. Embeddings
+    print("Loading embeddings model...")
+    embeddings = HuggingFaceEmbeddings(
+        model_name="Omartificial-Intelligence-Space/GATE-AraBert-v1"
+    )
+    print("✅ Embeddings ready")
+    # 3. Vector Store
+    print("Building vector database...")
+    vectorstore = Chroma.from_documents(
+        docs,
+        embeddings,
+        persist_directory="chroma_db"
+    )
+    base_retriever = vectorstore.as_retriever(search_kwargs={"k": 15})
+    print("✅ Vector database ready")
+    # 4. BM25 Keyword Retriever
+    class BM25Retriever(BaseRetriever):
+        """BM25-based keyword retriever"""
+        corpus_docs: List[Document]
+        bm25: BM25Okapi = None
+        k: int = 15
+        class Config:
+            arbitrary_types_allowed = True
+        def __init__(self, **data):
+            super().__init__(**data)
+            tokenized_corpus = [doc.page_content.split() for doc in self.corpus_docs]
+            self.bm25 = BM25Okapi(tokenized_corpus)
+        def _get_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            tokenized_query = query.split()
+            scores = self.bm25.get_scores(tokenized_query)
+            top_indices = np.argsort(scores)[::-1][:self.k]
+            return [self.corpus_docs[i] for i in top_indices if scores[i] > 0]
+        async def _aget_relevant_documents(self, query: str, **kwargs) -> List[Document]:
+            return self._get_relevant_documents(query)
+    bm25_retriever = BM25Retriever(corpus_docs=docs, k=15)
+    print("✅ BM25 retriever ready")
+    # 5. Metadata Filter Retriever
+    class MetadataFilterRetriever(BaseRetriever):
+        """Metadata-based filtering retriever"""
+        corpus_docs: List[Document]
+        k: int = 15
+        class Config:
+            arbitrary_types_allowed = True
+        def _get_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            query_lower = query.lower()
+            scored_docs = []
+            for doc in self.corpus_docs:
+                score = 0
+                keywords = doc.metadata.get('keywords', '').lower()
+                if any(word in keywords for word in query_lower.split()):
+                    score += 3
+                legal_nature = doc.metadata.get('legal_nature', '').lower()
+                if any(word in legal_nature for word in query_lower.split()):
+                    score += 2
+                part = doc.metadata.get('part', '').lower()
+                chapter = doc.metadata.get('chapter', '').lower()
+                if any(word in part or word in chapter for word in query_lower.split()):
+                    score += 1
+                if any(word in doc.page_content.lower() for word in query_lower.split()):
+                    score += 1
+                if score > 0:
+                    scored_docs.append((doc, score))
+            scored_docs.sort(key=lambda x: x[1], reverse=True)
+            return [doc for doc, _ in scored_docs[:self.k]]
+        async def _aget_relevant_documents(self, query: str, **kwargs) -> List[Document]:
+            return self._get_relevant_documents(query)
+    metadata_retriever = MetadataFilterRetriever(corpus_docs=docs, k=15)
+    print("✅ Metadata retriever ready")
+    # 6. Hybrid RRF Retriever
+    class HybridRRFRetriever(BaseRetriever):
+        """Combines semantic, BM25, and metadata using Reciprocal Rank Fusion"""
+        semantic_retriever: BaseRetriever
+        bm25_retriever: BM25Retriever
+        metadata_retriever: MetadataFilterRetriever
+        beta_semantic: float = 0.5
+        beta_keyword: float = 0.3
+        beta_metadata: float = 0.2
+        k: int = 60
+        top_k: int = 15
+        class Config:
+            arbitrary_types_allowed = True
+        def _get_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            semantic_docs = self.semantic_retriever.invoke(query)
+            bm25_docs = self.bm25_retriever.invoke(query)
+            metadata_docs = self.metadata_retriever.invoke(query)
+            rrf_scores = {}
+            for rank, doc in enumerate(semantic_docs, start=1):
+                doc_id = doc.metadata.get('article_number', str(hash(doc.page_content)))
+                rrf_scores[doc_id] = rrf_scores.get(doc_id, 0) + self.beta_semantic / (self.k + rank)
+            for rank, doc in enumerate(bm25_docs, start=1):
+                doc_id = doc.metadata.get('article_number', str(hash(doc.page_content)))
+                rrf_scores[doc_id] = rrf_scores.get(doc_id, 0) + self.beta_keyword / (self.k + rank)
+            for rank, doc in enumerate(metadata_docs, start=1):
+                doc_id = doc.metadata.get('article_number', str(hash(doc.page_content)))
+                rrf_scores[doc_id] = rrf_scores.get(doc_id, 0) + self.beta_metadata / (self.k + rank)
+            all_docs = {}
+            for doc in semantic_docs + bm25_docs + metadata_docs:
+                doc_id = doc.metadata.get('article_number', str(hash(doc.page_content)))
+                if doc_id not in all_docs:
+                    all_docs[doc_id] = doc
+            sorted_doc_ids = sorted(rrf_scores.items(), key=lambda x: x[1], reverse=True)
+            result_docs = []
+            for doc_id, score in sorted_doc_ids[:self.top_k]:
+                if doc_id in all_docs:
+                    result_docs.append(all_docs[doc_id])
+            return result_docs
+        async def _aget_relevant_documents(self, query: str, **kwargs) -> List[Document]:
+            return self._get_relevant_documents(query)
+    hybrid_retriever = HybridRRFRetriever(
+        semantic_retriever=base_retriever,
+        bm25_retriever=bm25_retriever,
+        metadata_retriever=metadata_retriever,
+        beta_semantic=0.5,
+        beta_keyword=0.3,
+        beta_metadata=0.2,
+        k=60,
+        top_k=20
+    )
+    print("✅ Hybrid RRF retriever ready (β: semantic=0.5, keyword=0.3, metadata=0.2)")
+    # 7. Cross-Reference Retriever
+    class CrossReferenceRetriever(BaseRetriever):
+        """Enhances retrieval by fetching cross-referenced articles"""
+        base_retriever: BaseRetriever
+        article_map: dict
+        class Config:
+            arbitrary_types_allowed = True
+        def _get_relevant_documents(
+            self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
+        ) -> List[Document]:
+            initial_docs = self.base_retriever.invoke(query)
+            all_article_numbers = set()
+            for doc in initial_docs:
+                if 'article_number' in doc.metadata:
+                    all_article_numbers.add(doc.metadata['article_number'])
+                cross_refs_str = doc.metadata.get('cross_references', '')
+                if cross_refs_str:
+                    cross_refs = [ref.strip() for ref in cross_refs_str.split(',')]
+                    for ref in cross_refs:
+                        if ref:
+                            all_article_numbers.add(str(ref))
+            enhanced_docs = []
+            seen_numbers = set()
+            for doc in initial_docs:
+                enhanced_docs.append(doc)
+                seen_numbers.add(doc.metadata.get('article_number'))
+            for article_num in all_article_numbers:
+                if article_num not in seen_numbers and article_num in self.article_map:
+                    article_data = self.article_map[article_num]
+                    cross_ref_text = ""
+                    if article_data.get('cross_references'):
+                        cross_ref_text = "\nالمواد ذات الصلة: " + ", ".join(
+                            [f"المادة {ref}" for ref in article_data['cross_references']]
+                        )
+                    page_content = f"""
+                    رقم المادة: {article_data['article_number']}
+                    النص الأصلي: {article_data['original_text']}
+                    الشرح المبسط: {article_data['simplified_summary']}{cross_ref_text}
+                    """
+                    enhanced_doc = Document(
+                        page_content=page_content,
+                        metadata={
+                            "article_id": article_data['article_id'],
+                            "article_number": str(article_data['article_number']),
+                            "legal_nature": article_data['legal_nature'],
+                            "keywords": ", ".join(article_data['keywords']),
+                            "cross_references": ", ".join([str(ref) for ref in article_data.get('cross_references', [])])
+                        }
+                    )
+                    enhanced_docs.append(enhanced_doc)
+                    seen_numbers.add(article_num)
+            return enhanced_docs
+        async def _aget_relevant_documents(self, query: str, **kwargs) -> List[Document]:
+            return self._get_relevant_documents(query)
+    cross_ref_retriever = CrossReferenceRetriever(
+        base_retriever=hybrid_retriever,
+        article_map=article_map
+    )
+    print("✅ Cross-reference retriever ready")
+    # 8. Reranker
+    print("Loading reranker model...")
+    local_model_path = r"D:\FOE\Senior 2\Graduation Project\Chatbot_me\reranker"
+    if not os.path.exists(local_model_path):
+        raise FileNotFoundError(f"Reranker path not found: {local_model_path}")
+    model = HuggingFaceCrossEncoder(model_name=local_model_path)
+    compressor = CrossEncoderReranker(model=model, top_n=5)
+    compression_retriever = ContextualCompressionRetriever(
+        base_compressor=compressor,
+        base_retriever=cross_ref_retriever
+    )
+    print("✅ Reranker ready (top_n=5)")
+    # 9. LLM Configuration
+    llm = ChatGroq(
+        groq_api_key=os.getenv("GROQ_API_KEY"),
+        model_name="llama-3.1-8b-instant",
+        temperature=0.3,
+        model_kwargs={"top_p": 0.9}
+    )
+    # 10. Prompt Template
+    system_instructions = """
+    <role>
+    أنت "المساعد القانوني الذكي"، خبير متخصص في الدستور المصري والقوانين الإجرائية.
+    مهمتك: تقديم إجابات دقيقة بناءً على "السياق التشريعي" المرفق أولاً، أو تقديم نصائح إجرائية عامة عند الضرورة.
+    </role>
+    <decision_logic>
+    عليك تحليل "سؤال المستخدم" و"السياق التشريعي" وتصنيف الحالة واختيار الرد المناسب:
+    🔴 الحالة الأولى: (الإجابة موجودة في السياق التشريعي)
+    - استخرج الإجابة من السياق فقط
+    - ابدأ الإجابة مباشرة دون مقدمات
+    - وثق الإجابة برقم المادة
+    - توقف، لا تضف معلومات خارجية
+    🟡 الحالة ��لثانية: (السياق فارغ + السؤال إجرائي/عملي)
+    - استخدم معرفتك العامة بالقانون المصري
+    - ابدأ بـ: "بناءً على الإجراءات القانونية العامة في مصر:"
+    - قدم الخطوات في نقاط مرقمة
+    🔵 الحالة الثالثة: (السياق فارغ + سؤال دستوري)
+    - قل: "عذراً، لم يرد ذكر لهذا في المواد المسترجاعة"
+    - لا تخترع نصوصاً دستورية
+    🟢 الحالة الرابعة: (تحية/شكر)
+    - رد بتحية مهذبة مختصرة
+    ⚫ الحالة الخامسة: (خارج النطاق)
+    - اعتذر بلطف ووجه للقانون
+    </decision_logic>
+    <formatting_rules>
+    - استخدم فقرات قصيرة واترك سطراً فارغاً بينها
+    - التزم باللغة العربية الفصحى المبسطة
+    </formatting_rules>
+    """
+    prompt = ChatPromptTemplate.from_messages([
+        ("system", system_instructions),
+        ("system", "السياق التشريعي المتاح:\n{context}"),
+        ("human", "السؤال:\n{input}")
+    ])
+    # 11. Build QA Chain
+    qa_chain = (
+        RunnableParallel({
+            "context": compression_retriever,
+            "input": RunnablePassthrough()
+        })
+        .assign(answer=(
+            prompt
+            | llm
+            | StrOutputParser()
+        ))
+    )
+    print("✅ RAG pipeline initialized!\n")
+    return qa_chain
+# ==========================================
+# 📊 RAGAS EVALUATION
+# ==========================================
+def run_evaluation(test_file="test_dataset.json", output_file="evaluation_results.json"):
+    """Run RAGAS evaluation on test dataset"""
+    print("\n" + "="*60)
+    print("📊 RAGAS EVALUATION")
+    print("="*60)
+    # Load test dataset
+    print(f"\n📂 Loading test dataset: {test_file}")
+    with open(test_file, "r", encoding="utf-8") as f:
+        test_questions = json.load(f)
+    print(f"✅ Loaded {len(test_questions)} test questions")
+    # Initialize RAG pipeline
+    print("\n📥 Initializing RAG pipeline...")
+    qa_chain = initialize_rag_pipeline()
+    # Generate answers
+    print("\n🤖 Generating answers for evaluation...")
+    results = {
+        "question": [],
+        "answer": [],
+        "contexts": [],
+        "ground_truth": []
+    }
+    for idx, item in enumerate(test_questions, 1):
+        question = item["question"]
+        ground_truth = item.get("ground_truth", "")
+        print(f"  [{idx}/{len(test_questions)}] Processing question {idx}...")
+        try:
+            result = qa_chain.invoke(question)
+            answer = result["answer"]
+            contexts = [doc.page_content for doc in result["context"]]
+            results["question"].append(question)
+            results["answer"].append(answer)
+            results["contexts"].append(contexts)
+            results["ground_truth"].append(ground_truth)
+        except Exception as e:
+            print(f"      ❌ Error: {str(e)[:100]}")
+            results["question"].append(question)
+            results["answer"].append("Error generating answer")
+            results["contexts"].append([])
+            results["ground_truth"].append(ground_truth)
+    # Run Ragas evaluation
+    print("\n⚙️ Running RAGAS metrics...")
+    dataset = Dataset.from_dict(results)
+    # Configure evaluation LLM (same as main app)
+    print("  📌 Using Groq (llama-3.1-8b-instant, temp=0.3, top_p=0.9)")
+    evaluator_llm = LangchainLLMWrapper(ChatGroq(
+        groq_api_key=os.getenv("GROQ_API_KEY"),
+        model_name="llama-3.1-8b-instant",
+        temperature=0.3,
+        model_kwargs={"top_p": 0.9},
+        max_retries=2
+    ))
+    # Configure evaluation embeddings (same as main app)
+    print("  📌 Using HuggingFace (Omartificial-Intelligence-Space/GATE-AraBert-v1)")
+    evaluator_embeddings = LangchainEmbeddingsWrapper(HuggingFaceEmbeddings(
+        model_name="Omartificial-Intelligence-Space/GATE-AraBert-v1"
+    ))
+    try:
+        import time
+        print("\n  ⏳ Evaluating each question separately with all metrics...")
+        print("  ⚠️ This will take ~10-15 minutes (120 sec delay between questions)\n")
+        # Evaluate each question separately to see results immediately
+        all_scores = {
+            "faithfulness": [],
+            "answer_relevancy": [],
+            "context_precision": [],
+            "context_recall": []
+        }
+        for q_idx in range(len(results["question"])):
+            print(f"\n  📋 Question {q_idx + 1}/{len(results['question'])}: {results['question'][q_idx][:60]}...")
+            # Create single-question dataset
+            single_q_data = {
+                "question": [results["question"][q_idx]],
+                "answer": [results["answer"][q_idx]],
+                "contexts": [results["contexts"][q_idx]],
+                "ground_truth": [results["ground_truth"][q_idx]]
+            }
+            single_dataset = Dataset.from_dict(single_q_data)
+            # Evaluate all metrics for this question
+            try:
+                q_result = evaluate(
+                    single_dataset,
+                    metrics=[faithfulness, answer_relevancy, context_precision, context_recall],
+                    llm=evaluator_llm,
+                    embeddings=evaluator_embeddings,
+                    raise_exceptions=False
+                )
+                # Extract scores (handle if they're lists or single values)
+                def get_score(value):
+                    if isinstance(value, list):
+                        return value[0] if len(value) > 0 else 0.0
+                    return float(value) if value is not None else 0.0
+                f_score = get_score(q_result['faithfulness'])
+                a_score = get_score(q_result['answer_relevancy'])
+                cp_score = get_score(q_result['context_precision'])
+                cr_score = get_score(q_result['context_recall'])
+                # Display scores for this question
+                print(f"     Faithfulness       : {f_score:.4f}")
+                print(f"     Answer Relevancy   : {a_score:.4f}")
+                print(f"     Context Precision  : {cp_score:.4f}")
+                print(f"     Context Recall     : {cr_score:.4f}")
+                all_scores["faithfulness"].append(f_score)
+                all_scores["answer_relevancy"].append(a_score)
+                all_scores["context_precision"].append(cp_score)
+                all_scores["context_recall"].append(cr_score)
+            except Exception as e:
+                print(f"     ❌ Error evaluating this question: {str(e)[:80]}")
+                all_scores["faithfulness"].append(0.0)
+                all_scores["answer_relevancy"].append(0.0)
+                all_scores["context_precision"].append(0.0)
+                all_scores["context_recall"].append(0.0)
+            # Wait between questions to avoid rate limits
+            if q_idx < len(results["question"]) - 1:
+                print(f"\n     ⏳ Waiting 120 seconds (2 min) before next question...")
+                time.sleep(120)
+        # Calculate average scores
+        eval_results = {
+            "faithfulness": sum(all_scores["faithfulness"]) / len(all_scores["faithfulness"]) if all_scores["faithfulness"] else 0.0,
+            "answer_relevancy": sum(all_scores["answer_relevancy"]) / len(all_scores["answer_relevancy"]) if all_scores["answer_relevancy"] else 0.0,
+            "context_precision": sum(all_scores["context_precision"]) / len(all_scores["context_precision"]) if all_scores["context_precision"] else 0.0,
+            "context_recall": sum(all_scores["context_recall"]) / len(all_scores["context_recall"]) if all_scores["context_recall"] else 0.0
+        }
+        # Display results
+        print("\n" + "="*60)
+        print("📈 EVALUATION RESULTS")
+        print("="*60)
+        for metric, score in eval_results.items():
+            if isinstance(score, (int, float)):
+                print(f"  {metric:28s}: {score:.4f}")
+        # Save results to JSON
+        with open(output_file, "w", encoding="utf-8") as f:
+            results_dict = {
+                "metrics": {k: float(v) if isinstance(v, (int, float)) else str(v)
+                           for k, v in eval_results.items()},
+                "test_samples": len(dataset),
+                "test_file": test_file
+            }
+            json.dump(results_dict, f, ensure_ascii=False, indent=2)
+        print(f"\n💾 Results saved to: {output_file}")
+        print("="*60 + "\n")
+        return eval_results
+    except Exception as e:
+        print(f"\n❌ Evaluation failed: {e}")
+        print("\n⚠️ Make sure:")
+        print("   1. GROQ_API_KEY is set in .env")
+        print("   2. You have valid Groq API credits")
+        print("   3. Internet connection is available")
+        return None
+# ==========================================
+# 🎯 MAIN EXECUTION
+# ==========================================
+if __name__ == "__main__":
+    import sys
+    test_file = "test_dataset.json"
+    output_file = "evaluation_results.json"
+    # Check for command line arguments
+    if len(sys.argv) > 1:
+        test_file = sys.argv[1]
+    if len(sys.argv) > 2:
+        output_file = sys.argv[2]
+    print("\n" + "="*60)
+    print("🚀 Constitutional Legal Assistant - RAGAS Evaluation")
+    print("="*60)
+    run_evaluation(test_file, output_file)

evaluate_rag.py ADDED Viewed

	@@ -0,0 +1,535 @@

+# -*- coding: utf-8 -*-
+"""
+RAG Evaluation Script using Ragas Metrics
+==========================================
+Evaluates the Constitutional Legal Assistant using:
+- faithfulness
+- answer_relevancy
+- context_precision
+- context_recall
+- context_relevancy
+USAGE:
+------
+1. Command line: python evaluate_rag.py path/to/questions.json
+2. Environment variable: set QA_FILE_PATH=path/to/questions.json
+3. Default: Place 'test_dataset.json' in same directory
+JSON FORMAT:
+-----------
+List format: [{"question": "...", "ground_truth": "..."}, ...]
+OR dict format: {"data": [...]} or {"questions": [...]}
+RATE LIMITS:
+-----------
+- 120 second delay between questions to avoid API timeouts
+- 30 second delay before evaluation starts
+- 15 second initial cooldown after pipeline load
+"""
+import os
+import sys
+import json
+import time
+from dotenv import load_dotenv
+from datasets import Dataset
+from ragas import evaluate
+from ragas.metrics import (
+    faithfulness,
+    answer_relevancy,
+    context_precision,
+    context_recall,
+)
+from ragas.llms import LangchainLLMWrapper
+from ragas.embeddings import LangchainEmbeddingsWrapper
+from langchain_groq import ChatGroq
+from langchain_huggingface import HuggingFaceEmbeddings
+import logging
+# Import the RAG pipeline initialization
+from app_final_updated import initialize_rag_pipeline
+# Suppress verbose API logging
+logging.getLogger("httpx").setLevel(logging.WARNING)
+logging.getLogger("groq").setLevel(logging.WARNING)
+load_dotenv()
+model_name="Omartificial-Intelligence-Space/GATE-AraBert-v1"
+# ==========================================
+# ⏱️ RATE LIMITING / DELAYS (GROQ LIMITS)
+# ==========================================
+RPM_LIMIT = 30
+TPM_LIMIT = 6000
+RPD_LIMIT = 14400
+TPD_LIMIT = 500000
+# Use a conservative delay to stay within RPM limits.
+# Increased delays to prevent API timeouts
+MIN_DELAY_SECONDS = 60.0 / RPM_LIMIT
+REQUEST_DELAY_SECONDS = 60.0  # 1 minute between each question to avoid timeouts
+EVALUATION_DELAY_SECONDS = 60.0  # 60 seconds before starting evaluation
+INITIAL_COOLDOWN = 10.0  # 10 seconds after loading pipeline
+PER_METRIC_DELAY = 60.0  # 60 seconds between evaluating each question's metrics
+# ==========================================
+# 📝 TEST DATASET
+# ==========================================
+# Default test questions (used when no file is provided)
+DEFAULT_TEST_QUESTIONS = [
+    {
+        "question": "ما هي شروط الترشح لرئاسة الجمهورية؟",
+        "ground_truth": "يجب أن يكون المرشح مصرياً من أبوين مصريين، وألا تكون له جنسية أخرى، وأن يكون متمتعاً بحقوقه المدنية والسياسية، وأن يكون قد أدى الخدمة العسكرية أو أعفي منها قانوناً، وألا تقل سنه يوم فتح باب الترشح عن أربعين سنة ميلادية."
+    },
+    {
+        "question": "ما هي مدة ولاية رئيس الجمهورية؟",
+        "ground_truth": "مدة الرئاسة ست سنوات ميلادية، تبدأ من اليوم التالي لانتهاء مدة سلفه، ولا يجوز إعادة انتخابه إلا لمرة واحدة."
+    },
+    {
+        "question": "ما هي حقوق المواطن في الحصول على المعلومات؟",
+        "ground_truth": "المعلومات والبيانات والإحصاءات والوثائق الرسمية ملك للشعب، والإفصاح عنها من مصادرها المختلفة حق تكفله الدولة لكل مواطن."
+    },
+    {
+        "question": "ما هو دور مجلس الشيوخ؟",
+        "ground_truth": "يختص مجلس الشيوخ بدراسة واقتراح ما يراه كفيلاً بدعم الوحدة الوطنية والسلام الاجتماعي والحفاظ على المقومات الأساسية للمجتمع، ودراسة مشروعات القوانين المكملة للدستور."
+    },
+    {
+        "question": "كيف يتم تعديل الدستور؟",
+        "ground_truth": "لرئيس الجمهورية أو لخمس أعضاء مجلس النواب طلب تعديل مادة أو أكثر من الدستور، ويجب الموافقة على التعديل بأغلبية ثلثي أعضاء المجلس، ثم يعرض على الشعب في استفتاء."
+    }
+]
+def load_test_questions(file_path: str):
+    """Load test questions from JSON file"""
+    try:
+        with open(file_path, "r", encoding="utf-8") as f:
+            obj = json.load(f)
+        if isinstance(obj, list):
+            return obj
+        if isinstance(obj, dict):
+            if "data" in obj and isinstance(obj["data"], list):
+                return obj["data"]
+            if "questions" in obj and isinstance(obj["questions"], list):
+                return obj["questions"]
+        raise ValueError("Unsupported QA JSON format; expected a list or dict with 'data' or 'questions'.")
+    except FileNotFoundError:
+        raise FileNotFoundError(f"❌ QA file not found: {file_path}")
+    except json.JSONDecodeError as e:
+        raise ValueError(f"❌ Invalid JSON format in {file_path}: {e}")
+    except Exception as e:
+        raise Exception(f"❌ Error loading QA file {file_path}: {e}")
+# Load QA file path from environment variable or command line
+qa_file_path = os.getenv("QA_FILE_PATH")
+if not qa_file_path and len(sys.argv) > 1:
+    qa_file_path = sys.argv[1]
+# If still not provided, try default file
+if not qa_file_path:
+    default_path = "test_dataset_5_questions.json"
+    if os.path.exists(default_path):
+        qa_file_path = default_path
+        print(f"📂 Using default dataset: {default_path}")
+if qa_file_path and os.path.exists(qa_file_path):
+    print(f"📂 Loading questions from: {qa_file_path}")
+    try:
+        test_questions = load_test_questions(qa_file_path)
+        print(f"✅ Loaded {len(test_questions)} questions from file")
+    except Exception as e:
+        print(f"❌ Error loading file: {e}")
+        print("📝 Using default inline test questions instead")
+        test_questions = DEFAULT_TEST_QUESTIONS
+else:
+    if qa_file_path:
+        print(f"⚠️ File not found: {qa_file_path}")
+    print("📝 Using default inline test questions")
+    test_questions = DEFAULT_TEST_QUESTIONS
+# ==========================================
+# 🔄 RUN EVALUATION
+# ==========================================
+def run_evaluation():
+    print("="*60)
+    print("🚀 Starting RAG Evaluation with Ragas")
+    print("="*60)
+    print(f"\n📊 Configuration:")
+    print(f"   Questions to evaluate: {len(test_questions)}")
+    print(f"   Delay per question (generation): {REQUEST_DELAY_SECONDS}s")
+    print(f"   Delay per question (evaluation): {PER_METRIC_DELAY}s")
+    total_gen_time = len(test_questions) * REQUEST_DELAY_SECONDS / 60.0
+    total_eval_time = len(test_questions) * PER_METRIC_DELAY / 60.0
+    total_time = total_gen_time + total_eval_time + INITIAL_COOLDOWN / 60.0 + EVALUATION_DELAY_SECONDS / 60.0
+    print(f"\n⏱️ Estimated total time:")
+    print(f"   Question generation: ~{total_gen_time:.1f} minutes")
+    print(f"   Evaluation phase: ~{total_eval_time:.1f} minutes")
+    print(f"   Total: ~{total_time:.1f} minutes ({total_time/60:.1f} hours)\n")
+    # 1. Initialize RAG Pipeline
+    print("\n📥 Loading RAG pipeline...")
+    qa_chain = initialize_rag_pipeline()
+    print("✅ Pipeline loaded successfully")
+    # Let the service cool down before starting requests
+    print(f"⏳ Cooling down for {INITIAL_COOLDOWN} seconds...")
+    time.sleep(INITIAL_COOLDOWN)
+    # 2. Generate answers and collect context
+    print("\n🤖 Generating answers for test questions...\n")
+    results = {
+        "question": [],
+        "answer": [],
+        "contexts": [],
+        "ground_truth": []
+    }
+    for idx, item in enumerate(test_questions, 1):
+        question = item["question"]
+        ground_truth = item.get("ground_truth", "")
+        print(f"\n{'='*60}")
+        print(f"[{idx}/{len(test_questions)}] Generating answer ({idx / len(test_questions) * 100:.0f}% complete)")
+        print(f"{'='*60}")
+        print(f"Q: {question[:80]}...")
+        print(f"{'-'*60}")
+        try:
+            # Invoke the chain
+            result = qa_chain.invoke(question)
+            answer = result["answer"]
+            context_docs = result["context"]
+            # Extract context text from documents
+            contexts = [doc.page_content for doc in context_docs]
+            # Store results
+            results["question"].append(question)
+            results["answer"].append(answer)
+            results["contexts"].append(contexts)
+            results["ground_truth"].append(ground_truth)
+            print(f"✅ Generated answer ({len(answer)} chars)")
+            print(f"✅ Retrieved {len(contexts)} context documents")
+            # Delay between requests to avoid hitting RPM limits
+            if idx < len(test_questions):
+                print(f"⏳ Waiting {REQUEST_DELAY_SECONDS} seconds before next question...")
+                time.sleep(REQUEST_DELAY_SECONDS)
+        except Exception as e:
+            print(f"❌ Error: {e}")
+            # Add placeholder to keep dataset aligned
+            results["question"].append(question)
+            results["answer"].append("Error generating answer")
+            results["contexts"].append([])
+            results["ground_truth"].append(ground_truth)
+    # 3. Convert to Ragas Dataset format
+    print("\n📊 Creating evaluation dataset...")
+    dataset = Dataset.from_dict(results)
+    print(f"✅ Dataset created with {len(dataset)} samples")
+    # 4. Run Ragas Evaluation
+    print("\n⚙️ Running Ragas evaluation...")
+    print("This may take a few minutes...")
+    print("Using Groq API (Llama 3.1 8B Instant) for evaluation...")
+    # Add a larger delay before evaluation to avoid back-to-back bursts
+    print(f"⏳ Waiting {EVALUATION_DELAY_SECONDS} seconds before evaluation...")
+    time.sleep(EVALUATION_DELAY_SECONDS)
+    # Configure Groq LLM for evaluation (same as app_final.py)
+    evaluator_llm = LangchainLLMWrapper(ChatGroq(
+        model="llama-3.1-8b-instant",  # Same as app_final.py
+        temperature=0.3,  # Same as app_final.py
+        model_kwargs={"top_p": 0.9},  # Same as app_final.py
+        max_retries=3  # Add retries for robustness
+    ))
+    # Configure embeddings (same as app_final.py)
+    print("Configuring HuggingFace embeddings (same as app_final.py)...")
+    evaluator_embeddings = LangchainEmbeddingsWrapper(HuggingFaceEmbeddings(
+        model_name=model_name
+    ))
+    try:
+        # Evaluate each question separately with delays to avoid rate limits
+        print("\n⚠️ Evaluating each question separately with 60-second delays...")
+        print(f"⏱️ Estimated time: ~{len(results['question']) * PER_METRIC_DELAY / 60:.1f} minutes\n")
+        all_scores = {
+            "faithfulness": [],
+            "answer_relevancy": [],
+            "context_precision": [],
+            "context_recall": []
+        }
+        for q_idx in range(len(results["question"])):
+            print(f"\n{'='*60}")
+            print(f"📋 Question {q_idx + 1}/{len(results['question'])} ({(q_idx + 1) / len(results['question']) * 100:.0f}% complete)")
+            print(f"{'='*60}")
+            print(f"Q: {results['question'][q_idx][:80]}...")
+            print(f"-" * 60)
+            # Create single-question dataset
+            single_q_data = {
+                "question": [results["question"][q_idx]],
+                "answer": [results["answer"][q_idx]],
+                "contexts": [results["contexts"][q_idx]],
+                "ground_truth": [results["ground_truth"][q_idx]]
+            }
+            single_dataset = Dataset.from_dict(single_q_data)
+            # Evaluate all metrics for this question
+            try:
+                q_result = evaluate(
+                    single_dataset,
+                    metrics=[faithfulness, answer_relevancy, context_precision, context_recall],
+                    llm=evaluator_llm,
+                    embeddings=evaluator_embeddings,
+                    raise_exceptions=False
+                )
+                # Convert EvaluationResult to dict if needed
+                if hasattr(q_result, 'to_pandas'):
+                    # Convert to pandas and then to dict
+                    result_df = q_result.to_pandas()
+                    result_dict = result_df.to_dict('records')[0] if len(result_df) > 0 else {}
+                elif isinstance(q_result, dict):
+                    result_dict = q_result
+                else:
+                    # Try to access as attributes
+                    result_dict = {
+                        'faithfulness': getattr(q_result, 'faithfulness', 0.0),
+                        'answer_relevancy': getattr(q_result, 'answer_relevancy', 0.0),
+                        'context_precision': getattr(q_result, 'context_precision', 0.0),
+                        'context_recall': getattr(q_result, 'context_recall', 0.0)
+                    }
+                # Extract scores (handle if they're lists or single values)
+                def get_score(value):
+                    if isinstance(value, list):
+                        return value[0] if len(value) > 0 else 0.0
+                    return float(value) if value is not None else 0.0
+                f_score = get_score(result_dict.get('faithfulness', 0.0))
+                a_score = get_score(result_dict.get('answer_relevancy', 0.0))
+                cp_score = get_score(result_dict.get('context_precision', 0.0))
+                cr_score = get_score(result_dict.get('context_recall', 0.0))
+                # Display scores for this question
+                print(f"\n📊 Results for Question {q_idx + 1}:")
+                print(f"   Faithfulness       : {f_score:.4f}")
+                print(f"   Answer Relevancy   : {a_score:.4f}")
+                print(f"   Context Precision  : {cp_score:.4f}")
+                print(f"   Context Recall     : {cr_score:.4f}")
+                all_scores["faithfulness"].append(f_score)
+                all_scores["answer_relevancy"].append(a_score)
+                all_scores["context_precision"].append(cp_score)
+                all_scores["context_recall"].append(cr_score)
+            except Exception as e:
+                print(f"\n❌ Error evaluating question {q_idx + 1}: {str(e)}")
+                print(f"   Error type: {type(e).__name__}")
+                # Print more debug info if verbose
+                import traceback
+                print(f"   Traceback: {traceback.format_exc()[:200]}...")
+                all_scores["faithfulness"].append(0.0)
+                all_scores["answer_relevancy"].append(0.0)
+                all_scores["context_precision"].append(0.0)
+                all_scores["context_recall"].append(0.0)
+            # Wait between questions to avoid rate limits
+            if q_idx < len(results["question"]) - 1:
+                print(f"\n⏳ Waiting {PER_METRIC_DELAY} seconds before next question...")
+                time.sleep(PER_METRIC_DELAY)
+        # Calculate average scores
+        print("\n" + "="*60)
+        print("📊 CALCULATING AVERAGE SCORES")
+        print("="*60)
+        evaluation_results = {
+            "faithfulness": sum(all_scores["faithfulness"]) / len(all_scores["faithfulness"]) if all_scores["faithfulness"] else 0.0,
+            "answer_relevancy": sum(all_scores["answer_relevancy"]) / len(all_scores["answer_relevancy"]) if all_scores["answer_relevancy"] else 0.0,
+            "context_precision": sum(all_scores["context_precision"]) / len(all_scores["context_precision"]) if all_scores["context_precision"] else 0.0,
+            "context_recall": sum(all_scores["context_recall"]) / len(all_scores["context_recall"]) if all_scores["context_recall"] else 0.0
+        }
+        print("\n" + "="*60)
+        print("📈 FINAL AVERAGE RESULTS")
+        print("="*60)
+        # Display average results
+        for metric_name, score in evaluation_results.items():
+            if isinstance(score, (int, float)):
+                print(f"  {metric_name:28s}: {score:.4f}")
+        overall_avg = sum(evaluation_results.values()) / len(evaluation_results)
+        print(f"\n  {'Overall Average':28s}: {overall_avg:.4f}")
+        # Save results to JSON
+        results_file = "evaluation_results.json"
+        with open(results_file, "w", encoding="utf-8") as f:
+            results_dict = {
+                "metrics": {k: float(v) if isinstance(v, (int, float)) else str(v)
+                           for k, v in evaluation_results.items()},
+                "individual_scores": all_scores,
+                "test_samples": len(dataset),
+                "overall_average": overall_avg,
+                "evaluation_details": {
+                    "delay_per_question": f"{REQUEST_DELAY_SECONDS}s",
+                    "delay_per_metric": f"{PER_METRIC_DELAY}s",
+                    "model": "llama-3.1-8b-instant",
+                    "embeddings": model_name
+                }
+            }
+            json.dump(results_dict, f, ensure_ascii=False, indent=2)
+        print(f"\n💾 Results saved to: {results_file}")
+        # Save individual question breakdown
+        breakdown_file = "evaluation_breakdown.json"
+        breakdown_data = []
+        for q_idx in range(len(results["question"])):
+            # Calculate average score for this question across all metrics
+            question_score = (
+                all_scores["faithfulness"][q_idx] +
+                all_scores["answer_relevancy"][q_idx] +
+                all_scores["context_precision"][q_idx] +
+                all_scores["context_recall"][q_idx]
+            ) / 4.0
+            breakdown_data.append({
+                "question": results["question"][q_idx],
+                "ground_truth": results["ground_truth"][q_idx],
+                "actual_answer": results["answer"][q_idx],
+                "score": round(question_score, 4)
+            })
+        # Calculate average score of all questions
+        total_avg_score = sum(item["score"] for item in breakdown_data) / len(breakdown_data) if breakdown_data else 0.0
+        # Create simplified results structure
+        simplified_results = {
+            "questions": breakdown_data,
+            "average_score": round(total_avg_score, 4)
+        }
+        with open(breakdown_file, "w", encoding="utf-8") as f:
+            json.dump(simplified_results, f, ensure_ascii=False, indent=2)
+        print(f"💾 Question breakdown saved to: {breakdown_file}")
+        print(f"📊 Average score across all questions: {total_avg_score:.4f}")
+        # Save detailed results
+        detailed_file = "evaluation_detailed.json"
+        with open(detailed_file, "w", encoding="utf-8") as f:
+            json.dump(results, f, ensure_ascii=False, indent=2)
+        print(f"💾 Detailed results saved to: {detailed_file}")
+        print("\n" + "="*60)
+        print("✅ Evaluation Complete!")
+        print("="*60)
+        return evaluation_results
+    except Exception as e:
+        print(f"\n❌ Evaluation failed: {e}")
+        print("\n⚠️ Troubleshooting:")
+        print("   1. Check GROQ_API_KEY is set in .env file")
+        print("   2. Verify you have valid Groq API credits")
+        print("   3. Ensure internet connection is stable")
+        print("   4. Try increasing PER_METRIC_DELAY in the script")
+        print("   5. Reduce the number of test questions")
+        import traceback
+        traceback.print_exc()
+        return None
+# ==========================================
+# 📊 METRIC EXPLANATIONS
+# ==========================================
+def print_metric_explanations():
+    """Print what each metric measures"""
+    print("\n" + "="*60)
+    print("📖 RAGAS METRICS EXPLANATION")
+    print("="*60)
+    explanations = {
+        "faithfulness": "Is the answer grounded in the context? (0-1, higher is better)\n"
+                       "Measures if the answer contains only information from the retrieved context.",
+        "answer_relevancy": "Does the answer relate to the question? (0-1, higher is better)\n"
+                           "Measures how well the answer addresses the question asked.",
+        "context_precision": "How much retrieved context was relevant? (0-1, higher is better)\n"
+                            "Measures the signal-to-noise ratio in retrieved documents.",
+        "context_recall": "Did we retrieve all needed information? (0-1, higher is better)\n"
+                         "Measures if all ground truth information is in the context.",
+        "context_relevancy": "Overall relevance of context to question (0-1, higher is better)\n"
+                            "Measures how relevant the retrieved context is to the question."
+    }
+    for metric, explanation in explanations.items():
+        print(f"\n{metric.upper()}:")
+        print(f"  {explanation}")
+    print("\n" + "="*60)
+# ==========================================
+# 🎯 MAIN EXECUTION
+# ==========================================
+if __name__ == "__main__":
+    from datetime import datetime
+    start_time = datetime.now()
+    print("\n" + "="*60)
+    print("🎯 RAG EVALUATION SYSTEM")
+    print("   Constitutional Legal Assistant - Egyptian Constitution")
+    print("="*60)
+    print(f"\n⏰ Started at: {start_time.strftime('%Y-%m-%d %H:%M:%S')}")
+    # Print what metrics mean
+    print_metric_explanations()
+    # Run evaluation
+    input("\nPress ENTER to start evaluation...")
+    results = run_evaluation()
+    end_time = datetime.now()
+    duration = end_time - start_time
+    print("\n" + "="*60)
+    print("📊 EVALUATION SUMMARY")
+    print("="*60)
+    print(f"⏰ Started:  {start_time.strftime('%Y-%m-%d %H:%M:%S')}")
+    print(f"⏰ Finished: {end_time.strftime('%Y-%m-%d %H:%M:%S')}")
+    print(f"⏱️ Duration: {duration.total_seconds() / 60:.1f} minutes")
+    print(f"📝 Questions evaluated: {len(test_questions)}")
+    if results:
+        print(f"\n✅ Evaluation completed successfully!")
+        print(f"\n📂 Output files:")
+        print(f"   - evaluation_results.json (average metrics & config)")
+        print(f"   - evaluation_breakdown.json (per-question scores)")
+        print(f"   - evaluation_detailed.json (full Q&A data)")
+    else:
+        print(f"\n⚠️ Evaluation could not be completed.")
+        print(f"   Check the error messages above for troubleshooting.")
+    print("\n" + "="*60)

requirements.txt ADDED Viewed

	@@ -0,0 +1,57 @@

+# ===========================================
+# Constitutional Legal Assistant - Requirements
+# ===========================================
+# Core Python
+python-dotenv>=1.0.0
+# Streamlit UI
+streamlit>=1.28.0
+# LangChain Core
+langchain>=0.2.0
+langchain-core>=0.2.0
+langchain-text-splitters>=0.2.0
+langchain-community>=0.2.0
+langchain-classic>=0.0.1
+# Vector Store
+langchain-chroma>=0.1.0
+chromadb>=0.4.0
+# Embeddings & Reranker
+langchain-huggingface>=0.0.3
+sentence-transformers>=2.2.0
+transformers>=4.35.0
+torch>=2.0.0
+# LLM Provider (Groq)
+langchain-groq>=0.1.0
+# BM25 Keyword Search
+rank-bm25>=0.2.2
+# Numerical
+numpy>=1.24.0
+# ===========================================
+# EVALUATION (RAGAS)
+# ===========================================
+ragas>=0.1.0
+datasets>=2.14.0
+# ===========================================
+# PHOENIX OBSERVABILITY (Optional)
+# ===========================================
+# For app_final_pheonix.py tracing
+opentelemetry-api>=1.20.0
+opentelemetry-sdk>=1.20.0
+opentelemetry-exporter-otlp-proto-http>=1.20.0
+arize-phoenix>=4.0.0
+# ===========================================
+# LOCAL WHEEL PACKAGES (Install Separately)
+# ===========================================
+# Install these manually with:
+# pip install openinference_instrumentation_langchain-0.1.56-py3-none-any.whl
+# pip install openinference_instrumentation_openai-0.1.41-py3-none-any.whl

test_dataset_5_questions.json ADDED Viewed

	@@ -0,0 +1,22 @@

+[
+  {
+    "question": "ما الطبيعة القانونية لحق العمل في الدستور المصري؟",
+    "ground_truth": "حق أساسي/حرية: العمل حق وواجب تكفله الدولة. يُمنع العمل الجبري إلا بقانون ولخدمة عامة وبمقابل عادل."
+  },
+  {
+    "question": "ما حكم التحرش أو التنمر أو العنف ضد العامل في مكان العمل وفق قانون العمل؟",
+    "ground_truth": "حظر السخرة والعمل الجبري والتحرش والتنمر والعنف بكافة أشكاله (اللفظي والجسدي والنفسي) ضد العمال، مع تحديد جزاءات تأديبية في لوائح المنشأة."
+  },
+  {
+    "question": "ما المقصود بالتلبس وما أثره الإجرائي بشكل عام؟",
+    "ground_truth": "لمأمور الضبط القضائي في التلبس منع الحاضرين من المغادرة حتى تحرير المحضر واستدعاء من يفيد في التحقيق."
+  },
+  {
+    "question": "ما حكم نشر صور أو معلومات تنتهك خصوصية شخص دون رضاه عبر الإنترنت؟",
+    "ground_truth": "تجرم المادة الاعتداء على القيم الأسرية أو الخصوصية عبر الرسائل الكثيفة دون موافقة، أو تسليم بيانات للترويج دون موافقة، أو نشر محتوى ينتهك الخصوصية سواء كان صحيحًا أو غير صحيح."
+  },
+  {
+    "question": "ما الشروط العامة لاستحقاق الزوجة النفقة وفق قانون الأحوال الشخصية؟",
+    "ground_truth": "تجب النفقة للزوجة من تاريخ العقد الصحيح وتشمل الغذاء والكسوة والمسكن والعلاج. لا تجب النفقة إذا ارتدت أو امتنعت عن تسليم نفسها أو خرجت بدون إذن. نفقة الزوجة دين على الزوج ولها امتياز على أمواله."
+  }
+]