Ahmd1 commited on
Commit
e37d541
·
0 Parent(s):

Legal Assistant with RAG evaluation

Browse files
.gitignore ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Python
2
+ __pycache__/
3
+ *.py[cod]
4
+ *$py.class
5
+ *.so
6
+ .Python
7
+
8
+ # Virtual Environment
9
+ venv/
10
+ env/
11
+ ENV/
12
+
13
+ # Environment variables
14
+ .env
15
+ .env.local
16
+
17
+ # Reranker model files
18
+ reranker/
19
+
20
+ # Vector database
21
+ chroma_db/
22
+
23
+ # Data files - CSV
24
+ *.csv
25
+
26
+ # Data files - JSON (exclude all except specific test file)
27
+ *.json
28
+ !test_dataset_5_questions.json
29
+
30
+ # Markdown files (exclude all except README)
31
+ *.md
32
+ !README.md
33
+
34
+ # Wheel files
35
+ *.whl
36
+
37
+ # IDE
38
+ .vscode/
39
+ .idea/
40
+ *.swp
41
+ *.swo
42
+
43
+ # OS
44
+ .DS_Store
45
+ Thumbs.db
46
+
47
+ # Logs
48
+ *.log
49
+
50
+ # Jupyter Notebooks checkpoints
51
+ .ipynb_checkpoints/
README.md ADDED
@@ -0,0 +1,383 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ⚖️ Constitutional Legal Assistant - Egyptian Constitution Chatbot
2
+
3
+ An intelligent RAG-based chatbot for answering questions about the Egyptian Constitution in Arabic.
4
+
5
+ ---
6
+
7
+ ## 📁 Project Structure
8
+
9
+ ```
10
+ Chatbot_me/
11
+ ├── app_final.py # Main Streamlit app (v1 - basic)
12
+ ├── app_final_pheonix.py # Streamlit app with Phoenix tracing
13
+ ├── app_final_updated.py # Latest production version with improvements
14
+ ├── evaluate_rag.py # RAG evaluation with RAGAS metrics (simplified output)
15
+ ├── evaluate.py # Full standalone evaluation script
16
+ ├── requirements.txt # Python dependencies
17
+ ├── .env # Environment variables (create this - NOT in repo)
18
+ ├── .gitignore # Git ignore rules
19
+ ├── test_dataset_5_questions.json # Test dataset (5 questions from different categories)
20
+ ├── data/ # Legal documents (NOT in repo)
21
+ │ ├── Egyptian_Constitution_legalnature_only.json
22
+ │ ├── Egyptian_Civil.json
23
+ │ ├── Egyptian_Labour_Law.json
24
+ │ ├── Egyptian_Personal Status Laws.json
25
+ │ ├── Technology Crimes Law.json
26
+ │ └── قانون_الإجراءات_الجنائية.json
27
+ ├── chroma_db/ # Vector database (auto-generated - NOT in repo)
28
+ ├── reranker/ # Arabic reranker model files (NOT in repo)
29
+ │ ├── model.safetensors
30
+ │ ├── config.json
31
+ │ └── ...
32
+ └── *.whl # Local wheel packages for Phoenix (NOT in repo)
33
+ ```
34
+
35
+ ---
36
+
37
+ ## 🚀 Quick Start
38
+
39
+ ### Step 1: Create Virtual Environment (Recommended)
40
+
41
+ ```powershell
42
+ # Create virtual environment
43
+ python -m venv venv
44
+
45
+ # Activate it (Windows PowerShell)
46
+ .\venv\Scripts\Activate.ps1
47
+
48
+ # Or (Windows CMD)
49
+ .\venv\Scripts\activate.bat
50
+ ```
51
+
52
+ ### Step 2: Install Dependencies
53
+
54
+ ```powershell
55
+ # Install all requirements
56
+ pip install -r requirements.txt
57
+ ```
58
+
59
+ ### Step 3: Install Local Wheel Packages (For Phoenix Tracing)
60
+
61
+ ```powershell
62
+ # Install OpenInference instrumentation packages
63
+ pip install openinference_instrumentation_langchain-0.1.56-py3-none-any.whl
64
+ pip install openinference_instrumentation_openai-0.1.41-py3-none-any.whl
65
+ ```
66
+
67
+ ### Step 4: Create `.env` File
68
+
69
+ Create a `.env` file in the project root with:
70
+
71
+ ```env
72
+ # Required: Groq API Key (get from https://console.groq.com)
73
+ GROQ_API_KEY=gsk_your_groq_api_key_here
74
+
75
+ # Optional: For Phoenix tracing
76
+ PHOENIX_OTLP_ENDPOINT=http://localhost:6006/v1/traces
77
+ PHOENIX_SERVICE_NAME=constitutional-assistant
78
+ ```
79
+
80
+ ---
81
+
82
+ ## 🏃 Running the Applications
83
+
84
+ ### 1. Run Latest Production App (`app_final_updated.py`) ⭐ RECOMMENDED
85
+
86
+ The most recent version with improved prompt engineering and decision tree logic:
87
+
88
+ ```powershell
89
+ streamlit run app_final_updated.py
90
+ ```
91
+
92
+ Then open: **http://localhost:8501**
93
+
94
+ **Features:**
95
+ - Enhanced Arabic RTL support
96
+ - Improved decision tree for handling different question types
97
+ - Better handling of procedural vs. constitutional questions
98
+ - Cleaner response formatting
99
+
100
+ ---
101
+
102
+ ### 2. Run Basic App (`app_final.py`)
103
+
104
+ The original version:
105
+
106
+ ```powershell
107
+ streamlit run app_final.py
108
+ ```
109
+
110
+ Then open: **http://localhost:8501**
111
+
112
+ ---
113
+
114
+ ### 3. Run App with Phoenix Tracing (`app_final_pheonix.py`)
115
+
116
+ This version includes observability/tracing with Phoenix.
117
+
118
+ #### Step A: Start Phoenix Server First
119
+
120
+ ```powershell
121
+ # In a separate terminal
122
+ python -m phoenix.server.main serve
123
+ ```
124
+
125
+ Phoenix UI will be at: **http://localhost:6006**
126
+
127
+ #### Step B: Run the App
128
+
129
+ ```powershell
130
+ streamlit run app_final_pheonix.py
131
+ ```
132
+
133
+ Then open:
134
+ - **App**: http://localhost:8501
135
+ - **Phoenix Traces**: http://localhost:6006
136
+
137
+ ---
138
+
139
+ ### 4. Run Evaluation (`evaluate_rag.py`) ⭐ NEW SIMPLIFIED FORMAT
140
+
141
+ Evaluate the RAG system with simplified output showing only essential information:
142
+
143
+ ```powershell
144
+ # Uses default test dataset (test_dataset_5_questions.json)
145
+ python evaluate_rag.py
146
+
147
+ # With custom test file
148
+ python evaluate_rag.py path/to/your_test.json
149
+
150
+ # Set via environment variable
151
+ set QA_FILE_PATH=test_dataset_5_questions.json
152
+ python evaluate_rag.py
153
+ ```
154
+
155
+ **Output Files:**
156
+ - `evaluation_breakdown.json` - **Simplified format** with:
157
+ - Question
158
+ - Ground truth
159
+ - Actual answer
160
+ - Score (average of all metrics per question)
161
+ - Average score across all questions
162
+ - `evaluation_results.json` - Detailed metrics breakdown
163
+ - `evaluation_detailed.json` - Full raw evaluation data
164
+
165
+ **Sample Output Format:**
166
+ ```json
167
+ {
168
+ "questions": [
169
+ {
170
+ "question": "ما الطبيعة القانونية لحق العمل في الدستور المصري؟",
171
+ "ground_truth": "حق أساسي/حرية: العمل حق وواجب...",
172
+ "actual_answer": "حسب المادة (12) من الدستور المصري...",
173
+ "score": 0.8542
174
+ }
175
+ ],
176
+ "average_score": 0.8542
177
+ }
178
+ ```
179
+
180
+ **⚠️ Note:** This script has a **60-second delay** between questions to avoid Groq API rate limits.
181
+
182
+ ---
183
+
184
+ ### 5. Run Full Evaluation (`evaluate.py`)
185
+
186
+ More comprehensive evaluation with external test dataset and rate limiting:
187
+
188
+ ```powershell
189
+ # Basic run (uses test_dataset.json)
190
+ python evaluate.py
191
+
192
+ # With custom test file
193
+ python evaluate.py test_dataset_small.json
194
+
195
+ # With custom test and output files
196
+ python evaluate.py test_dataset_small.json my_results.json
197
+ ```
198
+
199
+ **⚠️ Note:** This script has a **2-minute delay** between questions to avoid Groq API rate limits.
200
+
201
+ ---
202
+
203
+ ## 📊 Test Dataset
204
+
205
+ The project includes a curated test dataset with 5 questions covering different legal categories:
206
+
207
+ **`test_dataset_5_questions.json`** includes:
208
+ 1. **الدستور (Constitution)** - Constitutional rights and principles
209
+ 2. **قانون العمل (Labour Law)** - Workplace rights and regulations
210
+ 3. **الإجراءات الجنائية (Criminal Procedures)** - Criminal law procedures
211
+ 4. **جرائم تقنية المعلومات (Technology Crimes)** - Cybercrime laws
212
+ 5. **الأحوال الشخصية (Personal Status Laws)** - Family law matters
213
+
214
+ This diverse dataset ensures comprehensive testing across all major legal domains covered by the system.
215
+
216
+ ---
217
+
218
+ ## 📊 Understanding RAGAS Metrics
219
+
220
+ The evaluation system uses RAGAS metrics to assess the quality of the RAG pipeline. The simplified output combines these into a single score per question:
221
+
222
+ | Metric | Description | Good Score |
223
+ |--------|-------------|------------|
224
+ | **faithfulness** | Is answer grounded in context? | > 0.7 |
225
+ | **answer_relevancy** | Does answer match the question? | > 0.8 |
226
+ | **context_precision** | How much context was useful? | > 0.6 |
227
+ | **context_recall** | Did we retrieve all needed info? | > 0.7 |
228
+
229
+ **Question Score** = Average of all four metrics (0-1 scale)
230
+
231
+ **Overall Score** = Average of all question scores
232
+
233
+ ---
234
+
235
+ ## � Repository Structure & Git
236
+
237
+ ### Files NOT Included in Repository (via `.gitignore`)
238
+
239
+ The following files are excluded from version control for security, size, or privacy reasons:
240
+
241
+ 1. **`reranker/`** - Large model files (download separately or train locally)
242
+ 2. **`__pycache__/`** - Python compiled bytecode
243
+ 3. **`chroma_db/`** - Vector database (auto-generated on first run)
244
+ 4. **`.env`** - Environment variables with API keys (NEVER commit this!)
245
+ 5. **`*.json`** - All JSON files EXCEPT `test_dataset_5_questions.json`
246
+ 6. **`*.csv`** - CSV data files
247
+ 7. **`*.md`** - All markdown files EXCEPT `README.md`
248
+ 8. **`*.whl`** - Wheel package files
249
+
250
+ ### First-Time Setup
251
+
252
+ When cloning this repository, you'll need to:
253
+
254
+ 1. **Create `.env` file** with your API keys
255
+ 2. **Download/prepare data files** in the `data/` folder
256
+ 3. **Download reranker model** to `reranker/` folder
257
+ 4. **Install dependencies** from `requirements.txt`
258
+ 5. **Run the app** - ChromaDB will auto-generate on first run
259
+
260
+ ---
261
+
262
+ ## �🔧 Troubleshooting
263
+
264
+ ### "GROQ_API_KEY not found"
265
+ Make sure your `.env` file exists and contains:
266
+ ```env
267
+ GROQ_API_KEY=gsk_your_key_here
268
+ ```
269
+
270
+ ### "Reranker path not found"
271
+ Ensure the `reranker/` folder exists with model files:
272
+ ```
273
+ reranker/
274
+ ├── model.safetensors
275
+ ├── config.json
276
+ ├── tokenizer.json
277
+ └── ...
278
+ ```
279
+
280
+ ### "Phoenix connection refused"
281
+ Start Phoenix server first:
282
+ ```powershell
283
+ python -m phoenix.server.main serve
284
+ ```
285
+
286
+ ### Rate Limit Errors (Groq)
287
+ - Wait a few minutes and try again
288
+ - Use `test_dataset_small.json` for fewer questions
289
+ - The `evaluate.py` script has built-in 2-minute delays
290
+
291
+ ### Import Errors
292
+ ```powershell
293
+ # Reinstall all dependencies
294
+ pip install -r requirements.txt --force-reinstall
295
+ ```
296
+
297
+ ---
298
+
299
+ ## 📝 API Keys Required
300
+
301
+ | Service | Purpose | Get Key From |
302
+ |---------|---------|--------------|
303
+ | **Groq** | LLM (Llama 3.1 8B) | https://console.groq.com |
304
+ | **HuggingFace** | Embeddings (auto-download) | No key needed |
305
+
306
+ ---
307
+
308
+ ## 🔄 How the System Works
309
+
310
+ ```
311
+ User Question (Arabic)
312
+
313
+ ┌─────────────────────────────────┐
314
+ │ Decision Tree Logic │
315
+ │ (app_final_updated.py) │
316
+ │ ├── Constitutional questions │
317
+ │ ├── Procedural questions │
318
+ │ ├── General legal advice │
319
+ │ └── Out-of-scope filtering │
320
+ └─────────────────────────────────┘
321
+
322
+ ┌─────────────────────────────────┐
323
+ │ Hybrid Retrieval (RRF) │
324
+ │ ├── Semantic Search (50%) │
325
+ │ ├── BM25 Keyword (30%) │
326
+ │ └── Metadata Filter (20%) │
327
+ └─────────────────────────────────┘
328
+
329
+ ┌─────────────────────────────────┐
330
+ │ Cross-Reference Expansion │
331
+ │ (Fetch related articles) │
332
+ └─────────────────────────────────┘
333
+
334
+ ┌─────────────────────────────────┐
335
+ │ Arabic Reranker (ARM-V1) │
336
+ │ (Select top 5 most relevant) │
337
+ └─────────────────────────────────┘
338
+
339
+ ┌─────────────────────────────────┐
340
+ │ LLM (Llama 3.1 via Groq) │
341
+ │ (Generate Arabic answer) │
342
+ │ - Separate system/user prompts │
343
+ │ - Citation with article numbers│
344
+ │ - Temperature: 0.3 │
345
+ └─────────────────────────────────┘
346
+
347
+ Final Answer
348
+ ```
349
+
350
+ ---
351
+
352
+ ## 📋 Version History
353
+
354
+ ### Latest Updates (Feb 2026)
355
+ - ✅ Added `app_final_updated.py` with improved decision tree logic
356
+ - ✅ Simplified evaluation output (question, ground_truth, answer, score)
357
+ - ✅ Created curated 5-question test dataset covering 5 legal categories
358
+ - ✅ Added comprehensive `.gitignore` for repository management
359
+ - ✅ Updated documentation with all recent changes
360
+ - ✅ Improved Arabic RTL support and number formatting
361
+
362
+ ### Previous Features
363
+ - Multi-source legal document support (Constitution, Civil, Labour, etc.)
364
+ - Hybrid retrieval with RRF (Reciprocal Rank Fusion)
365
+ - Arabic-specific reranker integration
366
+ - Phoenix tracing for observability
367
+ - RAGAS-based evaluation system
368
+
369
+ ---
370
+
371
+ ## 📞 Support
372
+
373
+ For issues, check:
374
+ 1. `.env` file has correct API keys
375
+ 2. All dependencies installed
376
+ 3. `reranker/` folder exists with model files
377
+ 4. Internet connection for API calls
378
+
379
+ ---
380
+
381
+ ## 📄 License
382
+
383
+ This project is for educational purposes - Egyptian Constitution Legal Assistant.
app_final.py ADDED
@@ -0,0 +1,625 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # -*- coding: utf-8 -*-
2
+ import os
3
+ import sys
4
+ import json
5
+ from dotenv import load_dotenv
6
+ import streamlit as st
7
+ import logging
8
+ import warnings
9
+
10
+ # Suppress progress bars from transformers/tqdm
11
+ os.environ['TRANSFORMERS_NO_PROGRESS_BAR'] = '1'
12
+ warnings.filterwarnings('ignore')
13
+
14
+ # 1. Loaders & Splitters
15
+ from langchain_core.documents import Document
16
+ from langchain_text_splitters import RecursiveCharacterTextSplitter
17
+ from langchain_core.retrievers import BaseRetriever
18
+ from langchain_core.callbacks import CallbackManagerForRetrieverRun
19
+ from typing import List
20
+ from rank_bm25 import BM25Okapi
21
+ import numpy as np
22
+
23
+ # 2. Vector Store & Embeddings
24
+ from langchain_chroma import Chroma
25
+ from langchain_huggingface import HuggingFaceEmbeddings
26
+
27
+ # 3. Reranker Imports
28
+ from langchain_classic.retrievers.document_compressors import CrossEncoderReranker
29
+ from langchain_classic.retrievers import ContextualCompressionRetriever
30
+ from langchain_community.cross_encoders import HuggingFaceCrossEncoder
31
+
32
+ # 4. LLM
33
+ from langchain_groq import ChatGroq
34
+ from langchain_core.prompts import ChatPromptTemplate
35
+ from langchain_core.output_parsers import StrOutputParser
36
+ from langchain_core.runnables import RunnablePassthrough, RunnableParallel
37
+
38
+ # Configure logging
39
+ logging.basicConfig(level=logging.INFO)
40
+ logger = logging.getLogger(__name__)
41
+
42
+ load_dotenv()
43
+
44
+ # ==========================================
45
+ # 🎨 UI SETUP (CSS FOR ARABIC & RTL)
46
+ # ==========================================
47
+ st.set_page_config(page_title="المساعد القانوني", page_icon="⚖️")
48
+
49
+ # This CSS block fixes the "001" number issue and right alignment
50
+ st.markdown("""
51
+ <style>
52
+ /* Force the main app container to be Right-to-Left */
53
+ .stApp {
54
+ direction: rtl;
55
+ text-align: right;
56
+ }
57
+
58
+ /* Fix input fields to type from right */
59
+ .stTextInput input {
60
+ direction: rtl;
61
+ text-align: right;
62
+ }
63
+
64
+ /* Fix chat messages alignment */
65
+ .stChatMessage {
66
+ direction: rtl;
67
+ text-align: right;
68
+ }
69
+
70
+ /* Ensure proper paragraph spacing */
71
+ .stMarkdown p {
72
+ margin: 0.5em 0 !important;
73
+ line-height: 1.6;
74
+ word-spacing: 0.1em;
75
+ }
76
+
77
+ /* Ensure numbers display correctly in RTL */
78
+ p, div, span, label {
79
+ unicode-bidi: embed;
80
+ direction: inherit;
81
+ white-space: normal;
82
+ word-wrap: break-word;
83
+ }
84
+
85
+ /* Force all content to respect RTL */
86
+ * {
87
+ direction: rtl !important;
88
+ }
89
+
90
+ /* Preserve line breaks and spacing */
91
+ .stMarkdown pre {
92
+ direction: rtl;
93
+ text-align: right;
94
+ white-space: pre-wrap;
95
+ word-wrap: break-word;
96
+ }
97
+
98
+ /* Hide the "Deploy" button and standard menu for cleaner look */
99
+ #MainMenu {visibility: hidden;}
100
+ footer {visibility: hidden;}
101
+
102
+ </style>
103
+ """, unsafe_allow_html=True)
104
+
105
+ # Put this at the top of your code
106
+ def convert_to_eastern_arabic(text):
107
+ """Converts 0123456789 to ٠١٢٣٤٥٦٧٨٩"""
108
+ if not isinstance(text, str):
109
+ return text
110
+ western_numerals = '0123456789'
111
+ eastern_numerals = '٠١٢٣٤٥٦٧٨٩'
112
+ translation_table = str.maketrans(western_numerals, eastern_numerals)
113
+ return text.translate(translation_table)
114
+
115
+ st.title("⚖️ المساعد القانوني الذكي (دستور مصر)")
116
+
117
+ # ==========================================
118
+ # 🚀 CACHED RESOURCE LOADING (THE FIX)
119
+ # ==========================================
120
+ # This decorator tells Streamlit: "Run this ONCE and save the result."
121
+ @st.cache_resource
122
+ def initialize_rag_pipeline():
123
+ print("🔄 Initializing system...")
124
+ print("📥 Loading data...")
125
+
126
+ # 1. Load JSON
127
+ json_path = "Egyptian_Constitution_legalnature_only.json"
128
+ if not os.path.exists(json_path):
129
+ raise FileNotFoundError(f"File not found: {json_path}")
130
+
131
+ with open(json_path, "r", encoding="utf-8") as f:
132
+ data = json.load(f)
133
+
134
+ # Create a mapping of article numbers for cross-reference lookup
135
+ article_map = {str(item['article_number']): item for item in data}
136
+
137
+ docs = []
138
+ for item in data:
139
+ # Build cross-reference section
140
+ cross_ref_text = ""
141
+ if item.get('cross_references') and len(item['cross_references']) > 0:
142
+ cross_ref_text = "\nالمواد ذات الصلة (المراجع المتقاطعة): " + ", ".join(
143
+ [f"المادة {ref}" for ref in item['cross_references']]
144
+ )
145
+
146
+ # Construct content
147
+ page_content = f"""
148
+ رقم المادة: {item['article_number']}
149
+ النص الأصلي: {item['original_text']}
150
+ الشرح المبسط: {item['simplified_summary']}{cross_ref_text}
151
+ """
152
+ metadata = {
153
+ "article_id": item['article_id'],
154
+ "article_number": str(item['article_number']),
155
+ "legal_nature": item['legal_nature'],
156
+ "keywords": ", ".join(item['keywords']),
157
+ "part": item.get('part (Bab)', ''),
158
+ "chapter": item.get('chapter (Fasl)', ''),
159
+ "cross_references": ", ".join([str(ref) for ref in item.get('cross_references', [])]) # Convert list to string
160
+ }
161
+ docs.append(Document(page_content=page_content, metadata=metadata))
162
+
163
+ print(f"✅ Loaded {len(docs)} constitutional articles")
164
+
165
+ # 2. Embeddings
166
+ print("Loading embeddings model...")
167
+ embeddings = HuggingFaceEmbeddings(
168
+ model_name="Omartificial-Intelligence-Space/GATE-AraBert-v1"
169
+ )
170
+ print("✅ Embeddings model ready")
171
+
172
+ # 3. No splitting - keep articles as complete units
173
+ chunks = docs
174
+
175
+ # 4. Vector Store
176
+ print("Building vector database...")
177
+ vectorstore = Chroma.from_documents(
178
+ chunks,
179
+ embeddings,
180
+ persist_directory="chroma_db"
181
+ )
182
+ base_retriever = vectorstore.as_retriever(search_kwargs={"k": 15})
183
+ print("✅ Vector database ready")
184
+
185
+ # 5. Create BM25 Keyword Retriever
186
+ class BM25Retriever(BaseRetriever):
187
+ """BM25-based keyword retriever for constitutional articles"""
188
+ corpus_docs: List[Document]
189
+ bm25: BM25Okapi = None
190
+ k: int = 15
191
+
192
+ class Config:
193
+ arbitrary_types_allowed = True
194
+
195
+ def __init__(self, **data):
196
+ super().__init__(**data)
197
+ # Tokenize corpus for BM25
198
+ tokenized_corpus = [doc.page_content.split() for doc in self.corpus_docs]
199
+ self.bm25 = BM25Okapi(tokenized_corpus)
200
+
201
+ def _get_relevant_documents(
202
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
203
+ ) -> List[Document]:
204
+ # Tokenize query
205
+ tokenized_query = query.split()
206
+ # Get BM25 scores
207
+ scores = self.bm25.get_scores(tokenized_query)
208
+ # Get top k indices
209
+ top_indices = np.argsort(scores)[::-1][:self.k]
210
+ # Return documents
211
+ return [self.corpus_docs[i] for i in top_indices if scores[i] > 0]
212
+
213
+ async def _aget_relevant_documents(
214
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
215
+ ) -> List[Document]:
216
+ return self._get_relevant_documents(query, run_manager=run_manager)
217
+
218
+ bm25_retriever = BM25Retriever(corpus_docs=docs, k=15)
219
+ print("✅ BM25 keyword retriever ready")
220
+
221
+ # 6. Create Metadata Filter Retriever
222
+ class MetadataFilterRetriever(BaseRetriever):
223
+ """Metadata-based filtering retriever"""
224
+ corpus_docs: List[Document]
225
+ k: int = 15
226
+
227
+ class Config:
228
+ arbitrary_types_allowed = True
229
+
230
+ def _get_relevant_documents(
231
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
232
+ ) -> List[Document]:
233
+ query_lower = query.lower()
234
+ scored_docs = []
235
+
236
+ for doc in self.corpus_docs:
237
+ score = 0
238
+ # Match keywords
239
+ keywords = doc.metadata.get('keywords', '').lower()
240
+ if any(word in keywords for word in query_lower.split()):
241
+ score += 3
242
+
243
+ # Match legal nature
244
+ legal_nature = doc.metadata.get('legal_nature', '').lower()
245
+ if any(word in legal_nature for word in query_lower.split()):
246
+ score += 2
247
+
248
+ # Match part/chapter
249
+ part = doc.metadata.get('part', '').lower()
250
+ chapter = doc.metadata.get('chapter', '').lower()
251
+ if any(word in part or word in chapter for word in query_lower.split()):
252
+ score += 1
253
+
254
+ # Match in content
255
+ if any(word in doc.page_content.lower() for word in query_lower.split()):
256
+ score += 1
257
+
258
+ if score > 0:
259
+ scored_docs.append((doc, score))
260
+
261
+ # Sort by score and return top k
262
+ scored_docs.sort(key=lambda x: x[1], reverse=True)
263
+ return [doc for doc, _ in scored_docs[:self.k]]
264
+
265
+ async def _aget_relevant_documents(
266
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
267
+ ) -> List[Document]:
268
+ return self._get_relevant_documents(query, run_manager=run_manager)
269
+
270
+ metadata_retriever = MetadataFilterRetriever(corpus_docs=docs, k=15)
271
+ print("✅ Metadata filter retriever ready")
272
+
273
+ # 7. Create Hybrid RRF Retriever
274
+ class HybridRRFRetriever(BaseRetriever):
275
+ """Combines semantic, BM25, and metadata retrievers using Reciprocal Rank Fusion"""
276
+ semantic_retriever: BaseRetriever
277
+ bm25_retriever: BM25Retriever
278
+ metadata_retriever: MetadataFilterRetriever
279
+ beta_semantic: float = 0.6 # Weight for semantic search
280
+ beta_keyword: float = 0.2 # Weight for BM25 keyword search
281
+ beta_metadata: float = 0.2 # Weight for metadata filtering
282
+ k: int = 60 # RRF constant (typically 60)
283
+ top_k: int = 15
284
+
285
+ class Config:
286
+ arbitrary_types_allowed = True
287
+
288
+ def _get_relevant_documents(
289
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
290
+ ) -> List[Document]:
291
+ # Get results from all three retrievers
292
+ semantic_docs = self.semantic_retriever.invoke(query)
293
+ bm25_docs = self.bm25_retriever.invoke(query)
294
+ metadata_docs = self.metadata_retriever.invoke(query)
295
+
296
+ # Apply Reciprocal Rank Fusion
297
+ rrf_scores = {}
298
+
299
+ # Process semantic results
300
+ for rank, doc in enumerate(semantic_docs, start=1):
301
+ doc_id = doc.metadata.get('article_number', str(hash(doc.page_content)))
302
+ rrf_scores[doc_id] = rrf_scores.get(doc_id, 0) + self.beta_semantic / (self.k + rank)
303
+
304
+ # Process BM25 results
305
+ for rank, doc in enumerate(bm25_docs, start=1):
306
+ doc_id = doc.metadata.get('article_number', str(hash(doc.page_content)))
307
+ rrf_scores[doc_id] = rrf_scores.get(doc_id, 0) + self.beta_keyword / (self.k + rank)
308
+
309
+ # Process metadata results
310
+ for rank, doc in enumerate(metadata_docs, start=1):
311
+ doc_id = doc.metadata.get('article_number', str(hash(doc.page_content)))
312
+ rrf_scores[doc_id] = rrf_scores.get(doc_id, 0) + self.beta_metadata / (self.k + rank)
313
+
314
+ # Create document lookup
315
+ all_docs = {}
316
+ for doc in semantic_docs + bm25_docs + metadata_docs:
317
+ doc_id = doc.metadata.get('article_number', str(hash(doc.page_content)))
318
+ if doc_id not in all_docs:
319
+ all_docs[doc_id] = doc
320
+
321
+ # Sort by RRF score
322
+ sorted_doc_ids = sorted(rrf_scores.items(), key=lambda x: x[1], reverse=True)
323
+
324
+ # Return top k documents
325
+ result_docs = []
326
+ for doc_id, score in sorted_doc_ids[:self.top_k]:
327
+ if doc_id in all_docs:
328
+ result_docs.append(all_docs[doc_id])
329
+
330
+ return result_docs
331
+
332
+ async def _aget_relevant_documents(
333
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
334
+ ) -> List[Document]:
335
+ return self._get_relevant_documents(query, run_manager=run_manager)
336
+
337
+ # Create hybrid retriever with tuned beta weights
338
+ hybrid_retriever = HybridRRFRetriever(
339
+ semantic_retriever=base_retriever,
340
+ bm25_retriever=bm25_retriever,
341
+ metadata_retriever=metadata_retriever,
342
+ beta_semantic=0.5, # Semantic search gets highest weight (most reliable)
343
+ beta_keyword=0.3, # BM25 keyword search (good for exact term matches)
344
+ beta_metadata=0.2, # Metadata filtering (supporting role)
345
+ k=60,
346
+ top_k=20
347
+ )
348
+ print("✅ Hybrid RRF retriever ready with β weights: semantic=0.5, keyword=0.3, metadata=0.2")
349
+
350
+ # 8. Create Cross-Reference Enhanced Retriever
351
+ class CrossReferenceRetriever(BaseRetriever):
352
+ """Enhances retrieval by automatically fetching cross-referenced articles"""
353
+ base_retriever: BaseRetriever
354
+ article_map: dict
355
+
356
+ class Config:
357
+ arbitrary_types_allowed = True
358
+
359
+ def _get_relevant_documents(
360
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
361
+ ) -> List[Document]:
362
+ # Get initial results
363
+ initial_docs = self.base_retriever.invoke(query)
364
+
365
+ # Collect all related article numbers
366
+ all_article_numbers = set()
367
+ for doc in initial_docs:
368
+ if 'article_number' in doc.metadata:
369
+ all_article_numbers.add(doc.metadata['article_number'])
370
+ # Parse cross_references (now stored as comma-separated string)
371
+ cross_refs_str = doc.metadata.get('cross_references', '')
372
+ if cross_refs_str:
373
+ cross_refs = [ref.strip() for ref in cross_refs_str.split(',')]
374
+ for ref in cross_refs:
375
+ if ref: # Skip empty strings
376
+ all_article_numbers.add(str(ref))
377
+
378
+ # Build enhanced document list
379
+ enhanced_docs = []
380
+ seen_numbers = set()
381
+
382
+ # Add initially retrieved documents
383
+ for doc in initial_docs:
384
+ enhanced_docs.append(doc)
385
+ seen_numbers.add(doc.metadata.get('article_number'))
386
+
387
+ # Add cross-referenced articles not yet retrieved
388
+ for article_num in all_article_numbers:
389
+ if article_num not in seen_numbers and article_num in self.article_map:
390
+ article_data = self.article_map[article_num]
391
+ cross_ref_text = ""
392
+ if article_data.get('cross_references'):
393
+ cross_ref_text = "\nالمواد ذات الصلة: " + ", ".join(
394
+ [f"المادة {ref}" for ref in article_data['cross_references']]
395
+ )
396
+
397
+ page_content = f"""
398
+ رقم المادة: {article_data['article_number']}
399
+ النص الأصلي: {article_data['original_text']}
400
+ الشرح المبسط: {article_data['simplified_summary']}{cross_ref_text}
401
+ """
402
+
403
+ enhanced_doc = Document(
404
+ page_content=page_content,
405
+ metadata={
406
+ "article_id": article_data['article_id'],
407
+ "article_number": str(article_data['article_number']),
408
+ "legal_nature": article_data['legal_nature'],
409
+ "keywords": ", ".join(article_data['keywords']),
410
+ "cross_references": ", ".join([str(ref) for ref in article_data.get('cross_references', [])])
411
+ }
412
+ )
413
+ enhanced_docs.append(enhanced_doc)
414
+ seen_numbers.add(article_num)
415
+
416
+ return enhanced_docs
417
+
418
+ async def _aget_relevant_documents(
419
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
420
+ ) -> List[Document]:
421
+ return self._get_relevant_documents(query, run_manager=run_manager)
422
+
423
+ cross_ref_retriever = CrossReferenceRetriever(
424
+ base_retriever=hybrid_retriever,
425
+ article_map=article_map
426
+ )
427
+ print("✅ Cross-reference retriever ready (using hybrid RRF base)")
428
+
429
+ # 9. Reranker
430
+ print("Loading reranker model...")
431
+ local_model_path = r"D:\FOE\Senior 2\Graduation Project\Chatbot_me\reranker"
432
+
433
+ if not os.path.exists(local_model_path):
434
+ raise FileNotFoundError(f"Reranker path not found: {local_model_path}")
435
+
436
+ model = HuggingFaceCrossEncoder(model_name=local_model_path)
437
+ compressor = CrossEncoderReranker(model=model, top_n=5)
438
+
439
+ compression_retriever = ContextualCompressionRetriever(
440
+ base_compressor=compressor,
441
+ base_retriever=cross_ref_retriever
442
+ )
443
+ print("✅ Reranker model ready")
444
+
445
+ # 7. LLM - Balanced for consistency with slight creativity
446
+ # 7. LLM Configuration
447
+ llm = ChatGroq(
448
+ groq_api_key=os.getenv("GROQ_API_KEY"),
449
+ model_name="llama-3.1-8b-instant",
450
+ temperature=0.3, # Slightly increased to allow helpful general advice
451
+ model_kwargs={"top_p": 0.9}
452
+ )
453
+
454
+ # ==================================================
455
+ # 🛠️ THE FIX: SEPARATE SYSTEM INSTRUCTIONS FROM USER INPUT
456
+ # ==================================================
457
+
458
+ # ==================================================
459
+ # 🧠 PROMPT ENGINEERING: DECISION TREE LOGIC
460
+ # ==================================================
461
+
462
+ system_instructions = """
463
+ <role>
464
+ أنت "المساعد القانوني الذكي"، خبير متخصص في الدستور المصري والقوانين الإجرائية.
465
+ مهمتك: تقديم إجابات دقيقة بناءً على "السياق التشريعي" المرفق أولاً، أو تقديم نصائح إجرائية عامة عند الضرورة.
466
+ </role>
467
+
468
+ <decision_logic>
469
+ عليك تحليل "سؤال المستخدم" و"السياق التشريعي" وتصنيف الحالة واختيار الرد المناسب بناءً على القواعد التالية بدقة:
470
+
471
+ 🔴 الحالة الأولى: (الإجابة موجودة في السياق التشريعي)
472
+ الشرط: إذا وجدت معلومات داخل "السياق التشريعي المتاح" تجيب على السؤال.
473
+ الفعل:
474
+ 1. استخرج الإجابة من السياق فقط.
475
+ 2. ابدأ الإجابة مباشرة دون مقدمات.
476
+ 3. يجب توثيق الإجابة برقم المادة (مثال: "نصت المادة (50) على...").
477
+ 4. توقف هنا. لا تضف أي معلومات خارجية.
478
+
479
+ 🟡 الحالة الثانية: (السياق فارغ/غير مفيد + السؤال إجرائي/عملي)
480
+ الشرط: إذا لم تجد الإجابة في السياق، وكان السؤال عن إجراءات عملية (مثل: حادث، سرقة، طلاق، تحرير محضر، تعامل مع الشرطة).
481
+ الفعل:
482
+ 1. تجاهل السياق الفارغ.
483
+ 2. استخدم معرفتك العامة بالقانون المصري.
484
+ 3. ابدأ وجوباً بعبارة: "بناءً على الإجراءات القانونية العامة في مصر (وليس ن��اً دستورياً محدداً):"
485
+ 4. قدم الخطوات في نقاط مرقمة واضحة ومختصرة (1، 2، 3).
486
+ 5. تحذير: لا تذكر أرقام مواد قانونية (لا تخترع أرقام مواد).
487
+
488
+ 🔵 الحالة الثالثة: (السياق فارغ + السؤال عن نص دستوري محدد)
489
+ الشرط: إذا سأل عن (مجلس الشعب، الشورى، مادة محددة) ولم تجدها في السياق.
490
+ الفعل:
491
+ 1. قل بوضوح: "عذراً، لم يرد ذكر لهذا الموضوع في المواد الدستورية التي تم استرجاعها في السياق الحالي."
492
+ 2. لا تحاول الإجابة من ذاكرتك لكي لا تخطئ في النصوص الدستورية الحساسة.
493
+
494
+ 🟢 الحالة الرابعة: (محادثة ودية)
495
+ الشرط: تحية، شكر، أو "كيف حالك".
496
+ الفعل: رد بتحية مهذبة جداً ومقتضبة، ثم قل: "أنا جاهز للإجابة على استفساراتك القانونية."
497
+
498
+ ⚫ الحالة الخامسة: (خارج النطاق تماماً)
499
+ الشرط: طبخ، رياضة، برمجة، أو أي موضوع غير قانوني.
500
+ الفعل: اعتذر بلطف ووجه المستخدم للسؤال في القانون.
501
+ </decision_logic>
502
+
503
+ <formatting_rules>
504
+ - لا تكرر هذه التعليمات في ردك.
505
+ - استخدم فقرات قصيرة واترك سطراً فارغاً بينها.
506
+ - لا تستخدم عبارات مثل "بناء على السياق المرفق" في بداية الجملة، بل ادخل في صلب الموضوع فوراً.
507
+ - التزم باللغة العربية الفصحى المبسطة والرصينة.
508
+ </formatting_rules>
509
+ """
510
+
511
+ # We use .from_messages to strictly separate instructions from data
512
+ prompt = ChatPromptTemplate.from_messages([
513
+ ("system", system_instructions),
514
+ ("system", "السياق التشريعي المتاح (المصدر الأساسي):\n{context}"),
515
+ ("human", "سؤال المستفيد:\n{input}")
516
+ ])
517
+
518
+ # 9. Build Chain with RunnableParallel (returns both context and answer)
519
+ qa_chain = (
520
+ RunnableParallel({
521
+ "context": compression_retriever,
522
+ "input": RunnablePassthrough()
523
+ })
524
+ .assign(answer=(
525
+ prompt
526
+ | llm
527
+ | StrOutputParser()
528
+ ))
529
+ )
530
+
531
+ print("✅ System ready to use!")
532
+ return qa_chain
533
+
534
+ # ==========================================
535
+ # ⚡ MAIN EXECUTION
536
+ # ==========================================
537
+
538
+ try:
539
+ # Only need the chain now - it handles all retrieval internally
540
+ qa_chain = initialize_rag_pipeline()
541
+
542
+ except Exception as e:
543
+ st.error(f"Critical Error loading application: {e}")
544
+ st.stop()
545
+
546
+ # ==========================================
547
+ # 💬 CHAT LOOP
548
+ # ==========================================
549
+ if "messages" not in st.session_state:
550
+ st.session_state.messages = []
551
+
552
+ # Display Chat History (with Eastern Arabic numerals)
553
+ for message in st.session_state.messages:
554
+ with st.chat_message(message["role"]):
555
+ # Convert to Eastern Arabic when displaying from history
556
+ st.markdown(convert_to_eastern_arabic(message["content"]))
557
+
558
+ # Handle New User Input
559
+ if prompt_input := st.chat_input("اكتب سؤالك القانوني هنا..."):
560
+ # Show user message
561
+ st.session_state.messages.append({"role": "user", "content": prompt_input})
562
+ with st.chat_message("user"):
563
+ st.markdown(prompt_input)
564
+
565
+ # Generate Response
566
+ with st.chat_message("assistant"):
567
+ with st.spinner("جاري التحليل القانوني..."):
568
+ try:
569
+ # Invoke chain ONCE - returns Dict with 'context', 'input', and 'answer'
570
+ result = qa_chain.invoke(prompt_input)
571
+
572
+ # Extract answer and context from result
573
+ response_text = result["answer"]
574
+ source_docs = result["context"] # Context is already in the result!
575
+
576
+ # Display Answer
577
+ response_text_arabic = convert_to_eastern_arabic(response_text)
578
+ st.markdown(response_text_arabic)
579
+
580
+ # Display Sources
581
+ if source_docs and len(source_docs) > 0:
582
+ print(f"✅ Found {len(source_docs)} documents")
583
+ # Deduplicate documents by article_number
584
+ seen_articles = set()
585
+ unique_docs = []
586
+
587
+ for doc in source_docs:
588
+ article_num = str(doc.metadata.get('article_number', '')).strip()
589
+ if article_num and article_num not in seen_articles:
590
+ seen_articles.add(article_num)
591
+ unique_docs.append(doc)
592
+
593
+ st.markdown("---") # Separator before sources
594
+
595
+ if unique_docs:
596
+ with st.expander(f"📚 المصادر المستخدمة ({len(unique_docs)} مادة)"):
597
+ st.markdown("### المواد الدستورية المستخدمة في التحليل:")
598
+ st.markdown("---")
599
+
600
+ for idx, doc in enumerate(unique_docs, 1):
601
+ article_num = str(doc.metadata.get('article_number', '')).strip()
602
+ legal_nature = doc.metadata.get('legal_nature', '')
603
+
604
+ if article_num:
605
+ st.markdown(f"**المادة رقم {convert_to_eastern_arabic(article_num)}**")
606
+ if legal_nature:
607
+ st.markdown(f"*الطبيعة القانونية: {legal_nature}*")
608
+
609
+ # Display article content
610
+ content_lines = doc.page_content.strip().split('\n')
611
+ for line in content_lines:
612
+ line = line.strip()
613
+ if line:
614
+ st.markdown(convert_to_eastern_arabic(line))
615
+
616
+ st.markdown("---")
617
+ else:
618
+ st.info("📌 لم يتم العثور على مصادر")
619
+ else:
620
+ st.info("📌 لم يتم العثور على مصادر")
621
+
622
+ # Persist the raw answer to avoid double conversion glitches on rerun
623
+ st.session_state.messages.append({"role": "assistant", "content": response_text})
624
+ except Exception as e:
625
+ st.error(f"حدث خطأ: {e}")
app_final_pheonix.py ADDED
@@ -0,0 +1,838 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # === Phoenix Observability Setup ===
2
+ import os
3
+ from datetime import datetime
4
+
5
+ try:
6
+ # OpenTelemetry SDK + OTLP exporter (Phoenix consumes OTLP)
7
+ from opentelemetry import trace
8
+ from opentelemetry.sdk.resources import SERVICE_NAME, Resource
9
+ from opentelemetry.sdk.trace import TracerProvider
10
+ from opentelemetry.sdk.trace.export import BatchSpanProcessor
11
+ from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
12
+ PHOENIX_AVAILABLE = True
13
+ except Exception:
14
+ PHOENIX_AVAILABLE = False
15
+
16
+
17
+ def setup_phoenix_tracing():
18
+ """Configure OTLP tracing for Phoenix. Uses PHOENIX_OTLP_ENDPOINT env if set."""
19
+ if not PHOENIX_AVAILABLE:
20
+ return None
21
+
22
+ service_name = os.getenv("PHOENIX_SERVICE_NAME", "constitutional-assistant")
23
+ otlp_endpoint = os.getenv("PHOENIX_OTLP_ENDPOINT", "http://localhost:6006/v1/traces")
24
+
25
+ resource = Resource(attributes={SERVICE_NAME: service_name})
26
+ provider = TracerProvider(resource=resource)
27
+ exporter = OTLPSpanExporter(endpoint=otlp_endpoint)
28
+ span_processor = BatchSpanProcessor(exporter)
29
+ provider.add_span_processor(span_processor)
30
+ trace.set_tracer_provider(provider)
31
+ return trace.get_tracer(service_name)
32
+
33
+
34
+ # Create a module-level tracer
35
+ _phoenix_tracer = setup_phoenix_tracing()
36
+
37
+
38
+ class PhoenixSpan:
39
+ """Context manager helper to create spans with proper parent-child hierarchy."""
40
+ def __init__(self, name: str, attributes: dict | None = None, kind: str = "INTERNAL"):
41
+ self.name = name
42
+ self.attributes = attributes or {}
43
+ self.kind = kind
44
+ self._span_context = None
45
+ self._span = None
46
+ self._start_time = None
47
+
48
+ def __enter__(self):
49
+ if _phoenix_tracer:
50
+ from opentelemetry.trace import SpanKind
51
+ import time
52
+ self._start_time = time.time()
53
+
54
+ # Map string kind to SpanKind enum
55
+ kind_map = {
56
+ "CLIENT": SpanKind.CLIENT,
57
+ "SERVER": SpanKind.SERVER,
58
+ "INTERNAL": SpanKind.INTERNAL,
59
+ }
60
+ span_kind = kind_map.get(self.kind, SpanKind.INTERNAL)
61
+
62
+ # Use start_as_current_span to establish parent-child relationships
63
+ self._span_context = _phoenix_tracer.start_as_current_span(
64
+ self.name,
65
+ kind=span_kind
66
+ )
67
+ self._span = self._span_context.__enter__()
68
+ for k, v in self.attributes.items():
69
+ try:
70
+ self._span.set_attribute(k, v)
71
+ except Exception:
72
+ pass
73
+ return self
74
+
75
+ def set_attr(self, key: str, value):
76
+ if self._span:
77
+ try:
78
+ self._span.set_attribute(key, value)
79
+ except Exception:
80
+ pass
81
+
82
+ def __exit__(self, exc_type, exc, tb):
83
+ if self._span_context:
84
+ try:
85
+ if exc_type:
86
+ self._span.record_exception(exc)
87
+ from opentelemetry.trace import Status, StatusCode
88
+ self._span.set_status(Status(StatusCode.ERROR, str(exc)))
89
+ else:
90
+ # Add duration as attribute
91
+ if self._start_time:
92
+ import time
93
+ duration = time.time() - self._start_time
94
+ self._span.set_attribute("duration_ms", round(duration * 1000, 2))
95
+ from opentelemetry.trace import Status, StatusCode
96
+ self._span.set_status(Status(StatusCode.OK))
97
+ self._span_context.__exit__(exc_type, exc, tb)
98
+ except Exception:
99
+ pass
100
+
101
+ # -*- coding: utf-8 -*-
102
+ import os
103
+ import sys
104
+ import json
105
+ from dotenv import load_dotenv
106
+ import streamlit as st
107
+ import logging
108
+ import warnings
109
+
110
+ # Suppress progress bars from transformers/tqdm
111
+ os.environ['TRANSFORMERS_NO_PROGRESS_BAR'] = '1'
112
+ warnings.filterwarnings('ignore')
113
+
114
+ # 1. Loaders & Splitters
115
+ from langchain_core.documents import Document
116
+ from langchain_text_splitters import RecursiveCharacterTextSplitter
117
+ from langchain_core.retrievers import BaseRetriever
118
+ from langchain_core.callbacks import CallbackManagerForRetrieverRun
119
+ from typing import List
120
+ from rank_bm25 import BM25Okapi
121
+ import numpy as np
122
+
123
+ # 2. Vector Store & Embeddings
124
+ from langchain_chroma import Chroma
125
+ from langchain_huggingface import HuggingFaceEmbeddings
126
+
127
+ # 3. Reranker Imports
128
+ from langchain_classic.retrievers.document_compressors import CrossEncoderReranker
129
+ from langchain_classic.retrievers import ContextualCompressionRetriever
130
+ from langchain_community.cross_encoders import HuggingFaceCrossEncoder
131
+
132
+ # 4. LLM
133
+ from langchain_groq import ChatGroq
134
+ from langchain_core.prompts import ChatPromptTemplate
135
+ from langchain_core.output_parsers import StrOutputParser
136
+ from langchain_core.runnables import RunnablePassthrough, RunnableParallel
137
+
138
+ # Configure logging
139
+ logging.basicConfig(level=logging.INFO)
140
+ logger = logging.getLogger(__name__)
141
+
142
+ load_dotenv()
143
+
144
+ # ==========================================
145
+ # 🎨 UI SETUP (CSS FOR ARABIC & RTL)
146
+ # ==========================================
147
+ st.set_page_config(page_title="المساعد القانوني", page_icon="⚖️")
148
+
149
+ # This CSS block fixes the "001" number issue and right alignment
150
+ st.markdown("""
151
+ <style>
152
+ /* Force the main app container to be Right-to-Left */
153
+ .stApp {
154
+ direction: rtl;
155
+ text-align: right;
156
+ }
157
+
158
+ /* Fix input fields to type from right */
159
+ .stTextInput input {
160
+ direction: rtl;
161
+ text-align: right;
162
+ }
163
+
164
+ /* Fix chat messages alignment */
165
+ .stChatMessage {
166
+ direction: rtl;
167
+ text-align: right;
168
+ }
169
+
170
+ /* Ensure proper paragraph spacing */
171
+ .stMarkdown p {
172
+ margin: 0.5em 0 !important;
173
+ line-height: 1.6;
174
+ word-spacing: 0.1em;
175
+ }
176
+
177
+ /* Ensure numbers display correctly in RTL */
178
+ p, div, span, label {
179
+ unicode-bidi: embed;
180
+ direction: inherit;
181
+ white-space: normal;
182
+ word-wrap: break-word;
183
+ }
184
+
185
+ /* Force all content to respect RTL */
186
+ * {
187
+ direction: rtl !important;
188
+ }
189
+
190
+ /* Preserve line breaks and spacing */
191
+ .stMarkdown pre {
192
+ direction: rtl;
193
+ text-align: right;
194
+ white-space: pre-wrap;
195
+ word-wrap: break-word;
196
+ }
197
+
198
+ /* Hide the "Deploy" button and standard menu for cleaner look */
199
+ #MainMenu {visibility: hidden;}
200
+ footer {visibility: hidden;}
201
+
202
+ </style>
203
+ """, unsafe_allow_html=True)
204
+
205
+ # Put this at the top of your code
206
+ def convert_to_eastern_arabic(text):
207
+ """Converts 0123456789 to ٠١٢٣٤٥٦٧٨٩"""
208
+ if not isinstance(text, str):
209
+ return text
210
+ western_numerals = '0123456789'
211
+ eastern_numerals = '٠١٢٣٤٥٦٧٨٩'
212
+ translation_table = str.maketrans(western_numerals, eastern_numerals)
213
+ return text.translate(translation_table)
214
+
215
+ st.title("⚖️ المساعد القانوني الذكي (دستور مصر)")
216
+
217
+ # ==========================================
218
+ # 🚀 CACHED RESOURCE LOADING (THE FIX)
219
+ # ==========================================
220
+ # This decorator tells Streamlit: "Run this ONCE and save the result."
221
+ @st.cache_resource
222
+ def initialize_rag_pipeline():
223
+ print("🔄 Initializing system...")
224
+ print("📥 Loading data...")
225
+
226
+ # 1. Load JSON
227
+ json_path = "Egyptian_Constitution_legalnature_only.json"
228
+ if not os.path.exists(json_path):
229
+ raise FileNotFoundError(f"File not found: {json_path}")
230
+
231
+ with open(json_path, "r", encoding="utf-8") as f:
232
+ data = json.load(f)
233
+
234
+ # Create a mapping of article numbers for cross-reference lookup
235
+ article_map = {str(item['article_number']): item for item in data}
236
+
237
+ docs = []
238
+ for item in data:
239
+ # Build cross-reference section
240
+ cross_ref_text = ""
241
+ if item.get('cross_references') and len(item['cross_references']) > 0:
242
+ cross_ref_text = "\nالمواد ذات الصلة (المراجع المتقاطعة): " + ", ".join(
243
+ [f"المادة {ref}" for ref in item['cross_references']]
244
+ )
245
+
246
+ # Construct content
247
+ page_content = f"""
248
+ رقم المادة: {item['article_number']}
249
+ النص الأصلي: {item['original_text']}
250
+ الشرح المبسط: {item['simplified_summary']}{cross_ref_text}
251
+ """
252
+ metadata = {
253
+ "article_id": item['article_id'],
254
+ "article_number": str(item['article_number']),
255
+ "legal_nature": item['legal_nature'],
256
+ "keywords": ", ".join(item['keywords']),
257
+ "part": item.get('part (Bab)', ''),
258
+ "chapter": item.get('chapter (Fasl)', ''),
259
+ "cross_references": ", ".join([str(ref) for ref in item.get('cross_references', [])]) # Convert list to string
260
+ }
261
+ docs.append(Document(page_content=page_content, metadata=metadata))
262
+
263
+ print(f"✅ Loaded {len(docs)} constitutional articles")
264
+
265
+ # 2. Embeddings
266
+ print("Loading embeddings model...")
267
+ embeddings = HuggingFaceEmbeddings(
268
+ model_name="Omartificial-Intelligence-Space/GATE-AraBert-v1"
269
+ )
270
+ print("✅ Embeddings model ready")
271
+
272
+ # 3. No splitting - keep articles as complete units
273
+ chunks = docs
274
+
275
+ # 4. Vector Store
276
+ print("Building vector database...")
277
+ vectorstore = Chroma.from_documents(
278
+ chunks,
279
+ embeddings,
280
+ persist_directory="chroma_db"
281
+ )
282
+ base_retriever = vectorstore.as_retriever(search_kwargs={"k": 15})
283
+ print("✅ Vector database ready")
284
+
285
+ # 5. Create BM25 Keyword Retriever
286
+ class BM25Retriever(BaseRetriever):
287
+ """BM25-based keyword retriever for constitutional articles"""
288
+ corpus_docs: List[Document]
289
+ bm25: BM25Okapi = None
290
+ k: int = 15
291
+
292
+ class Config:
293
+ arbitrary_types_allowed = True
294
+
295
+ def __init__(self, **data):
296
+ super().__init__(**data)
297
+ # Tokenize corpus for BM25
298
+ tokenized_corpus = [doc.page_content.split() for doc in self.corpus_docs]
299
+ self.bm25 = BM25Okapi(tokenized_corpus)
300
+
301
+ def _get_relevant_documents(
302
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
303
+ ) -> List[Document]:
304
+ # Tokenize query
305
+ tokenized_query = query.split()
306
+ # Get BM25 scores
307
+ scores = self.bm25.get_scores(tokenized_query)
308
+ # Get top k indices
309
+ top_indices = np.argsort(scores)[::-1][:self.k]
310
+ # Return documents
311
+ return [self.corpus_docs[i] for i in top_indices if scores[i] > 0]
312
+
313
+ async def _aget_relevant_documents(
314
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
315
+ ) -> List[Document]:
316
+ return self._get_relevant_documents(query, run_manager=run_manager)
317
+
318
+ bm25_retriever = BM25Retriever(corpus_docs=docs, k=15)
319
+ print("✅ BM25 keyword retriever ready")
320
+
321
+ # 6. Create Metadata Filter Retriever
322
+ class MetadataFilterRetriever(BaseRetriever):
323
+ """Metadata-based filtering retriever"""
324
+ corpus_docs: List[Document]
325
+ k: int = 15
326
+
327
+ class Config:
328
+ arbitrary_types_allowed = True
329
+
330
+ def _get_relevant_documents(
331
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
332
+ ) -> List[Document]:
333
+ query_lower = query.lower()
334
+ scored_docs = []
335
+
336
+ for doc in self.corpus_docs:
337
+ score = 0
338
+ # Match keywords (boosted)
339
+ keywords = doc.metadata.get('keywords', '').lower()
340
+ if any(word in keywords for word in query_lower.split()):
341
+ score += 4
342
+
343
+ # Match legal nature (boosted)
344
+ legal_nature = doc.metadata.get('legal_nature', '').lower()
345
+ if any(word in legal_nature for word in query_lower.split()):
346
+ score += 3
347
+
348
+ # Match part/chapter
349
+ part = doc.metadata.get('part', '').lower()
350
+ chapter = doc.metadata.get('chapter', '').lower()
351
+ if any(word in part or word in chapter for word in query_lower.split()):
352
+ score += 1
353
+
354
+ # Match in content
355
+ if any(word in doc.page_content.lower() for word in query_lower.split()):
356
+ score += 1
357
+
358
+ if score > 0:
359
+ scored_docs.append((doc, score))
360
+
361
+ # Sort by score and return top k
362
+ scored_docs.sort(key=lambda x: x[1], reverse=True)
363
+ return [doc for doc, _ in scored_docs[:self.k]]
364
+
365
+ async def _aget_relevant_documents(
366
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
367
+ ) -> List[Document]:
368
+ return self._get_relevant_documents(query, run_manager=run_manager)
369
+
370
+ metadata_retriever = MetadataFilterRetriever(corpus_docs=docs, k=15)
371
+ print("✅ Metadata filter retriever ready")
372
+
373
+ # 7. Create Hybrid RRF Retriever
374
+ class HybridRRFRetriever(BaseRetriever):
375
+ """Combines semantic, BM25, and metadata retrievers using Reciprocal Rank Fusion"""
376
+ semantic_retriever: BaseRetriever
377
+ bm25_retriever: BM25Retriever
378
+ metadata_retriever: MetadataFilterRetriever
379
+ beta_semantic: float = 0.6 # Weight for semantic search
380
+ beta_keyword: float = 0.25 # Weight for BM25 keyword search
381
+ beta_metadata: float = 0.15 # Weight for metadata filtering
382
+ k: int = 60 # RRF constant (typically 60)
383
+ top_k: int = 25
384
+
385
+ class Config:
386
+ arbitrary_types_allowed = True
387
+
388
+ def _get_relevant_documents(
389
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
390
+ ) -> List[Document]:
391
+ # Get results from all three retrievers (no separate spans - details logged in hybrid_retrieval span)
392
+ semantic_docs = self.semantic_retriever.invoke(query)
393
+ bm25_docs = self.bm25_retriever.invoke(query)
394
+ metadata_docs = self.metadata_retriever.invoke(query)
395
+
396
+ # Apply Reciprocal Rank Fusion
397
+ rrf_scores = {}
398
+
399
+ # Process semantic results
400
+ for rank, doc in enumerate(semantic_docs, start=1):
401
+ doc_id = doc.metadata.get('article_number', str(hash(doc.page_content)))
402
+ rrf_scores[doc_id] = rrf_scores.get(doc_id, 0) + self.beta_semantic / (self.k + rank)
403
+
404
+ # Process BM25 results
405
+ for rank, doc in enumerate(bm25_docs, start=1):
406
+ doc_id = doc.metadata.get('article_number', str(hash(doc.page_content)))
407
+ rrf_scores[doc_id] = rrf_scores.get(doc_id, 0) + self.beta_keyword / (self.k + rank)
408
+
409
+ # Process metadata results
410
+ for rank, doc in enumerate(metadata_docs, start=1):
411
+ doc_id = doc.metadata.get('article_number', str(hash(doc.page_content)))
412
+ rrf_scores[doc_id] = rrf_scores.get(doc_id, 0) + self.beta_metadata / (self.k + rank)
413
+
414
+ # Create document lookup
415
+ all_docs = {}
416
+ for doc in semantic_docs + bm25_docs + metadata_docs:
417
+ doc_id = doc.metadata.get('article_number', str(hash(doc.page_content)))
418
+ if doc_id not in all_docs:
419
+ all_docs[doc_id] = doc
420
+
421
+ # Sort by RRF score
422
+ sorted_doc_ids = sorted(rrf_scores.items(), key=lambda x: x[1], reverse=True)
423
+
424
+ # Return top k documents
425
+ result_docs = []
426
+ for doc_id, score in sorted_doc_ids[:self.top_k]:
427
+ if doc_id in all_docs:
428
+ result_docs.append(all_docs[doc_id])
429
+
430
+ # Log all retrieval details in one place (no nested spans to avoid hierarchy issues)
431
+ try:
432
+ with PhoenixSpan("hybrid_retrieval", {
433
+ "query": query[:200],
434
+ "beta_semantic": self.beta_semantic,
435
+ "beta_keyword": self.beta_keyword,
436
+ "beta_metadata": self.beta_metadata,
437
+ "rrf_k_constant": self.k,
438
+ "top_k_limit": self.top_k
439
+ }, kind="INTERNAL") as fusion_span:
440
+ # Semantic retrieval details
441
+ fusion_span.set_attr("semantic_input_count", len(semantic_docs))
442
+ if semantic_docs:
443
+ fusion_span.set_attr("semantic_top_5", ", ".join([d.metadata.get('article_number', 'N/A') for d in semantic_docs[:5]]))
444
+
445
+ # BM25 retrieval details
446
+ fusion_span.set_attr("bm25_input_count", len(bm25_docs))
447
+ if bm25_docs:
448
+ fusion_span.set_attr("bm25_top_5", ", ".join([d.metadata.get('article_number', 'N/A') for d in bm25_docs[:5]]))
449
+
450
+ # Metadata retrieval details
451
+ fusion_span.set_attr("metadata_input_count", len(metadata_docs))
452
+ if metadata_docs:
453
+ fusion_span.set_attr("metadata_top_5", ", ".join([d.metadata.get('article_number', 'N/A') for d in metadata_docs[:5]]))
454
+
455
+ # Fusion results
456
+ fusion_span.set_attr("unique_docs_before_fusion", len(all_docs))
457
+ fusion_span.set_attr("final_doc_count", len(result_docs))
458
+ if result_docs:
459
+ top_article_nums = [d.metadata.get('article_number', 'N/A') for d in result_docs[:10]]
460
+ fusion_span.set_attr("fused_top_10_articles", ", ".join(map(str, top_article_nums)))
461
+ # Show top 5 RRF scores
462
+ top_scores = [(doc_id, f"{score:.4f}") for doc_id, score in sorted_doc_ids[:5]]
463
+ fusion_span.set_attr("top_5_rrf_scores", str(top_scores))
464
+ fusion_span.set_attr("top_doc_preview", result_docs[0].page_content[:300])
465
+ except Exception:
466
+ pass
467
+
468
+ return result_docs
469
+
470
+ async def _aget_relevant_documents(
471
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
472
+ ) -> List[Document]:
473
+ return self._get_relevant_documents(query, run_manager=run_manager)
474
+
475
+ # Create hybrid retriever with tuned beta weights
476
+ hybrid_retriever = HybridRRFRetriever(
477
+ semantic_retriever=base_retriever,
478
+ bm25_retriever=bm25_retriever,
479
+ metadata_retriever=metadata_retriever,
480
+ beta_semantic=0.6, # Semantic search gets highest weight (most reliable)
481
+ beta_keyword=0.25, # BM25 keyword search (good for exact term matches)
482
+ beta_metadata=0.15, # Metadata filtering (supporting role)
483
+ k=60,
484
+ top_k=25
485
+ )
486
+ print("✅ Hybrid RRF retriever ready with β weights: semantic=0.6, keyword=0.25, metadata=0.15, top_k=25")
487
+
488
+ # 8. Create Cross-Reference Enhanced Retriever
489
+ class CrossReferenceRetriever(BaseRetriever):
490
+ """Enhances retrieval by automatically fetching cross-referenced articles"""
491
+ base_retriever: BaseRetriever
492
+ article_map: dict
493
+
494
+ class Config:
495
+ arbitrary_types_allowed = True
496
+
497
+ def _get_relevant_documents(
498
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
499
+ ) -> List[Document]:
500
+ with PhoenixSpan("cross_reference_expansion", {"query": query[:200]}, kind="INTERNAL") as xref_span:
501
+ # Get initial results
502
+ initial_docs = self.base_retriever.invoke(query)
503
+ xref_span.set_attr("initial_doc_count", len(initial_docs))
504
+
505
+ # Collect all related article numbers
506
+ all_article_numbers = set()
507
+ for doc in initial_docs:
508
+ if 'article_number' in doc.metadata:
509
+ all_article_numbers.add(doc.metadata['article_number'])
510
+ # Parse cross_references (now stored as comma-separated string)
511
+ cross_refs_str = doc.metadata.get('cross_references', '')
512
+ if cross_refs_str:
513
+ cross_refs = [ref.strip() for ref in cross_refs_str.split(',')]
514
+ for ref in cross_refs:
515
+ if ref: # Skip empty strings
516
+ all_article_numbers.add(str(ref))
517
+
518
+ # Build enhanced document list
519
+ enhanced_docs = []
520
+ seen_numbers = set()
521
+
522
+ # Add initially retrieved documents
523
+ for doc in initial_docs:
524
+ enhanced_docs.append(doc)
525
+ seen_numbers.add(doc.metadata.get('article_number'))
526
+
527
+ # Add cross-referenced articles not yet retrieved
528
+ for article_num in all_article_numbers:
529
+ if article_num not in seen_numbers and article_num in self.article_map:
530
+ article_data = self.article_map[article_num]
531
+ cross_ref_text = ""
532
+ if article_data.get('cross_references'):
533
+ cross_ref_text = "\nالمواد ذات الصلة: " + ", ".join(
534
+ [f"المادة {ref}" for ref in article_data['cross_references']]
535
+ )
536
+
537
+ page_content = f"""
538
+ رقم المادة: {article_data['article_number']}
539
+ النص الأصلي: {article_data['original_text']}
540
+ الشرح المبسط: {article_data['simplified_summary']}{cross_ref_text}
541
+ """
542
+
543
+ enhanced_doc = Document(
544
+ page_content=page_content,
545
+ metadata={
546
+ "article_id": article_data['article_id'],
547
+ "article_number": str(article_data['article_number']),
548
+ "legal_nature": article_data['legal_nature'],
549
+ "keywords": ", ".join(article_data['keywords']),
550
+ "cross_references": ", ".join([str(ref) for ref in article_data.get('cross_references', [])])
551
+ }
552
+ )
553
+ enhanced_docs.append(enhanced_doc)
554
+ seen_numbers.add(article_num)
555
+
556
+ # Record expansion stats (OUTSIDE the loop, at the end)
557
+ expanded_articles = [doc.metadata.get('article_number') for doc in enhanced_docs if doc not in initial_docs]
558
+ xref_span.set_attr("cross_refs_added", len(expanded_articles))
559
+ xref_span.set_attr("final_doc_count", len(enhanced_docs))
560
+ if expanded_articles:
561
+ xref_span.set_attr("expanded_article_numbers", ", ".join(map(str, expanded_articles[:15])))
562
+
563
+ return enhanced_docs
564
+
565
+ async def _aget_relevant_documents(
566
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
567
+ ) -> List[Document]:
568
+ return self._get_relevant_documents(query, run_manager=run_manager)
569
+
570
+ cross_ref_retriever = CrossReferenceRetriever(
571
+ base_retriever=hybrid_retriever,
572
+ article_map=article_map
573
+ )
574
+ print("✅ Cross-reference retriever ready (using hybrid RRF base)")
575
+
576
+ # 9. Reranker
577
+ print("Loading reranker model...")
578
+ local_model_path = r"D:\FOE\Senior 2\Graduation Project\Chatbot_me\reranker"
579
+
580
+ if not os.path.exists(local_model_path):
581
+ raise FileNotFoundError(f"Reranker path not found: {local_model_path}")
582
+
583
+ model = HuggingFaceCrossEncoder(model_name=local_model_path)
584
+ compressor = CrossEncoderReranker(model=model, top_n=10)
585
+
586
+ # Wrap compression retriever to add Phoenix spans
587
+ class InstrumentedCompressionRetriever(BaseRetriever):
588
+ base_retriever: ContextualCompressionRetriever
589
+
590
+ class Config:
591
+ arbitrary_types_allowed = True
592
+
593
+ def _get_relevant_documents(
594
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
595
+ ) -> List[Document]:
596
+ with PhoenixSpan("reranker_compression", {
597
+ "query": query[:200],
598
+ "model": "HuggingFaceCrossEncoder",
599
+ "top_n": 10
600
+ }, kind="INTERNAL") as rerank_span:
601
+ # Apply reranking (this will call cross_ref_retriever internally)
602
+ reranked_docs = self.base_retriever.invoke(query)
603
+
604
+ rerank_span.set_attr("output_doc_count", len(reranked_docs))
605
+ if reranked_docs:
606
+ output_articles = [d.metadata.get('article_number', 'N/A') for d in reranked_docs]
607
+ rerank_span.set_attr("reranked_articles", ", ".join(map(str, output_articles)))
608
+ rerank_span.set_attr("top_doc_preview", reranked_docs[0].page_content[:400] if reranked_docs else "")
609
+
610
+ return reranked_docs
611
+
612
+ async def _aget_relevant_documents(
613
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
614
+ ) -> List[Document]:
615
+ return self._get_relevant_documents(query, run_manager=run_manager)
616
+
617
+ base_compression_retriever = ContextualCompressionRetriever(
618
+ base_compressor=compressor,
619
+ base_retriever=cross_ref_retriever
620
+ )
621
+ compression_retriever = InstrumentedCompressionRetriever(base_retriever=base_compression_retriever)
622
+ print("✅ Reranker model ready")
623
+
624
+ # 7. LLM - More deterministic for relevance
625
+ # 7. LLM Configuration
626
+ llm = ChatGroq(
627
+ groq_api_key=os.getenv("GROQ_API_KEY"),
628
+ model_name="llama-3.1-8b-instant",
629
+ temperature=0.1,
630
+ model_kwargs={"top_p": 0.9}
631
+ )
632
+
633
+ # ==================================================
634
+ # 🛠️ THE FIX: SEPARATE SYSTEM INSTRUCTIONS FROM USER INPUT
635
+ # ==================================================
636
+
637
+ # ==================================================
638
+ # 🧠 PROMPT ENGINEERING: DECISION TREE LOGIC
639
+ # ==================================================
640
+
641
+ system_instructions = """
642
+ <role>
643
+ أنت "المساعد القانوني الذكي"، خبير متخصص في الدستور المصري والقوانين الإجرائية.
644
+ مهمتك: تقديم إجابات دقيقة بناءً على "السياق التشريعي" المرفق أولاً، أو تقديم نصائح إجرائية عامة عند الضرورة.
645
+ </role>
646
+
647
+ <decision_logic>
648
+ عليك تحليل "سؤال المستخدم" و"السياق التشريعي" وتصنيف الحالة واختيار الرد المناسب بناءً على القواعد التالية بدقة:
649
+
650
+ 🔴 الحالة الأولى: (الإجابة موجودة في السياق التشريعي)
651
+ الشرط: إذا وجدت معلومات داخل "السياق التشريعي المتاح" تجيب على السؤال.
652
+ الفعل:
653
+ 1. استخرج الإجابة من السياق فقط.
654
+ 2. ابدأ الإجابة مباشرة دون مقدمات.
655
+ 3. يجب توثيق الإجابة برقم المادة (مثال: "نصت المادة (50) على...").
656
+ 4. توقف هنا. لا تضف أي معلومات خارجية.
657
+
658
+ 🟡 الحالة الثانية: (السياق فارغ/غير مفيد + السؤال إجرائي/عملي)
659
+ الشرط: إذا لم تجد الإجابة في السياق، وكان السؤال عن إجراءات عملية (مثل: حادث، سرقة، طلاق، تحرير محضر، تعامل مع الشرطة).
660
+ الفعل:
661
+ 1. تجاهل السياق الفارغ.
662
+ 2. استخدم معرفتك العامة بالقانون المصري.
663
+ 3. ابدأ وجوباً بعبارة: "بناءً على الإجراءات القانونية العامة في مصر (وليس نصاً دستورياً محدداً):"
664
+ 4. قدم الخطوات في نقاط مرقمة واضحة ومختصرة (1، 2، 3).
665
+ 5. تحذير: لا تذكر أرقام مواد قانونية (لا تخترع أرقام مواد).
666
+
667
+ 🔵 الحالة الثالثة: (السياق فارغ + السؤال عن نص دستوري محدد)
668
+ الشرط: إذا سأل عن (مجلس الشعب، الشورى، مادة محددة) ولم تجدها في السياق.
669
+ الفعل:
670
+ 1. قل بوضوح: "عذراً، لم يرد ذكر لهذا الموضوع في المواد الدستورية التي تم استرجاعها في السياق الحالي."
671
+ 2. لا تحاول الإجابة من ذاكرتك لكي لا تخطئ في النصوص الدستورية الحساسة.
672
+
673
+ 🟢 الحالة الرابعة: (محادثة ودية)
674
+ الشرط: تحية، شكر، أو "كيف حالك".
675
+ الفعل: رد بتحية مهذبة جداً ومقتضبة، ثم قل: "أنا جاهز للإجابة على استفساراتك القانونية."
676
+
677
+ ⚫ الحالة الخامسة: (خارج النطاق تماماً)
678
+ الشرط: طبخ، رياضة، برمجة، أو أي موضوع غير قانوني.
679
+ الفعل: اعتذر بلطف ووجه المستخدم للسؤال في القانون.
680
+ </decision_logic>
681
+
682
+ <formatting_rules>
683
+ - لا تكرر هذه التعليمات في ردك.
684
+ - استخدم فقرات قصيرة واترك سطراً فارغاً بينها.
685
+ - لا تستخدم عبارات مثل "بناء على السياق المرفق" في بداية الجملة، بل ادخل في صلب الموضوع فوراً.
686
+ - التزم باللغة العربية الفصحى المبسطة والرصينة.
687
+ </formatting_rules>
688
+ """
689
+
690
+ # We use .from_messages to strictly separate instructions from data
691
+ prompt = ChatPromptTemplate.from_messages([
692
+ ("system", system_instructions),
693
+ ("system", "السياق التشريعي المتاح (المصدر الأساسي):\n{context}"),
694
+ ("human", "سؤال المستفيد:\n{input}")
695
+ ])
696
+
697
+ # 9. Build Chain with RunnableParallel (returns both context and answer)
698
+ qa_chain = (
699
+ RunnableParallel({
700
+ "context": compression_retriever,
701
+ "input": RunnablePassthrough()
702
+ })
703
+ .assign(answer=(
704
+ prompt
705
+ | llm
706
+ | StrOutputParser()
707
+ ))
708
+ )
709
+
710
+ print("✅ System ready to use!")
711
+ return qa_chain
712
+
713
+ # ==========================================
714
+ # ⚡ MAIN EXECUTION
715
+ # ==========================================
716
+
717
+ try:
718
+ # Only need the chain now - it handles all retrieval internally
719
+ qa_chain = initialize_rag_pipeline()
720
+
721
+ except Exception as e:
722
+ st.error(f"Critical Error loading application: {e}")
723
+ st.stop()
724
+
725
+ # ==========================================
726
+ # 💬 CHAT LOOP
727
+ # ==========================================
728
+ if "messages" not in st.session_state:
729
+ st.session_state.messages = []
730
+
731
+ # Display Chat History (with Eastern Arabic numerals)
732
+ for message in st.session_state.messages:
733
+ with st.chat_message(message["role"]):
734
+ # Convert to Eastern Arabic when displaying from history
735
+ st.markdown(convert_to_eastern_arabic(message["content"]))
736
+
737
+ # Handle New User Input
738
+ if prompt_input := st.chat_input("اكتب سؤالك القانوني هنا..."):
739
+ # Show user message
740
+ st.session_state.messages.append({"role": "user", "content": prompt_input})
741
+ with st.chat_message("user"):
742
+ st.markdown(prompt_input)
743
+
744
+ # Generate Response
745
+ with st.chat_message("assistant"):
746
+ with st.spinner("جاري التحليل القانوني..."):
747
+ try:
748
+ # Invoke chain ONCE - returns Dict with 'context', 'input', and 'answer'
749
+ with PhoenixSpan("chat_request", {
750
+ "question": prompt_input,
751
+ "question_len": len(prompt_input or ""),
752
+ "timestamp": datetime.utcnow().isoformat(),
753
+ }, kind="SERVER") as span:
754
+ result = qa_chain.invoke(prompt_input)
755
+
756
+ # Extract answer and context from result
757
+ response_text = result["answer"]
758
+ source_docs = result["context"]
759
+
760
+ # Attach detailed context attributes
761
+ try:
762
+ ctx_list = result.get("context", []) or []
763
+ ctx_count = len(ctx_list)
764
+ span.set_attr("context_count", ctx_count)
765
+ if ctx_count:
766
+ # Record all article numbers
767
+ article_nums = [doc.metadata.get("article_number", "N/A") for doc in ctx_list]
768
+ span.set_attr("context_articles", ", ".join(map(str, article_nums)))
769
+ # Record legal natures
770
+ legal_natures = [doc.metadata.get("legal_nature", "N/A") for doc in ctx_list]
771
+ span.set_attr("legal_natures", ", ".join(legal_natures[:5]))
772
+ # Add context preview (first doc)
773
+ span.set_attr("context_preview", ctx_list[0].page_content[:500])
774
+ except Exception:
775
+ pass
776
+
777
+ # Log LLM generation as a nested span (properly nested under chat_request)
778
+ with PhoenixSpan("llm_generation", {
779
+ "model": "llama-3.1-8b-instant",
780
+ "temperature": 0.1,
781
+ "top_p": 0.9,
782
+ "prompt_preview": prompt_input[:300]
783
+ }, kind="CLIENT") as llm_span:
784
+ llm_span.set_attr("response", response_text)
785
+ llm_span.set_attr("response_len", len(response_text))
786
+ llm_span.set_attr("response_preview", response_text[:500])
787
+ llm_span.set_attr("context_docs_used", len(source_docs))
788
+
789
+ # Display Answer
790
+ response_text_arabic = convert_to_eastern_arabic(response_text)
791
+ st.markdown(response_text_arabic)
792
+
793
+ # Display Sources
794
+ if source_docs and len(source_docs) > 0:
795
+ print(f"✅ Found {len(source_docs)} documents")
796
+ # Deduplicate documents by article_number
797
+ seen_articles = set()
798
+ unique_docs = []
799
+
800
+ for doc in source_docs:
801
+ article_num = str(doc.metadata.get('article_number', '')).strip()
802
+ if article_num and article_num not in seen_articles:
803
+ seen_articles.add(article_num)
804
+ unique_docs.append(doc)
805
+
806
+ st.markdown("---") # Separator before sources
807
+
808
+ if unique_docs:
809
+ with st.expander(f"📚 المصادر المستخدمة ({len(unique_docs)} مادة)"):
810
+ st.markdown("### المواد الدستورية المستخدمة في التحليل:")
811
+ st.markdown("---")
812
+
813
+ for idx, doc in enumerate(unique_docs, 1):
814
+ article_num = str(doc.metadata.get('article_number', '')).strip()
815
+ legal_nature = doc.metadata.get('legal_nature', '')
816
+
817
+ if article_num:
818
+ st.markdown(f"**المادة رقم {convert_to_eastern_arabic(article_num)}**")
819
+ if legal_nature:
820
+ st.markdown(f"*الطبيعة القانونية: {legal_nature}*")
821
+
822
+ # Display article content
823
+ content_lines = doc.page_content.strip().split('\n')
824
+ for line in content_lines:
825
+ line = line.strip()
826
+ if line:
827
+ st.markdown(convert_to_eastern_arabic(line))
828
+
829
+ st.markdown("---")
830
+ else:
831
+ st.info("📌 لم يتم العثور على مصادر")
832
+ else:
833
+ st.info("📌 لم يتم العثور على مصادر")
834
+
835
+ # Persist the raw answer to avoid double conversion glitches on rerun
836
+ st.session_state.messages.append({"role": "assistant", "content": response_text})
837
+ except Exception as e:
838
+ st.error(f"حدث خطأ: {e}")
app_final_updated.py ADDED
@@ -0,0 +1,704 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # -*- coding: utf-8 -*-
2
+ import os
3
+ import sys
4
+ import json
5
+ from dotenv import load_dotenv
6
+ import logging
7
+ import warnings
8
+
9
+ # Suppress progress bars from transformers/tqdm
10
+ os.environ['TRANSFORMERS_NO_PROGRESS_BAR'] = '1'
11
+ warnings.filterwarnings('ignore')
12
+
13
+ # 1. Loaders & Splitters
14
+ from langchain_core.documents import Document
15
+ #from langchain_text_splitters import RecursiveCharacterTextSplitter
16
+ from langchain_core.retrievers import BaseRetriever
17
+ from langchain_core.callbacks import CallbackManagerForRetrieverRun
18
+ from typing import List
19
+ from rank_bm25 import BM25Okapi
20
+ import numpy as np
21
+
22
+ # 2. Vector Store & Embeddings
23
+ from langchain_chroma import Chroma
24
+ from langchain_huggingface import HuggingFaceEmbeddings
25
+
26
+ # 3. Reranker Imports
27
+ from langchain_classic.retrievers.document_compressors import CrossEncoderReranker
28
+ from langchain_classic.retrievers import ContextualCompressionRetriever
29
+ from langchain_community.cross_encoders import HuggingFaceCrossEncoder
30
+
31
+ # 4. LLM
32
+ from langchain_groq import ChatGroq
33
+ from langchain_core.prompts import ChatPromptTemplate
34
+ from langchain_core.output_parsers import StrOutputParser
35
+ from langchain_core.runnables import RunnablePassthrough, RunnableParallel
36
+
37
+ # Configure logging
38
+ logging.basicConfig(level=logging.INFO)
39
+ logger = logging.getLogger(__name__)
40
+
41
+ load_dotenv()
42
+
43
+ # ==========================================
44
+ # 🧭 RUNTIME MODE (UI vs CLI)
45
+ # ==========================================
46
+ RUN_MODE = os.getenv("RUN_MODE", "").strip().lower()
47
+ IS_CLI = RUN_MODE in {"cli", "terminal", "eval", "evaluation"}
48
+
49
+ if IS_CLI:
50
+ class _DummyStreamlit:
51
+ @staticmethod
52
+ def cache_resource(func=None, **_kwargs):
53
+ if func is None:
54
+ def decorator(f):
55
+ return f
56
+ return decorator
57
+ return func
58
+
59
+ st = _DummyStreamlit()
60
+ else:
61
+ import streamlit as st
62
+
63
+ # ==========================================
64
+ # 📁 PATHS (use project-relative folders)
65
+ # ==========================================
66
+ BASE_DIR = os.path.dirname(os.path.abspath(__file__))
67
+ DATA_DIR = os.path.join(BASE_DIR, "data")
68
+ CHROMA_DIR = os.path.join(BASE_DIR, "chroma_db")
69
+
70
+ if not IS_CLI:
71
+ # ==========================================
72
+ # 🎨 UI SETUP (CSS FOR ARABIC & RTL)
73
+ # ==========================================
74
+ st.set_page_config(page_title="المساعد القانوني", page_icon="⚖️")
75
+
76
+ # This CSS block fixes the "001" number issue and right alignment
77
+ st.markdown("""
78
+ <style>
79
+ /* Force the main app container to be Right-to-Left */
80
+ .stApp {
81
+ direction: rtl;
82
+ text-align: right;
83
+ }
84
+
85
+ /* Fix input fields to type from right */
86
+ .stTextInput input {
87
+ direction: rtl;
88
+ text-align: right;
89
+ }
90
+
91
+ /* Fix chat messages alignment */
92
+ .stChatMessage {
93
+ direction: rtl;
94
+ text-align: right;
95
+ }
96
+
97
+ /* Ensure proper paragraph spacing */
98
+ .stMarkdown p {
99
+ margin: 0.5em 0 !important;
100
+ line-height: 1.6;
101
+ word-spacing: 0.1em;
102
+ }
103
+
104
+ /* Ensure numbers display correctly in RTL */
105
+ p, div, span, label {
106
+ unicode-bidi: embed;
107
+ direction: inherit;
108
+ white-space: normal;
109
+ word-wrap: break-word;
110
+ }
111
+
112
+ /* Force all content to respect RTL */
113
+ * {
114
+ direction: rtl !important;
115
+ }
116
+
117
+ /* Preserve line breaks and spacing */
118
+ .stMarkdown pre {
119
+ direction: rtl;
120
+ text-align: right;
121
+ white-space: pre-wrap;
122
+ word-wrap: break-word;
123
+ }
124
+
125
+ /* Hide the "Deploy" button and standard menu for cleaner look */
126
+ #MainMenu {visibility: hidden;}
127
+ footer {visibility: hidden;}
128
+
129
+ </style>
130
+ """, unsafe_allow_html=True)
131
+
132
+ # Put this at the top of your code
133
+ def convert_to_eastern_arabic(text):
134
+ """Converts 0123456789 to ٠١٢٣٤٥٦٧٨٩"""
135
+ if not isinstance(text, str):
136
+ return text
137
+ western_numerals = '0123456789'
138
+ eastern_numerals = '٠١٢٣٤٥٦٧٨٩'
139
+ translation_table = str.maketrans(western_numerals, eastern_numerals)
140
+ return text.translate(translation_table)
141
+
142
+ if not IS_CLI:
143
+ st.title("⚖️ المساعد القانوني الذكي (دستور مصر)")
144
+
145
+ # ==========================================
146
+ # 🚀 CACHED RESOURCE LOADING (THE FIX)
147
+ # ==========================================
148
+ # This decorator tells Streamlit: "Run this ONCE and save the result."
149
+ @st.cache_resource
150
+ def initialize_rag_pipeline():
151
+ print("🔄 Initializing system...")
152
+ print("📥 Loading data...")
153
+ # 1. Load JSONs from ./data (supports multiple files)
154
+ def load_json_folder(folder_path: str):
155
+ all_items = []
156
+ for filename in os.listdir(folder_path):
157
+ if not filename.lower().endswith(".json"):
158
+ continue
159
+ file_path = os.path.join(folder_path, filename)
160
+ with open(file_path, "r", encoding="utf-8") as f:
161
+ obj = json.load(f)
162
+
163
+ # Support: list of articles, or dict with 'data'/'articles', or single dict article
164
+ if isinstance(obj, list):
165
+ all_items.extend(obj)
166
+ elif isinstance(obj, dict):
167
+ if "data" in obj and isinstance(obj["data"], list):
168
+ all_items.extend(obj["data"])
169
+ elif "articles" in obj and isinstance(obj["articles"], list):
170
+ all_items.extend(obj["articles"])
171
+ else:
172
+ all_items.append(obj)
173
+ else:
174
+ logger.warning(f"Unsupported JSON format in: {file_path}")
175
+ return all_items
176
+
177
+ if not os.path.exists(DATA_DIR):
178
+ raise FileNotFoundError(f"Data folder not found: {DATA_DIR}")
179
+
180
+ data = load_json_folder(DATA_DIR)
181
+
182
+ # Optional: de-duplicate (article_id preferred, fallback to article_number)
183
+ unique = {}
184
+ for item in data:
185
+ key = str(item.get("article_id") or item.get("article_number") or hash(json.dumps(item, ensure_ascii=False)))
186
+ unique[key] = item
187
+ data = list(unique.values())
188
+
189
+ # Create a mapping of article numbers for cross-reference lookup
190
+ article_map = {str(item['article_number']): item for item in data if 'article_number' in item}
191
+
192
+ docs = []
193
+ for item in data:
194
+ article_number = item.get("article_number")
195
+ original_text = item.get("original_text")
196
+ simplified_summary = item.get("simplified_summary")
197
+
198
+ if not article_number or not original_text or not simplified_summary:
199
+ logger.warning("Skipping item with missing fields (article_number/original_text/simplified_summary)")
200
+ continue
201
+
202
+ cross_refs = item.get("cross_references")
203
+ if not isinstance(cross_refs, list):
204
+ cross_refs = []
205
+
206
+ # Build cross-reference section
207
+ cross_ref_text = ""
208
+ if cross_refs:
209
+ cross_ref_text = "\nالمواد ذات الصلة (المراجع المتقاطعة): " + ", ".join(
210
+ [f"المادة {ref}" for ref in cross_refs]
211
+ )
212
+
213
+ # Construct content
214
+ page_content = f"""
215
+ رقم المادة: {article_number}
216
+ النص الأصلي: {original_text}
217
+ الشرح المبسط: {simplified_summary}{cross_ref_text}
218
+ """
219
+
220
+ metadata = {
221
+ "article_id": item.get("article_id") or str(article_number),
222
+ "article_number": str(article_number),
223
+ "legal_nature": item.get("legal_nature", ""),
224
+ "keywords": ", ".join(item.get("keywords", []) or []),
225
+ "part": item.get("part (Bab)", ""),
226
+ "chapter": item.get("chapter (Fasl)", ""),
227
+ "cross_references": ", ".join([str(ref) for ref in cross_refs])
228
+ }
229
+ docs.append(Document(page_content=page_content, metadata=metadata))
230
+
231
+ print(f"✅ Loaded {len(docs)} constitutional articles")
232
+
233
+ # 2. Embeddings
234
+ print("Loading embeddings model...")
235
+ embeddings = HuggingFaceEmbeddings(
236
+ model_name="Omartificial-Intelligence-Space/GATE-AraBert-v1"
237
+ )
238
+ print("✅ Embeddings model ready")
239
+
240
+ # 3. No splitting - keep articles as complete units
241
+ chunks = docs
242
+ # 4. Vector Store (persist once, load on next runs)
243
+ if os.path.exists(CHROMA_DIR) and os.listdir(CHROMA_DIR):
244
+ print("📦 Loading existing vector database...")
245
+ vectorstore = Chroma(
246
+ persist_directory=CHROMA_DIR,
247
+ embedding_function=embeddings
248
+ )
249
+ print("✅ Loaded existing Chroma DB (no re-embedding)")
250
+ else:
251
+ print("🧱 Building vector database for the first time (this will create embeddings)...")
252
+ vectorstore = Chroma.from_documents(
253
+ chunks,
254
+ embeddings,
255
+ persist_directory=CHROMA_DIR
256
+ )
257
+ print("✅ Built Chroma DB and persisted to disk")
258
+
259
+ base_retriever = vectorstore.as_retriever(search_kwargs={"k": 15})
260
+ # 5. Create BM25 Keyword Retriever
261
+ class BM25Retriever(BaseRetriever):
262
+ """BM25-based keyword retriever for constitutional articles"""
263
+ corpus_docs: List[Document]
264
+ bm25: BM25Okapi = None
265
+ k: int = 15
266
+
267
+ class Config:
268
+ arbitrary_types_allowed = True
269
+
270
+ def __init__(self, **data):
271
+ super().__init__(**data)
272
+ # Tokenize corpus for BM25
273
+ tokenized_corpus = [doc.page_content.split() for doc in self.corpus_docs]
274
+ self.bm25 = BM25Okapi(tokenized_corpus)
275
+
276
+ def _get_relevant_documents(
277
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
278
+ ) -> List[Document]:
279
+ # Tokenize query
280
+ tokenized_query = query.split()
281
+ # Get BM25 scores
282
+ scores = self.bm25.get_scores(tokenized_query)
283
+ # Get top k indices
284
+ top_indices = np.argsort(scores)[::-1][:self.k]
285
+ # Return documents
286
+ return [self.corpus_docs[i] for i in top_indices if scores[i] > 0]
287
+
288
+ async def _aget_relevant_documents(
289
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
290
+ ) -> List[Document]:
291
+ return self._get_relevant_documents(query, run_manager=run_manager)
292
+
293
+ bm25_retriever = BM25Retriever(corpus_docs=docs, k=15)
294
+ print("✅ BM25 keyword retriever ready")
295
+
296
+ # 6. Create Metadata Filter Retriever
297
+ class MetadataFilterRetriever(BaseRetriever):
298
+ """Metadata-based filtering retriever"""
299
+ corpus_docs: List[Document]
300
+ k: int = 15
301
+
302
+ class Config:
303
+ arbitrary_types_allowed = True
304
+
305
+ def _get_relevant_documents(
306
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
307
+ ) -> List[Document]:
308
+ query_lower = query.lower()
309
+ scored_docs = []
310
+
311
+ for doc in self.corpus_docs:
312
+ score = 0
313
+ # Match keywords
314
+ keywords = doc.metadata.get('keywords', '').lower()
315
+ if any(word in keywords for word in query_lower.split()):
316
+ score += 3
317
+
318
+ # Match legal nature
319
+ legal_nature = doc.metadata.get('legal_nature', '').lower()
320
+ if any(word in legal_nature for word in query_lower.split()):
321
+ score += 2
322
+
323
+ # Match part/chapter
324
+ part = doc.metadata.get('part', '').lower()
325
+ chapter = doc.metadata.get('chapter', '').lower()
326
+ if any(word in part or word in chapter for word in query_lower.split()):
327
+ score += 1
328
+
329
+ # Match in content
330
+ if any(word in doc.page_content.lower() for word in query_lower.split()):
331
+ score += 1
332
+
333
+ if score > 0:
334
+ scored_docs.append((doc, score))
335
+
336
+ # Sort by score and return top k
337
+ scored_docs.sort(key=lambda x: x[1], reverse=True)
338
+ return [doc for doc, _ in scored_docs[:self.k]]
339
+
340
+ async def _aget_relevant_documents(
341
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
342
+ ) -> List[Document]:
343
+ return self._get_relevant_documents(query, run_manager=run_manager)
344
+
345
+ metadata_retriever = MetadataFilterRetriever(corpus_docs=docs, k=15)
346
+ print("✅ Metadata filter retriever ready")
347
+
348
+ # 7. Create Hybrid RRF Retriever
349
+ class HybridRRFRetriever(BaseRetriever):
350
+ """Combines semantic, BM25, and metadata retrievers using Reciprocal Rank Fusion"""
351
+ semantic_retriever: BaseRetriever
352
+ bm25_retriever: BM25Retriever
353
+ metadata_retriever: MetadataFilterRetriever
354
+ beta_semantic: float = 0.6 # Weight for semantic search
355
+ beta_keyword: float = 0.2 # Weight for BM25 keyword search
356
+ beta_metadata: float = 0.2 # Weight for metadata filtering
357
+ k: int = 60 # RRF constant (typically 60)
358
+ top_k: int = 15
359
+
360
+ class Config:
361
+ arbitrary_types_allowed = True
362
+
363
+ def _get_relevant_documents(
364
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
365
+ ) -> List[Document]:
366
+ # Get results from all three retrievers
367
+ semantic_docs = self.semantic_retriever.invoke(query)
368
+ bm25_docs = self.bm25_retriever.invoke(query)
369
+ metadata_docs = self.metadata_retriever.invoke(query)
370
+
371
+ # Apply Reciprocal Rank Fusion
372
+ rrf_scores = {}
373
+
374
+ # Process semantic results
375
+ for rank, doc in enumerate(semantic_docs, start=1):
376
+ doc_id = (doc.metadata.get('article_id') or doc.metadata.get('article_number') or str(hash(doc.page_content)))
377
+ rrf_scores[doc_id] = rrf_scores.get(doc_id, 0) + self.beta_semantic / (self.k + rank)
378
+
379
+ # Process BM25 results
380
+ for rank, doc in enumerate(bm25_docs, start=1):
381
+ doc_id = (doc.metadata.get('article_id') or doc.metadata.get('article_number') or str(hash(doc.page_content)))
382
+ rrf_scores[doc_id] = rrf_scores.get(doc_id, 0) + self.beta_keyword / (self.k + rank)
383
+
384
+ # Process metadata results
385
+ for rank, doc in enumerate(metadata_docs, start=1):
386
+ doc_id = (doc.metadata.get('article_id') or doc.metadata.get('article_number') or str(hash(doc.page_content)))
387
+ rrf_scores[doc_id] = rrf_scores.get(doc_id, 0) + self.beta_metadata / (self.k + rank)
388
+
389
+ # Create document lookup
390
+ all_docs = {}
391
+ for doc in semantic_docs + bm25_docs + metadata_docs:
392
+ doc_id = (doc.metadata.get('article_id') or doc.metadata.get('article_number') or str(hash(doc.page_content)))
393
+ if doc_id not in all_docs:
394
+ all_docs[doc_id] = doc
395
+
396
+ # Sort by RRF score
397
+ sorted_doc_ids = sorted(rrf_scores.items(), key=lambda x: x[1], reverse=True)
398
+
399
+ # Return top k documents
400
+ result_docs = []
401
+ for doc_id, score in sorted_doc_ids[:self.top_k]:
402
+ if doc_id in all_docs:
403
+ result_docs.append(all_docs[doc_id])
404
+
405
+ return result_docs
406
+
407
+ async def _aget_relevant_documents(
408
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
409
+ ) -> List[Document]:
410
+ return self._get_relevant_documents(query, run_manager=run_manager)
411
+
412
+ # Create hybrid retriever with tuned beta weights
413
+ hybrid_retriever = HybridRRFRetriever(
414
+ semantic_retriever=base_retriever,
415
+ bm25_retriever=bm25_retriever,
416
+ metadata_retriever=metadata_retriever,
417
+ beta_semantic=0.5, # Semantic search gets highest weight (most reliable)
418
+ beta_keyword=0.3, # BM25 keyword search (good for exact term matches)
419
+ beta_metadata=0.2, # Metadata filtering (supporting role)
420
+ k=60,
421
+ top_k=20
422
+ )
423
+ print("✅ Hybrid RRF retriever ready with β weights: semantic=0.5, keyword=0.3, metadata=0.2")
424
+
425
+ # 8. Create Cross-Reference Enhanced Retriever
426
+ class CrossReferenceRetriever(BaseRetriever):
427
+ """Enhances retrieval by automatically fetching cross-referenced articles"""
428
+ base_retriever: BaseRetriever
429
+ article_map: dict
430
+
431
+ class Config:
432
+ arbitrary_types_allowed = True
433
+
434
+ def _get_relevant_documents(
435
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
436
+ ) -> List[Document]:
437
+ # Get initial results
438
+ initial_docs = self.base_retriever.invoke(query)
439
+
440
+ # Collect all related article numbers
441
+ all_article_numbers = set()
442
+ for doc in initial_docs:
443
+ if 'article_number' in doc.metadata:
444
+ all_article_numbers.add(doc.metadata['article_number'])
445
+ # Parse cross_references (now stored as comma-separated string)
446
+ cross_refs_str = doc.metadata.get('cross_references', '')
447
+ if cross_refs_str:
448
+ cross_refs = [ref.strip() for ref in cross_refs_str.split(',')]
449
+ for ref in cross_refs:
450
+ if ref: # Skip empty strings
451
+ all_article_numbers.add(str(ref))
452
+
453
+ # Build enhanced document list
454
+ enhanced_docs = []
455
+ seen_numbers = set()
456
+
457
+ # Add initially retrieved documents
458
+ for doc in initial_docs:
459
+ enhanced_docs.append(doc)
460
+ seen_numbers.add(doc.metadata.get('article_number'))
461
+
462
+ # Add cross-referenced articles not yet retrieved
463
+ for article_num in all_article_numbers:
464
+ if article_num not in seen_numbers and article_num in self.article_map:
465
+ article_data = self.article_map[article_num]
466
+ cross_ref_text = ""
467
+ cross_refs = article_data.get("cross_references")
468
+ if not isinstance(cross_refs, list):
469
+ cross_refs = []
470
+ if cross_refs:
471
+ cross_ref_text = "\nالمواد ذات الصلة: " + ", ".join(
472
+ [f"المادة {ref}" for ref in cross_refs]
473
+ )
474
+
475
+ page_content = f"""
476
+ رقم المادة: {article_data.get('article_number', '')}
477
+ النص الأصلي: {article_data.get('original_text', '')}
478
+ الشرح المبسط: {article_data.get('simplified_summary', '')}{cross_ref_text}
479
+ """
480
+
481
+ enhanced_doc = Document(
482
+ page_content=page_content,
483
+ metadata={
484
+ "article_id": article_data.get("article_id") or str(article_data.get("article_number", "")),
485
+ "article_number": str(article_data.get("article_number", "")),
486
+ "legal_nature": article_data.get("legal_nature", ""),
487
+ "keywords": ", ".join(article_data.get("keywords", []) or []),
488
+ "cross_references": ", ".join([str(ref) for ref in cross_refs])
489
+ }
490
+ )
491
+ enhanced_docs.append(enhanced_doc)
492
+ seen_numbers.add(article_num)
493
+
494
+ return enhanced_docs
495
+
496
+ async def _aget_relevant_documents(
497
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
498
+ ) -> List[Document]:
499
+ return self._get_relevant_documents(query, run_manager=run_manager)
500
+
501
+ cross_ref_retriever = CrossReferenceRetriever(
502
+ base_retriever=hybrid_retriever,
503
+ article_map=article_map
504
+ )
505
+ print("✅ Cross-reference retriever ready (using hybrid RRF base)")
506
+
507
+ # 9. Reranker
508
+ print("Loading reranker model...")
509
+ local_model_path = r"D:\FOE\Senior 2\Graduation Project\Chatbot_me\reranker"
510
+
511
+ if not os.path.exists(local_model_path):
512
+ raise FileNotFoundError(f"Reranker path not found: {local_model_path}")
513
+
514
+ model = HuggingFaceCrossEncoder(model_name=local_model_path)
515
+ compressor = CrossEncoderReranker(model=model, top_n=5)
516
+
517
+ compression_retriever = ContextualCompressionRetriever(
518
+ base_compressor=compressor,
519
+ base_retriever=cross_ref_retriever
520
+ )
521
+ print("✅ Reranker model ready")
522
+
523
+ # 7. LLM - Balanced for consistency with slight creativity
524
+ # 7. LLM Configuration
525
+ llm = ChatGroq(
526
+ groq_api_key=os.getenv("GROQ_API_KEY"),
527
+ model_name="llama-3.1-8b-instant",
528
+ temperature=0.3, # Slightly increased to allow helpful general advice
529
+ model_kwargs={"top_p": 0.9}
530
+ )
531
+
532
+ # ==================================================
533
+ # 🛠️ THE FIX: SEPARATE SYSTEM INSTRUCTIONS FROM USER INPUT
534
+ # ==================================================
535
+
536
+ # ==================================================
537
+ # 🧠 PROMPT ENGINEERING: DECISION TREE LOGIC
538
+ # ==================================================
539
+
540
+ system_instructions = """
541
+ <role>
542
+ أنت "المساعد القانوني الذكي"، خبير متخصص في الدستور المصري والقوانين الإجرائية.
543
+ مهمتك: تقديم إجابات دقيقة بناءً على "السياق التشريعي" المرفق أولاً، أو تقديم نصائح إجرائية عامة عند الضرورة.
544
+ </role>
545
+
546
+ <decision_logic>
547
+ عليك تحليل "سؤال المستخدم" و"السياق التشريعي" وتصنيف الحالة واختيار الرد المناسب بناءً على القواعد التالية بدقة:
548
+
549
+ 🔴 الحالة الأولى: (الإجابة موجودة في السياق التشريعي)
550
+ الشرط: إذا وجدت معلومات داخل "السياق التشريعي المتاح" تجيب على السؤال.
551
+ الفعل:
552
+ 1. استخرج الإجابة من السياق فقط.
553
+ 2. ابدأ الإجابة مباشرة دون مقدمات.
554
+ 3. يجب توثيق الإجابة برقم المادة (مثال: "نصت المادة (50) على...").
555
+ 4. توقف هنا. لا تضف أي معلومات خارجية.
556
+
557
+ 🟡 الحالة الثانية: (السياق فارغ/غير مفيد + السؤال إجرائي/عملي)
558
+ الشرط: إذا لم تجد الإجابة في السياق، وكان السؤال عن إجراءات عملية (مثل: حادث، سرقة، طلاق، تحرير محضر، تعامل مع الشرطة).
559
+ الفعل:
560
+ 1. تجاهل السياق الفارغ.
561
+ 2. استخدم معرفتك العامة بالقانون المصري.
562
+ 3. ابدأ وجوباً بعبارة: "بناءً على الإجراءات القانونية العامة في مصر (وليس نصاً دستورياً محدداً):"
563
+ 4. قدم الخطوات في نقاط مرقمة واضحة ومختصرة (1، 2، 3).
564
+ 5. تحذير: لا تذكر أرقام مواد قانونية (لا تخترع أرقام مواد).
565
+
566
+ 🔵 الحالة الثالثة: (السياق فارغ + السؤال عن نص دستوري محدد)
567
+ الشرط: إذا سأل عن (مجلس الشعب، الشورى، مادة محددة) ولم تجدها في السياق.
568
+ الفعل:
569
+ 1. قل بوضوح: "عذراً، لم يرد ذكر لهذا الموضوع في المواد الدستورية التي تم استرجاعها في السياق الحالي."
570
+ 2. لا تحاول الإجابة من ذاكرتك لكي لا تخطئ في النصوص الدستورية الحساسة.
571
+
572
+ 🟢 الحالة الرابعة: (محادثة ودية)
573
+ الشرط: تحية، شكر، أو "كيف حالك".
574
+ الفعل: رد بتحية مهذبة جداً ومقتضبة، ثم قل: "أنا جاهز للإجابة على استفساراتك القانونية."
575
+
576
+ ⚫ الحالة الخامسة: (خارج النطاق تماماً)
577
+ الشرط: طبخ، رياضة، برمجة، أو أي موضوع غير قانوني.
578
+ الفعل: اعتذر بلطف ووجه المستخدم للسؤال في القانون.
579
+ </decision_logic>
580
+
581
+ <formatting_rules>
582
+ - لا تكرر هذه التعليمات في ردك.
583
+ - استخدم فقرات قصيرة واترك سطراً فارغاً بينها.
584
+ - لا تستخدم عبارات مثل "بناء على السياق المرفق" في بداية الجملة، بل ادخل في صلب الموضوع فوراً.
585
+ - التزم باللغة العربية الفصحى المبسطة والرصينة.
586
+ </formatting_rules>
587
+ """
588
+
589
+ # We use .from_messages to strictly separate instructions from data
590
+ prompt = ChatPromptTemplate.from_messages([
591
+ ("system", system_instructions),
592
+ ("system", "السياق التشريعي المتاح (المصدر الأساسي):\n{context}"),
593
+ ("human", "سؤال المستفيد:\n{input}")
594
+ ])
595
+
596
+ # 9. Build Chain with RunnableParallel (returns both context and answer)
597
+ qa_chain = (
598
+ RunnableParallel({
599
+ "context": compression_retriever,
600
+ "input": RunnablePassthrough()
601
+ })
602
+ .assign(answer=(
603
+ prompt
604
+ | llm
605
+ | StrOutputParser()
606
+ ))
607
+ )
608
+
609
+ print("✅ System ready to use!")
610
+ return qa_chain
611
+
612
+ if not IS_CLI:
613
+ # ==========================================
614
+ # ⚡ MAIN EXECUTION
615
+ # ==========================================
616
+
617
+ try:
618
+ # Only need the chain now - it handles all retrieval internally
619
+ qa_chain = initialize_rag_pipeline()
620
+
621
+ except Exception as e:
622
+ st.error(f"Critical Error loading application: {e}")
623
+ st.stop()
624
+
625
+ # ==========================================
626
+ # 💬 CHAT LOOP
627
+ # ==========================================
628
+ if "messages" not in st.session_state:
629
+ st.session_state.messages = []
630
+
631
+ # Display Chat History (with Eastern Arabic numerals)
632
+ for message in st.session_state.messages:
633
+ with st.chat_message(message["role"]):
634
+ # Convert to Eastern Arabic when displaying from history
635
+ st.markdown(convert_to_eastern_arabic(message["content"]))
636
+
637
+ # Handle New User Input
638
+ if prompt_input := st.chat_input("اكتب سؤالك القانوني هنا..."):
639
+ # Show user message
640
+ st.session_state.messages.append({"role": "user", "content": prompt_input})
641
+ with st.chat_message("user"):
642
+ st.markdown(prompt_input)
643
+
644
+ # Generate Response
645
+ with st.chat_message("assistant"):
646
+ with st.spinner("جاري التحليل القانوني..."):
647
+ try:
648
+ # Invoke chain ONCE - returns Dict with 'context', 'input', and 'answer'
649
+ result = qa_chain.invoke(prompt_input)
650
+
651
+ # Extract answer and context from result
652
+ response_text = result["answer"]
653
+ source_docs = result["context"] # Context is already in the result!
654
+
655
+ # Display Answer
656
+ response_text_arabic = convert_to_eastern_arabic(response_text)
657
+ st.markdown(response_text_arabic)
658
+
659
+ # Display Sources
660
+ if source_docs and len(source_docs) > 0:
661
+ print(f"✅ Found {len(source_docs)} documents")
662
+ # Deduplicate documents by article_number
663
+ seen_articles = set()
664
+ unique_docs = []
665
+
666
+ for doc in source_docs:
667
+ article_num = str(doc.metadata.get('article_number', '')).strip()
668
+ if article_num and article_num not in seen_articles:
669
+ seen_articles.add(article_num)
670
+ unique_docs.append(doc)
671
+
672
+ st.markdown("---") # Separator before sources
673
+
674
+ if unique_docs:
675
+ with st.expander(f"📚 المصادر المستخدمة ({len(unique_docs)} مادة)"):
676
+ st.markdown("### المواد الدستورية المستخدمة في التحليل:")
677
+ st.markdown("---")
678
+
679
+ for idx, doc in enumerate(unique_docs, 1):
680
+ article_num = str(doc.metadata.get('article_number', '')).strip()
681
+ legal_nature = doc.metadata.get('legal_nature', '')
682
+
683
+ if article_num:
684
+ st.markdown(f"**المادة رقم {convert_to_eastern_arabic(article_num)}**")
685
+ if legal_nature:
686
+ st.markdown(f"*الطبيعة القانونية: {legal_nature}*")
687
+
688
+ # Display article content
689
+ content_lines = doc.page_content.strip().split('\n')
690
+ for line in content_lines:
691
+ line = line.strip()
692
+ if line:
693
+ st.markdown(convert_to_eastern_arabic(line))
694
+
695
+ st.markdown("---")
696
+ else:
697
+ st.info("📌 لم يتم العثور على مصادر")
698
+ else:
699
+ st.info("📌 لم يتم العثور على مصادر")
700
+
701
+ # Persist the raw answer to avoid double conversion glitches on rerun
702
+ st.session_state.messages.append({"role": "assistant", "content": response_text})
703
+ except Exception as e:
704
+ st.error(f"حدث خطأ: {e}")
evaluate.py ADDED
@@ -0,0 +1,620 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # -*- coding: utf-8 -*-
2
+ """
3
+ RAGAS Evaluation Script for Constitutional Legal Assistant
4
+ Evaluates: faithfulness, answer_relevancy, context_precision, context_recall
5
+ """
6
+
7
+ import os
8
+ import json
9
+ from dotenv import load_dotenv
10
+ import logging
11
+ import warnings
12
+
13
+ # Suppress progress bars
14
+ os.environ['TRANSFORMERS_NO_PROGRESS_BAR'] = '1'
15
+ warnings.filterwarnings('ignore')
16
+
17
+ # Core imports
18
+ from langchain_core.documents import Document
19
+ from langchain_core.retrievers import BaseRetriever
20
+ from langchain_core.callbacks import CallbackManagerForRetrieverRun
21
+ from typing import List
22
+ from rank_bm25 import BM25Okapi
23
+ import numpy as np
24
+
25
+ # Vector Store & Embeddings
26
+ from langchain_chroma import Chroma
27
+ from langchain_huggingface import HuggingFaceEmbeddings
28
+
29
+ # Reranker
30
+ from langchain_classic.retrievers.document_compressors import CrossEncoderReranker
31
+ from langchain_classic.retrievers import ContextualCompressionRetriever
32
+ from langchain_community.cross_encoders import HuggingFaceCrossEncoder
33
+
34
+ # LLM
35
+ from langchain_groq import ChatGroq
36
+ from langchain_core.prompts import ChatPromptTemplate
37
+ from langchain_core.output_parsers import StrOutputParser
38
+ from langchain_core.runnables import RunnablePassthrough, RunnableParallel
39
+
40
+ # Evaluation
41
+ from datasets import Dataset
42
+ from ragas import evaluate
43
+ from ragas.metrics import (
44
+ faithfulness,
45
+ answer_relevancy,
46
+ context_precision,
47
+ context_recall,
48
+ )
49
+ from ragas.llms import LangchainLLMWrapper
50
+ from ragas.embeddings import LangchainEmbeddingsWrapper
51
+
52
+ # Configure logging
53
+ logging.basicConfig(level=logging.INFO)
54
+ logger = logging.getLogger(__name__)
55
+
56
+ load_dotenv()
57
+
58
+ # ==========================================
59
+ # 🚀 RAG PIPELINE INITIALIZATION
60
+ # ==========================================
61
+
62
+ def initialize_rag_pipeline():
63
+ """Initialize the RAG pipeline for constitutional legal questions"""
64
+ print("🔄 Initializing RAG pipeline...")
65
+ print("📥 Loading data...")
66
+
67
+ # 1. Load JSON
68
+ json_path = "Egyptian_Constitution_legalnature_only.json"
69
+ if not os.path.exists(json_path):
70
+ raise FileNotFoundError(f"File not found: {json_path}")
71
+
72
+ with open(json_path, "r", encoding="utf-8") as f:
73
+ data = json.load(f)
74
+
75
+ # Create article mapping for cross-references
76
+ article_map = {str(item['article_number']): item for item in data}
77
+
78
+ docs = []
79
+ for item in data:
80
+ # Build cross-reference section
81
+ cross_ref_text = ""
82
+ if item.get('cross_references') and len(item['cross_references']) > 0:
83
+ cross_ref_text = "\nالمواد ذات الصلة (المراجع المتقاطعة): " + ", ".join(
84
+ [f"المادة {ref}" for ref in item['cross_references']]
85
+ )
86
+
87
+ # Construct document content
88
+ page_content = f"""
89
+ رقم المادة: {item['article_number']}
90
+ النص الأصلي: {item['original_text']}
91
+ الشرح المبسط: {item['simplified_summary']}{cross_ref_text}
92
+ """
93
+
94
+ metadata = {
95
+ "article_id": item['article_id'],
96
+ "article_number": str(item['article_number']),
97
+ "legal_nature": item['legal_nature'],
98
+ "keywords": ", ".join(item['keywords']),
99
+ "part": item.get('part (Bab)', ''),
100
+ "chapter": item.get('chapter (Fasl)', ''),
101
+ "cross_references": ", ".join([str(ref) for ref in item.get('cross_references', [])])
102
+ }
103
+ docs.append(Document(page_content=page_content, metadata=metadata))
104
+
105
+ print(f"✅ Loaded {len(docs)} constitutional articles")
106
+
107
+ # 2. Embeddings
108
+ print("Loading embeddings model...")
109
+ embeddings = HuggingFaceEmbeddings(
110
+ model_name="Omartificial-Intelligence-Space/GATE-AraBert-v1"
111
+ )
112
+ print("✅ Embeddings ready")
113
+
114
+ # 3. Vector Store
115
+ print("Building vector database...")
116
+ vectorstore = Chroma.from_documents(
117
+ docs,
118
+ embeddings,
119
+ persist_directory="chroma_db"
120
+ )
121
+ base_retriever = vectorstore.as_retriever(search_kwargs={"k": 15})
122
+ print("✅ Vector database ready")
123
+
124
+ # 4. BM25 Keyword Retriever
125
+ class BM25Retriever(BaseRetriever):
126
+ """BM25-based keyword retriever"""
127
+ corpus_docs: List[Document]
128
+ bm25: BM25Okapi = None
129
+ k: int = 15
130
+
131
+ class Config:
132
+ arbitrary_types_allowed = True
133
+
134
+ def __init__(self, **data):
135
+ super().__init__(**data)
136
+ tokenized_corpus = [doc.page_content.split() for doc in self.corpus_docs]
137
+ self.bm25 = BM25Okapi(tokenized_corpus)
138
+
139
+ def _get_relevant_documents(
140
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
141
+ ) -> List[Document]:
142
+ tokenized_query = query.split()
143
+ scores = self.bm25.get_scores(tokenized_query)
144
+ top_indices = np.argsort(scores)[::-1][:self.k]
145
+ return [self.corpus_docs[i] for i in top_indices if scores[i] > 0]
146
+
147
+ async def _aget_relevant_documents(self, query: str, **kwargs) -> List[Document]:
148
+ return self._get_relevant_documents(query)
149
+
150
+ bm25_retriever = BM25Retriever(corpus_docs=docs, k=15)
151
+ print("✅ BM25 retriever ready")
152
+
153
+ # 5. Metadata Filter Retriever
154
+ class MetadataFilterRetriever(BaseRetriever):
155
+ """Metadata-based filtering retriever"""
156
+ corpus_docs: List[Document]
157
+ k: int = 15
158
+
159
+ class Config:
160
+ arbitrary_types_allowed = True
161
+
162
+ def _get_relevant_documents(
163
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
164
+ ) -> List[Document]:
165
+ query_lower = query.lower()
166
+ scored_docs = []
167
+
168
+ for doc in self.corpus_docs:
169
+ score = 0
170
+ keywords = doc.metadata.get('keywords', '').lower()
171
+ if any(word in keywords for word in query_lower.split()):
172
+ score += 3
173
+
174
+ legal_nature = doc.metadata.get('legal_nature', '').lower()
175
+ if any(word in legal_nature for word in query_lower.split()):
176
+ score += 2
177
+
178
+ part = doc.metadata.get('part', '').lower()
179
+ chapter = doc.metadata.get('chapter', '').lower()
180
+ if any(word in part or word in chapter for word in query_lower.split()):
181
+ score += 1
182
+
183
+ if any(word in doc.page_content.lower() for word in query_lower.split()):
184
+ score += 1
185
+
186
+ if score > 0:
187
+ scored_docs.append((doc, score))
188
+
189
+ scored_docs.sort(key=lambda x: x[1], reverse=True)
190
+ return [doc for doc, _ in scored_docs[:self.k]]
191
+
192
+ async def _aget_relevant_documents(self, query: str, **kwargs) -> List[Document]:
193
+ return self._get_relevant_documents(query)
194
+
195
+ metadata_retriever = MetadataFilterRetriever(corpus_docs=docs, k=15)
196
+ print("✅ Metadata retriever ready")
197
+
198
+ # 6. Hybrid RRF Retriever
199
+ class HybridRRFRetriever(BaseRetriever):
200
+ """Combines semantic, BM25, and metadata using Reciprocal Rank Fusion"""
201
+ semantic_retriever: BaseRetriever
202
+ bm25_retriever: BM25Retriever
203
+ metadata_retriever: MetadataFilterRetriever
204
+ beta_semantic: float = 0.5
205
+ beta_keyword: float = 0.3
206
+ beta_metadata: float = 0.2
207
+ k: int = 60
208
+ top_k: int = 15
209
+
210
+ class Config:
211
+ arbitrary_types_allowed = True
212
+
213
+ def _get_relevant_documents(
214
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
215
+ ) -> List[Document]:
216
+ semantic_docs = self.semantic_retriever.invoke(query)
217
+ bm25_docs = self.bm25_retriever.invoke(query)
218
+ metadata_docs = self.metadata_retriever.invoke(query)
219
+
220
+ rrf_scores = {}
221
+
222
+ for rank, doc in enumerate(semantic_docs, start=1):
223
+ doc_id = doc.metadata.get('article_number', str(hash(doc.page_content)))
224
+ rrf_scores[doc_id] = rrf_scores.get(doc_id, 0) + self.beta_semantic / (self.k + rank)
225
+
226
+ for rank, doc in enumerate(bm25_docs, start=1):
227
+ doc_id = doc.metadata.get('article_number', str(hash(doc.page_content)))
228
+ rrf_scores[doc_id] = rrf_scores.get(doc_id, 0) + self.beta_keyword / (self.k + rank)
229
+
230
+ for rank, doc in enumerate(metadata_docs, start=1):
231
+ doc_id = doc.metadata.get('article_number', str(hash(doc.page_content)))
232
+ rrf_scores[doc_id] = rrf_scores.get(doc_id, 0) + self.beta_metadata / (self.k + rank)
233
+
234
+ all_docs = {}
235
+ for doc in semantic_docs + bm25_docs + metadata_docs:
236
+ doc_id = doc.metadata.get('article_number', str(hash(doc.page_content)))
237
+ if doc_id not in all_docs:
238
+ all_docs[doc_id] = doc
239
+
240
+ sorted_doc_ids = sorted(rrf_scores.items(), key=lambda x: x[1], reverse=True)
241
+ result_docs = []
242
+ for doc_id, score in sorted_doc_ids[:self.top_k]:
243
+ if doc_id in all_docs:
244
+ result_docs.append(all_docs[doc_id])
245
+
246
+ return result_docs
247
+
248
+ async def _aget_relevant_documents(self, query: str, **kwargs) -> List[Document]:
249
+ return self._get_relevant_documents(query)
250
+
251
+ hybrid_retriever = HybridRRFRetriever(
252
+ semantic_retriever=base_retriever,
253
+ bm25_retriever=bm25_retriever,
254
+ metadata_retriever=metadata_retriever,
255
+ beta_semantic=0.5,
256
+ beta_keyword=0.3,
257
+ beta_metadata=0.2,
258
+ k=60,
259
+ top_k=20
260
+ )
261
+ print("✅ Hybrid RRF retriever ready (β: semantic=0.5, keyword=0.3, metadata=0.2)")
262
+
263
+ # 7. Cross-Reference Retriever
264
+ class CrossReferenceRetriever(BaseRetriever):
265
+ """Enhances retrieval by fetching cross-referenced articles"""
266
+ base_retriever: BaseRetriever
267
+ article_map: dict
268
+
269
+ class Config:
270
+ arbitrary_types_allowed = True
271
+
272
+ def _get_relevant_documents(
273
+ self, query: str, *, run_manager: CallbackManagerForRetrieverRun = None
274
+ ) -> List[Document]:
275
+ initial_docs = self.base_retriever.invoke(query)
276
+
277
+ all_article_numbers = set()
278
+ for doc in initial_docs:
279
+ if 'article_number' in doc.metadata:
280
+ all_article_numbers.add(doc.metadata['article_number'])
281
+ cross_refs_str = doc.metadata.get('cross_references', '')
282
+ if cross_refs_str:
283
+ cross_refs = [ref.strip() for ref in cross_refs_str.split(',')]
284
+ for ref in cross_refs:
285
+ if ref:
286
+ all_article_numbers.add(str(ref))
287
+
288
+ enhanced_docs = []
289
+ seen_numbers = set()
290
+
291
+ for doc in initial_docs:
292
+ enhanced_docs.append(doc)
293
+ seen_numbers.add(doc.metadata.get('article_number'))
294
+
295
+ for article_num in all_article_numbers:
296
+ if article_num not in seen_numbers and article_num in self.article_map:
297
+ article_data = self.article_map[article_num]
298
+ cross_ref_text = ""
299
+ if article_data.get('cross_references'):
300
+ cross_ref_text = "\nالمواد ذات الصلة: " + ", ".join(
301
+ [f"المادة {ref}" for ref in article_data['cross_references']]
302
+ )
303
+
304
+ page_content = f"""
305
+ رقم المادة: {article_data['article_number']}
306
+ النص الأصلي: {article_data['original_text']}
307
+ الشرح المبسط: {article_data['simplified_summary']}{cross_ref_text}
308
+ """
309
+
310
+ enhanced_doc = Document(
311
+ page_content=page_content,
312
+ metadata={
313
+ "article_id": article_data['article_id'],
314
+ "article_number": str(article_data['article_number']),
315
+ "legal_nature": article_data['legal_nature'],
316
+ "keywords": ", ".join(article_data['keywords']),
317
+ "cross_references": ", ".join([str(ref) for ref in article_data.get('cross_references', [])])
318
+ }
319
+ )
320
+ enhanced_docs.append(enhanced_doc)
321
+ seen_numbers.add(article_num)
322
+
323
+ return enhanced_docs
324
+
325
+ async def _aget_relevant_documents(self, query: str, **kwargs) -> List[Document]:
326
+ return self._get_relevant_documents(query)
327
+
328
+ cross_ref_retriever = CrossReferenceRetriever(
329
+ base_retriever=hybrid_retriever,
330
+ article_map=article_map
331
+ )
332
+ print("✅ Cross-reference retriever ready")
333
+
334
+ # 8. Reranker
335
+ print("Loading reranker model...")
336
+ local_model_path = r"D:\FOE\Senior 2\Graduation Project\Chatbot_me\reranker"
337
+
338
+ if not os.path.exists(local_model_path):
339
+ raise FileNotFoundError(f"Reranker path not found: {local_model_path}")
340
+
341
+ model = HuggingFaceCrossEncoder(model_name=local_model_path)
342
+ compressor = CrossEncoderReranker(model=model, top_n=5)
343
+
344
+ compression_retriever = ContextualCompressionRetriever(
345
+ base_compressor=compressor,
346
+ base_retriever=cross_ref_retriever
347
+ )
348
+ print("✅ Reranker ready (top_n=5)")
349
+
350
+ # 9. LLM Configuration
351
+ llm = ChatGroq(
352
+ groq_api_key=os.getenv("GROQ_API_KEY"),
353
+ model_name="llama-3.1-8b-instant",
354
+ temperature=0.3,
355
+ model_kwargs={"top_p": 0.9}
356
+ )
357
+
358
+ # 10. Prompt Template
359
+ system_instructions = """
360
+ <role>
361
+ أنت "المساعد القانوني الذكي"، خبير متخصص في الدستور المصري والقوانين الإجرائية.
362
+ مهمتك: تقديم إجابات دقيقة بناءً على "السياق التشريعي" المرفق أولاً، أو تقديم نصائح إجرائية عامة عند الضرورة.
363
+ </role>
364
+
365
+ <decision_logic>
366
+ عليك تحليل "سؤال المستخدم" و"السياق التشريعي" وتصنيف الحالة واختيار الرد المناسب:
367
+
368
+ 🔴 الحالة الأولى: (الإجابة موجودة في السياق التشريعي)
369
+ - استخرج الإجابة من السياق فقط
370
+ - ابدأ الإجابة مباشرة دون مقدمات
371
+ - وثق الإجابة برقم المادة
372
+ - توقف، لا تضف معلومات خارجية
373
+
374
+ 🟡 الحالة ��لثانية: (السياق فارغ + السؤال إجرائي/عملي)
375
+ - استخدم معرفتك العامة بالقانون المصري
376
+ - ابدأ بـ: "بناءً على الإجراءات القانونية العامة في مصر:"
377
+ - قدم الخطوات في نقاط مرقمة
378
+
379
+ 🔵 الحالة الثالثة: (السياق فارغ + سؤال دستوري)
380
+ - قل: "عذراً، لم يرد ذكر لهذا في المواد المسترجاعة"
381
+ - لا تخترع نصوصاً دستورية
382
+
383
+ 🟢 الحالة الرابعة: (تحية/شكر)
384
+ - رد بتحية مهذبة مختصرة
385
+
386
+ ⚫ الحالة الخامسة: (خارج النطاق)
387
+ - اعتذر بلطف ووجه للقانون
388
+ </decision_logic>
389
+
390
+ <formatting_rules>
391
+ - استخدم فقرات قصيرة واترك سطراً فارغاً بينها
392
+ - التزم باللغة العربية الفصحى المبسطة
393
+ </formatting_rules>
394
+ """
395
+
396
+ prompt = ChatPromptTemplate.from_messages([
397
+ ("system", system_instructions),
398
+ ("system", "السياق التشريعي المتاح:\n{context}"),
399
+ ("human", "السؤال:\n{input}")
400
+ ])
401
+
402
+ # 11. Build QA Chain
403
+ qa_chain = (
404
+ RunnableParallel({
405
+ "context": compression_retriever,
406
+ "input": RunnablePassthrough()
407
+ })
408
+ .assign(answer=(
409
+ prompt
410
+ | llm
411
+ | StrOutputParser()
412
+ ))
413
+ )
414
+
415
+ print("✅ RAG pipeline initialized!\n")
416
+ return qa_chain
417
+
418
+ # ==========================================
419
+ # 📊 RAGAS EVALUATION
420
+ # ==========================================
421
+
422
+ def run_evaluation(test_file="test_dataset.json", output_file="evaluation_results.json"):
423
+ """Run RAGAS evaluation on test dataset"""
424
+
425
+ print("\n" + "="*60)
426
+ print("📊 RAGAS EVALUATION")
427
+ print("="*60)
428
+
429
+ # Load test dataset
430
+ print(f"\n📂 Loading test dataset: {test_file}")
431
+ with open(test_file, "r", encoding="utf-8") as f:
432
+ test_questions = json.load(f)
433
+ print(f"✅ Loaded {len(test_questions)} test questions")
434
+
435
+ # Initialize RAG pipeline
436
+ print("\n📥 Initializing RAG pipeline...")
437
+ qa_chain = initialize_rag_pipeline()
438
+
439
+ # Generate answers
440
+ print("\n🤖 Generating answers for evaluation...")
441
+ results = {
442
+ "question": [],
443
+ "answer": [],
444
+ "contexts": [],
445
+ "ground_truth": []
446
+ }
447
+
448
+ for idx, item in enumerate(test_questions, 1):
449
+ question = item["question"]
450
+ ground_truth = item.get("ground_truth", "")
451
+
452
+ print(f" [{idx}/{len(test_questions)}] Processing question {idx}...")
453
+
454
+ try:
455
+ result = qa_chain.invoke(question)
456
+ answer = result["answer"]
457
+ contexts = [doc.page_content for doc in result["context"]]
458
+
459
+ results["question"].append(question)
460
+ results["answer"].append(answer)
461
+ results["contexts"].append(contexts)
462
+ results["ground_truth"].append(ground_truth)
463
+
464
+ except Exception as e:
465
+ print(f" ❌ Error: {str(e)[:100]}")
466
+ results["question"].append(question)
467
+ results["answer"].append("Error generating answer")
468
+ results["contexts"].append([])
469
+ results["ground_truth"].append(ground_truth)
470
+
471
+ # Run Ragas evaluation
472
+ print("\n⚙️ Running RAGAS metrics...")
473
+ dataset = Dataset.from_dict(results)
474
+
475
+ # Configure evaluation LLM (same as main app)
476
+ print(" 📌 Using Groq (llama-3.1-8b-instant, temp=0.3, top_p=0.9)")
477
+ evaluator_llm = LangchainLLMWrapper(ChatGroq(
478
+ groq_api_key=os.getenv("GROQ_API_KEY"),
479
+ model_name="llama-3.1-8b-instant",
480
+ temperature=0.3,
481
+ model_kwargs={"top_p": 0.9},
482
+ max_retries=2
483
+ ))
484
+
485
+ # Configure evaluation embeddings (same as main app)
486
+ print(" 📌 Using HuggingFace (Omartificial-Intelligence-Space/GATE-AraBert-v1)")
487
+ evaluator_embeddings = LangchainEmbeddingsWrapper(HuggingFaceEmbeddings(
488
+ model_name="Omartificial-Intelligence-Space/GATE-AraBert-v1"
489
+ ))
490
+
491
+ try:
492
+ import time
493
+ print("\n ⏳ Evaluating each question separately with all metrics...")
494
+ print(" ⚠️ This will take ~10-15 minutes (120 sec delay between questions)\n")
495
+
496
+ # Evaluate each question separately to see results immediately
497
+ all_scores = {
498
+ "faithfulness": [],
499
+ "answer_relevancy": [],
500
+ "context_precision": [],
501
+ "context_recall": []
502
+ }
503
+
504
+ for q_idx in range(len(results["question"])):
505
+ print(f"\n 📋 Question {q_idx + 1}/{len(results['question'])}: {results['question'][q_idx][:60]}...")
506
+
507
+ # Create single-question dataset
508
+ single_q_data = {
509
+ "question": [results["question"][q_idx]],
510
+ "answer": [results["answer"][q_idx]],
511
+ "contexts": [results["contexts"][q_idx]],
512
+ "ground_truth": [results["ground_truth"][q_idx]]
513
+ }
514
+ single_dataset = Dataset.from_dict(single_q_data)
515
+
516
+ # Evaluate all metrics for this question
517
+ try:
518
+ q_result = evaluate(
519
+ single_dataset,
520
+ metrics=[faithfulness, answer_relevancy, context_precision, context_recall],
521
+ llm=evaluator_llm,
522
+ embeddings=evaluator_embeddings,
523
+ raise_exceptions=False
524
+ )
525
+
526
+ # Extract scores (handle if they're lists or single values)
527
+ def get_score(value):
528
+ if isinstance(value, list):
529
+ return value[0] if len(value) > 0 else 0.0
530
+ return float(value) if value is not None else 0.0
531
+
532
+ f_score = get_score(q_result['faithfulness'])
533
+ a_score = get_score(q_result['answer_relevancy'])
534
+ cp_score = get_score(q_result['context_precision'])
535
+ cr_score = get_score(q_result['context_recall'])
536
+
537
+ # Display scores for this question
538
+ print(f" Faithfulness : {f_score:.4f}")
539
+ print(f" Answer Relevancy : {a_score:.4f}")
540
+ print(f" Context Precision : {cp_score:.4f}")
541
+ print(f" Context Recall : {cr_score:.4f}")
542
+
543
+ all_scores["faithfulness"].append(f_score)
544
+ all_scores["answer_relevancy"].append(a_score)
545
+ all_scores["context_precision"].append(cp_score)
546
+ all_scores["context_recall"].append(cr_score)
547
+
548
+ except Exception as e:
549
+ print(f" ❌ Error evaluating this question: {str(e)[:80]}")
550
+ all_scores["faithfulness"].append(0.0)
551
+ all_scores["answer_relevancy"].append(0.0)
552
+ all_scores["context_precision"].append(0.0)
553
+ all_scores["context_recall"].append(0.0)
554
+
555
+ # Wait between questions to avoid rate limits
556
+ if q_idx < len(results["question"]) - 1:
557
+ print(f"\n ⏳ Waiting 120 seconds (2 min) before next question...")
558
+ time.sleep(120)
559
+
560
+ # Calculate average scores
561
+ eval_results = {
562
+ "faithfulness": sum(all_scores["faithfulness"]) / len(all_scores["faithfulness"]) if all_scores["faithfulness"] else 0.0,
563
+ "answer_relevancy": sum(all_scores["answer_relevancy"]) / len(all_scores["answer_relevancy"]) if all_scores["answer_relevancy"] else 0.0,
564
+ "context_precision": sum(all_scores["context_precision"]) / len(all_scores["context_precision"]) if all_scores["context_precision"] else 0.0,
565
+ "context_recall": sum(all_scores["context_recall"]) / len(all_scores["context_recall"]) if all_scores["context_recall"] else 0.0
566
+ }
567
+
568
+ # Display results
569
+ print("\n" + "="*60)
570
+ print("📈 EVALUATION RESULTS")
571
+ print("="*60)
572
+
573
+ for metric, score in eval_results.items():
574
+ if isinstance(score, (int, float)):
575
+ print(f" {metric:28s}: {score:.4f}")
576
+
577
+ # Save results to JSON
578
+ with open(output_file, "w", encoding="utf-8") as f:
579
+ results_dict = {
580
+ "metrics": {k: float(v) if isinstance(v, (int, float)) else str(v)
581
+ for k, v in eval_results.items()},
582
+ "test_samples": len(dataset),
583
+ "test_file": test_file
584
+ }
585
+ json.dump(results_dict, f, ensure_ascii=False, indent=2)
586
+
587
+ print(f"\n💾 Results saved to: {output_file}")
588
+ print("="*60 + "\n")
589
+
590
+ return eval_results
591
+
592
+ except Exception as e:
593
+ print(f"\n❌ Evaluation failed: {e}")
594
+ print("\n⚠️ Make sure:")
595
+ print(" 1. GROQ_API_KEY is set in .env")
596
+ print(" 2. You have valid Groq API credits")
597
+ print(" 3. Internet connection is available")
598
+ return None
599
+
600
+ # ==========================================
601
+ # 🎯 MAIN EXECUTION
602
+ # ==========================================
603
+
604
+ if __name__ == "__main__":
605
+ import sys
606
+
607
+ test_file = "test_dataset.json"
608
+ output_file = "evaluation_results.json"
609
+
610
+ # Check for command line arguments
611
+ if len(sys.argv) > 1:
612
+ test_file = sys.argv[1]
613
+ if len(sys.argv) > 2:
614
+ output_file = sys.argv[2]
615
+
616
+ print("\n" + "="*60)
617
+ print("🚀 Constitutional Legal Assistant - RAGAS Evaluation")
618
+ print("="*60)
619
+
620
+ run_evaluation(test_file, output_file)
evaluate_rag.py ADDED
@@ -0,0 +1,535 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # -*- coding: utf-8 -*-
2
+ """
3
+ RAG Evaluation Script using Ragas Metrics
4
+ ==========================================
5
+ Evaluates the Constitutional Legal Assistant using:
6
+ - faithfulness
7
+ - answer_relevancy
8
+ - context_precision
9
+ - context_recall
10
+ - context_relevancy
11
+
12
+ USAGE:
13
+ ------
14
+ 1. Command line: python evaluate_rag.py path/to/questions.json
15
+ 2. Environment variable: set QA_FILE_PATH=path/to/questions.json
16
+ 3. Default: Place 'test_dataset.json' in same directory
17
+
18
+ JSON FORMAT:
19
+ -----------
20
+ List format: [{"question": "...", "ground_truth": "..."}, ...]
21
+ OR dict format: {"data": [...]} or {"questions": [...]}
22
+
23
+ RATE LIMITS:
24
+ -----------
25
+ - 120 second delay between questions to avoid API timeouts
26
+ - 30 second delay before evaluation starts
27
+ - 15 second initial cooldown after pipeline load
28
+ """
29
+
30
+ import os
31
+ import sys
32
+ import json
33
+ import time
34
+ from dotenv import load_dotenv
35
+ from datasets import Dataset
36
+ from ragas import evaluate
37
+ from ragas.metrics import (
38
+ faithfulness,
39
+ answer_relevancy,
40
+ context_precision,
41
+ context_recall,
42
+ )
43
+ from ragas.llms import LangchainLLMWrapper
44
+ from ragas.embeddings import LangchainEmbeddingsWrapper
45
+ from langchain_groq import ChatGroq
46
+ from langchain_huggingface import HuggingFaceEmbeddings
47
+ import logging
48
+
49
+ # Import the RAG pipeline initialization
50
+ from app_final_updated import initialize_rag_pipeline
51
+
52
+ # Suppress verbose API logging
53
+ logging.getLogger("httpx").setLevel(logging.WARNING)
54
+ logging.getLogger("groq").setLevel(logging.WARNING)
55
+
56
+ load_dotenv()
57
+ model_name="Omartificial-Intelligence-Space/GATE-AraBert-v1"
58
+ # ==========================================
59
+ # ⏱️ RATE LIMITING / DELAYS (GROQ LIMITS)
60
+ # ==========================================
61
+ RPM_LIMIT = 30
62
+ TPM_LIMIT = 6000
63
+ RPD_LIMIT = 14400
64
+ TPD_LIMIT = 500000
65
+
66
+ # Use a conservative delay to stay within RPM limits.
67
+ # Increased delays to prevent API timeouts
68
+ MIN_DELAY_SECONDS = 60.0 / RPM_LIMIT
69
+ REQUEST_DELAY_SECONDS = 60.0 # 1 minute between each question to avoid timeouts
70
+ EVALUATION_DELAY_SECONDS = 60.0 # 60 seconds before starting evaluation
71
+ INITIAL_COOLDOWN = 10.0 # 10 seconds after loading pipeline
72
+ PER_METRIC_DELAY = 60.0 # 60 seconds between evaluating each question's metrics
73
+
74
+ # ==========================================
75
+ # 📝 TEST DATASET
76
+ # ==========================================
77
+ # Default test questions (used when no file is provided)
78
+ DEFAULT_TEST_QUESTIONS = [
79
+ {
80
+ "question": "ما هي شروط الترشح لرئاسة الجمهورية؟",
81
+ "ground_truth": "يجب أن يكون المرشح مصرياً من أبوين مصريين، وألا تكون له جنسية أخرى، وأن يكون متمتعاً بحقوقه المدنية والسياسية، وأن يكون قد أدى الخدمة العسكرية أو أعفي منها قانوناً، وألا تقل سنه يوم فتح باب الترشح عن أربعين سنة ميلادية."
82
+ },
83
+ {
84
+ "question": "ما هي مدة ولاية رئيس الجمهورية؟",
85
+ "ground_truth": "مدة الرئاسة ست سنوات ميلادية، تبدأ من اليوم التالي لانتهاء مدة سلفه، ولا يجوز إعادة انتخابه إلا لمرة واحدة."
86
+ },
87
+ {
88
+ "question": "ما هي حقوق المواطن في الحصول على المعلومات؟",
89
+ "ground_truth": "المعلومات والبيانات والإحصاءات والوثائق الرسمية ملك للشعب، والإفصاح عنها من مصادرها المختلفة حق تكفله الدولة لكل مواطن."
90
+ },
91
+ {
92
+ "question": "ما هو دور مجلس الشيوخ؟",
93
+ "ground_truth": "يختص مجلس الشيوخ بدراسة واقتراح ما يراه كفيلاً بدعم الوحدة الوطنية والسلام الاجتماعي والحفاظ على المقومات الأساسية للمجتمع، ودراسة مشروعات القوانين المكملة للدستور."
94
+ },
95
+ {
96
+ "question": "كيف يتم تعديل الدستور؟",
97
+ "ground_truth": "لرئيس الجمهورية أو لخمس أعضاء مجلس النواب طلب تعديل مادة أو أكثر من الدستور، ويجب الموافقة على التعديل بأغلبية ثلثي أعضاء المجلس، ثم يعرض على الشعب في استفتاء."
98
+ }
99
+ ]
100
+
101
+ def load_test_questions(file_path: str):
102
+ """Load test questions from JSON file"""
103
+ try:
104
+ with open(file_path, "r", encoding="utf-8") as f:
105
+ obj = json.load(f)
106
+
107
+ if isinstance(obj, list):
108
+ return obj
109
+ if isinstance(obj, dict):
110
+ if "data" in obj and isinstance(obj["data"], list):
111
+ return obj["data"]
112
+ if "questions" in obj and isinstance(obj["questions"], list):
113
+ return obj["questions"]
114
+ raise ValueError("Unsupported QA JSON format; expected a list or dict with 'data' or 'questions'.")
115
+ except FileNotFoundError:
116
+ raise FileNotFoundError(f"❌ QA file not found: {file_path}")
117
+ except json.JSONDecodeError as e:
118
+ raise ValueError(f"❌ Invalid JSON format in {file_path}: {e}")
119
+ except Exception as e:
120
+ raise Exception(f"❌ Error loading QA file {file_path}: {e}")
121
+
122
+
123
+ # Load QA file path from environment variable or command line
124
+ qa_file_path = os.getenv("QA_FILE_PATH")
125
+ if not qa_file_path and len(sys.argv) > 1:
126
+ qa_file_path = sys.argv[1]
127
+
128
+ # If still not provided, try default file
129
+ if not qa_file_path:
130
+ default_path = "test_dataset_5_questions.json"
131
+ if os.path.exists(default_path):
132
+ qa_file_path = default_path
133
+ print(f"📂 Using default dataset: {default_path}")
134
+
135
+ if qa_file_path and os.path.exists(qa_file_path):
136
+ print(f"📂 Loading questions from: {qa_file_path}")
137
+ try:
138
+ test_questions = load_test_questions(qa_file_path)
139
+ print(f"✅ Loaded {len(test_questions)} questions from file")
140
+ except Exception as e:
141
+ print(f"❌ Error loading file: {e}")
142
+ print("📝 Using default inline test questions instead")
143
+ test_questions = DEFAULT_TEST_QUESTIONS
144
+ else:
145
+ if qa_file_path:
146
+ print(f"⚠️ File not found: {qa_file_path}")
147
+ print("📝 Using default inline test questions")
148
+ test_questions = DEFAULT_TEST_QUESTIONS
149
+
150
+ # ==========================================
151
+ # 🔄 RUN EVALUATION
152
+ # ==========================================
153
+
154
+ def run_evaluation():
155
+ print("="*60)
156
+ print("🚀 Starting RAG Evaluation with Ragas")
157
+ print("="*60)
158
+
159
+ print(f"\n📊 Configuration:")
160
+ print(f" Questions to evaluate: {len(test_questions)}")
161
+ print(f" Delay per question (generation): {REQUEST_DELAY_SECONDS}s")
162
+ print(f" Delay per question (evaluation): {PER_METRIC_DELAY}s")
163
+
164
+ total_gen_time = len(test_questions) * REQUEST_DELAY_SECONDS / 60.0
165
+ total_eval_time = len(test_questions) * PER_METRIC_DELAY / 60.0
166
+ total_time = total_gen_time + total_eval_time + INITIAL_COOLDOWN / 60.0 + EVALUATION_DELAY_SECONDS / 60.0
167
+
168
+ print(f"\n⏱️ Estimated total time:")
169
+ print(f" Question generation: ~{total_gen_time:.1f} minutes")
170
+ print(f" Evaluation phase: ~{total_eval_time:.1f} minutes")
171
+ print(f" Total: ~{total_time:.1f} minutes ({total_time/60:.1f} hours)\n")
172
+
173
+ # 1. Initialize RAG Pipeline
174
+ print("\n📥 Loading RAG pipeline...")
175
+ qa_chain = initialize_rag_pipeline()
176
+ print("✅ Pipeline loaded successfully")
177
+
178
+ # Let the service cool down before starting requests
179
+ print(f"⏳ Cooling down for {INITIAL_COOLDOWN} seconds...")
180
+ time.sleep(INITIAL_COOLDOWN)
181
+
182
+ # 2. Generate answers and collect context
183
+ print("\n🤖 Generating answers for test questions...\n")
184
+
185
+ results = {
186
+ "question": [],
187
+ "answer": [],
188
+ "contexts": [],
189
+ "ground_truth": []
190
+ }
191
+
192
+ for idx, item in enumerate(test_questions, 1):
193
+ question = item["question"]
194
+ ground_truth = item.get("ground_truth", "")
195
+
196
+ print(f"\n{'='*60}")
197
+ print(f"[{idx}/{len(test_questions)}] Generating answer ({idx / len(test_questions) * 100:.0f}% complete)")
198
+ print(f"{'='*60}")
199
+ print(f"Q: {question[:80]}...")
200
+ print(f"{'-'*60}")
201
+
202
+ try:
203
+ # Invoke the chain
204
+ result = qa_chain.invoke(question)
205
+
206
+ answer = result["answer"]
207
+ context_docs = result["context"]
208
+
209
+ # Extract context text from documents
210
+ contexts = [doc.page_content for doc in context_docs]
211
+
212
+ # Store results
213
+ results["question"].append(question)
214
+ results["answer"].append(answer)
215
+ results["contexts"].append(contexts)
216
+ results["ground_truth"].append(ground_truth)
217
+
218
+ print(f"✅ Generated answer ({len(answer)} chars)")
219
+ print(f"✅ Retrieved {len(contexts)} context documents")
220
+
221
+ # Delay between requests to avoid hitting RPM limits
222
+ if idx < len(test_questions):
223
+ print(f"⏳ Waiting {REQUEST_DELAY_SECONDS} seconds before next question...")
224
+ time.sleep(REQUEST_DELAY_SECONDS)
225
+
226
+ except Exception as e:
227
+ print(f"❌ Error: {e}")
228
+ # Add placeholder to keep dataset aligned
229
+ results["question"].append(question)
230
+ results["answer"].append("Error generating answer")
231
+ results["contexts"].append([])
232
+ results["ground_truth"].append(ground_truth)
233
+
234
+ # 3. Convert to Ragas Dataset format
235
+ print("\n📊 Creating evaluation dataset...")
236
+ dataset = Dataset.from_dict(results)
237
+ print(f"✅ Dataset created with {len(dataset)} samples")
238
+
239
+ # 4. Run Ragas Evaluation
240
+ print("\n⚙️ Running Ragas evaluation...")
241
+ print("This may take a few minutes...")
242
+ print("Using Groq API (Llama 3.1 8B Instant) for evaluation...")
243
+
244
+ # Add a larger delay before evaluation to avoid back-to-back bursts
245
+ print(f"⏳ Waiting {EVALUATION_DELAY_SECONDS} seconds before evaluation...")
246
+ time.sleep(EVALUATION_DELAY_SECONDS)
247
+
248
+ # Configure Groq LLM for evaluation (same as app_final.py)
249
+ evaluator_llm = LangchainLLMWrapper(ChatGroq(
250
+ model="llama-3.1-8b-instant", # Same as app_final.py
251
+ temperature=0.3, # Same as app_final.py
252
+ model_kwargs={"top_p": 0.9}, # Same as app_final.py
253
+ max_retries=3 # Add retries for robustness
254
+ ))
255
+
256
+ # Configure embeddings (same as app_final.py)
257
+ print("Configuring HuggingFace embeddings (same as app_final.py)...")
258
+ evaluator_embeddings = LangchainEmbeddingsWrapper(HuggingFaceEmbeddings(
259
+ model_name=model_name
260
+ ))
261
+
262
+ try:
263
+ # Evaluate each question separately with delays to avoid rate limits
264
+ print("\n⚠️ Evaluating each question separately with 60-second delays...")
265
+ print(f"⏱️ Estimated time: ~{len(results['question']) * PER_METRIC_DELAY / 60:.1f} minutes\n")
266
+
267
+ all_scores = {
268
+ "faithfulness": [],
269
+ "answer_relevancy": [],
270
+ "context_precision": [],
271
+ "context_recall": []
272
+ }
273
+
274
+ for q_idx in range(len(results["question"])):
275
+ print(f"\n{'='*60}")
276
+ print(f"📋 Question {q_idx + 1}/{len(results['question'])} ({(q_idx + 1) / len(results['question']) * 100:.0f}% complete)")
277
+ print(f"{'='*60}")
278
+ print(f"Q: {results['question'][q_idx][:80]}...")
279
+ print(f"-" * 60)
280
+
281
+ # Create single-question dataset
282
+ single_q_data = {
283
+ "question": [results["question"][q_idx]],
284
+ "answer": [results["answer"][q_idx]],
285
+ "contexts": [results["contexts"][q_idx]],
286
+ "ground_truth": [results["ground_truth"][q_idx]]
287
+ }
288
+ single_dataset = Dataset.from_dict(single_q_data)
289
+
290
+ # Evaluate all metrics for this question
291
+ try:
292
+ q_result = evaluate(
293
+ single_dataset,
294
+ metrics=[faithfulness, answer_relevancy, context_precision, context_recall],
295
+ llm=evaluator_llm,
296
+ embeddings=evaluator_embeddings,
297
+ raise_exceptions=False
298
+ )
299
+
300
+ # Convert EvaluationResult to dict if needed
301
+ if hasattr(q_result, 'to_pandas'):
302
+ # Convert to pandas and then to dict
303
+ result_df = q_result.to_pandas()
304
+ result_dict = result_df.to_dict('records')[0] if len(result_df) > 0 else {}
305
+ elif isinstance(q_result, dict):
306
+ result_dict = q_result
307
+ else:
308
+ # Try to access as attributes
309
+ result_dict = {
310
+ 'faithfulness': getattr(q_result, 'faithfulness', 0.0),
311
+ 'answer_relevancy': getattr(q_result, 'answer_relevancy', 0.0),
312
+ 'context_precision': getattr(q_result, 'context_precision', 0.0),
313
+ 'context_recall': getattr(q_result, 'context_recall', 0.0)
314
+ }
315
+
316
+ # Extract scores (handle if they're lists or single values)
317
+ def get_score(value):
318
+ if isinstance(value, list):
319
+ return value[0] if len(value) > 0 else 0.0
320
+ return float(value) if value is not None else 0.0
321
+
322
+ f_score = get_score(result_dict.get('faithfulness', 0.0))
323
+ a_score = get_score(result_dict.get('answer_relevancy', 0.0))
324
+ cp_score = get_score(result_dict.get('context_precision', 0.0))
325
+ cr_score = get_score(result_dict.get('context_recall', 0.0))
326
+
327
+ # Display scores for this question
328
+ print(f"\n📊 Results for Question {q_idx + 1}:")
329
+ print(f" Faithfulness : {f_score:.4f}")
330
+ print(f" Answer Relevancy : {a_score:.4f}")
331
+ print(f" Context Precision : {cp_score:.4f}")
332
+ print(f" Context Recall : {cr_score:.4f}")
333
+
334
+ all_scores["faithfulness"].append(f_score)
335
+ all_scores["answer_relevancy"].append(a_score)
336
+ all_scores["context_precision"].append(cp_score)
337
+ all_scores["context_recall"].append(cr_score)
338
+
339
+ except Exception as e:
340
+ print(f"\n❌ Error evaluating question {q_idx + 1}: {str(e)}")
341
+ print(f" Error type: {type(e).__name__}")
342
+ # Print more debug info if verbose
343
+ import traceback
344
+ print(f" Traceback: {traceback.format_exc()[:200]}...")
345
+ all_scores["faithfulness"].append(0.0)
346
+ all_scores["answer_relevancy"].append(0.0)
347
+ all_scores["context_precision"].append(0.0)
348
+ all_scores["context_recall"].append(0.0)
349
+
350
+ # Wait between questions to avoid rate limits
351
+ if q_idx < len(results["question"]) - 1:
352
+ print(f"\n⏳ Waiting {PER_METRIC_DELAY} seconds before next question...")
353
+ time.sleep(PER_METRIC_DELAY)
354
+
355
+ # Calculate average scores
356
+ print("\n" + "="*60)
357
+ print("📊 CALCULATING AVERAGE SCORES")
358
+ print("="*60)
359
+
360
+ evaluation_results = {
361
+ "faithfulness": sum(all_scores["faithfulness"]) / len(all_scores["faithfulness"]) if all_scores["faithfulness"] else 0.0,
362
+ "answer_relevancy": sum(all_scores["answer_relevancy"]) / len(all_scores["answer_relevancy"]) if all_scores["answer_relevancy"] else 0.0,
363
+ "context_precision": sum(all_scores["context_precision"]) / len(all_scores["context_precision"]) if all_scores["context_precision"] else 0.0,
364
+ "context_recall": sum(all_scores["context_recall"]) / len(all_scores["context_recall"]) if all_scores["context_recall"] else 0.0
365
+ }
366
+
367
+ print("\n" + "="*60)
368
+ print("📈 FINAL AVERAGE RESULTS")
369
+ print("="*60)
370
+
371
+ # Display average results
372
+ for metric_name, score in evaluation_results.items():
373
+ if isinstance(score, (int, float)):
374
+ print(f" {metric_name:28s}: {score:.4f}")
375
+
376
+ overall_avg = sum(evaluation_results.values()) / len(evaluation_results)
377
+ print(f"\n {'Overall Average':28s}: {overall_avg:.4f}")
378
+
379
+ # Save results to JSON
380
+ results_file = "evaluation_results.json"
381
+ with open(results_file, "w", encoding="utf-8") as f:
382
+ results_dict = {
383
+ "metrics": {k: float(v) if isinstance(v, (int, float)) else str(v)
384
+ for k, v in evaluation_results.items()},
385
+ "individual_scores": all_scores,
386
+ "test_samples": len(dataset),
387
+ "overall_average": overall_avg,
388
+ "evaluation_details": {
389
+ "delay_per_question": f"{REQUEST_DELAY_SECONDS}s",
390
+ "delay_per_metric": f"{PER_METRIC_DELAY}s",
391
+ "model": "llama-3.1-8b-instant",
392
+ "embeddings": model_name
393
+ }
394
+ }
395
+ json.dump(results_dict, f, ensure_ascii=False, indent=2)
396
+
397
+ print(f"\n💾 Results saved to: {results_file}")
398
+
399
+ # Save individual question breakdown
400
+ breakdown_file = "evaluation_breakdown.json"
401
+ breakdown_data = []
402
+ for q_idx in range(len(results["question"])):
403
+ # Calculate average score for this question across all metrics
404
+ question_score = (
405
+ all_scores["faithfulness"][q_idx] +
406
+ all_scores["answer_relevancy"][q_idx] +
407
+ all_scores["context_precision"][q_idx] +
408
+ all_scores["context_recall"][q_idx]
409
+ ) / 4.0
410
+
411
+ breakdown_data.append({
412
+ "question": results["question"][q_idx],
413
+ "ground_truth": results["ground_truth"][q_idx],
414
+ "actual_answer": results["answer"][q_idx],
415
+ "score": round(question_score, 4)
416
+ })
417
+
418
+ # Calculate average score of all questions
419
+ total_avg_score = sum(item["score"] for item in breakdown_data) / len(breakdown_data) if breakdown_data else 0.0
420
+
421
+ # Create simplified results structure
422
+ simplified_results = {
423
+ "questions": breakdown_data,
424
+ "average_score": round(total_avg_score, 4)
425
+ }
426
+
427
+ with open(breakdown_file, "w", encoding="utf-8") as f:
428
+ json.dump(simplified_results, f, ensure_ascii=False, indent=2)
429
+
430
+ print(f"💾 Question breakdown saved to: {breakdown_file}")
431
+ print(f"📊 Average score across all questions: {total_avg_score:.4f}")
432
+
433
+ # Save detailed results
434
+ detailed_file = "evaluation_detailed.json"
435
+ with open(detailed_file, "w", encoding="utf-8") as f:
436
+ json.dump(results, f, ensure_ascii=False, indent=2)
437
+
438
+ print(f"💾 Detailed results saved to: {detailed_file}")
439
+
440
+ print("\n" + "="*60)
441
+ print("✅ Evaluation Complete!")
442
+ print("="*60)
443
+
444
+ return evaluation_results
445
+
446
+ except Exception as e:
447
+ print(f"\n❌ Evaluation failed: {e}")
448
+ print("\n⚠️ Troubleshooting:")
449
+ print(" 1. Check GROQ_API_KEY is set in .env file")
450
+ print(" 2. Verify you have valid Groq API credits")
451
+ print(" 3. Ensure internet connection is stable")
452
+ print(" 4. Try increasing PER_METRIC_DELAY in the script")
453
+ print(" 5. Reduce the number of test questions")
454
+ import traceback
455
+ traceback.print_exc()
456
+ return None
457
+
458
+ # ==========================================
459
+ # 📊 METRIC EXPLANATIONS
460
+ # ==========================================
461
+
462
+ def print_metric_explanations():
463
+ """Print what each metric measures"""
464
+ print("\n" + "="*60)
465
+ print("📖 RAGAS METRICS EXPLANATION")
466
+ print("="*60)
467
+
468
+ explanations = {
469
+ "faithfulness": "Is the answer grounded in the context? (0-1, higher is better)\n"
470
+ "Measures if the answer contains only information from the retrieved context.",
471
+
472
+ "answer_relevancy": "Does the answer relate to the question? (0-1, higher is better)\n"
473
+ "Measures how well the answer addresses the question asked.",
474
+
475
+ "context_precision": "How much retrieved context was relevant? (0-1, higher is better)\n"
476
+ "Measures the signal-to-noise ratio in retrieved documents.",
477
+
478
+ "context_recall": "Did we retrieve all needed information? (0-1, higher is better)\n"
479
+ "Measures if all ground truth information is in the context.",
480
+
481
+ "context_relevancy": "Overall relevance of context to question (0-1, higher is better)\n"
482
+ "Measures how relevant the retrieved context is to the question."
483
+ }
484
+
485
+ for metric, explanation in explanations.items():
486
+ print(f"\n{metric.upper()}:")
487
+ print(f" {explanation}")
488
+
489
+ print("\n" + "="*60)
490
+
491
+ # ==========================================
492
+ # 🎯 MAIN EXECUTION
493
+ # ==========================================
494
+
495
+ if __name__ == "__main__":
496
+ from datetime import datetime
497
+
498
+ start_time = datetime.now()
499
+
500
+ print("\n" + "="*60)
501
+ print("🎯 RAG EVALUATION SYSTEM")
502
+ print(" Constitutional Legal Assistant - Egyptian Constitution")
503
+ print("="*60)
504
+ print(f"\n⏰ Started at: {start_time.strftime('%Y-%m-%d %H:%M:%S')}")
505
+
506
+ # Print what metrics mean
507
+ print_metric_explanations()
508
+
509
+ # Run evaluation
510
+ input("\nPress ENTER to start evaluation...")
511
+
512
+ results = run_evaluation()
513
+
514
+ end_time = datetime.now()
515
+ duration = end_time - start_time
516
+
517
+ print("\n" + "="*60)
518
+ print("📊 EVALUATION SUMMARY")
519
+ print("="*60)
520
+ print(f"⏰ Started: {start_time.strftime('%Y-%m-%d %H:%M:%S')}")
521
+ print(f"⏰ Finished: {end_time.strftime('%Y-%m-%d %H:%M:%S')}")
522
+ print(f"⏱️ Duration: {duration.total_seconds() / 60:.1f} minutes")
523
+ print(f"📝 Questions evaluated: {len(test_questions)}")
524
+
525
+ if results:
526
+ print(f"\n✅ Evaluation completed successfully!")
527
+ print(f"\n📂 Output files:")
528
+ print(f" - evaluation_results.json (average metrics & config)")
529
+ print(f" - evaluation_breakdown.json (per-question scores)")
530
+ print(f" - evaluation_detailed.json (full Q&A data)")
531
+ else:
532
+ print(f"\n⚠️ Evaluation could not be completed.")
533
+ print(f" Check the error messages above for troubleshooting.")
534
+
535
+ print("\n" + "="*60)
requirements.txt ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ===========================================
2
+ # Constitutional Legal Assistant - Requirements
3
+ # ===========================================
4
+
5
+ # Core Python
6
+ python-dotenv>=1.0.0
7
+
8
+ # Streamlit UI
9
+ streamlit>=1.28.0
10
+
11
+ # LangChain Core
12
+ langchain>=0.2.0
13
+ langchain-core>=0.2.0
14
+ langchain-text-splitters>=0.2.0
15
+ langchain-community>=0.2.0
16
+ langchain-classic>=0.0.1
17
+
18
+ # Vector Store
19
+ langchain-chroma>=0.1.0
20
+ chromadb>=0.4.0
21
+
22
+ # Embeddings & Reranker
23
+ langchain-huggingface>=0.0.3
24
+ sentence-transformers>=2.2.0
25
+ transformers>=4.35.0
26
+ torch>=2.0.0
27
+
28
+ # LLM Provider (Groq)
29
+ langchain-groq>=0.1.0
30
+
31
+ # BM25 Keyword Search
32
+ rank-bm25>=0.2.2
33
+
34
+ # Numerical
35
+ numpy>=1.24.0
36
+
37
+ # ===========================================
38
+ # EVALUATION (RAGAS)
39
+ # ===========================================
40
+ ragas>=0.1.0
41
+ datasets>=2.14.0
42
+
43
+ # ===========================================
44
+ # PHOENIX OBSERVABILITY (Optional)
45
+ # ===========================================
46
+ # For app_final_pheonix.py tracing
47
+ opentelemetry-api>=1.20.0
48
+ opentelemetry-sdk>=1.20.0
49
+ opentelemetry-exporter-otlp-proto-http>=1.20.0
50
+ arize-phoenix>=4.0.0
51
+
52
+ # ===========================================
53
+ # LOCAL WHEEL PACKAGES (Install Separately)
54
+ # ===========================================
55
+ # Install these manually with:
56
+ # pip install openinference_instrumentation_langchain-0.1.56-py3-none-any.whl
57
+ # pip install openinference_instrumentation_openai-0.1.41-py3-none-any.whl
test_dataset_5_questions.json ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "question": "ما الطبيعة القانونية لحق العمل في الدستور المصري؟",
4
+ "ground_truth": "حق أساسي/حرية: العمل حق وواجب تكفله الدولة. يُمنع العمل الجبري إلا بقانون ولخدمة عامة وبمقابل عادل."
5
+ },
6
+ {
7
+ "question": "ما حكم التحرش أو التنمر أو العنف ضد العامل في مكان العمل وفق قانون العمل؟",
8
+ "ground_truth": "حظر السخرة والعمل الجبري والتحرش والتنمر والعنف بكافة أشكاله (اللفظي والجسدي والنفسي) ضد العمال، مع تحديد جزاءات تأديبية في لوائح المنشأة."
9
+ },
10
+ {
11
+ "question": "ما المقصود بالتلبس وما أثره الإجرائي بشكل عام؟",
12
+ "ground_truth": "لمأمور الضبط القضائي في التلبس منع الحاضرين من المغادرة حتى تحرير المحضر واستدعاء من يفيد في التحقيق."
13
+ },
14
+ {
15
+ "question": "ما حكم نشر صور أو معلومات تنتهك خصوصية شخص دون رضاه عبر الإنترنت؟",
16
+ "ground_truth": "تجرم المادة الاعتداء على القيم الأسرية أو الخصوصية عبر الرسائل الكثيفة دون موافقة، أو تسليم بيانات للترويج دون موافقة، أو نشر محتوى ينتهك الخصوصية سواء كان صحيحًا أو غير صحيح."
17
+ },
18
+ {
19
+ "question": "ما الشروط العامة لاستحقاق الزوجة النفقة وفق قانون الأحوال الشخصية؟",
20
+ "ground_truth": "تجب النفقة للزوجة من تاريخ العقد الصحيح وتشمل الغذاء والكسوة والمسكن والعلاج. لا تجب النفقة إذا ارتدت أو امتنعت عن تسليم نفسها أو خرجت بدون إذن. نفقة الزوجة دين على الزوج ولها امتياز على أمواله."
21
+ }
22
+ ]