Charles Grandjean commited on
Commit
695b33f
·
1 Parent(s): 9a9d495

reorganizing the project

Browse files
Files changed (38) hide show
  1. CL_LAST_BK.png +0 -0
  2. DEPLOYMENT_GUIDE.md +0 -300
  3. agent_api.py +20 -52
  4. agent_states/agent_state.py +26 -0
  5. pdf_analyzer_state.py → agent_states/pdf_analyzer_state.py +0 -0
  6. data/rag_storage/graph_chunk_entity_relation.graphml +0 -0
  7. data/rag_storage/kv_store_doc_status.json +0 -3
  8. data/rag_storage/kv_store_entity_chunks.json +0 -3
  9. data/rag_storage/kv_store_full_docs.json +0 -3
  10. data/rag_storage/kv_store_full_entities.json +0 -3
  11. data/rag_storage/kv_store_full_relations.json +0 -3
  12. data/rag_storage/kv_store_llm_response_cache.json +0 -3
  13. data/rag_storage/kv_store_relation_chunks.json +0 -3
  14. data/rag_storage/kv_store_text_chunks.json +0 -3
  15. data/rag_storage/vdb_entities.json +0 -3
  16. data/rag_storage/vdb_relationships.json +0 -3
  17. docker-compose.yml +2 -0
  18. langraph_agent.py +5 -5
  19. lightrag.log +0 -51
  20. prompts/__init__.py +1 -0
  21. prompts_lawyer_selector.py → prompts/lawyer_selector.py +32 -1
  22. prompts.py → prompts/main.py +21 -1
  23. prompts_pdf_analyzer.py → prompts/pdf_analyzer.py +0 -0
  24. requirements.txt +1 -0
  25. startup.sh +35 -25
  26. structured_outputs/__init__.py +1 -0
  27. structured_outputs/api_models.py +64 -0
  28. structured_outputs/lawyer_selector.py +19 -0
  29. subagents/__init__.py +1 -0
  30. lawyer_selector.py → subagents/lawyer_selector.py +48 -26
  31. pdf_analyzer.py → subagents/pdf_analyzer.py +2 -2
  32. test_agent.ipynb +0 -152
  33. test_openai_key.ipynb +0 -155
  34. test_tool_calling_demo.ipynb +0 -676
  35. agent_state.py → utils/conversation_manager.py +61 -21
  36. utils.py → utils/lightrag_client.py +4 -144
  37. tools.py → utils/tools.py +27 -38
  38. utils/utils.py +92 -0
CL_LAST_BK.png DELETED
Binary file (11.8 kB)
 
DEPLOYMENT_GUIDE.md DELETED
@@ -1,300 +0,0 @@
1
- # 🚀 Guide de Déploiement Gratuit - CyberLegal AI
2
-
3
- ## 📋 Options de Déploiement Gratuit
4
-
5
- ### 1. **Render.com** (Recommandé pour débutants)
6
- **✅ Avantages**: Simple, gratuit pour small apps, déploiement automatique
7
- **📦 Plan**: Free tier (750h/mois)
8
-
9
- #### Étapes:
10
- ```bash
11
- # 1. Créer un fichier render.yaml
12
- cat > render.yaml << 'EOF'
13
- services:
14
- - type: web
15
- name: cyberlegal-ai
16
- env: docker
17
- plan: free
18
- dockerfilePath: ./Dockerfile
19
- dockerContext: .
20
- envVars:
21
- - key: OPENAI_API_KEY
22
- sync: false
23
- - key: LIGHTRAG_HOST
24
- value: 0.0.0.0
25
- - key: LIGHTRAG_PORT
26
- value: 9621
27
- healthCheckPath: /health
28
- autoDeploy: true
29
- EOF
30
-
31
- # 2. Push sur GitHub
32
- git add .
33
- git commit -m "Deploy to Render"
34
- git push origin main
35
-
36
- # 3. Connecter GitHub à Render.com
37
- # → New Web Service → Connect GitHub → Select repo
38
- ```
39
-
40
- ### 2. **Railway.app** (Très populaire)
41
- **✅ Avantages**: $5 crédit gratuit, Docker natif, base de données gratuite
42
- **📦 Plan**: $5 crédit/mois (suffisant pour usage modéré)
43
-
44
- #### Étapes:
45
- ```bash
46
- # 1. Installer Railway CLI
47
- npm install -g @railway/cli
48
-
49
- # 2. Se connecter
50
- railway login
51
-
52
- # 3. Déployer
53
- railway up
54
-
55
- # 4. Configurer variables d'environnement
56
- railway variables set OPENAI_API_KEY=votre_clé_ici
57
- railway variables set LIGHTRAG_HOST=0.0.0.0
58
- railway variables set LIGHTRAG_PORT=9621
59
- ```
60
-
61
- ### 3. **Fly.io** (Pour utilisateurs avancés)
62
- **✅ Avantages**: 160h gratuits/mois, Docker, worldwide deployment
63
- **📦 Plan**: Free tier avec shared CPU
64
-
65
- #### Étapes:
66
- ```bash
67
- # 1. Installer Fly CLI
68
- curl -L https://fly.io/install.sh | sh
69
-
70
- # 2. Se connecter
71
- fly auth login
72
-
73
- # 3. Initialiser
74
- fly launch
75
-
76
- # 4. Déployer
77
- fly deploy
78
-
79
- # 5. Configurer secrets
80
- fly secrets set OPENAI_API_KEY=votre_clé_ici
81
- ```
82
-
83
- ### 4. **Vercel + Docker** (Alternative)
84
- **✅ Avantages**: Excellent frontend, facile à utiliser
85
- **📦 Plan**: Hobby plan gratuit
86
-
87
- #### Étapes:
88
- ```bash
89
- # 1. Créer vercel.json
90
- cat > vercel.json << 'EOF'
91
- {
92
- "version": 2,
93
- "builds": [
94
- {
95
- "src": "Dockerfile",
96
- "use": "@vercel/docker"
97
- }
98
- ],
99
- "routes": [
100
- {
101
- "src": "/(.*)",
102
- "dest": "/"
103
- }
104
- ]
105
- }
106
- EOF
107
-
108
- # 2. Déployer
109
- vercel --prod
110
- ```
111
-
112
- ## 🔧 Configuration Requise
113
-
114
- ### Variables d'Environnement Essentielles:
115
- ```bash
116
- OPENAI_API_KEY=sk-proj-votre_clé_complete
117
- LIGHTRAG_HOST=0.0.0.0
118
- LIGHTRAG_PORT=9621
119
- ```
120
-
121
- ### Modifications pour le Cloud:
122
-
123
- #### 1. **Adapter Dockerfile pour le cloud**
124
- ```dockerfile
125
- # Ajouter au début du Dockerfile
126
- FROM python:3.11-slim
127
-
128
- # Health check pour les plateformes cloud
129
- HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
130
- CMD curl -f http://localhost:8000/health || exit 1
131
-
132
- # Exposer le bon port
133
- EXPOSE 8000
134
- ```
135
-
136
- #### 2. **Adapter docker-compose.yml**
137
- ```yaml
138
- version: '3.8'
139
- services:
140
- app:
141
- build: .
142
- ports:
143
- - "8000:8000"
144
- environment:
145
- - PORT=8000
146
- - OPENAI_API_KEY=${OPENAI_API_KEY}
147
- ```
148
-
149
- ## 📊 Optimisations pour le Gratuit
150
-
151
- ### 1. **Réduire la consommation de ressources**
152
- ```python
153
- # Dans langraph_agent.py
154
- # Limiter la concurrence
155
- MAX_ASYNC = 1
156
- MAX_PARALLEL_INSERT = 1
157
-
158
- # Timeout plus court
159
- LLM_TIMEOUT = 60
160
- ```
161
-
162
- ### 2. **Caching intelligent**
163
- ```python
164
- # Activer le cache LLM
165
- ENABLE_LLM_CACHE = true
166
- ```
167
-
168
- ### 3. **Optimisations Docker**
169
- ```dockerfile
170
- # Image plus légère
171
- FROM python:3.11-slim
172
-
173
- # Multi-stage build
174
- COPY requirements.txt .
175
- RUN pip install --no-cache-dir -r requirements.txt
176
-
177
- # Nettoyer après installation
178
- RUN apt-get clean && rm -rf /var/lib/apt/lists/*
179
- ```
180
-
181
- ## 🌐 Déploiement sur Render (Tutorial Complet)
182
-
183
- ### Étape 1: Préparer le projet
184
- ```bash
185
- # 1. Créer render.yaml
186
- cat > render.yaml << 'EOF'
187
- services:
188
- - type: web
189
- name: cyberlegal-ai
190
- runtime: docker
191
- plan: free
192
- dockerfilePath: ./Dockerfile
193
- dockerContext: .
194
- healthCheckPath: /health
195
- envVars:
196
- - key: OPENAI_API_KEY
197
- sync: false
198
- - key: PORT
199
- value: 8000
200
- autoDeploy: true
201
- EOF
202
-
203
- # 2. Ajouter au git
204
- git add render.yaml
205
- git commit -m "Add Render deployment config"
206
- ```
207
-
208
- ### Étape 2: Configuration sur Render
209
- 1. **Créer compte**: https://render.com
210
- 2. **Connecter GitHub**: Dashboard → New Web Service
211
- 3. **Sélectionner repository**: Votre repo GitHub
212
- 4. **Configurer**:
213
- - Name: `cyberlegal-ai`
214
- - Environment: `Docker`
215
- - Plan: `Free`
216
- - Health Check: `/health`
217
-
218
- ### Étape 3: Variables d'Environnement
219
- Dans Dashboard → Service → Environment:
220
- ```
221
- OPENAI_API_KEY = sk-proj-votre_clé_complète_ici
222
- PORT = 8000
223
- ```
224
-
225
- ### Étape 4: Déploiement automatique
226
- - Render détecte les push GitHub
227
- - Rebuild automatiquement
228
- - Déploie sur: `https://cyberlegal-ai.onrender.com`
229
-
230
- ## 🔍 Tests de Déploiement
231
-
232
- ### Vérifier le déploiement:
233
- ```bash
234
- # Health check
235
- curl https://votre-app.onrender.com/health
236
-
237
- # Test chat
238
- curl -X POST "https://votre-app.onrender.com/chat" \
239
- -H "Content-Type: application/json" \
240
- -d '{"message": "Test question", "role": "user", "jurisdiction": "EU"}'
241
- ```
242
-
243
- ## ⚠️ Limitations Gratuites
244
-
245
- ### Render.com:
246
- - ✅ 750h/mois (suffisant pour 24/7)
247
- - ✅ 512MB RAM
248
- - ✅ Support Docker
249
- - ❌ Timeout après 15min inactivité
250
-
251
- ### Railway.app:
252
- - ✅ $5 crédit/mois
253
- - ✅ 512MB RAM
254
- - ✅ Base de données gratuite
255
- - ❌ Dormance après inactivité
256
-
257
- ### Fly.io:
258
- - ✅ 160h/mois
259
- - ✅ Global deployment
260
- - ✅ Docker natif
261
- - ❌ Besoin de crédit carte (sans charge)
262
-
263
- ## 🚀 Recommandation Finale
264
-
265
- **Pour commencer**: Render.com - le plus simple et fiable
266
-
267
- **Pour production**: Railway.app - plus de features
268
-
269
- **Pour avancés**: Fly.io - plus de contrôle
270
-
271
- ---
272
-
273
- ## 📞 Support et Monitoring
274
-
275
- ### Ajouter monitoring basique:
276
- ```python
277
- # Dans agent_api.py
278
- @app.get("/metrics")
279
- async def metrics():
280
- return {
281
- "status": "healthy",
282
- "timestamp": datetime.now().isoformat(),
283
- "version": "1.0.0"
284
- }
285
- ```
286
-
287
- ### Logs avec Render:
288
- - Dashboard → Logs → Votre service
289
- - Debug en temps réel
290
-
291
- ---
292
-
293
- 🎯 **Prochaines étapes**:
294
- 1. Choisir votre plateforme (Render recommandé)
295
- 2. Suivre le tutoriel spécifique
296
- 3. Configurer les variables d'environnement
297
- 4. Tester le déploiement
298
- 5. Monitorer les performances
299
-
300
- Bonne chance avec votre déploiement gratuit ! 🚀
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
agent_api.py CHANGED
@@ -17,13 +17,18 @@ from fastapi import Depends
17
  from fastapi.security import APIKeyHeader
18
  import secrets
19
 
20
- from langraph_agent import CyberLegalAgent
21
- from agent_state import ConversationManager
22
- from utils import validate_query, LightRAGClient
23
- import tools
24
- from lawyer_selector import LawyerSelectorAgent
25
- from prompts import SYSTEM_PROMPT_CLIENT, SYSTEM_PROMPT_LAWYER
26
- from pdf_analyzer import PDFAnalyzerAgent
 
 
 
 
 
27
  from langchain_openai import ChatOpenAI
28
  from mistralai import Mistral
29
  import logging
@@ -32,6 +37,7 @@ import base64
32
  import tempfile
33
  import os as pathlib
34
  from langchain_tavily import TavilySearch
 
35
 
36
  # Load environment variables
37
  load_dotenv(dotenv_path=".env", override=False)
@@ -62,47 +68,6 @@ def require_password(x_api_key: str = Depends(api_key_header)):
62
  if x_api_key and secrets.compare_digest(x_api_key, API_PASSWORD):
63
  return
64
  raise HTTPException(status_code=401, detail="Unauthorized")
65
- # Pydantic models for request/response
66
- class Message(BaseModel):
67
- role: str = Field(..., description="Role: 'user' or 'assistant'")
68
- content: str = Field(..., description="Message content")
69
- class DocumentAnalysis(BaseModel):
70
- file_name: str
71
- summary: Optional[str]
72
- actors: Optional[str]
73
- key_details: Optional[str]
74
- class ChatRequest(BaseModel):
75
- message: str = Field(..., description="User's question")
76
- conversationHistory: Optional[List[Message]] = Field(default=[], description="Previous conversation messages")
77
- userType: Optional[str] = Field(default="client", description="User type: 'client' for general users or 'lawyer' for legal professionals")
78
- jurisdiction: Optional[str] = Field(default="Romania", description="Jurisdiction of the user")
79
- documentAnalyses: Optional[List[DocumentAnalysis]] = Field(default=None, description="Lawyer's document analyses")
80
-
81
- class ChatResponse(BaseModel):
82
- response: str = Field(..., description="Assistant's response")
83
- processing_time: float = Field(..., description="Processing time in seconds")
84
- references: List[str] = Field(default=[], description="Referenced documents")
85
- timestamp: str = Field(..., description="Response timestamp")
86
- error: Optional[str] = Field(None, description="Error message if any")
87
-
88
- class HealthResponse(BaseModel):
89
- status: str = Field(..., description="Health status")
90
- agent_ready: bool = Field(..., description="Whether agent is ready")
91
- lightrag_healthy: bool = Field(..., description="Whether LightRAG is healthy")
92
- timestamp: str = Field(..., description="Health check timestamp")
93
-
94
- class AnalyzePDFRequest(BaseModel):
95
- pdf_content: str = Field(..., description="Base64 encoded document content (PDF or image)")
96
- filename: Optional[str] = Field(default="document.pdf", description="Original filename")
97
-
98
- class AnalyzePDFResponse(BaseModel):
99
- actors: str = Field(..., description="Extracted actors")
100
- key_details: str = Field(..., description="Key details extracted")
101
- summary: str = Field(..., description="High-level summary")
102
- processing_status: str = Field(..., description="Processing status")
103
- processing_time: float = Field(..., description="Processing time in seconds")
104
- timestamp: str = Field(..., description="Analysis timestamp")
105
- error: Optional[str] = Field(None, description="Error message if any")
106
 
107
  # Global agent instance
108
  agent_instance = None
@@ -149,6 +114,10 @@ class CyberLegalAPI:
149
  )
150
  tools.tavily_search = tavily_search
151
  logger.info("✅ Tavily search client initialized")
 
 
 
 
152
 
153
  self.agent_client = CyberLegalAgent(llm=llm, system_prompt=SYSTEM_PROMPT_CLIENT, tools=tools.tools_for_client)
154
  self.agent_lawyer = CyberLegalAgent(llm=llm, system_prompt=SYSTEM_PROMPT_LAWYER, tools=tools.tools_for_lawyer)
@@ -258,7 +227,7 @@ class CyberLegalAPI:
258
  Check health status of the API and dependencies
259
  """
260
  try:
261
- from utils import LightRAGClient
262
  lightrag_client = LightRAGClient()
263
  lightrag_healthy = lightrag_client.health_check()
264
 
@@ -411,8 +380,7 @@ async def root():
411
  """
412
  llm_provider = os.getenv("LLM_PROVIDER", "openai").upper()
413
  technology_map = {
414
- "OPENAI": "LangGraph + LightRAG + GPT-5-Nano",
415
- "GEMINI": "LangGraph + LightRAG + Gemini 1.5 Flash"
416
  }
417
 
418
  return {
@@ -420,7 +388,7 @@ async def root():
420
  "version": "1.0.0",
421
  "description": "LangGraph-powered cyber-legal assistant API",
422
  "llm_provider": llm_provider,
423
- "technology": technology_map.get(llm_provider, "LangGraph + LightRAG"),
424
  "endpoints": {
425
  "chat": "POST /chat - Chat with the assistant",
426
  "analyze-pdf": "POST /analyze-pdf - Analyze PDF document",
 
17
  from fastapi.security import APIKeyHeader
18
  import secrets
19
 
20
+ from structured_outputs.api_models import (
21
+ Message, DocumentAnalysis, ChatRequest, ChatResponse,
22
+ HealthResponse, AnalyzePDFRequest, AnalyzePDFResponse
23
+ )
24
+ from langgraph_agent import CyberLegalAgent
25
+ from utils.conversation_manager import ConversationManager
26
+ from utils.utils import validate_query
27
+ from utils.lightrag_client import LightRAGClient
28
+ from utils import tools
29
+ from subagents.lawyer_selector import LawyerSelectorAgent
30
+ from prompts.main import SYSTEM_PROMPT_CLIENT, SYSTEM_PROMPT_LAWYER
31
+ from subagents.pdf_analyzer import PDFAnalyzerAgent
32
  from langchain_openai import ChatOpenAI
33
  from mistralai import Mistral
34
  import logging
 
37
  import tempfile
38
  import os as pathlib
39
  from langchain_tavily import TavilySearch
40
+ import resend
41
 
42
  # Load environment variables
43
  load_dotenv(dotenv_path=".env", override=False)
 
68
  if x_api_key and secrets.compare_digest(x_api_key, API_PASSWORD):
69
  return
70
  raise HTTPException(status_code=401, detail="Unauthorized")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
71
 
72
  # Global agent instance
73
  agent_instance = None
 
114
  )
115
  tools.tavily_search = tavily_search
116
  logger.info("✅ Tavily search client initialized")
117
+
118
+ # Initialize Resend
119
+ resend.api_key = os.getenv("RESEND_API_KEY")
120
+ logger.info("✅ Resend client initialized")
121
 
122
  self.agent_client = CyberLegalAgent(llm=llm, system_prompt=SYSTEM_PROMPT_CLIENT, tools=tools.tools_for_client)
123
  self.agent_lawyer = CyberLegalAgent(llm=llm, system_prompt=SYSTEM_PROMPT_LAWYER, tools=tools.tools_for_lawyer)
 
227
  Check health status of the API and dependencies
228
  """
229
  try:
230
+ from utils.lightrag_client import LightRAGClient
231
  lightrag_client = LightRAGClient()
232
  lightrag_healthy = lightrag_client.health_check()
233
 
 
380
  """
381
  llm_provider = os.getenv("LLM_PROVIDER", "openai").upper()
382
  technology_map = {
383
+ "OPENAI": "LangGraph + RAG + Cerebras (GPT-5-Nano)"
 
384
  }
385
 
386
  return {
 
388
  "version": "1.0.0",
389
  "description": "LangGraph-powered cyber-legal assistant API",
390
  "llm_provider": llm_provider,
391
+ "technology": technology_map.get(llm_provider, "LangGraph + RAG + Cerebras"),
392
  "endpoints": {
393
  "chat": "POST /chat - Chat with the assistant",
394
  "analyze-pdf": "POST /analyze-pdf - Analyze PDF document",
agent_states/agent_state.py ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Agent state management for the LangGraph cyber-legal assistant
4
+ """
5
+
6
+ from typing import TypedDict, List, Dict, Any, Optional
7
+ from datetime import datetime
8
+
9
+
10
+ class AgentState(TypedDict):
11
+ """
12
+ State definition for the LangGraph agent workflow
13
+ """
14
+ # User interaction
15
+ user_query: str
16
+ conversation_history: List[Dict[str, str]]
17
+ intermediate_steps: List[Dict[str, Any]]
18
+ system_prompt: Optional[str]
19
+
20
+ # Context processing
21
+ relevant_documents: List[str]
22
+
23
+ # Metadata
24
+ query_timestamp: str
25
+ processing_time: Optional[float]
26
+ jurisdiction: Optional[str]
pdf_analyzer_state.py → agent_states/pdf_analyzer_state.py RENAMED
File without changes
data/rag_storage/graph_chunk_entity_relation.graphml DELETED
The diff for this file is too large to render. See raw diff
 
data/rag_storage/kv_store_doc_status.json DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:309b48238b8853a5e1380e8202534eb896da93561462b947a1ee648adc7cf73d
3
- size 35812
 
 
 
 
data/rag_storage/kv_store_entity_chunks.json DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:515c0e6f05ae32ab87d73d8c09b30324c7601b6dcf994f78ab6dca24a2c468e6
3
- size 1289498
 
 
 
 
data/rag_storage/kv_store_full_docs.json DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:7a2ce58b26969dae661a9808be1fa20bd972ce18d15bcaabedd1219b878d7ba2
3
- size 2864044
 
 
 
 
data/rag_storage/kv_store_full_entities.json DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:f5c4388472db8b3f3519c4bf78f08dbedf94a44abafed2517fa2cca73b8350b5
3
- size 175853
 
 
 
 
data/rag_storage/kv_store_full_relations.json DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:8ac82d39371f490247f2f9b961624c2d72568b7087a5477f927ca756e83359b4
3
- size 350564
 
 
 
 
data/rag_storage/kv_store_llm_response_cache.json DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:2b530175acd7dd02605748a846c988bdec8651b0bb6090d6af808c15824dacce
3
- size 31129737
 
 
 
 
data/rag_storage/kv_store_relation_chunks.json DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:d9986b500dfb603ad44c42d9d5d88bff416e3d4eca47dea0e9c8748cef5ad1a4
3
- size 1184276
 
 
 
 
data/rag_storage/kv_store_text_chunks.json DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:66a70cbdef477b1ab7b66b29fbf156fce65e8dd50de8c5698d177e01690b1edb
3
- size 3428391
 
 
 
 
data/rag_storage/vdb_entities.json DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:fa886e92b33a43033ce138121f12574e10d35e3cd2acc48b9bcda8bdab412258
3
- size 124984482
 
 
 
 
data/rag_storage/vdb_relationships.json DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:dc4dad69ba410410554fcebaa3595ced6b520e26a301029a72d31d432d08bde9
3
- size 109748156
 
 
 
 
docker-compose.yml CHANGED
@@ -9,6 +9,8 @@ services:
9
  env_file:
10
  - .env # Load environment variables from .env file
11
  environment:
 
 
12
  - LIGHTRAG_HOST=127.0.0.1
13
  - LIGHTRAG_PORT=9621
14
  - API_PORT=8000
 
9
  env_file:
10
  - .env # Load environment variables from .env file
11
  environment:
12
+ - LIGHTRAG_GRAPHS=romania:9621,bahrain:9622
13
+ - LIGHTRAG_STORAGE_ROOT=data/rag_storage
14
  - LIGHTRAG_HOST=127.0.0.1
15
  - LIGHTRAG_PORT=9621
16
  - API_PORT=8000
langraph_agent.py CHANGED
@@ -15,14 +15,14 @@ from langchain_google_genai import ChatGoogleGenerativeAI
15
  from langchain_core.messages import HumanMessage, SystemMessage, AIMessage, ToolMessage
16
 
17
  logger = logging.getLogger(__name__)
18
- from agent_state import AgentState
19
- from prompts import SYSTEM_PROMPT, SYSTEM_PROMPT_CLIENT, SYSTEM_PROMPT_LAWYER
20
- from utils import LightRAGClient, PerformanceMonitor
21
- from tools import tools, tools_for_client, tools_for_lawyer
22
 
23
 
24
  class CyberLegalAgent:
25
- def __init__(self, llm, system_prompt: str = SYSTEM_PROMPT_CLIENT, tools: List[Any] = tools):
26
  self.tools = tools
27
  self.llm = llm
28
  self.performance_monitor = PerformanceMonitor()
 
15
  from langchain_core.messages import HumanMessage, SystemMessage, AIMessage, ToolMessage
16
 
17
  logger = logging.getLogger(__name__)
18
+ from agent_states.agent_state import AgentState
19
+ from utils.utils import PerformanceMonitor
20
+ from utils.lightrag_client import LightRAGClient
21
+ from utils.tools import tools, tools_for_client, tools_for_lawyer
22
 
23
 
24
  class CyberLegalAgent:
25
+ def __init__(self, llm, tools: List[Any] = tools):
26
  self.tools = tools
27
  self.llm = llm
28
  self.performance_monitor = PerformanceMonitor()
lightrag.log DELETED
@@ -1,51 +0,0 @@
1
- 2025-12-16 00:49:52,478 - lightrag - INFO - OpenAI LLM Options: {'max_completion_tokens': 9000}
2
- 2025-12-16 00:49:52,478 - lightrag - INFO - Reranking is disabled
3
- 2025-12-16 00:49:52,769 - lightrag - INFO - [_] Created new empty graph file: /Users/cgrdj/Documents/Code/Cyberlgl/test_minimal/rag_storage/graph_chunk_entity_relation.graphml
4
- 2025-12-16 00:49:52,862 - uvicorn.error - INFO - Started server process [43560]
5
- 2025-12-16 00:49:52,862 - uvicorn.error - INFO - Waiting for application startup.
6
- 2025-12-16 00:49:52,871 - lightrag - INFO - [_] Process 43560 KV load full_docs with 7 records
7
- 2025-12-16 00:49:52,871 - lightrag - INFO - [_] Process 43560 KV load text_chunks with 0 records
8
- 2025-12-16 00:49:52,871 - lightrag - INFO - [_] Process 43560 KV load full_entities with 0 records
9
- 2025-12-16 00:49:52,871 - lightrag - INFO - [_] Process 43560 KV load full_relations with 0 records
10
- 2025-12-16 00:49:52,871 - lightrag - INFO - [_] Process 43560 KV load entity_chunks with 0 records
11
- 2025-12-16 00:49:52,871 - lightrag - INFO - [_] Process 43560 KV load relation_chunks with 0 records
12
- 2025-12-16 00:49:52,871 - lightrag - INFO - [_] Process 43560 KV load llm_response_cache with 0 records
13
- 2025-12-16 00:49:52,872 - lightrag - INFO - [_] Process 43560 doc status load doc_status with 7 records
14
- 2025-12-16 00:49:52,872 - uvicorn.error - INFO - Application startup complete.
15
- 2025-12-16 00:49:52,872 - uvicorn.error - INFO - Uvicorn running on http://127.0.0.1:9621 (Press CTRL+C to quit)
16
- 2025-12-16 01:46:32,747 - uvicorn.access - INFO - 127.0.0.1:50864 - "GET / HTTP/1.1" 307
17
- 2025-12-16 01:46:32,749 - uvicorn.access - INFO - 127.0.0.1:50864 - "GET /webui HTTP/1.1" 307
18
- 2025-12-16 01:46:32,771 - uvicorn.access - INFO - 127.0.0.1:50864 - "GET /webui/assets/index-CRtuqff2.js HTTP/1.1" 200
19
- 2025-12-16 01:46:32,777 - uvicorn.access - INFO - 127.0.0.1:50865 - "GET /webui/assets/index-C8dNBpcg.css HTTP/1.1" 200
20
- 2025-12-16 01:46:33,013 - uvicorn.access - INFO - 127.0.0.1:50865 - "GET /auth-status HTTP/1.1" 200
21
- 2025-12-16 01:46:33,252 - uvicorn.access - INFO - 127.0.0.1:50865 - "GET /docs HTTP/1.1" 200
22
- 2025-12-16 01:46:33,268 - lightrag - INFO - [_] Subgraph query successful | Node count: 0 | Edge count: 0
23
- 2025-12-16 01:46:33,269 - uvicorn.access - INFO - 127.0.0.1:50864 - "GET /graphs?label=*&max_depth=3&max_nodes=1000 HTTP/1.1" 200
24
- 2025-12-16 01:46:33,298 - uvicorn.access - INFO - 127.0.0.1:50865 - "GET /static/swagger-ui/swagger-ui.css HTTP/1.1" 200
25
- 2025-12-16 01:46:33,298 - uvicorn.access - INFO - 127.0.0.1:50864 - "GET /static/swagger-ui/swagger-ui-bundle.js HTTP/1.1" 200
26
- 2025-12-16 01:46:33,462 - uvicorn.access - INFO - 127.0.0.1:50864 - "GET /openapi.json HTTP/1.1" 200
27
- 2025-12-16 01:46:40,237 - uvicorn.error - INFO - Shutting down
28
- 2025-12-16 01:46:40,338 - uvicorn.error - INFO - Waiting for application shutdown.
29
- 2025-12-16 01:46:40,339 - lightrag - INFO - Successfully finalized 12 storages
30
- 2025-12-16 01:46:40,339 - uvicorn.error - INFO - Application shutdown complete.
31
- 2025-12-16 01:46:40,339 - uvicorn.error - INFO - Finished server process [43560]
32
- 2025-12-16 01:46:50,846 - lightrag - INFO - OpenAI LLM Options: {'max_completion_tokens': 9000}
33
- 2025-12-16 01:46:50,847 - lightrag - INFO - Reranking is disabled
34
- 2025-12-16 01:46:51,158 - lightrag - INFO - [_] Created new empty graph file: /Users/cgrdj/Documents/Code/Cyberlgl/test_minimal/rag_storage/graph_chunk_entity_relation.graphml
35
- 2025-12-16 01:46:51,271 - uvicorn.error - INFO - Started server process [12197]
36
- 2025-12-16 01:46:51,271 - uvicorn.error - INFO - Waiting for application startup.
37
- 2025-12-16 01:46:51,280 - lightrag - INFO - [_] Process 12197 KV load full_docs with 7 records
38
- 2025-12-16 01:46:51,281 - lightrag - INFO - [_] Process 12197 KV load text_chunks with 0 records
39
- 2025-12-16 01:46:51,281 - lightrag - INFO - [_] Process 12197 KV load full_entities with 0 records
40
- 2025-12-16 01:46:51,281 - lightrag - INFO - [_] Process 12197 KV load full_relations with 0 records
41
- 2025-12-16 01:46:51,281 - lightrag - INFO - [_] Process 12197 KV load entity_chunks with 0 records
42
- 2025-12-16 01:46:51,281 - lightrag - INFO - [_] Process 12197 KV load relation_chunks with 0 records
43
- 2025-12-16 01:46:51,281 - lightrag - INFO - [_] Process 12197 KV load llm_response_cache with 0 records
44
- 2025-12-16 01:46:51,281 - lightrag - INFO - [_] Process 12197 doc status load doc_status with 7 records
45
- 2025-12-16 01:46:51,282 - uvicorn.error - INFO - Application startup complete.
46
- 2025-12-16 01:46:51,282 - uvicorn.error - INFO - Uvicorn running on http://127.0.0.1:9621 (Press CTRL+C to quit)
47
- 2025-12-16 01:47:02,388 - uvicorn.error - INFO - Shutting down
48
- 2025-12-16 01:47:02,490 - uvicorn.error - INFO - Waiting for application shutdown.
49
- 2025-12-16 01:47:02,491 - lightrag - INFO - Successfully finalized 12 storages
50
- 2025-12-16 01:47:02,491 - uvicorn.error - INFO - Application shutdown complete.
51
- 2025-12-16 01:47:02,491 - uvicorn.error - INFO - Finished server process [12197]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
prompts/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ """Prompt templates for agents"""
prompts_lawyer_selector.py → prompts/lawyer_selector.py RENAMED
@@ -4,7 +4,7 @@ Prompts for lawyer selection agent
4
  """
5
 
6
  LAWYER_SELECTION_PROMPT = """### Task
7
- Based on the conversation above, select the TOP 3 lawyers who are most suitable to handle this case.
8
 
9
  ### Available Lawyers
10
  {lawyers}
@@ -15,5 +15,36 @@ Consider:
15
  2. Experience level
16
  3. Presentation and expertise description
17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  ### Important Note
19
  For each lawyer selected, explain in CLIENT-FRIENDLY language how they can help with the specific legal problem. Focus on benefits to the client, not just technical details. Use clear, accessible language that a non-lawyer can understand."""
 
4
  """
5
 
6
  LAWYER_SELECTION_PROMPT = """### Task
7
+ Based on the conversation above, select 0 to 3 lawyers who are most suitable to handle this case.
8
 
9
  ### Available Lawyers
10
  {lawyers}
 
15
  2. Experience level
16
  3. Presentation and expertise description
17
 
18
+ ### Important Rules
19
+ - Select 0 lawyers if the legal issue doesn't match any available lawyer's expertise
20
+ - Select up to 3 lawyers, ranked by relevance (1 = most suitable)
21
+ - Only select lawyers whose areas of practice clearly align with the case
22
+ - Do not hallucinate lawyers that don't exist, just output a list of ids of the lawyers you selected
23
+
24
+ ### Response Format
25
+ Return a structured response with:
26
+ - rankings: list of 0-3 lawyer selections, each containing:
27
+ - reasoning: client-friendly explanation of why this lawyer is suitable
28
+ - lawyer_index: the unique index of the lawyer
29
+ - rank: position in the recommendation list (1, 2, 3)
30
+
31
+ Example response format:
32
+ ```json
33
+ {{
34
+ "rankings": [
35
+ {{
36
+ "reasoning": "This lawyer specializes in data protection and has experience with GDPR compliance...",
37
+ "lawyer_index": 3,
38
+ "rank": 1
39
+ }},
40
+ {{
41
+ "reasoning": "This lawyer has expertise in cyber law and can help with data breach incidents...",
42
+ "lawyer_index": 1,
43
+ "rank": 2
44
+ }}
45
+ ]
46
+ }}
47
+ ```
48
+
49
  ### Important Note
50
  For each lawyer selected, explain in CLIENT-FRIENDLY language how they can help with the specific legal problem. Focus on benefits to the client, not just technical details. Use clear, accessible language that a non-lawyer can understand."""
prompts.py → prompts/main.py RENAMED
@@ -13,6 +13,7 @@ Client Jurisdiction: {jurisdiction}
13
  1. **query_knowledge_graph**: Search legal documents (GDPR, NIS2, DORA, etc.) to answer questions about EU cyber regulations and directives.
14
  2. **find_lawyers**: Recommend suitable lawyers based on the user's legal issue and conversation context.
15
  3. **search_web**: Search the web for current information, recent legal updates, court decisions, or news that may not be in the knowledge graph.
 
16
 
17
  ### Tool-Calling Process
18
  You operate in an iterative loop:
@@ -27,12 +28,31 @@ You operate in an iterative loop:
27
  1. Your responses should be clear, friendly, and provide practical, actionable answers to the user's question.
28
  2. Use simple language and avoid excessive legal jargon. When legal terms are necessary, explain them in plain terms.
29
  3. When answering legal questions, use the query_knowledge_graph tool to provide accurate, up-to-date information from official EU legal sources.
30
- 4. If you use specific knowledge from a regulation or directive, reference it in your response and explain what it means in practical terms. Create a section at the end of your response called "References" that lists the source documents used to answer the user's question.
31
  5. When users ask for lawyer recommendations or legal representation, use the find_lawyers tool to provide suitable lawyer suggestions.
32
  6. Before calling find_lawyers, ask enough details about the case to understand the problem and provide context for the lawyer selection process if needed.
33
  7. If the user's question can be answered with your general knowledge, respond directly without calling tools.
34
  8. Remember: Your final response is sent to the user when you stop calling tools.
35
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
36
  ### Tone
37
  - Approachable and supportive
38
  - Focus on practical implications for the user
 
13
  1. **query_knowledge_graph**: Search legal documents (GDPR, NIS2, DORA, etc.) to answer questions about EU cyber regulations and directives.
14
  2. **find_lawyers**: Recommend suitable lawyers based on the user's legal issue and conversation context.
15
  3. **search_web**: Search the web for current information, recent legal updates, court decisions, or news that may not be in the knowledge graph.
16
+ 4. **send_email**: Send an email to a recipient. This tool has STRICT usage requirements (see below).
17
 
18
  ### Tool-Calling Process
19
  You operate in an iterative loop:
 
28
  1. Your responses should be clear, friendly, and provide practical, actionable answers to the user's question.
29
  2. Use simple language and avoid excessive legal jargon. When legal terms are necessary, explain them in plain terms.
30
  3. When answering legal questions, use the query_knowledge_graph tool to provide accurate, up-to-date information from official EU legal sources.
31
+ 4. If you use specific knowledge from a regulation or directive, reference it in your response and explain what it means in practical terms. Create a section at the end of your response called "References" that lists of source documents used to answer the user's question.
32
  5. When users ask for lawyer recommendations or legal representation, use the find_lawyers tool to provide suitable lawyer suggestions.
33
  6. Before calling find_lawyers, ask enough details about the case to understand the problem and provide context for the lawyer selection process if needed.
34
  7. If the user's question can be answered with your general knowledge, respond directly without calling tools.
35
  8. Remember: Your final response is sent to the user when you stop calling tools.
36
 
37
+ ### Email Tool Usage Requirements
38
+ **CRITICAL**: The send_email tool MUST ONLY be used in the following specific workflow:
39
+
40
+ 1. User asks for a lawyer's help
41
+ 2. Agent uses the find_lawyers tool to recommend lawyers, and asks the user if they want to contact one of the recommended lawyers
42
+ 3. User confirms they want to contact a specific lawyer
43
+ 4. Agent proposes an email draft to contact that lawyer
44
+ 5. User agrees to send the email
45
+ 6. Agent can call the send_email tool
46
+
47
+ **Prohibited usage:**
48
+ - DO NOT use send_email in any other context
49
+ - DO NOT use send_email before having a list of lawyers recommendations
50
+ - DO NOT use send_email without user agreement
51
+ - DO NOT use send_email to send emails to anyone other than lawyers from the recommendations
52
+ - DO NOT use send_email proactively or spontaneously
53
+
54
+ The send_email tool is exclusively for contacting lawyers after user explicitly requests and approves the action.
55
+
56
  ### Tone
57
  - Approachable and supportive
58
  - Focus on practical implications for the user
prompts_pdf_analyzer.py → prompts/pdf_analyzer.py RENAMED
File without changes
requirements.txt CHANGED
@@ -22,3 +22,4 @@ uvicorn[standard]>=0.24.0
22
  pydantic>=2.0.0
23
  typing-extensions>=4.0.0
24
  langchain-tavily>=0.2.16
 
 
22
  pydantic>=2.0.0
23
  typing-extensions>=4.0.0
24
  langchain-tavily>=0.2.16
25
+ resend>=0.8.0
startup.sh CHANGED
@@ -1,31 +1,41 @@
1
  #!/usr/bin/env bash
2
  set -euo pipefail
3
 
4
- LIGHTRAG_HOST="${LIGHTRAG_HOST:-127.0.0.1}"
5
- LIGHTRAG_PORT="${LIGHTRAG_PORT:-9621}"
6
- PUBLIC_PORT="${PORT:-${API_PORT:-8000}}"
7
-
8
- echo "🚀 Starting CyberLegal AI Stack..."
9
- echo "Step 1: Starting LightRAG server on ${LIGHTRAG_HOST}:${LIGHTRAG_PORT} ..."
10
-
11
- lightrag-server --host "${LIGHTRAG_HOST}" --port "${LIGHTRAG_PORT}" &
12
- LIGHTRAG_PID=$!
13
-
14
- cleanup() {
15
- kill -TERM "${LIGHTRAG_PID}" 2>/dev/null || true
16
- wait "${LIGHTRAG_PID}" 2>/dev/null || true
17
- }
18
- trap cleanup EXIT INT TERM
19
-
20
- echo "Waiting for LightRAG server to be ready..."
21
- for i in {1..30}; do
22
- if curl -fsS "http://${LIGHTRAG_HOST}:${LIGHTRAG_PORT}/health" >/dev/null 2>&1; then
23
- echo " LightRAG is ready!"
24
- break
25
- fi
26
- sleep 2
 
 
 
 
 
 
 
27
  done
28
 
29
- export PORT="${PUBLIC_PORT}"
30
- echo "Step 2: Starting API on 0.0.0.0:${PORT} ..."
 
 
 
31
  python agent_api.py
 
1
  #!/usr/bin/env bash
2
  set -euo pipefail
3
 
4
+ HOST="${LIGHTRAG_HOST:-127.0.0.1}"
5
+ ROOT="${LIGHTRAG_STORAGE_ROOT:-data/rag_storage}"
6
+ GRAPHS="${LIGHTRAG_GRAPHS:-romania:9621}"
7
+ API_PORT="${PORT:-${API_PORT:-8000}}"
8
+
9
+ echo "🚀 LIGHTRAG_GRAPHS=${GRAPHS}"
10
+ echo "📁 Storage root: ${ROOT}"
11
+
12
+ PIDS=()
13
+ trap 'kill -TERM ${PIDS[@]:-} 2>/dev/null || true; wait ${PIDS[@]:-} 2>/dev/null || true' EXIT INT TERM
14
+
15
+ ENDPOINTS=()
16
+
17
+ IFS=',' read -r -a items <<<"${GRAPHS}"
18
+ for item in "${items[@]}"; do
19
+ IFS=':' read -r id port <<<"${item}"
20
+ dir="${ROOT}/${id}"
21
+ mkdir -p "${dir}"
22
+
23
+ echo "➡️ Start LightRAG '${id}' on ${HOST}:${port} (dir=${dir})"
24
+ lightrag-server --host "${HOST}" --port "${port}" --working-dir "${dir}" &
25
+ PIDS+=("$!")
26
+
27
+ echo " ⏳ Waiting health..."
28
+ for _ in {1..30}; do
29
+ curl -fsS "http://${HOST}:${port}/health" >/dev/null 2>&1 && { echo " ✅ ${id} ready"; break; }
30
+ sleep 2
31
+ done
32
+
33
+ ENDPOINTS+=("${id}=http://${HOST}:${port}")
34
  done
35
 
36
+ export LIGHTRAG_ENDPOINTS="$(IFS=,; echo "${ENDPOINTS[*]}")"
37
+ export PORT="${API_PORT}"
38
+
39
+ echo "✅ LIGHTRAG_ENDPOINTS=${LIGHTRAG_ENDPOINTS}"
40
+ echo "🚀 Starting API on 0.0.0.0:${PORT} ..."
41
  python agent_api.py
structured_outputs/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ """Structured output definitions for agents"""
structured_outputs/api_models.py ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Structured outputs for Agent API
4
+ """
5
+
6
+ from typing import List, Optional
7
+ from pydantic import BaseModel, Field
8
+
9
+
10
+ class Message(BaseModel):
11
+ """Chat message"""
12
+ role: str = Field(..., description="Role: 'user' or 'assistant'")
13
+ content: str = Field(..., description="Message content")
14
+
15
+
16
+ class DocumentAnalysis(BaseModel):
17
+ """Document analysis result"""
18
+ file_name: str
19
+ summary: Optional[str] = None
20
+ actors: Optional[str] = None
21
+ key_details: Optional[str] = None
22
+
23
+
24
+ class ChatRequest(BaseModel):
25
+ """Chat request model"""
26
+ message: str = Field(..., description="User's question")
27
+ conversationHistory: Optional[List[Message]] = Field(default=[], description="Previous conversation messages")
28
+ userType: Optional[str] = Field(default="client", description="User type: 'client' for general users or 'lawyer' for legal professionals")
29
+ jurisdiction: Optional[str] = Field(default="Romania", description="Jurisdiction of the user")
30
+ documentAnalyses: Optional[List[DocumentAnalysis]] = Field(default=None, description="Lawyer's document analyses")
31
+
32
+
33
+ class ChatResponse(BaseModel):
34
+ """Chat response model"""
35
+ response: str = Field(..., description="Assistant's response")
36
+ processing_time: float = Field(..., description="Processing time in seconds")
37
+ references: List[str] = Field(default=[], description="Referenced documents")
38
+ timestamp: str = Field(..., description="Response timestamp")
39
+ error: Optional[str] = Field(None, description="Error message if any")
40
+
41
+
42
+ class HealthResponse(BaseModel):
43
+ """Health check response model"""
44
+ status: str = Field(..., description="Health status")
45
+ agent_ready: bool = Field(..., description="Whether agent is ready")
46
+ lightrag_healthy: bool = Field(..., description="Whether LightRAG is healthy")
47
+ timestamp: str = Field(..., description="Health check timestamp")
48
+
49
+
50
+ class AnalyzePDFRequest(BaseModel):
51
+ """PDF analysis request model"""
52
+ pdf_content: str = Field(..., description="Base64 encoded document content (PDF or image)")
53
+ filename: Optional[str] = Field(default="document.pdf", description="Original filename")
54
+
55
+
56
+ class AnalyzePDFResponse(BaseModel):
57
+ """PDF analysis response model"""
58
+ actors: str = Field(..., description="Extracted actors")
59
+ key_details: str = Field(..., description="Key details extracted")
60
+ summary: str = Field(..., description="High-level summary")
61
+ processing_status: str = Field(..., description="Processing status")
62
+ processing_time: float = Field(..., description="Processing time in seconds")
63
+ timestamp: str = Field(..., description="Analysis timestamp")
64
+ error: Optional[str] = Field(None, description="Error message if any")
structured_outputs/lawyer_selector.py ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Structured outputs for Lawyer Selector Agent
4
+ """
5
+
6
+ from typing import List
7
+ from pydantic import BaseModel, Field
8
+
9
+
10
+ class LawyerRanking(BaseModel):
11
+ """Individual lawyer ranking"""
12
+ reasoning: str = Field(description="Client-friendly explanation of how this lawyer can help with their specific legal problem")
13
+ rank: int = Field(description="1, 2, or 3")
14
+ lawyer_index: int = Field(description="Lawyer number from 1 to N")
15
+
16
+
17
+ class LawyerRankings(BaseModel):
18
+ """Collection of lawyer rankings"""
19
+ rankings: List[LawyerRanking] = Field(description="List of 0 to 3 lawyer rankings", min_length=0, max_length=3)
subagents/__init__.py ADDED
@@ -0,0 +1 @@
 
 
1
+ """Subagent implementations for the main agent"""
lawyer_selector.py → subagents/lawyer_selector.py RENAMED
@@ -14,30 +14,17 @@ from langchain_core.messages import HumanMessage, SystemMessage, AIMessage
14
  from langchain_core.output_parsers import PydanticOutputParser
15
  from pydantic import BaseModel, Field
16
 
17
- from prompts_lawyer_selector import LAWYER_SELECTION_PROMPT
 
18
 
19
  load_dotenv()
20
 
21
 
22
- class LawyerRanking(BaseModel):
23
- reasoning: str = Field(description="Client-friendly explanation of how this lawyer can help with their specific legal problem")
24
- rank: int = Field(description="1, 2, or 3")
25
- lawyer_index: int = Field(description="Lawyer number from 1 to N")
26
-
27
-
28
- class LawyerRankings(BaseModel):
29
- rankings: List[LawyerRanking] = Field(description="List of top 3 lawyer rankings")
30
-
31
-
32
  class LawyerSelectorAgent:
33
  """Simple agent that analyzes conversations and selects top 3 lawyers"""
34
 
35
  def __init__(self, llm):
36
  self.llm = llm
37
-
38
- with open("data/lawyers.json", "r", encoding="utf-8") as f:
39
- self.lawyers = json.load(f)
40
-
41
  self.parser = PydanticOutputParser(pydantic_object=LawyerRankings)
42
  self.workflow = self._build_workflow()
43
 
@@ -48,14 +35,34 @@ class LawyerSelectorAgent:
48
  workflow.add_edge("select_lawyers", END)
49
  return workflow.compile()
50
 
51
- def _format_lawyers(self) -> str:
52
  return "\n\n".join([
53
- f"Lawyer {i}:\n- Name: {l['name']}\n- Specialty: {l['specialty']}\n- Experience: {l['experience_years']} years\n- Areas: {', '.join(l['areas_of_practice'])}"
54
- for i, l in enumerate(self.lawyers, 1)
55
  ])
56
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
  async def _select_lawyers(self, state: dict) -> dict:
58
- lawyers_text = self._format_lawyers()
 
59
  prompt = LAWYER_SELECTION_PROMPT.format(lawyers=lawyers_text)
60
 
61
  # Convert message dicts to Message objects
@@ -78,15 +85,30 @@ class LawyerSelectorAgent:
78
  result = self.parser.parse(response.content)
79
  rankings = result.rankings
80
 
81
- state["top_lawyers"] = [
82
- {**self.lawyers[r.lawyer_index - 1], **r.model_dump()}
83
- for r in rankings
84
- ]
 
 
 
 
 
 
 
 
 
85
  return state
86
 
87
- async def select_lawyers(self, conversation_history: List[dict]) -> dict:
88
- result = await self.workflow.ainvoke({"messages": conversation_history})
89
- return {"top_lawyers": result["top_lawyers"]}
 
 
 
 
 
 
90
 
91
 
92
  async def main():
 
14
  from langchain_core.output_parsers import PydanticOutputParser
15
  from pydantic import BaseModel, Field
16
 
17
+ from prompts.lawyer_selector import LAWYER_SELECTION_PROMPT
18
+ from structured_outputs.lawyer_selector import LawyerRanking, LawyerRankings
19
 
20
  load_dotenv()
21
 
22
 
 
 
 
 
 
 
 
 
 
 
23
  class LawyerSelectorAgent:
24
  """Simple agent that analyzes conversations and selects top 3 lawyers"""
25
 
26
  def __init__(self, llm):
27
  self.llm = llm
 
 
 
 
28
  self.parser = PydanticOutputParser(pydantic_object=LawyerRankings)
29
  self.workflow = self._build_workflow()
30
 
 
35
  workflow.add_edge("select_lawyers", END)
36
  return workflow.compile()
37
 
38
+ def _format_lawyers(self, lawyers: List[dict]) -> str:
39
  return "\n\n".join([
40
+ f"Lawyer index:{i}\n\n- Name: {l['name']}\n- Specialty: {l['specialty']}\n- Experience: {l['experience_years']} years\n- Areas: {', '.join(l['areas_of_practice'])}"
41
+ for i, l in enumerate(lawyers, 1)
42
  ])
43
 
44
+ def _format_lawyer_profile(self, lawyer: dict, rank: int, reasoning: str) -> str:
45
+ """Format a single lawyer profile for the result output"""
46
+ lines = [
47
+ "\n" + "─" * 80,
48
+ f"RECOMMENDATION #{rank}",
49
+ "─" * 80,
50
+ f"\n👤 {lawyer['name']}",
51
+ f" {lawyer['presentation']}",
52
+ f"\n📊 Experience: {lawyer['experience_years']} years",
53
+ f"🎯 Specialty: {lawyer['specialty']}",
54
+ f"\n✅ Why this lawyer matches your case:",
55
+ f" {reasoning}",
56
+ f"\n📚 Areas of Practice:"
57
+ ]
58
+ for area in lawyer['areas_of_practice']:
59
+ lines.append(f" • {area}")
60
+ lines.append("")
61
+ return "\n".join(lines)
62
+
63
  async def _select_lawyers(self, state: dict) -> dict:
64
+ lawyers = state["lawyers"]
65
+ lawyers_text = self._format_lawyers(lawyers)
66
  prompt = LAWYER_SELECTION_PROMPT.format(lawyers=lawyers_text)
67
 
68
  # Convert message dicts to Message objects
 
85
  result = self.parser.parse(response.content)
86
  rankings = result.rankings
87
 
88
+ # Retrieve and concatenate lawyer profiles
89
+ if not rankings:
90
+ output = ["=" * 80, "LAWYER RECOMMENDATIONS", "=" * 80]
91
+ output.append("\n❌ No lawyers available for this particular case.")
92
+ output.append("Your legal issue may fall outside our current areas of expertise.")
93
+ output.append("Please consider refining your request or contacting a general legal service.")
94
+ else:
95
+ output = ["=" * 80, f"{len(rankings)} RECOMMENDED LAWYERS FOR YOUR CASE", "=" * 80]
96
+ for r in rankings:
97
+ lawyer = lawyers[r.lawyer_index - 1]
98
+ output.append(self._format_lawyer_profile(lawyer, r.rank, r.reasoning))
99
+
100
+ state["result"] = "\n".join(output)
101
  return state
102
 
103
+ async def select_lawyers(self, conversation_history: List[dict]) -> str:
104
+ with open("data/lawyers.json", "r", encoding="utf-8") as f:
105
+ lawyers = json.load(f)
106
+
107
+ result = await self.workflow.ainvoke({
108
+ "messages": conversation_history,
109
+ "lawyers": lawyers
110
+ })
111
+ return result["result"]
112
 
113
 
114
  async def main():
pdf_analyzer.py → subagents/pdf_analyzer.py RENAMED
@@ -13,8 +13,8 @@ from langchain_google_genai import ChatGoogleGenerativeAI
13
  from langchain_core.messages import HumanMessage, SystemMessage
14
  from mistralai import Mistral
15
 
16
- from pdf_analyzer_state import PDFAnalyzerState
17
- from prompts_pdf_analyzer import SYSTEM_PROMPT, EXTRACT_ACTORS_PROMPT, EXTRACT_KEY_DETAILS_PROMPT, GENERATE_SUMMARY_PROMPT
18
 
19
  logger = logging.getLogger(__name__)
20
 
 
13
  from langchain_core.messages import HumanMessage, SystemMessage
14
  from mistralai import Mistral
15
 
16
+ from agent_states.pdf_analyzer_state import PDFAnalyzerState
17
+ from prompts.pdf_analyzer import SYSTEM_PROMPT, EXTRACT_ACTORS_PROMPT, EXTRACT_KEY_DETAILS_PROMPT, GENERATE_SUMMARY_PROMPT
18
 
19
  logger = logging.getLogger(__name__)
20
 
test_agent.ipynb DELETED
@@ -1,152 +0,0 @@
1
- {
2
- "cells": [
3
- {
4
- "cell_type": "code",
5
- "execution_count": 1,
6
- "id": "9fc74685",
7
- "metadata": {},
8
- "outputs": [
9
- {
10
- "data": {
11
- "text/plain": [
12
- "True"
13
- ]
14
- },
15
- "execution_count": 1,
16
- "metadata": {},
17
- "output_type": "execute_result"
18
- }
19
- ],
20
- "source": [
21
- "import json\n",
22
- "from langraph_agent import CyberLegalAgent\n",
23
- "from dotenv import load_dotenv\n",
24
- "load_dotenv()"
25
- ]
26
- },
27
- {
28
- "cell_type": "code",
29
- "execution_count": 7,
30
- "id": "d3ad35b4",
31
- "metadata": {},
32
- "outputs": [],
33
- "source": [
34
- "history=[{'role': 'user', 'content': 'I need help with a data breach issue'}, {'role': 'assistant', 'content': 'I can help with that. Can you tell me more about the breach?'}, {'role': 'user', 'content': \"My company is in Romania and experienced a data breach. Customer names, addresses, and SSNs were stolen. We have about 500 affected customers. The breach occurred 2 days ago. We haven't notified authorities yet. I need a lawyer to help with GDPR compliance and breach notification.\"}, {'role': 'assistant', 'content': \"I understand. To find the best lawyer for you, could you tell me: Are you looking for a Romanian-based firm or international? What language do you prefer? What's your budget range?\"}]\n",
35
- "user_query='I prefer a Romanian-based firm, English language is not fine, and my budget is around 2000-5000 EUR for this incident. I need help immediately with breach notification.'\n",
36
- "\n"
37
- ]
38
- },
39
- {
40
- "cell_type": "code",
41
- "execution_count": 8,
42
- "id": "a45517e7",
43
- "metadata": {},
44
- "outputs": [
45
- {
46
- "name": "stderr",
47
- "output_type": "stream",
48
- "text": [
49
- "INFO:langraph_agent:🤖 Initialized with OpenAI (gpt-5-nano)\n"
50
- ]
51
- }
52
- ],
53
- "source": [
54
- "agent=CyberLegalAgent()"
55
- ]
56
- },
57
- {
58
- "cell_type": "code",
59
- "execution_count": 11,
60
- "id": "589dc2c3",
61
- "metadata": {},
62
- "outputs": [
63
- {
64
- "name": "stderr",
65
- "output_type": "stream",
66
- "text": [
67
- "INFO:langraph_agent:� Starting query processing: I prefer a Romanian-based firm, English language is not fine, and my budget is around 2000-5000 EUR ...\n",
68
- "INFO:langraph_agent:� Conversation history length: 4 messages\n",
69
- "WARNING:utils:Query timeout, attempt 1\n",
70
- "INFO:utils:Query successful;{'response': 'Thanks for the details. Given the breach happened in Romania and involves sensitive customer data (names, addresses, SSNs) affecting about 500 people, you should act quickly to align with GDPR obligations and the EU cyber-notification framework. Here’s a practical, lawyer-focused plan you can use right away.\\n\\nImmediate next steps (urgent)\\n- Engage a Romanian-based GDPR/legal firm now (you’ve indicated a preference for a Romanian-language, Romania-based firm within a 2,000–5,000 EUR range). A local lawyer can lead the breach-notification process, coordinate with authorities, and help prepare communications to customers.\\n- Prepare a rapid internal factsheet for the lawyer and the incident response team (what happened, when it started, what data were affected, which systems were involved, what containment steps have been taken, and who is affected). This will support the legal review and notifications.\\n- Initiate coordination with the relevant authorities and incident-response bodies. In the EU framework you’re operating under, important entities handle incident notifications and information sharing (Single Points of Contact and CSIRTs). Your Romanian-based counsel can establish the right communication channel immediately.\\n- Do not delay notifications if advised by your lawyer. The directive framework requires timely reporting to the appropriate authorities and, where required, to the data subjects.\\n\\nKey regulatory obligations to address with your lawyer (based on the context you provided)\\n- Early warning and incident notification timing: The Directive (EU) 2022/2555 envisions an early warning within 24 hours and an incident notification within 72 hours of becoming aware of a significant incident, with a final report not later than one month after notification. Your lawyer can help determine if the breach qualifies as a significant incident and ensure the required timelines are met through the proper channels. [2]\\n- Personal data breach notification to supervisory authorities: If the breach involves personal data, authorities emphasize informing the supervisory authority without undue delay and within the established timeframes when applicable. Your lawyer will assess the specifics of your breach under GDPR and initiate the correct notification process. [2]\\n- Cooperation channels (SPOCs and CSIRTs): In the EU context, Single Points of Contact (SPOCs) and Computer Security Incident Response Teams (CSIRTs) play a central role in coordinating notifications and information sharing across competent authorities. Your Romanian counsel can engage these channels to ensure fast, compliant communication and support. [2]\\n- Civil-law liability context in Romania: Under Romanian Civil Code provisions, contractual and other damages liabilities arise if obligations are not fulfilled, which underscores why timely breach notification and remediation are important from a legal-arbitration risk perspective. This helps explain why a formal notification and remediation plan are essential. [1]\\n\\nWhat to expect from a Romanian-based GDPR lawyer (typical scope, aligned with your budget)\\n- Immediate breach-notification package: assessment of whether the breach triggers GDPR notification to the supervisory authority and/or to data subjects; drafting and coordinating the actual notices; ensuring timelines are met under the directive and Romanian law.\\n- Coordination with authorities and incident-response bodies: establishing contact with the applicable SPOC/CSIRT and managing liaison with the supervisory authority.\\n- Data-subject communications: drafting clear, compliant notices to affected customers, including information to help them mitigate risk (this is standard practice to minimize harm and regulatory risk).\\n- Documentation and follow-up plan: producing a formal incident report outline, a remediation plan, and a schedule for final reporting as required.\\n\\nPractical steps you can start today (to share with the lawyer)\\n- Share the breach details: approximate time of discovery, systems involved, data categories (names, addresses, SSNs), estimated number of affected individuals, current containment actions, and any steps already taken to secure data.\\n- Confirm the fastest permissible notification path: whether to file an early warning to the SPOC/CSIRT, followed by a 72-hour incident notification to the supervisory authority and affected individuals (if required), plus a final monthly report.\\n- Establish who will draft and approve all communications (legal, privacy/compliance lead, and senior management sign-off).\\n- Set clear expectations on cost and deliverables within your 2,000–5,000 EUR budget for this incident, with scope defined (not just advisory, but practical drafting, notifications, and coordination).\\n\\nNotes on grounding in the Context\\n- The described 24-hour/72-hour/one-month timelines and the roles of SPOCs/CSIRTs come from the EU Directive material you provided. They guide the notification cadence and the cooperation framework your lawyer will implement. [2]\\n- Romanian civil-law background underscores the importance of timely and proper remediation and communications to limit liability in contractual or civil-fault contexts. This supports using early, formal notifications as part of the overall risk management. [1]\\n\\nIf you’d like, I can help you draft a quick briefing for a Romanian GDPR lawyer (including the key questions to ask and the documents to provide) to accelerate starting the engagement and the breach-notification process.\\n\\n### References\\n\\n- [1] romanian_civil_code_2009.txt\\n- [2] gdpr_2022_2555.txt', 'references': [{'reference_id': '1', 'file_path': 'romanian_civil_code_2009.txt', 'content': None}, {'reference_id': '2', 'file_path': 'gdpr_2022_2555.txt', 'content': None}]}\n",
71
- "INFO:langraph_agent:🔍 LightRAG response received:\n",
72
- "INFO:langraph_agent:📄 Context length: 5594 characters\n",
73
- "INFO:langraph_agent:📚 References found: 2\n",
74
- "INFO:langraph_agent:📝 Context preview: Thanks for the details. Given the breach happened in Romania and involves sensitive customer data (names, addresses, SSNs) affecting about 500 people, you should act quickly to align with GDPR obligations and the EU cyber-notification framework. Here’s a practical, lawyer-focused plan you can use right away.\n",
75
- "\n",
76
- "Immediate next steps (urgent)\n",
77
- "- Engage a Romanian-based GDPR/legal firm now (you’ve indicated a preference for a Romanian-language, Romania-based firm within a 2,000–5,000 EUR range). A loc...\n",
78
- "INFO:langraph_agent:⏱️ LightRAG query processing time: 64.635s\n",
79
- "INFO:langraph_agent:📝 Conversation history: [{'role': 'user', 'content': 'I need help with a data breach issue'}, {'role': 'assistant', 'content': 'I can help with that. Can you tell me more about the breach?'}, {'role': 'user', 'content': \"My company is in Romania and experienced a data breach. Customer names, addresses, and SSNs were stolen. We have about 500 affected customers. The breach occurred 2 days ago. We haven't notified authorities yet. I need a lawyer to help with GDPR compliance and breach notification.\"}, {'role': 'assistant', 'content': \"I understand. To find the best lawyer for you, could you tell me: Are you looking for a Romanian-based firm or international? What language do you prefer? What's your budget range?\"}]\n",
80
- "INFO:langraph_agent:📝 Message: {'role': 'user', 'content': 'I need help with a data breach issue'}\n",
81
- "INFO:langraph_agent:📝 Message type: <class 'dict'>\n",
82
- "INFO:langraph_agent:📝 Message keys: dict_keys(['role', 'content'])\n",
83
- "INFO:langraph_agent:📝 Message: {'role': 'assistant', 'content': 'I can help with that. Can you tell me more about the breach?'}\n",
84
- "INFO:langraph_agent:📝 Message type: <class 'dict'>\n",
85
- "INFO:langraph_agent:📝 Message keys: dict_keys(['role', 'content'])\n",
86
- "INFO:langraph_agent:📝 Message: {'role': 'user', 'content': \"My company is in Romania and experienced a data breach. Customer names, addresses, and SSNs were stolen. We have about 500 affected customers. The breach occurred 2 days ago. We haven't notified authorities yet. I need a lawyer to help with GDPR compliance and breach notification.\"}\n",
87
- "INFO:langraph_agent:📝 Message type: <class 'dict'>\n",
88
- "INFO:langraph_agent:📝 Message keys: dict_keys(['role', 'content'])\n",
89
- "INFO:langraph_agent:📝 Message: {'role': 'assistant', 'content': \"I understand. To find the best lawyer for you, could you tell me: Are you looking for a Romanian-based firm or international? What language do you prefer? What's your budget range?\"}\n",
90
- "INFO:langraph_agent:📝 Message type: <class 'dict'>\n",
91
- "INFO:langraph_agent:📝 Message keys: dict_keys(['role', 'content'])\n",
92
- "INFO:langraph_agent:📝 Created Messages stack: [HumanMessage(content='I need help with a data breach issue', additional_kwargs={}, response_metadata={}), AIMessage(content='I can help with that. Can you tell me more about the breach?', additional_kwargs={}, response_metadata={}, tool_calls=[], invalid_tool_calls=[]), HumanMessage(content=\"My company is in Romania and experienced a data breach. Customer names, addresses, and SSNs were stolen. We have about 500 affected customers. The breach occurred 2 days ago. We haven't notified authorities yet. I need a lawyer to help with GDPR compliance and breach notification.\", additional_kwargs={}, response_metadata={}), AIMessage(content=\"I understand. To find the best lawyer for you, could you tell me: Are you looking for a Romanian-based firm or international? What language do you prefer? What's your budget range?\", additional_kwargs={}, response_metadata={}, tool_calls=[], invalid_tool_calls=[])]\n",
93
- "INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
94
- "INFO:langraph_agent:🤖 LLM Response type: <class 'langchain_core.messages.ai.AIMessage'>\n",
95
- "INFO:langraph_agent:🤖 LLM Response content length: 0\n",
96
- "INFO:langraph_agent:🤖 LLM Response has tool_calls: True\n",
97
- "INFO:langraph_agent:🤖 LLM Response tool_calls: [{'name': 'find_lawyers', 'args': {'query': 'Romanian-based GDPR breach-notification lawyer for a data breach in Romania involving 500 customers with names, addresses, and SSNs; urgent breach notification within GDPR timelines; Romanian language; budget 2000-5000 EUR; immediate help with notification to authorities and data subjects; experienced with SPOCs/CSIRTs and ANSPDCP coordination.'}, 'id': 'call_DYoaaEVoh5tCVoKjlnD3r2dy', 'type': 'tool_call'}]\n",
98
- "INFO:langraph_agent:🤖 LLM Response has invalid_tool_calls: True\n",
99
- "INFO:langraph_agent:🤖 LLM Response invalid_tool_calls: []\n",
100
- "INFO:langraph_agent:⏱️ Answer generation processing time: 9.758s\n",
101
- "INFO:langraph_agent:⏱️ Total query processing time: 74.393s\n",
102
- "INFO:langraph_agent:📚 References found: 2\n",
103
- "INFO:langraph_agent:🔍 Checking for tool calls in response_message...\n",
104
- "INFO:langraph_agent: Response type: <class 'NoneType'>\n",
105
- "INFO:langraph_agent: No response_message found in state\n",
106
- "INFO:langraph_agent: No tool calls, routing to end\n",
107
- "INFO:langraph_agent:final state: {'user_query': 'I prefer a Romanian-based firm, English language is not fine, and my budget is around 2000-5000 EUR for this incident. I need help immediately with breach notification.', 'conversation_history': [{'role': 'user', 'content': 'I need help with a data breach issue'}, {'role': 'assistant', 'content': 'I can help with that. Can you tell me more about the breach?'}, {'role': 'user', 'content': \"My company is in Romania and experienced a data breach. Customer names, addresses, and SSNs were stolen. We have about 500 affected customers. The breach occurred 2 days ago. We haven't notified authorities yet. I need a lawyer to help with GDPR compliance and breach notification.\"}, {'role': 'assistant', 'content': \"I understand. To find the best lawyer for you, could you tell me: Are you looking for a Romanian-based firm or international? What language do you prefer? What's your budget range?\"}], 'lightrag_response': {'response': 'Thanks for the details. Given the breach happened in Romania and involves sensitive customer data (names, addresses, SSNs) affecting about 500 people, you should act quickly to align with GDPR obligations and the EU cyber-notification framework. Here’s a practical, lawyer-focused plan you can use right away.\\n\\nImmediate next steps (urgent)\\n- Engage a Romanian-based GDPR/legal firm now (you’ve indicated a preference for a Romanian-language, Romania-based firm within a 2,000–5,000 EUR range). A local lawyer can lead the breach-notification process, coordinate with authorities, and help prepare communications to customers.\\n- Prepare a rapid internal factsheet for the lawyer and the incident response team (what happened, when it started, what data were affected, which systems were involved, what containment steps have been taken, and who is affected). This will support the legal review and notifications.\\n- Initiate coordination with the relevant authorities and incident-response bodies. In the EU framework you’re operating under, important entities handle incident notifications and information sharing (Single Points of Contact and CSIRTs). Your Romanian-based counsel can establish the right communication channel immediately.\\n- Do not delay notifications if advised by your lawyer. The directive framework requires timely reporting to the appropriate authorities and, where required, to the data subjects.\\n\\nKey regulatory obligations to address with your lawyer (based on the context you provided)\\n- Early warning and incident notification timing: The Directive (EU) 2022/2555 envisions an early warning within 24 hours and an incident notification within 72 hours of becoming aware of a significant incident, with a final report not later than one month after notification. Your lawyer can help determine if the breach qualifies as a significant incident and ensure the required timelines are met through the proper channels. [2]\\n- Personal data breach notification to supervisory authorities: If the breach involves personal data, authorities emphasize informing the supervisory authority without undue delay and within the established timeframes when applicable. Your lawyer will assess the specifics of your breach under GDPR and initiate the correct notification process. [2]\\n- Cooperation channels (SPOCs and CSIRTs): In the EU context, Single Points of Contact (SPOCs) and Computer Security Incident Response Teams (CSIRTs) play a central role in coordinating notifications and information sharing across competent authorities. Your Romanian counsel can engage these channels to ensure fast, compliant communication and support. [2]\\n- Civil-law liability context in Romania: Under Romanian Civil Code provisions, contractual and other damages liabilities arise if obligations are not fulfilled, which underscores why timely breach notification and remediation are important from a legal-arbitration risk perspective. This helps explain why a formal notification and remediation plan are essential. [1]\\n\\nWhat to expect from a Romanian-based GDPR lawyer (typical scope, aligned with your budget)\\n- Immediate breach-notification package: assessment of whether the breach triggers GDPR notification to the supervisory authority and/or to data subjects; drafting and coordinating the actual notices; ensuring timelines are met under the directive and Romanian law.\\n- Coordination with authorities and incident-response bodies: establishing contact with the applicable SPOC/CSIRT and managing liaison with the supervisory authority.\\n- Data-subject communications: drafting clear, compliant notices to affected customers, including information to help them mitigate risk (this is standard practice to minimize harm and regulatory risk).\\n- Documentation and follow-up plan: producing a formal incident report outline, a remediation plan, and a schedule for final reporting as required.\\n\\nPractical steps you can start today (to share with the lawyer)\\n- Share the breach details: approximate time of discovery, systems involved, data categories (names, addresses, SSNs), estimated number of affected individuals, current containment actions, and any steps already taken to secure data.\\n- Confirm the fastest permissible notification path: whether to file an early warning to the SPOC/CSIRT, followed by a 72-hour incident notification to the supervisory authority and affected individuals (if required), plus a final monthly report.\\n- Establish who will draft and approve all communications (legal, privacy/compliance lead, and senior management sign-off).\\n- Set clear expectations on cost and deliverables within your 2,000–5,000 EUR budget for this incident, with scope defined (not just advisory, but practical drafting, notifications, and coordination).\\n\\nNotes on grounding in the Context\\n- The described 24-hour/72-hour/one-month timelines and the roles of SPOCs/CSIRTs come from the EU Directive material you provided. They guide the notification cadence and the cooperation framework your lawyer will implement. [2]\\n- Romanian civil-law background underscores the importance of timely and proper remediation and communications to limit liability in contractual or civil-fault contexts. This supports using early, formal notifications as part of the overall risk management. [1]\\n\\nIf you’d like, I can help you draft a quick briefing for a Romanian GDPR lawyer (including the key questions to ask and the documents to provide) to accelerate starting the engagement and the breach-notification process.\\n\\n### References\\n\\n- [1] romanian_civil_code_2009.txt\\n- [2] gdpr_2022_2555.txt', 'references': [{'reference_id': '1', 'file_path': 'romanian_civil_code_2009.txt', 'content': None}, {'reference_id': '2', 'file_path': 'gdpr_2022_2555.txt', 'content': None}]}, 'processed_context': None, 'relevant_documents': ['romanian_civil_code_2009.txt', 'gdpr_2022_2555.txt'], 'analysis_thoughts': None, 'final_response': '', 'query_timestamp': '2026-01-03T18:32:20.045214', 'processing_time': 74.39322590827942}\n",
108
- "INFO:langraph_agent:✅ Query processing completed successfully\n",
109
- "INFO:langraph_agent:📄 Response length: 0 characters\n"
110
- ]
111
- },
112
- {
113
- "data": {
114
- "text/plain": [
115
- "{'response': '',\n",
116
- " 'processing_time': 74.39322590827942,\n",
117
- " 'references': ['romanian_civil_code_2009.txt', 'gdpr_2022_2555.txt'],\n",
118
- " 'timestamp': '2026-01-03T18:32:20.045214'}"
119
- ]
120
- },
121
- "execution_count": 11,
122
- "metadata": {},
123
- "output_type": "execute_result"
124
- }
125
- ],
126
- "source": [
127
- "await agent.process_query(conversation_history=history,user_query=user_query)"
128
- ]
129
- }
130
- ],
131
- "metadata": {
132
- "kernelspec": {
133
- "display_name": "cyberlgl",
134
- "language": "python",
135
- "name": "python3"
136
- },
137
- "language_info": {
138
- "codemirror_mode": {
139
- "name": "ipython",
140
- "version": 3
141
- },
142
- "file_extension": ".py",
143
- "mimetype": "text/x-python",
144
- "name": "python",
145
- "nbconvert_exporter": "python",
146
- "pygments_lexer": "ipython3",
147
- "version": "3.12.12"
148
- }
149
- },
150
- "nbformat": 4,
151
- "nbformat_minor": 5
152
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
test_openai_key.ipynb DELETED
@@ -1,155 +0,0 @@
1
- {
2
- "cells": [
3
- {
4
- "cell_type": "markdown",
5
- "metadata": {},
6
- "source": [
7
- "# Test OpenAI API Key\n",
8
- "\n",
9
- "This notebook tests if the OpenAI API key is working correctly."
10
- ]
11
- },
12
- {
13
- "cell_type": "code",
14
- "execution_count": 1,
15
- "metadata": {},
16
- "outputs": [
17
- {
18
- "name": "stdout",
19
- "output_type": "stream",
20
- "text": [
21
- "API Key loaded: sk-proj-zF...6Cny4xuHsA\n"
22
- ]
23
- }
24
- ],
25
- "source": [
26
- "# Load environment variables\n",
27
- "import os\n",
28
- "from dotenv import load_dotenv\n",
29
- "\n",
30
- "# Load the .env file\n",
31
- "load_dotenv('.env')\n",
32
- "\n",
33
- "# Get the API key\n",
34
- "api_key = os.getenv('OPENAI_API_KEY')\n",
35
- "print(f\"API Key loaded: {api_key[:10]}...{api_key[-10:] if api_key else 'None'}\")"
36
- ]
37
- },
38
- {
39
- "cell_type": "code",
40
- "execution_count": 2,
41
- "metadata": {},
42
- "outputs": [
43
- {
44
- "name": "stdout",
45
- "output_type": "stream",
46
- "text": [
47
- "❌ OpenAI API Test Failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}\n"
48
- ]
49
- }
50
- ],
51
- "source": [
52
- "# Test OpenAI API\n",
53
- "from openai import OpenAI\n",
54
- "\n",
55
- "try:\n",
56
- " client = OpenAI(api_key=api_key)\n",
57
- " \n",
58
- " # Simple test request\n",
59
- " response = client.chat.completions.create(\n",
60
- " model=\"gpt-5-nano-2025-08-07\",\n",
61
- " messages=[\n",
62
- " {\"role\": \"user\", \"content\": \"Hello, can you respond with just 'API key works!'?\"}\n",
63
- " ],\n",
64
- "\n",
65
- " )\n",
66
- " \n",
67
- " print(\"✅ OpenAI API Test Successful!\")\n",
68
- " print(f\"Response: {response.choices[0].message.content}\")\n",
69
- " \n",
70
- "except Exception as e:\n",
71
- " print(f\"❌ OpenAI API Test Failed: {e}\")"
72
- ]
73
- },
74
- {
75
- "cell_type": "code",
76
- "execution_count": null,
77
- "metadata": {},
78
- "outputs": [],
79
- "source": [
80
- "# Test embedding API\n",
81
- "try:\n",
82
- " response = client.embeddings.create(\n",
83
- " model=\"text-embedding-3-large\",\n",
84
- " input=\"Test embedding for cyber legal regulations\"\n",
85
- " )\n",
86
- " \n",
87
- " print(\"✅ OpenAI Embedding API Test Successful!\")\n",
88
- " print(f\"Embedding dimension: {len(response.data[0].embedding)}\")\n",
89
- " print(f\"First 5 values: {response.data[0].embedding[:5]}\")\n",
90
- " \n",
91
- "except Exception as e:\n",
92
- " print(f\"❌ OpenAI Embedding API Test Failed: {e}\")"
93
- ]
94
- },
95
- {
96
- "cell_type": "code",
97
- "execution_count": null,
98
- "metadata": {},
99
- "outputs": [],
100
- "source": [
101
- "# Test LightRAG connection with the API key\n",
102
- "try:\n",
103
- " import sys\n",
104
- " sys.path.append('../LightRAG')\n",
105
- " \n",
106
- " from lightrag import LightRAG, QueryParam\n",
107
- " from lightrag.llm import openai_complete_if_cache, openai_embedding\n",
108
- " \n",
109
- " rag = LightRAG(\n",
110
- " working_dir=\"./rag_storage\",\n",
111
- " llm_model_func=openai_complete_if_cache,\n",
112
- " llm_model_name=\"gpt-4o\",\n",
113
- " llm_model_kwargs={\"api_key\": api_key, \"model\": \"gpt-4o\"},\n",
114
- " embedding_func=openai_embedding,\n",
115
- " embedding_model_name=\"text-embedding-3-large\",\n",
116
- " embedding_model_kwargs={\"api_key\": api_key, \"model\": \"text-embedding-3-large\"}\n",
117
- " )\n",
118
- " \n",
119
- " # Test a simple query\n",
120
- " result = rag.query(\n",
121
- " \"What is NIS2?\",\n",
122
- " param=QueryParam(mode=\"naive\")\n",
123
- " )\n",
124
- " \n",
125
- " print(\"✅ LightRAG API Test Successful!\")\n",
126
- " print(f\"Response length: {len(result)} characters\")\n",
127
- " print(f\"Response preview: {result[:200]}...\")\n",
128
- " \n",
129
- "except Exception as e:\n",
130
- " print(f\"❌ LightRAG API Test Failed: {e}\")"
131
- ]
132
- }
133
- ],
134
- "metadata": {
135
- "kernelspec": {
136
- "display_name": "cyberlgl",
137
- "language": "python",
138
- "name": "python3"
139
- },
140
- "language_info": {
141
- "codemirror_mode": {
142
- "name": "ipython",
143
- "version": 3
144
- },
145
- "file_extension": ".py",
146
- "mimetype": "text/x-python",
147
- "name": "python",
148
- "nbconvert_exporter": "python",
149
- "pygments_lexer": "ipython3",
150
- "version": "3.12.12"
151
- }
152
- },
153
- "nbformat": 4,
154
- "nbformat_minor": 4
155
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
test_tool_calling_demo.ipynb DELETED
@@ -1,676 +0,0 @@
1
- {
2
- "cells": [
3
- {
4
- "cell_type": "markdown",
5
- "metadata": {},
6
- "source": [
7
- "# CyberLegalAI - Tool Calling Demo\n",
8
- "\n",
9
- "This notebook demonstrates the new flexible LangGraph agent with tool-calling capabilities.\n",
10
- "\n",
11
- "## What you'll see:\n",
12
- "1. Agent initialization with different configurations\n",
13
- "2. Tool calling scenarios (knowledge graph, lawyer finder)\n",
14
- "3. Direct answer scenarios (no tools)\n",
15
- "4. Multiple tool calls in sequence"
16
- ]
17
- },
18
- {
19
- "cell_type": "code",
20
- "execution_count": 1,
21
- "metadata": {},
22
- "outputs": [],
23
- "source": [
24
- "# Install required packages if needed\n",
25
- "# !pip install -r requirements.txt"
26
- ]
27
- },
28
- {
29
- "cell_type": "code",
30
- "execution_count": 2,
31
- "metadata": {},
32
- "outputs": [
33
- {
34
- "data": {
35
- "text/plain": [
36
- "True"
37
- ]
38
- },
39
- "execution_count": 2,
40
- "metadata": {},
41
- "output_type": "execute_result"
42
- }
43
- ],
44
- "source": [
45
- "import asyncio\n",
46
- "import json\n",
47
- "from dotenv import load_dotenv\n",
48
- "\n",
49
- "from langraph_agent import CyberLegalAgent\n",
50
- "from tools import tools_for_client, tools_for_lawyer\n",
51
- "from prompts import SYSTEM_PROMPT_CLIENT, SYSTEM_PROMPT_LAWYER\n",
52
- "\n",
53
- "load_dotenv()"
54
- ]
55
- },
56
- {
57
- "cell_type": "markdown",
58
- "metadata": {},
59
- "source": [
60
- "## 1. Initialize Agents\n",
61
- "\n",
62
- "We'll create two agents with different configurations:\n",
63
- "- **Client Agent**: Friendly tone, can find lawyers + query knowledge graph\n",
64
- "- **Lawyer Agent**: Professional tone, only queries knowledge graph"
65
- ]
66
- },
67
- {
68
- "cell_type": "code",
69
- "execution_count": 3,
70
- "metadata": {},
71
- "outputs": [
72
- {
73
- "name": "stdout",
74
- "output_type": "stream",
75
- "text": [
76
- "✅ Agents initialized successfully!\n",
77
- "\n",
78
- "Client Agent Tools: ['query_knowledge_graph', 'find_lawyers']\n",
79
- "Lawyer Agent Tools: ['query_knowledge_graph']\n"
80
- ]
81
- }
82
- ],
83
- "source": [
84
- "# Initialize client agent\n",
85
- "client_agent = CyberLegalAgent(\n",
86
- " llm_provider=\"openai\",\n",
87
- " system_prompt=SYSTEM_PROMPT_CLIENT,\n",
88
- " tools=tools_for_client\n",
89
- ")\n",
90
- "\n",
91
- "# Initialize lawyer agent\n",
92
- "lawyer_agent = CyberLegalAgent(\n",
93
- " llm_provider=\"openai\",\n",
94
- " system_prompt=SYSTEM_PROMPT_LAWYER,\n",
95
- " tools=tools_for_lawyer\n",
96
- ")\n",
97
- "\n",
98
- "print(\"✅ Agents initialized successfully!\")\n",
99
- "print(f\"\\nClient Agent Tools: {[t.name for t in client_agent.tools]}\")\n",
100
- "print(f\"Lawyer Agent Tools: {[t.name for t in lawyer_agent.tools]}\")"
101
- ]
102
- },
103
- {
104
- "cell_type": "markdown",
105
- "metadata": {},
106
- "source": [
107
- "## 2. Test 1: Direct Answer (No Tool Call)\n",
108
- "\n",
109
- "**Scenario**: User asks a general question that the LLM can answer with its knowledge.\n",
110
- "\n",
111
- "**Expected Behavior**: Agent responds directly without calling any tools."
112
- ]
113
- },
114
- {
115
- "cell_type": "code",
116
- "execution_count": 4,
117
- "metadata": {},
118
- "outputs": [
119
- {
120
- "name": "stdout",
121
- "output_type": "stream",
122
- "text": [
123
- "================================================================================\n",
124
- "TEST 1: Direct Answer (No Tool Call)\n",
125
- "================================================================================\n",
126
- "\n",
127
- "👤 User Query: What is GDPR and what are the main principles?\n",
128
- "\n",
129
- "Processing...\n",
130
- "\n"
131
- ]
132
- },
133
- {
134
- "name": "stderr",
135
- "output_type": "stream",
136
- "text": [
137
- "INFO:httpx:HTTP Request: POST https://api.cerebras.ai/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
138
- "INFO:langraph_agent:🔧 Calling tools: ['query_knowledge_graph']\n"
139
- ]
140
- },
141
- {
142
- "name": "stdout",
143
- "output_type": "stream",
144
- "text": [
145
- "[{'name': 'query_knowledge_graph', 'args': {'query': 'What is GDPR and what are the main principles', 'conversation_history': []}, 'id': 'd156ff401', 'type': 'tool_call'}]\n"
146
- ]
147
- },
148
- {
149
- "name": "stderr",
150
- "output_type": "stream",
151
- "text": [
152
- "INFO:utils:Query successful;{'response': '**What is the GDPR?** \\n\\nThe **General Data Protection Regulation (GDPR)**, formally **Regulation (EU) 2016/679**, is a European Union regulation that establishes a comprehensive legal framework for the protection of personal data and privacy of natural persons within the EU and the European Economic Area. It sets out rules for how personal data must be processed, stored, and erased, granting individuals rights such as access to their data, the right to be forgotten, and the right to data portability. The regulation also obliges organisations to be transparent about their data‑handling practices and to embed data‑protection safeguards into the design of their systems【1】. \\n\\n**Main principles of the GDPR** \\n\\n| Principle | What it means (as reflected in the GDPR) |\\n|-----------|-------------------------------------------|\\n| **Lawfulness, fairness and transparency** | Personal data must be processed on a lawful basis, in a fair manner, and organisations must provide clear information to data subjects about how their data are used. |\\n| **Purpose limitation** | Data may only be collected for specified, explicit, and legitimate purposes and must not be further processed in a way that is incompatible with those purposes【2】. |\\n| **Data minimisation** | Only the minimum amount of personal data necessary for the intended purpose may be collected and processed【3】. |\\n| **Accuracy** | Personal data must be kept accurate and up‑to‑date; inaccurate data should be corrected or erased without delay. |\\n| **Storage limitation** | Data should be retained only for as long as necessary for the purposes for which it was collected. |\\n| **Integrity and confidentiality (security)** | Appropriate technical and organisational measures must protect data against unauthorised access, loss, or destruction. |\\n| **Accountability** | Controllers are responsible for, and must be able to demonstrate, compliance with all other principles. |\\n| **Data‑protection‑by‑design and by‑default** | Systems and processes must incorporate data‑protection measures from the outset and ensure that, by default, only the data necessary for a specific purpose are processed【4】. |\\n\\nThese principles together create a high‑level standard for data‑protection compliance across the Union, shaping how organisations handle personal information and reinforcing individuals’ control over their own data【5】. \\n\\n---\\n\\n### References\\n\\n- [1] gdpr_2022_2555.txt \\n- [2] cyber_resilience_act_2024_2847.txt \\n- [3] cyber_resilience_act_2024_2847.txt \\n- [4] cyber_resilience_act_2024_2847.txt \\n- [5] nis2_2022_2555.txt ', 'references': [{'reference_id': '1', 'file_path': 'gdpr_2022_2555.txt', 'content': None}, {'reference_id': '2', 'file_path': 'cyber_resilience_act_2024_2847.txt', 'content': None}, {'reference_id': '3', 'file_path': 'nis2_2022_2555.txt', 'content': None}]}\n",
153
- "INFO:langraph_agent:🔧 Tool query_knowledge_graph returned: **What is the GDPR?** \n",
154
- "\n",
155
- "The **General Data Protection Regulation (GDPR)**, formally **Regulation (EU) 2016/679**, is a European Union regulation that establishes a comprehensive legal framework for the protection of personal data and privacy of natural persons within the EU and the European Economic Area. It sets out rules for how personal data must be processed, stored, and erased, granting individuals rights such as access to their data, the right to be forgotten, and the right to data portability. The regulation also obliges organisations to be transparent about their data‑handling practices and to embed data‑protection safeguards into the design of their systems【1】. \n",
156
- "\n",
157
- "**Main principles of the GDPR** \n",
158
- "\n",
159
- "| Principle | What it means (as reflected in the GDPR) |\n",
160
- "|-----------|-------------------------------------------|\n",
161
- "| **Lawfulness, fairness and transparency** | Personal data must be processed on a lawful basis, in a fair manner, and organisations must provide clear information to data subjects about how their data are used. |\n",
162
- "| **Purpose limitation** | Data may only be collected for specified, explicit, and legitimate purposes and must not be further processed in a way that is incompatible with those purposes【2】. |\n",
163
- "| **Data minimisation** | Only the minimum amount of personal data necessary for the intended purpose may be collected and processed【3】. |\n",
164
- "| **Accuracy** | Personal data must be kept accurate and up‑to‑date; inaccurate data should be corrected or erased without delay. |\n",
165
- "| **Storage limitation** | Data should be retained only for as long as necessary for the purposes for which it was collected. |\n",
166
- "| **Integrity and confidentiality (security)** | Appropriate technical and organisational measures must protect data against unauthorised access, loss, or destruction. |\n",
167
- "| **Accountability** | Controllers are responsible for, and must be able to demonstrate, compliance with all other principles. |\n",
168
- "| **Data‑protection‑by‑design and by‑default** | Systems and processes must incorporate data‑protection measures from the outset and ensure that, by default, only the data necessary for a specific purpose are processed【4】. |\n",
169
- "\n",
170
- "These principles together create a high‑level standard for data‑protection compliance across the Union, shaping how organisations handle personal information and reinforcing individuals’ control over their own data【5】. \n",
171
- "\n",
172
- "---\n",
173
- "\n",
174
- "### References\n",
175
- "\n",
176
- "- [1] gdpr_2022_2555.txt \n",
177
- "- [2] cyber_resilience_act_2024_2847.txt \n",
178
- "- [3] cyber_resilience_act_2024_2847.txt \n",
179
- "- [4] cyber_resilience_act_2024_2847.txt \n",
180
- "- [5] nis2_2022_2555.txt \n",
181
- "INFO:httpx:HTTP Request: POST https://api.cerebras.ai/v1/chat/completions \"HTTP/1.1 200 OK\"\n"
182
- ]
183
- },
184
- {
185
- "name": "stdout",
186
- "output_type": "stream",
187
- "text": [
188
- "--------------------------------------------------------------------------------\n",
189
- "🤖 Agent Response:\n",
190
- "--------------------------------------------------------------------------------\n",
191
- "**What is the GDPR?** \n",
192
- "The **General Data Protection Regulation (GDPR)** – formally **Regulation (EU) 2016/679** – is the EU’s main law on data protection. It sets a single, EU‑wide framework for how personal data (any information that can identify a person) must be handled. The goal is to give individuals more control over their data while requiring organisations to be transparent, secure, and accountable when they collect, store, or use that data.\n",
193
- "\n",
194
- "**The main GDPR principles** \n",
195
- "These are the building blocks that every data‑controller (the entity that decides why and how data is processed) must follow:\n",
196
- "\n",
197
- "| Principle | Plain‑English meaning |\n",
198
- "|-----------|-----------------------|\n",
199
- "| **Lawfulness, fairness & transparency** | You need a legal reason to process data, must treat people fairly,\n",
200
- "...\n",
201
- "\n",
202
- "⏱️ Processing Time: 0.00s\n",
203
- "📅 Timestamp: 2026-01-06T11:52:42.305179\n",
204
- "================================================================================\n"
205
- ]
206
- }
207
- ],
208
- "source": [
209
- "async def test_direct_answer():\n",
210
- " print(\"=\" * 80)\n",
211
- " print(\"TEST 1: Direct Answer (No Tool Call)\")\n",
212
- " print(\"=\" * 80)\n",
213
- " \n",
214
- " user_query = \"What is GDPR and what are the main principles?\"\n",
215
- " print(f\"\\n👤 User Query: {user_query}\\n\")\n",
216
- " print(\"Processing...\\n\")\n",
217
- " \n",
218
- " result = await client_agent.process_query(\n",
219
- " user_query=user_query,\n",
220
- " conversation_history=[]\n",
221
- " )\n",
222
- " \n",
223
- " print(\"-\" * 80)\n",
224
- " print(\"🤖 Agent Response:\")\n",
225
- " print(\"-\" * 80)\n",
226
- " print(result['response'][:800])\n",
227
- " print(\"...\" if len(result['response']) > 800 else \"\")\n",
228
- " \n",
229
- " print(f\"\\n⏱️ Processing Time: {result['processing_time']:.2f}s\")\n",
230
- " print(f\"📅 Timestamp: {result['timestamp']}\")\n",
231
- " print(\"=\" * 80)\n",
232
- "\n",
233
- "await test_direct_answer()"
234
- ]
235
- },
236
- {
237
- "cell_type": "markdown",
238
- "metadata": {},
239
- "source": [
240
- "## 3. Test 2: Tool Calling - Knowledge Graph\n",
241
- "\n",
242
- "**Scenario**: User asks a specific legal question requiring accurate information from EU regulations.\n",
243
- "\n",
244
- "**Expected Behavior**: Agent calls `query_knowledge_graph` tool, receives results, then formulates answer."
245
- ]
246
- },
247
- {
248
- "cell_type": "code",
249
- "execution_count": 5,
250
- "metadata": {},
251
- "outputs": [
252
- {
253
- "name": "stdout",
254
- "output_type": "stream",
255
- "text": [
256
- "================================================================================\n",
257
- "TEST 2: Tool Calling - Knowledge Graph Query\n",
258
- "================================================================================\n",
259
- "\n",
260
- "👤 User Query: What are the data breach notification requirements under GDPR?\n",
261
- "\n",
262
- "Processing...\n",
263
- "\n"
264
- ]
265
- },
266
- {
267
- "name": "stderr",
268
- "output_type": "stream",
269
- "text": [
270
- "INFO:httpx:HTTP Request: POST https://api.cerebras.ai/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
271
- "INFO:langraph_agent:🔧 Calling tools: ['query_knowledge_graph']\n"
272
- ]
273
- },
274
- {
275
- "name": "stdout",
276
- "output_type": "stream",
277
- "text": [
278
- "[{'name': 'query_knowledge_graph', 'args': {'query': 'GDPR data breach notification requirements', 'conversation_history': []}, 'id': 'f8f7b9d4f', 'type': 'tool_call'}]\n"
279
- ]
280
- },
281
- {
282
- "name": "stderr",
283
- "output_type": "stream",
284
- "text": [
285
- "INFO:utils:Query successful;{'response': '**GDPR data‑breach notification requirements**\\n\\n- **Timing of the first notification** \\n The regulator requires that a manufacturer (or any data‑controller) submit an **initial incident notification within 72\\u202fhours** of becoming aware of a breach that could affect the security of a product with digital elements. If a severe incident is identified earlier, a **warning must be sent within 24\\u202fhours**\\u202f【1】. \\n\\n- **Content of the initial notice** \\n The notification must contain: \\n 1. A description of the incident and its **severity and impact**; \\n 2. The **type of threat or root cause** that triggered the breach; \\n 3. Any **corrective or mitigating measures** already taken and those that users can apply; \\n 4. An assessment of whether the breach is **suspected of being caused by unlawful or malicious acts**; \\n 5. An indication of the **sensitivity** of the information disclosed. \\n\\n- **Final report** \\n A **final report** – detailed, including the incident’s full description, the root cause, and ongoing mitigation actions – must be submitted **no later than one month** after the initial notification【1】. \\n\\n- **Use of a single reporting platform** \\n Member States may provide a **single entry point** (e.g., an electronic portal) for submitting breach notifications. This platform can be used for GDPR‑related incident reports, ensuring that the same technical means serve multiple legal obligations and reducing administrative burden【2】. \\n\\n- **Obligation to inform data subjects** \\n When the breach is likely to result in a **high risk to the rights and freedoms of natural persons**, the controller must **inform the affected individuals without undue delay**, providing clear, understandable information about the nature of the breach and recommended protective steps. \\n\\n- **Supervisory‑authority role** \\n Supervisory Authorities, established under the GDPR, receive the notifications, assess the risk, may request further information, and are responsible for enforcing any required remedial actions. \\n\\nThese requirements together ensure that personal‑data breaches are reported promptly, transparently, and with sufficient detail for authorities and affected individuals to respond effectively. \\n\\n### References\\n\\n- [1] cyber_resilience_act_2024_2847.txt \\n- [2] nis2_2022_2555.txt ', 'references': [{'reference_id': '1', 'file_path': 'cyber_resilience_act_2024_2847.txt', 'content': None}, {'reference_id': '2', 'file_path': 'nis2_2022_2555.txt', 'content': None}, {'reference_id': '3', 'file_path': 'gdpr_2022_2555.txt', 'content': None}]}\n",
286
- "INFO:langraph_agent:🔧 Tool query_knowledge_graph returned: **GDPR data‑breach notification requirements**\n",
287
- "\n",
288
- "- **Timing of the first notification** \n",
289
- " The regulator requires that a manufacturer (or any data‑controller) submit an **initial incident notification within 72 hours** of becoming aware of a breach that could affect the security of a product with digital elements. If a severe incident is identified earlier, a **warning must be sent within 24 hours** 【1】. \n",
290
- "\n",
291
- "- **Content of the initial notice** \n",
292
- " The notification must contain: \n",
293
- " 1. A description of the incident and its **severity and impact**; \n",
294
- " 2. The **type of threat or root cause** that triggered the breach; \n",
295
- " 3. Any **corrective or mitigating measures** already taken and those that users can apply; \n",
296
- " 4. An assessment of whether the breach is **suspected of being caused by unlawful or malicious acts**; \n",
297
- " 5. An indication of the **sensitivity** of the information disclosed. \n",
298
- "\n",
299
- "- **Final report** \n",
300
- " A **final report** – detailed, including the incident’s full description, the root cause, and ongoing mitigation actions – must be submitted **no later than one month** after the initial notification【1】. \n",
301
- "\n",
302
- "- **Use of a single reporting platform** \n",
303
- " Member States may provide a **single entry point** (e.g., an electronic portal) for submitting breach notifications. This platform can be used for GDPR‑related incident reports, ensuring that the same technical means serve multiple legal obligations and reducing administrative burden【2】. \n",
304
- "\n",
305
- "- **Obligation to inform data subjects** \n",
306
- " When the breach is likely to result in a **high risk to the rights and freedoms of natural persons**, the controller must **inform the affected individuals without undue delay**, providing clear, understandable information about the nature of the breach and recommended protective steps. \n",
307
- "\n",
308
- "- **Supervisory‑authority role** \n",
309
- " Supervisory Authorities, established under the GDPR, receive the notifications, assess the risk, may request further information, and are responsible for enforcing any required remedial actions. \n",
310
- "\n",
311
- "These requirements together ensure that personal‑data breaches are reported promptly, transparently, and with sufficient detail for authorities and affected individuals to respond effectively. \n",
312
- "\n",
313
- "### References\n",
314
- "\n",
315
- "- [1] cyber_resilience_act_2024_2847.txt \n",
316
- "- [2] nis2_2022_2555.txt \n",
317
- "INFO:httpx:HTTP Request: POST https://api.cerebras.ai/v1/chat/completions \"HTTP/1.1 200 OK\"\n"
318
- ]
319
- },
320
- {
321
- "name": "stdout",
322
- "output_type": "stream",
323
- "text": [
324
- "--------------------------------------------------------------------------------\n",
325
- "🤖 Agent Response:\n",
326
- "--------------------------------------------------------------------------------\n",
327
- "**GDPR data‑breach notification requirements – plain‑language summary**\n",
328
- "\n",
329
- "| What you must do | When you must do it | What you need to include |\n",
330
- "|------------------|---------------------|--------------------------|\n",
331
- "| **Notify the supervisory authority** (the data‑protection regulator) | **Within 72 hours** of becoming aware of a breach that could affect personal data. If the breach is clearly severe, you must send an early **warning within 24 hours**. | • A clear description of what happened <br>• The severity and potential impact <br>• The cause or type of threat (e.g., hacking, lost laptop) <br>• Measures already taken and any steps users should take <br>• Whether the breach appears to be caused by unlawful or malicious acts <br>• How sensitive the disclosed information is |\n",
332
- "| **Submit a f\n",
333
- "...\n",
334
- "\n",
335
- "⏱️ Processing Time: 0.00s\n",
336
- "📅 Timestamp: 2026-01-06T11:53:48.934270\n",
337
- "================================================================================\n"
338
- ]
339
- }
340
- ],
341
- "source": [
342
- "async def test_knowledge_graph_query():\n",
343
- " print(\"=\" * 80)\n",
344
- " print(\"TEST 2: Tool Calling - Knowledge Graph Query\")\n",
345
- " print(\"=\" * 80)\n",
346
- " \n",
347
- " user_query = \"What are the data breach notification requirements under GDPR?\"\n",
348
- " print(f\"\\n👤 User Query: {user_query}\\n\")\n",
349
- " print(\"Processing...\\n\")\n",
350
- " \n",
351
- " result = await client_agent.process_query(\n",
352
- " user_query=user_query,\n",
353
- " conversation_history=[]\n",
354
- " )\n",
355
- " \n",
356
- " print(\"-\" * 80)\n",
357
- " print(\"🤖 Agent Response:\")\n",
358
- " print(\"-\" * 80)\n",
359
- " print(result['response'][:800])\n",
360
- " print(\"...\" if len(result['response']) > 800 else \"\")\n",
361
- " \n",
362
- " print(f\"\\n⏱️ Processing Time: {result['processing_time']:.2f}s\")\n",
363
- " print(f\"📅 Timestamp: {result['timestamp']}\")\n",
364
- " print(\"=\" * 80)\n",
365
- "\n",
366
- "await test_knowledge_graph_query()"
367
- ]
368
- },
369
- {
370
- "cell_type": "markdown",
371
- "metadata": {},
372
- "source": [
373
- "## 4. Test 3: Tool Calling - Find Lawyers\n",
374
- "\n",
375
- "**Scenario**: User with a data breach issue needs lawyer recommendations.\n",
376
- "\n",
377
- "**Expected Behavior**: Agent calls `find_lawyers` tool, receives recommendations, then presents them."
378
- ]
379
- },
380
- {
381
- "cell_type": "code",
382
- "execution_count": null,
383
- "metadata": {},
384
- "outputs": [
385
- {
386
- "name": "stdout",
387
- "output_type": "stream",
388
- "text": [
389
- "================================================================================\n",
390
- "TEST 3: Tool Calling - Find Lawyers\n",
391
- "================================================================================\n",
392
- "\n",
393
- "👤 User Query: I prefer a Romanian-based firm, English language is not fine, and my budget is around 2000-5000 EUR for this incident. I need help immediately with breach notification.\n",
394
- "\n",
395
- "Processing...\n",
396
- "\n"
397
- ]
398
- },
399
- {
400
- "name": "stderr",
401
- "output_type": "stream",
402
- "text": [
403
- "INFO:httpx:HTTP Request: POST https://api.cerebras.ai/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
404
- "INFO:langraph_agent:🔧 Calling tools: ['find_lawyers']\n"
405
- ]
406
- },
407
- {
408
- "name": "stdout",
409
- "output_type": "stream",
410
- "text": [
411
- "[{'name': 'find_lawyers', 'args': {'query': 'Romanian law firm for GDPR data breach notification, budget 2000-5000 EUR, immediate help, English not required', 'conversation_history': [{'role': 'user', 'content': \"My company is in Romania and experienced a data breach. Customer names, addresses, and SSNs were stolen. We have about 500 affected customers. The breach occurred 2 days ago. We haven't notified authorities yet. I need a lawyer to help with GDPR compliance and breach notification.\"}, {'role': 'assistant', 'content': 'I can help with that. Can you tell me more about the breach?'}, {'role': 'user', 'content': 'I prefer a Romanian-based firm, English language is not fine, and my budget is around 2000-5000 EUR for this incident. I need help immediately with breach notification.'}]}, 'id': 'c53ab7aa9', 'type': 'tool_call'}]\n"
412
- ]
413
- },
414
- {
415
- "name": "stderr",
416
- "output_type": "stream",
417
- "text": [
418
- "INFO:httpx:HTTP Request: POST https://api.cerebras.ai/v1/chat/completions \"HTTP/1.1 404 Not Found\"\n",
419
- "INFO:langraph_agent:🔧 Tool find_lawyers returned: Error finding lawyers: Error code: 404 - {'message': 'Model gpt-5-nano-2025-08-07 does not exist or you do not have access to it.', 'type': 'not_found_error', 'param': 'model', 'code': 'model_not_found'}\n",
420
- "INFO:httpx:HTTP Request: POST https://api.cerebras.ai/v1/chat/completions \"HTTP/1.1 429 Too Many Requests\"\n",
421
- "INFO:openai._base_client:Retrying request to /chat/completions in 59.000000 seconds\n"
422
- ]
423
- }
424
- ],
425
- "source": [
426
- "async def test_find_lawyers():\n",
427
- " print(\"=\" * 80)\n",
428
- " print(\"TEST 3: Tool Calling - Find Lawyers\")\n",
429
- " print(\"=\" * 80)\n",
430
- " \n",
431
- " # Create conversation history\n",
432
- " history = [\n",
433
- " {'role': 'user', 'content': 'I need help with a data breach issue'},\n",
434
- " {'role': 'assistant', 'content': 'I can help with that. Can you tell me more about the breach?'},\n",
435
- " {'role': 'user', 'content': 'My company is in Romania and experienced a data breach. Customer names, addresses, and SSNs were stolen. We have about 500 affected customers. The breach occurred 2 days ago. We haven\\'t notified authorities yet. I need a lawyer to help with GDPR compliance and breach notification.'}\n",
436
- " ]\n",
437
- " \n",
438
- " user_query = \"I prefer a Romanian-based firm, English language is not fine, and my budget is around 2000-5000 EUR for this incident. I need help immediately with breach notification.\"\n",
439
- " print(f\"\\n👤 User Query: {user_query}\\n\")\n",
440
- " print(\"Processing...\\n\")\n",
441
- " \n",
442
- " result = await client_agent.process_query(\n",
443
- " user_query=user_query,\n",
444
- " conversation_history=history\n",
445
- " )\n",
446
- " \n",
447
- " print(\"-\" * 80)\n",
448
- " print(\"🤖 Agent Response:\")\n",
449
- " print(\"-\" * 80)\n",
450
- " print(result['response'][:1000])\n",
451
- " print(\"...\" if len(result['response']) > 1000 else \"\")\n",
452
- " \n",
453
- " print(f\"\\n⏱️ Processing Time: {result['processing_time']:.2f}s\")\n",
454
- " print(f\"📅 Timestamp: {result['timestamp']}\")\n",
455
- " print(\"=\" * 80)\n",
456
- "\n",
457
- "await test_find_lawyers()"
458
- ]
459
- },
460
- {
461
- "cell_type": "markdown",
462
- "metadata": {},
463
- "source": [
464
- "## 5. Test 4: Lawyer Agent - Professional Tone\n",
465
- "\n",
466
- "**Scenario**: Legal professional asks a technical question about NIS2.\n",
467
- "\n",
468
- "**Expected Behavior**: Lawyer agent responds with professional, technical language using knowledge graph."
469
- ]
470
- },
471
- {
472
- "cell_type": "code",
473
- "execution_count": null,
474
- "metadata": {},
475
- "outputs": [],
476
- "source": [
477
- "async def test_lawyer_agent():\n",
478
- " print(\"=\" * 80)\n",
479
- " print(\"TEST 4: Lawyer Agent - Professional Technical Response\")\n",
480
- " print(\"=\" * 80)\n",
481
- " \n",
482
- " user_query = \"What are the data breach notification requirements under NIS2 Directive?\"\n",
483
- " print(f\"\\n👨‍⚖️ Lawyer Query: {user_query}\\n\")\n",
484
- " print(\"Processing...\\n\")\n",
485
- " \n",
486
- " result = await lawyer_agent.process_query(\n",
487
- " user_query=user_query,\n",
488
- " conversation_history=[]\n",
489
- " )\n",
490
- " \n",
491
- " print(\"-\" * 80)\n",
492
- " print(\"🤖 Agent Response:\")\n",
493
- " print(\"-\" * 80)\n",
494
- " print(result['response'][:800])\n",
495
- " print(\"...\" if len(result['response']) > 800 else \"\")\n",
496
- " \n",
497
- " print(f\"\\n⏱️ Processing Time: {result['processing_time']:.2f}s\")\n",
498
- " print(f\"📅 Timestamp: {result['timestamp']}\")\n",
499
- " print(\"=\" * 80)\n",
500
- "\n",
501
- "await test_lawyer_agent()"
502
- ]
503
- },
504
- {
505
- "cell_type": "markdown",
506
- "metadata": {},
507
- "source": [
508
- "## 6. Test 5: Tool Choice Decision\n",
509
- "\n",
510
- "**Scenario**: User asks something that could be answered either way.\n",
511
- "\n",
512
- "**Expected Behavior**: Agent decides whether to call tools or answer directly based on question complexity."
513
- ]
514
- },
515
- {
516
- "cell_type": "code",
517
- "execution_count": null,
518
- "metadata": {},
519
- "outputs": [],
520
- "source": [
521
- "async def test_tool_choice():\n",
522
- " print(\"=\" * 80)\n",
523
- " print(\"TEST 5: Tool Choice Decision\")\n",
524
- " print(\"=\" * 80)\n",
525
- " \n",
526
- " # Simple question - might answer directly\n",
527
- " simple_query = \"What is the purpose of the NIS2 Directive?\"\n",
528
- " print(f\"\\n👤 Simple Query: {simple_query}\\n\")\n",
529
- " print(\"Processing...\\n\")\n",
530
- " \n",
531
- " result = await client_agent.process_query(\n",
532
- " user_query=simple_query,\n",
533
- " conversation_history=[]\n",
534
- " )\n",
535
- " \n",
536
- " print(\"-\" * 80)\n",
537
- " print(\"🤖 Agent Response:\")\n",
538
- " print(\"-\" * 80)\n",
539
- " print(result['response'][:600])\n",
540
- " print(\"...\" if len(result['response']) > 600 else \"\")\n",
541
- " \n",
542
- " print(f\"\\n⏱️ Processing Time: {result['processing_time']:.2f}s\")\n",
543
- " print(\"=\" * 80)\n",
544
- "\n",
545
- "await test_tool_choice()"
546
- ]
547
- },
548
- {
549
- "cell_type": "markdown",
550
- "metadata": {},
551
- "source": [
552
- "## 7. Compare Processing Times\n",
553
- "\n",
554
- "Let's run all tests and compare their processing times to understand the performance impact of tool calling."
555
- ]
556
- },
557
- {
558
- "cell_type": "code",
559
- "execution_count": null,
560
- "metadata": {},
561
- "outputs": [],
562
- "source": [
563
- "import pandas as pd\n",
564
- "\n",
565
- "async def run_all_tests():\n",
566
- " results = []\n",
567
- " \n",
568
- " # Test 1: Direct answer\n",
569
- " result1 = await client_agent.process_query(\n",
570
- " user_query=\"What is GDPR?\",\n",
571
- " conversation_history=[]\n",
572
- " )\n",
573
- " results.append({\n",
574
- " 'Test': 'Direct Answer',\n",
575
- " 'Tools Called': 0,\n",
576
- " 'Processing Time (s)': result1['processing_time']\n",
577
- " })\n",
578
- " \n",
579
- " # Test 2: Knowledge graph\n",
580
- " result2 = await client_agent.process_query(\n",
581
- " user_query=\"What are GDPR breach notification requirements?\",\n",
582
- " conversation_history=[]\n",
583
- " )\n",
584
- " results.append({\n",
585
- " 'Test': 'Knowledge Graph Query',\n",
586
- " 'Tools Called': 1,\n",
587
- " 'Processing Time (s)': result2['processing_time']\n",
588
- " })\n",
589
- " \n",
590
- " # Test 3: Find lawyers\n",
591
- " result3 = await client_agent.process_query(\n",
592
- " user_query=\"I need a lawyer for a GDPR data breach in Romania\",\n",
593
- " conversation_history=[\n",
594
- " {'role': 'user', 'content': 'My company experienced a data breach in Romania with 500 affected customers.'}\n",
595
- " ]\n",
596
- " )\n",
597
- " results.append({\n",
598
- " 'Test': 'Find Lawyers',\n",
599
- " 'Tools Called': 1,\n",
600
- " 'Processing Time (s)': result3['processing_time']\n",
601
- " })\n",
602
- " \n",
603
- " # Test 4: Lawyer agent\n",
604
- " result4 = await lawyer_agent.process_query(\n",
605
- " user_query=\"What are NIS2 notification requirements?\",\n",
606
- " conversation_history=[]\n",
607
- " )\n",
608
- " results.append({\n",
609
- " 'Test': 'Lawyer Agent Query',\n",
610
- " 'Tools Called': 1,\n",
611
- " 'Processing Time (s)': result4['processing_time']\n",
612
- " })\n",
613
- " \n",
614
- " return pd.DataFrame(results)\n",
615
- "\n",
616
- "df = await run_all_tests()\n",
617
- "print(\"\\n\" + \"=\"*80)\n",
618
- "print(\"TEST RESULTS SUMMARY\")\n",
619
- "print(\"=\"*80)\n",
620
- "display(df)\n",
621
- "\n",
622
- "print(\"\\n💡 Insights:\")\n",
623
- "print(\"- Direct answers are fastest (no tool overhead)\")\n",
624
- "print(\"- Tool calls add processing time but provide accurate, sourced information\")\n",
625
- "print(\"- The agent intelligently chooses when to use tools based on query complexity\")"
626
- ]
627
- },
628
- {
629
- "cell_type": "markdown",
630
- "metadata": {},
631
- "source": [
632
- "## Summary\n",
633
- "\n",
634
- "### Key Takeaways:\n",
635
- "\n",
636
- "1. **Flexible Architecture**: The agent can be initialized with different system prompts and tool sets\n",
637
- "\n",
638
- "2. **Intelligent Tool Selection**: The LLM decides when to call tools based on the query\n",
639
- "\n",
640
- "3. **Iterative Process**: Tools can be called multiple times, with results fed back to the agent\n",
641
- "\n",
642
- "4. **User Type Specialization**: Different prompts and tools for clients vs lawyers\n",
643
- "\n",
644
- "5. **Performance Trade-off**: Direct answers are faster, but tool calls provide more accurate, sourced information\n",
645
- "\n",
646
- "### Architecture Benefits:\n",
647
- "- ✅ Modular design - easy to add new tools\n",
648
- "- ✅ Clear separation of concerns\n",
649
- "- ✅ Flexible configuration\n",
650
- "- ✅ Maintains conversation context\n",
651
- "- ✅ Suitable for API integration"
652
- ]
653
- }
654
- ],
655
- "metadata": {
656
- "kernelspec": {
657
- "display_name": "cyberlgl",
658
- "language": "python",
659
- "name": "python3"
660
- },
661
- "language_info": {
662
- "codemirror_mode": {
663
- "name": "ipython",
664
- "version": 3
665
- },
666
- "file_extension": ".py",
667
- "mimetype": "text/x-python",
668
- "name": "python",
669
- "nbconvert_exporter": "python",
670
- "pygments_lexer": "ipython3",
671
- "version": "3.12.12"
672
- }
673
- },
674
- "nbformat": 4,
675
- "nbformat_minor": 4
676
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
agent_state.py → utils/conversation_manager.py RENAMED
@@ -1,31 +1,12 @@
1
  #!/usr/bin/env python3
2
  """
3
- Agent state management for the LangGraph cyber-legal assistant
4
  """
5
 
6
- from typing import TypedDict, List, Dict, Any, Optional
7
  from datetime import datetime
8
 
9
 
10
- class AgentState(TypedDict):
11
- """
12
- State definition for the LangGraph agent workflow
13
- """
14
- # User interaction
15
- user_query: str
16
- conversation_history: List[Dict[str, str]]
17
- intermediate_steps: List[Dict[str, Any]]
18
- system_prompt: Optional[str]
19
-
20
- # Context processing
21
- relevant_documents: List[str]
22
-
23
- # Metadata
24
- query_timestamp: str
25
- processing_time: Optional[float]
26
- jurisdiction: Optional[str]
27
-
28
-
29
  class ConversationManager:
30
  """
31
  Manages conversation history and context
@@ -87,3 +68,62 @@ class ConversationManager:
87
  context_parts.append(f"{role}: {exchange['content']}")
88
 
89
  return "\n".join(context_parts)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  #!/usr/bin/env python3
2
  """
3
+ Conversation management for the agent
4
  """
5
 
6
+ from typing import Dict, List, Any
7
  from datetime import datetime
8
 
9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  class ConversationManager:
11
  """
12
  Manages conversation history and context
 
68
  context_parts.append(f"{role}: {exchange['content']}")
69
 
70
  return "\n".join(context_parts)
71
+
72
+
73
+ class ConversationFormatter:
74
+ """
75
+ Format conversation data for different purposes
76
+ """
77
+
78
+ @staticmethod
79
+ def build_conversation_history(history: List[Dict[str, str]], max_turns: int = 10) -> List[Dict[str, str]]:
80
+ """
81
+ Build conversation history for LightRAG API
82
+ """
83
+ if not history:
84
+ return []
85
+
86
+ # Take last max_turns pairs (user + assistant)
87
+ recent_history = history[-max_turns*2:]
88
+ formatted = []
89
+
90
+ for exchange in recent_history:
91
+ # Handle both Message objects and dictionary formats
92
+ if hasattr(exchange, 'role'):
93
+ role = exchange.role
94
+ content = exchange.content
95
+ else:
96
+ role = exchange["role"]
97
+ content = exchange["content"]
98
+
99
+ formatted.append({
100
+ "role": role,
101
+ "content": content
102
+ })
103
+
104
+ return formatted
105
+
106
+ @staticmethod
107
+ def create_context_summary(history: List[Dict[str, str]]) -> str:
108
+ """
109
+ Create a summary of conversation context
110
+ """
111
+ if not history:
112
+ return "No previous conversation."
113
+
114
+ recent_exchanges = history[-4:] # Last 2 exchanges
115
+ context_parts = []
116
+
117
+ for exchange in recent_exchanges:
118
+ # Handle both Message objects and dictionary formats
119
+ if hasattr(exchange, 'role'):
120
+ role = "User" if exchange.role == "user" else "Assistant"
121
+ content = exchange.content
122
+ else:
123
+ role = "User" if exchange["role"] == "user" else "Assistant"
124
+ content = exchange["content"]
125
+
126
+ content = content[:100] + "..." if len(content) > 100 else content
127
+ context_parts.append(f"{role}: {content}")
128
+
129
+ return "\n".join(context_parts)
utils.py → utils/lightrag_client.py RENAMED
@@ -1,14 +1,13 @@
1
  #!/usr/bin/env python3
2
  """
3
- Utility functions for LightRAG integration and agent operations
4
  """
5
 
6
  import os
7
  import requests
8
  import time
9
- from typing import Dict, List, Any, Optional, Tuple
10
  from dotenv import load_dotenv
11
- from datetime import datetime
12
  import logging
13
 
14
  # Load environment variables
@@ -24,6 +23,7 @@ LIGHTRAG_HOST = os.getenv("LIGHTRAG_HOST", "127.0.0.1")
24
  SERVER_URL = f"http://{LIGHTRAG_HOST}:{LIGHTRAG_PORT}"
25
  API_KEY = os.getenv("LIGHTRAG_API_KEY")
26
 
 
27
  class LightRAGClient:
28
  """
29
  Client for interacting with LightRAG server
@@ -77,9 +77,8 @@ class LightRAGClient:
77
  )
78
 
79
  if response.status_code == 200:
80
- logger.info(f"Query successful;{response.json()}")
81
  return response.json()
82
-
83
  else:
84
  logger.warning(f"Query failed with status {response.status_code}, attempt {attempt + 1}")
85
 
@@ -150,142 +149,3 @@ class ResponseProcessor:
150
  legal_entities.append(reg)
151
 
152
  return list(set(legal_entities)) # Remove duplicates
153
-
154
-
155
- class ConversationFormatter:
156
- """
157
- Format conversation data for different purposes
158
- """
159
-
160
- @staticmethod
161
- def build_conversation_history(history: List[Dict[str, str]], max_turns: int = 10) -> List[Dict[str, str]]:
162
- """
163
- Build conversation history for LightRAG API
164
- """
165
- if not history:
166
- return []
167
-
168
- # Take last max_turns pairs (user + assistant)
169
- recent_history = history[-max_turns*2:]
170
- formatted = []
171
-
172
- for exchange in recent_history:
173
- # Handle both Message objects and dictionary formats
174
- if hasattr(exchange, 'role'):
175
- role = exchange.role
176
- content = exchange.content
177
- else:
178
- role = exchange["role"]
179
- content = exchange["content"]
180
-
181
- formatted.append({
182
- "role": role,
183
- "content": content
184
- })
185
-
186
- return formatted
187
-
188
- @staticmethod
189
- def create_context_summary(history: List[Dict[str, str]]) -> str:
190
- """
191
- Create a summary of conversation context
192
- """
193
- if not history:
194
- return "No previous conversation."
195
-
196
- recent_exchanges = history[-4:] # Last 2 exchanges
197
- context_parts = []
198
-
199
- for exchange in recent_exchanges:
200
- # Handle both Message objects and dictionary formats
201
- if hasattr(exchange, 'role'):
202
- role = "User" if exchange.role == "user" else "Assistant"
203
- content = exchange.content
204
- else:
205
- role = "User" if exchange["role"] == "user" else "Assistant"
206
- content = exchange["content"]
207
-
208
- content = content[:100] + "..." if len(content) > 100 else content
209
- context_parts.append(f"{role}: {content}")
210
-
211
- return "\n".join(context_parts)
212
-
213
-
214
- class PerformanceMonitor:
215
- """
216
- Monitor agent performance and timing
217
- """
218
-
219
- def __init__(self):
220
- self.metrics = {}
221
-
222
- def start_timer(self, operation: str) -> None:
223
- """
224
- Start timing an operation
225
- """
226
- self.metrics[f"{operation}_start"] = time.time()
227
-
228
- def end_timer(self, operation: str) -> float:
229
- """
230
- End timing an operation and return duration
231
- """
232
- start_time = self.metrics.get(f"{operation}_start")
233
- if start_time:
234
- duration = time.time() - start_time
235
- self.metrics[f"{operation}_duration"] = duration
236
- return duration
237
- return 0.0
238
-
239
- def get_metrics(self) -> Dict[str, Any]:
240
- """
241
- Get all collected metrics
242
- """
243
- return self.metrics.copy()
244
-
245
- def reset(self) -> None:
246
- """
247
- Reset all metrics
248
- """
249
- self.metrics.clear()
250
-
251
-
252
- def validate_query(query: str) -> Tuple[bool, Optional[str]]:
253
- """
254
- Validate user query
255
- """
256
- if not query or not query.strip():
257
- return False, "Query cannot be empty."
258
-
259
- if len(query) > 2500:
260
- return False, "Query is too long. Please keep it under 1000 characters."
261
-
262
- return True, None
263
-
264
-
265
- def format_error_message(error: str) -> str:
266
- """
267
- Format error messages for user display
268
- """
269
- error_map = {
270
- "Server unreachable": "❌ The legal database is currently unavailable. Please try again in a moment.",
271
- "timeout": "❌ The request timed out. Please try again.",
272
- "invalid json": "❌ There was an issue processing the response. Please try again.",
273
- "health check failed": "❌ The system is initializing. Please wait a moment and try again."
274
- }
275
-
276
- for key, message in error_map.items():
277
- if key.lower() in error.lower():
278
- return message
279
-
280
- return f"❌ An error occurred: {error}"
281
-
282
-
283
- def create_safe_filename(query: str, timestamp: str) -> str:
284
- """
285
- Create a safe filename for logging purposes
286
- """
287
- # Remove problematic characters
288
- safe_query = "".join(c for c in query if c.isalnum() or c in (' ', '-', '_')).strip()
289
- safe_query = safe_query[:50] # Limit length
290
-
291
- return f"{timestamp}_{safe_query}.log"
 
1
  #!/usr/bin/env python3
2
  """
3
+ LightRAG client for interacting with the RAG server
4
  """
5
 
6
  import os
7
  import requests
8
  import time
9
+ from typing import Dict, List, Any, Optional
10
  from dotenv import load_dotenv
 
11
  import logging
12
 
13
  # Load environment variables
 
23
  SERVER_URL = f"http://{LIGHTRAG_HOST}:{LIGHTRAG_PORT}"
24
  API_KEY = os.getenv("LIGHTRAG_API_KEY")
25
 
26
+
27
  class LightRAGClient:
28
  """
29
  Client for interacting with LightRAG server
 
77
  )
78
 
79
  if response.status_code == 200:
80
+ logger.info(f"Query successful")
81
  return response.json()
 
82
  else:
83
  logger.warning(f"Query failed with status {response.status_code}, attempt {attempt + 1}")
84
 
 
149
  legal_entities.append(reg)
150
 
151
  return list(set(legal_entities)) # Remove duplicates
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
tools.py → utils/tools.py RENAMED
@@ -7,13 +7,15 @@ import os
7
  from typing import List, Dict, Any, Optional
8
  from langchain_core.tools import tool
9
  from langchain_tavily import TavilySearch
10
- from lawyer_selector import LawyerSelectorAgent
11
- from utils import LightRAGClient, ConversationFormatter
 
12
 
13
  # Global instances - will be initialized in agent_api.py
14
  lawyer_selector_agent: Optional[LawyerSelectorAgent] = None
15
  lightrag_client: Optional[LightRAGClient] = None
16
  tavily_search = None
 
17
 
18
  @tool
19
  async def query_knowledge_graph(query: str, conversation_history: List[Dict[str, str]]) -> str:
@@ -77,6 +79,26 @@ async def search_web(query: str) -> str:
77
  except Exception as e:
78
  return f"Error: {str(e)}"
79
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
80
  @tool
81
  async def find_lawyers(query: str, conversation_history: List[Dict[str, str]]) -> str:
82
  """
@@ -94,52 +116,19 @@ async def find_lawyers(query: str, conversation_history: List[Dict[str, str]]) -
94
  conversation_history: The full conversation history with the user (automatically provided by the agent)
95
 
96
  Returns:
97
- A formatted string with the top 3 lawyer recommendations, including:
98
- - Lawyer name and presentation
99
- - Experience and specialty
100
- - Client-friendly explanation of why they match the case
101
- - Areas of practice
102
  """
103
  try:
104
- # Use the globally initialized lawyer selector agent
105
  if lawyer_selector_agent is None:
106
  raise ValueError("LawyerSelectorAgent not initialized. Please initialize it in agent_api.py")
107
 
108
- # Get lawyer recommendations using the conversation history
109
- result = await lawyer_selector_agent.select_lawyers(conversation_history)
110
- top_lawyers = result["top_lawyers"]
111
-
112
- # Format the output for the user
113
- output = ["=" * 80, "TOP 3 RECOMMENDED LAWYERS FOR YOUR CASE", "=" * 80]
114
-
115
- for lawyer in top_lawyers:
116
- output.append("\n" + "─" * 80)
117
- output.append(f"RECOMMENDATION #{lawyer['rank']}")
118
- output.append("─" * 80)
119
- output.append(f"\n👤 {lawyer['name']}")
120
- output.append(f" {lawyer['presentation']}")
121
- output.append(f"\n📊 Experience: {lawyer['experience_years']} years")
122
- output.append(f"🎯 Specialty: {lawyer['specialty']}")
123
- output.append(f"\n✅ Why this lawyer matches your case:")
124
- output.append(f" {lawyer['reasoning']}")
125
- output.append(f"\n📚 Areas of Practice:")
126
- for area in lawyer['areas_of_practice']:
127
- output.append(f" • {area}")
128
- output.append("")
129
-
130
- return "\n".join(output)
131
 
132
  except Exception as e:
133
  return f"Error finding lawyers: {str(e)}"
134
 
135
 
136
  # Export tool sets for different user types
137
-
138
- # Tools available to general clients (knowledge graph + lawyer finder + web search)
139
- tools_for_client = [query_knowledge_graph, find_lawyers, search_web]
140
-
141
- # Tools available to lawyers (knowledge graph + web search for current legal updates)
142
  tools_for_lawyer = [query_knowledge_graph, search_web]
143
-
144
- # Default tools (backward compatibility - client tools)
145
  tools = tools_for_client
 
7
  from typing import List, Dict, Any, Optional
8
  from langchain_core.tools import tool
9
  from langchain_tavily import TavilySearch
10
+ from subagents.lawyer_selector import LawyerSelectorAgent
11
+ from utils.lightrag_client import LightRAGClient
12
+ import resend
13
 
14
  # Global instances - will be initialized in agent_api.py
15
  lawyer_selector_agent: Optional[LawyerSelectorAgent] = None
16
  lightrag_client: Optional[LightRAGClient] = None
17
  tavily_search = None
18
+ resend_api_key: Optional[str] = None
19
 
20
  @tool
21
  async def query_knowledge_graph(query: str, conversation_history: List[Dict[str, str]]) -> str:
 
79
  except Exception as e:
80
  return f"Error: {str(e)}"
81
 
82
+ @tool
83
+ async def send_email(to_email: str, subject: str, content: str) -> str:
84
+ """Send an email using Resend."""
85
+ try:
86
+ from_email = os.getenv("RESEND_FROM_EMAIL")
87
+ from_name = os.getenv("RESEND_FROM_NAME", "CyberLegalAI")
88
+
89
+ params = {
90
+ "from": f"{from_name} <{from_email}>",
91
+ "to": [to_email],
92
+ "subject": subject,
93
+ "text": content
94
+ }
95
+
96
+ response = resend.Emails.send(params)
97
+ return f"✅ Email sent to {to_email} (ID: {response.get('id', 'N/A')})"
98
+
99
+ except Exception as e:
100
+ return f"❌ Failed: {str(e)}"
101
+
102
  @tool
103
  async def find_lawyers(query: str, conversation_history: List[Dict[str, str]]) -> str:
104
  """
 
116
  conversation_history: The full conversation history with the user (automatically provided by the agent)
117
 
118
  Returns:
119
+ A formatted string with the top 3 lawyer recommendations
 
 
 
 
120
  """
121
  try:
 
122
  if lawyer_selector_agent is None:
123
  raise ValueError("LawyerSelectorAgent not initialized. Please initialize it in agent_api.py")
124
 
125
+ return await lawyer_selector_agent.select_lawyers(conversation_history)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
126
 
127
  except Exception as e:
128
  return f"Error finding lawyers: {str(e)}"
129
 
130
 
131
  # Export tool sets for different user types
132
+ tools_for_client = [query_knowledge_graph, find_lawyers, search_web, send_email]
 
 
 
 
133
  tools_for_lawyer = [query_knowledge_graph, search_web]
 
 
134
  tools = tools_for_client
utils/utils.py ADDED
@@ -0,0 +1,92 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Utility functions for agent operations
4
+ """
5
+
6
+ import time
7
+ from typing import Tuple
8
+ import logging
9
+
10
+ # Configure logging
11
+ logging.basicConfig(level=logging.INFO)
12
+ logger = logging.getLogger(__name__)
13
+
14
+
15
+ class PerformanceMonitor:
16
+ """
17
+ Monitor agent performance and timing
18
+ """
19
+
20
+ def __init__(self):
21
+ self.metrics = {}
22
+
23
+ def start_timer(self, operation: str) -> None:
24
+ """
25
+ Start timing an operation
26
+ """
27
+ self.metrics[f"{operation}_start"] = time.time()
28
+
29
+ def end_timer(self, operation: str) -> float:
30
+ """
31
+ End timing an operation and return duration
32
+ """
33
+ start_time = self.metrics.get(f"{operation}_start")
34
+ if start_time:
35
+ duration = time.time() - start_time
36
+ self.metrics[f"{operation}_duration"] = duration
37
+ return duration
38
+ return 0.0
39
+
40
+ def get_metrics(self) -> dict:
41
+ """
42
+ Get all collected metrics
43
+ """
44
+ return self.metrics.copy()
45
+
46
+ def reset(self) -> None:
47
+ """
48
+ Reset all metrics
49
+ """
50
+ self.metrics.clear()
51
+
52
+
53
+ def validate_query(query: str) -> Tuple[bool, str]:
54
+ """
55
+ Validate user query
56
+ """
57
+ if not query or not query.strip():
58
+ return False, "Query cannot be empty."
59
+
60
+ if len(query) > 2500:
61
+ return False, "Query is too long. Please keep it under 1000 characters."
62
+
63
+ return True, None
64
+
65
+
66
+ def format_error_message(error: str) -> str:
67
+ """
68
+ Format error messages for user display
69
+ """
70
+ error_map = {
71
+ "Server unreachable": "❌ The legal database is currently unavailable. Please try again in a moment.",
72
+ "timeout": "❌ The request timed out. Please try again.",
73
+ "invalid json": "❌ There was an issue processing the response. Please try again.",
74
+ "health check failed": "❌ The system is initializing. Please wait a moment and try again."
75
+ }
76
+
77
+ for key, message in error_map.items():
78
+ if key.lower() in error.lower():
79
+ return message
80
+
81
+ return f"❌ An error occurred: {error}"
82
+
83
+
84
+ def create_safe_filename(query: str, timestamp: str) -> str:
85
+ """
86
+ Create a safe filename for logging purposes
87
+ """
88
+ # Remove problematic characters
89
+ safe_query = "".join(c for c in query if c.isalnum() or c in (' ', '-', '_')).strip()
90
+ safe_query = safe_query[:50] # Limit length
91
+
92
+ return f"{timestamp}_{safe_query}.log"