Spaces:

Cyberlgl
/

CyberLegalAIendpoint

Sleeping

App Files Files Community

Charles Grandjean commited on Jan 19

Commit

695b33f

1 Parent(s): 9a9d495

reorganizing the project

Browse files

Files changed (38) hide show

CL_LAST_BK.png +0 -0
DEPLOYMENT_GUIDE.md +0 -300
agent_api.py +20 -52
agent_states/agent_state.py +26 -0
pdf_analyzer_state.py → agent_states/pdf_analyzer_state.py +0 -0
data/rag_storage/graph_chunk_entity_relation.graphml +0 -0
data/rag_storage/kv_store_doc_status.json +0 -3
data/rag_storage/kv_store_entity_chunks.json +0 -3
data/rag_storage/kv_store_full_docs.json +0 -3
data/rag_storage/kv_store_full_entities.json +0 -3
data/rag_storage/kv_store_full_relations.json +0 -3
data/rag_storage/kv_store_llm_response_cache.json +0 -3
data/rag_storage/kv_store_relation_chunks.json +0 -3
data/rag_storage/kv_store_text_chunks.json +0 -3
data/rag_storage/vdb_entities.json +0 -3
data/rag_storage/vdb_relationships.json +0 -3
docker-compose.yml +2 -0
langraph_agent.py +5 -5
lightrag.log +0 -51
prompts/__init__.py +1 -0
prompts_lawyer_selector.py → prompts/lawyer_selector.py +32 -1
prompts.py → prompts/main.py +21 -1
prompts_pdf_analyzer.py → prompts/pdf_analyzer.py +0 -0
requirements.txt +1 -0
startup.sh +35 -25
structured_outputs/__init__.py +1 -0
structured_outputs/api_models.py +64 -0
structured_outputs/lawyer_selector.py +19 -0
subagents/__init__.py +1 -0
lawyer_selector.py → subagents/lawyer_selector.py +48 -26
pdf_analyzer.py → subagents/pdf_analyzer.py +2 -2
test_agent.ipynb +0 -152
test_openai_key.ipynb +0 -155
test_tool_calling_demo.ipynb +0 -676
agent_state.py → utils/conversation_manager.py +61 -21
utils.py → utils/lightrag_client.py +4 -144
tools.py → utils/tools.py +27 -38
utils/utils.py +92 -0

CL_LAST_BK.png DELETED Viewed

Binary file (11.8 kB)

DEPLOYMENT_GUIDE.md DELETED Viewed

@@ -1,300 +0,0 @@
-# 🚀 Guide de Déploiement Gratuit - CyberLegal AI
-## 📋 Options de Déploiement Gratuit
-### 1. **Render.com** (Recommandé pour débutants)
-**✅ Avantages**: Simple, gratuit pour small apps, déploiement automatique
-**📦 Plan**: Free tier (750h/mois)
-#### Étapes:
-```bash
-# 1. Créer un fichier render.yaml
-cat > render.yaml << 'EOF'
-services:
-  - type: web
-    name: cyberlegal-ai
-    env: docker
-    plan: free
-    dockerfilePath: ./Dockerfile
-    dockerContext: .
-    envVars:
-      - key: OPENAI_API_KEY
-        sync: false
-      - key: LIGHTRAG_HOST
-        value: 0.0.0.0
-      - key: LIGHTRAG_PORT
-        value: 9621
-    healthCheckPath: /health
-    autoDeploy: true
-EOF
-# 2. Push sur GitHub
-git add .
-git commit -m "Deploy to Render"
-git push origin main
-# 3. Connecter GitHub à Render.com
-# → New Web Service → Connect GitHub → Select repo
-```
-### 2. **Railway.app** (Très populaire)
-**✅ Avantages**: $5 crédit gratuit, Docker natif, base de données gratuite
-**📦 Plan**: $5 crédit/mois (suffisant pour usage modéré)
-#### Étapes:
-```bash
-# 1. Installer Railway CLI
-npm install -g @railway/cli
-# 2. Se connecter
-railway login
-# 3. Déployer
-railway up
-# 4. Configurer variables d'environnement
-railway variables set OPENAI_API_KEY=votre_clé_ici
-railway variables set LIGHTRAG_HOST=0.0.0.0
-railway variables set LIGHTRAG_PORT=9621
-```
-### 3. **Fly.io** (Pour utilisateurs avancés)
-**✅ Avantages**: 160h gratuits/mois, Docker, worldwide deployment
-**📦 Plan**: Free tier avec shared CPU
-#### Étapes:
-```bash
-# 1. Installer Fly CLI
-curl -L https://fly.io/install.sh | sh
-# 2. Se connecter
-fly auth login
-# 3. Initialiser
-fly launch
-# 4. Déployer
-fly deploy
-# 5. Configurer secrets
-fly secrets set OPENAI_API_KEY=votre_clé_ici
-```
-### 4. **Vercel + Docker** (Alternative)
-**✅ Avantages**: Excellent frontend, facile à utiliser
-**📦 Plan**: Hobby plan gratuit
-#### Étapes:
-```bash
-# 1. Créer vercel.json
-cat > vercel.json << 'EOF'
-{
-  "version": 2,
-  "builds": [
-    {
-      "src": "Dockerfile",
-      "use": "@vercel/docker"
-    }
-  ],
-  "routes": [
-    {
-      "src": "/(.*)",
-      "dest": "/"
-    }
-  ]
-}
-EOF
-# 2. Déployer
-vercel --prod
-```
-## 🔧 Configuration Requise
-### Variables d'Environnement Essentielles:
-```bash
-OPENAI_API_KEY=sk-proj-votre_clé_complete
-LIGHTRAG_HOST=0.0.0.0
-LIGHTRAG_PORT=9621
-```
-### Modifications pour le Cloud:
-#### 1. **Adapter Dockerfile pour le cloud**
-```dockerfile
-# Ajouter au début du Dockerfile
-FROM python:3.11-slim
-# Health check pour les plateformes cloud
-HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
-  CMD curl -f http://localhost:8000/health || exit 1
-# Exposer le bon port
-EXPOSE 8000
-```
-#### 2. **Adapter docker-compose.yml**
-```yaml
-version: '3.8'
-services:
-  app:
-    build: .
-    ports:
-      - "8000:8000"
-    environment:
-      - PORT=8000
-      - OPENAI_API_KEY=${OPENAI_API_KEY}
-```
-## 📊 Optimisations pour le Gratuit
-### 1. **Réduire la consommation de ressources**
-```python
-# Dans langraph_agent.py
-# Limiter la concurrence
-MAX_ASYNC = 1
-MAX_PARALLEL_INSERT = 1
-# Timeout plus court
-LLM_TIMEOUT = 60
-```
-### 2. **Caching intelligent**
-```python
-# Activer le cache LLM
-ENABLE_LLM_CACHE = true
-```
-### 3. **Optimisations Docker**
-```dockerfile
-# Image plus légère
-FROM python:3.11-slim
-# Multi-stage build
-COPY requirements.txt .
-RUN pip install --no-cache-dir -r requirements.txt
-# Nettoyer après installation
-RUN apt-get clean && rm -rf /var/lib/apt/lists/*
-```
-## 🌐 Déploiement sur Render (Tutorial Complet)
-### Étape 1: Préparer le projet
-```bash
-# 1. Créer render.yaml
-cat > render.yaml << 'EOF'
-services:
-  - type: web
-    name: cyberlegal-ai
-    runtime: docker
-    plan: free
-    dockerfilePath: ./Dockerfile
-    dockerContext: .
-    healthCheckPath: /health
-    envVars:
-      - key: OPENAI_API_KEY
-        sync: false
-      - key: PORT
-        value: 8000
-    autoDeploy: true
-EOF
-# 2. Ajouter au git
-git add render.yaml
-git commit -m "Add Render deployment config"
-```
-### Étape 2: Configuration sur Render
-1. **Créer compte**: https://render.com
-2. **Connecter GitHub**: Dashboard → New Web Service
-3. **Sélectionner repository**: Votre repo GitHub
-4. **Configurer**:
-   - Name: `cyberlegal-ai`
-   - Environment: `Docker`
-   - Plan: `Free`
-   - Health Check: `/health`
-### Étape 3: Variables d'Environnement
-Dans Dashboard → Service → Environment:
-```
-OPENAI_API_KEY = sk-proj-votre_clé_complète_ici
-PORT = 8000
-```
-### Étape 4: Déploiement automatique
-- Render détecte les push GitHub
-- Rebuild automatiquement
-- Déploie sur: `https://cyberlegal-ai.onrender.com`
-## 🔍 Tests de Déploiement
-### Vérifier le déploiement:
-```bash
-# Health check
-curl https://votre-app.onrender.com/health
-# Test chat
-curl -X POST "https://votre-app.onrender.com/chat" \
--H "Content-Type: application/json" \
--d '{"message": "Test question", "role": "user", "jurisdiction": "EU"}'
-```
-## ⚠️ Limitations Gratuites
-### Render.com:
-- ✅ 750h/mois (suffisant pour 24/7)
-- ✅ 512MB RAM
-- ✅ Support Docker
-- ❌ Timeout après 15min inactivité
-### Railway.app:
-- ✅ $5 crédit/mois
-- ✅ 512MB RAM
-- ✅ Base de données gratuite
-- ❌ Dormance après inactivité
-### Fly.io:
-- ✅ 160h/mois
-- ✅ Global deployment
-- ✅ Docker natif
-- ❌ Besoin de crédit carte (sans charge)
-## 🚀 Recommandation Finale
-**Pour commencer**: Render.com - le plus simple et fiable
-**Pour production**: Railway.app - plus de features
-**Pour avancés**: Fly.io - plus de contrôle
----
-## 📞 Support et Monitoring
-### Ajouter monitoring basique:
-```python
-# Dans agent_api.py
-@app.get("/metrics")
-async def metrics():
-    return {
-        "status": "healthy",
-        "timestamp": datetime.now().isoformat(),
-        "version": "1.0.0"
-    }
-```
-### Logs avec Render:
-- Dashboard → Logs → Votre service
-- Debug en temps réel
----
-🎯 **Prochaines étapes**:
-1. Choisir votre plateforme (Render recommandé)
-2. Suivre le tutoriel spécifique
-3. Configurer les variables d'environnement
-4. Tester le déploiement
-5. Monitorer les performances
-Bonne chance avec votre déploiement gratuit ! 🚀

agent_api.py CHANGED Viewed

@@ -17,13 +17,18 @@ from fastapi import Depends
 from fastapi.security import APIKeyHeader
 import secrets
-from langraph_agent import CyberLegalAgent
-from agent_state import ConversationManager
-from utils import validate_query, LightRAGClient
-import tools
-from lawyer_selector import LawyerSelectorAgent
-from prompts import SYSTEM_PROMPT_CLIENT, SYSTEM_PROMPT_LAWYER
-from pdf_analyzer import PDFAnalyzerAgent
 from langchain_openai import ChatOpenAI
 from mistralai import Mistral
 import logging
@@ -32,6 +37,7 @@ import base64
 import tempfile
 import os as pathlib
 from langchain_tavily import TavilySearch
 # Load environment variables
 load_dotenv(dotenv_path=".env", override=False)
@@ -62,47 +68,6 @@ def require_password(x_api_key: str = Depends(api_key_header)):
     if x_api_key and secrets.compare_digest(x_api_key, API_PASSWORD):
         return
     raise HTTPException(status_code=401, detail="Unauthorized")
-# Pydantic models for request/response
-class Message(BaseModel):
-    role: str = Field(..., description="Role: 'user' or 'assistant'")
-    content: str = Field(..., description="Message content")
-class DocumentAnalysis(BaseModel):
-    file_name: str
-    summary: Optional[str]
-    actors: Optional[str]
-    key_details: Optional[str]
-class ChatRequest(BaseModel):
-    message: str = Field(..., description="User's question")
-    conversationHistory: Optional[List[Message]] = Field(default=[], description="Previous conversation messages")
-    userType: Optional[str] = Field(default="client", description="User type: 'client' for general users or 'lawyer' for legal professionals")
-    jurisdiction: Optional[str] = Field(default="Romania", description="Jurisdiction of the user")
-    documentAnalyses: Optional[List[DocumentAnalysis]] = Field(default=None, description="Lawyer's document analyses")
-class ChatResponse(BaseModel):
-    response: str = Field(..., description="Assistant's response")
-    processing_time: float = Field(..., description="Processing time in seconds")
-    references: List[str] = Field(default=[], description="Referenced documents")
-    timestamp: str = Field(..., description="Response timestamp")
-    error: Optional[str] = Field(None, description="Error message if any")
-class HealthResponse(BaseModel):
-    status: str = Field(..., description="Health status")
-    agent_ready: bool = Field(..., description="Whether agent is ready")
-    lightrag_healthy: bool = Field(..., description="Whether LightRAG is healthy")
-    timestamp: str = Field(..., description="Health check timestamp")
-class AnalyzePDFRequest(BaseModel):
-    pdf_content: str = Field(..., description="Base64 encoded document content (PDF or image)")
-    filename: Optional[str] = Field(default="document.pdf", description="Original filename")
-class AnalyzePDFResponse(BaseModel):
-    actors: str = Field(..., description="Extracted actors")
-    key_details: str = Field(..., description="Key details extracted")
-    summary: str = Field(..., description="High-level summary")
-    processing_status: str = Field(..., description="Processing status")
-    processing_time: float = Field(..., description="Processing time in seconds")
-    timestamp: str = Field(..., description="Analysis timestamp")
-    error: Optional[str] = Field(None, description="Error message if any")
 # Global agent instance
 agent_instance = None
@@ -149,6 +114,10 @@ class CyberLegalAPI:
         )
         tools.tavily_search = tavily_search
         logger.info("✅ Tavily search client initialized")
         self.agent_client = CyberLegalAgent(llm=llm, system_prompt=SYSTEM_PROMPT_CLIENT, tools=tools.tools_for_client)
         self.agent_lawyer = CyberLegalAgent(llm=llm, system_prompt=SYSTEM_PROMPT_LAWYER, tools=tools.tools_for_lawyer)
@@ -258,7 +227,7 @@ class CyberLegalAPI:
         Check health status of the API and dependencies
         """
         try:
-            from utils import LightRAGClient
             lightrag_client = LightRAGClient()
             lightrag_healthy = lightrag_client.health_check()
@@ -411,8 +380,7 @@ async def root():
     """
     llm_provider = os.getenv("LLM_PROVIDER", "openai").upper()
     technology_map = {
-        "OPENAI": "LangGraph + LightRAG + GPT-5-Nano",
-        "GEMINI": "LangGraph + LightRAG + Gemini 1.5 Flash"
     }
     return {
@@ -420,7 +388,7 @@ async def root():
         "version": "1.0.0",
         "description": "LangGraph-powered cyber-legal assistant API",
         "llm_provider": llm_provider,
-        "technology": technology_map.get(llm_provider, "LangGraph + LightRAG"),
         "endpoints": {
             "chat": "POST /chat - Chat with the assistant",
             "analyze-pdf": "POST /analyze-pdf - Analyze PDF document",

 from fastapi.security import APIKeyHeader
 import secrets
+from structured_outputs.api_models import (
+    Message, DocumentAnalysis, ChatRequest, ChatResponse,
+    HealthResponse, AnalyzePDFRequest, AnalyzePDFResponse
+)
+from langgraph_agent import CyberLegalAgent
+from utils.conversation_manager import ConversationManager
+from utils.utils import validate_query
+from utils.lightrag_client import LightRAGClient
+from utils import tools
+from subagents.lawyer_selector import LawyerSelectorAgent
+from prompts.main import SYSTEM_PROMPT_CLIENT, SYSTEM_PROMPT_LAWYER
+from subagents.pdf_analyzer import PDFAnalyzerAgent
 from langchain_openai import ChatOpenAI
 from mistralai import Mistral
 import logging
 import tempfile
 import os as pathlib
 from langchain_tavily import TavilySearch
+import resend
 # Load environment variables
 load_dotenv(dotenv_path=".env", override=False)
     if x_api_key and secrets.compare_digest(x_api_key, API_PASSWORD):
         return
     raise HTTPException(status_code=401, detail="Unauthorized")
 # Global agent instance
 agent_instance = None
         )
         tools.tavily_search = tavily_search
         logger.info("✅ Tavily search client initialized")
+        # Initialize Resend
+        resend.api_key = os.getenv("RESEND_API_KEY")
+        logger.info("✅ Resend client initialized")
         self.agent_client = CyberLegalAgent(llm=llm, system_prompt=SYSTEM_PROMPT_CLIENT, tools=tools.tools_for_client)
         self.agent_lawyer = CyberLegalAgent(llm=llm, system_prompt=SYSTEM_PROMPT_LAWYER, tools=tools.tools_for_lawyer)
         Check health status of the API and dependencies
         """
         try:
+            from utils.lightrag_client import LightRAGClient
             lightrag_client = LightRAGClient()
             lightrag_healthy = lightrag_client.health_check()
     """
     llm_provider = os.getenv("LLM_PROVIDER", "openai").upper()
     technology_map = {
+        "OPENAI": "LangGraph + RAG + Cerebras (GPT-5-Nano)"
     }
     return {
         "version": "1.0.0",
         "description": "LangGraph-powered cyber-legal assistant API",
         "llm_provider": llm_provider,
+        "technology": technology_map.get(llm_provider, "LangGraph + RAG + Cerebras"),
         "endpoints": {
             "chat": "POST /chat - Chat with the assistant",
             "analyze-pdf": "POST /analyze-pdf - Analyze PDF document",

agent_states/agent_state.py ADDED Viewed

	@@ -0,0 +1,26 @@

+#!/usr/bin/env python3
+"""
+Agent state management for the LangGraph cyber-legal assistant
+"""
+from typing import TypedDict, List, Dict, Any, Optional
+from datetime import datetime
+class AgentState(TypedDict):
+    """
+    State definition for the LangGraph agent workflow
+    """
+    # User interaction
+    user_query: str
+    conversation_history: List[Dict[str, str]]
+    intermediate_steps: List[Dict[str, Any]]
+    system_prompt: Optional[str]
+    # Context processing
+    relevant_documents: List[str]
+    # Metadata
+    query_timestamp: str
+    processing_time: Optional[float]
+    jurisdiction: Optional[str]

pdf_analyzer_state.py → agent_states/pdf_analyzer_state.py RENAMED Viewed

File without changes

data/rag_storage/graph_chunk_entity_relation.graphml DELETED Viewed

The diff for this file is too large to render. See raw diff

data/rag_storage/kv_store_doc_status.json DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:309b48238b8853a5e1380e8202534eb896da93561462b947a1ee648adc7cf73d
-size 35812

data/rag_storage/kv_store_entity_chunks.json DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:515c0e6f05ae32ab87d73d8c09b30324c7601b6dcf994f78ab6dca24a2c468e6
-size 1289498

data/rag_storage/kv_store_full_docs.json DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:7a2ce58b26969dae661a9808be1fa20bd972ce18d15bcaabedd1219b878d7ba2
-size 2864044

data/rag_storage/kv_store_full_entities.json DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:f5c4388472db8b3f3519c4bf78f08dbedf94a44abafed2517fa2cca73b8350b5
-size 175853

data/rag_storage/kv_store_full_relations.json DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:8ac82d39371f490247f2f9b961624c2d72568b7087a5477f927ca756e83359b4
-size 350564

data/rag_storage/kv_store_llm_response_cache.json DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:2b530175acd7dd02605748a846c988bdec8651b0bb6090d6af808c15824dacce
-size 31129737

data/rag_storage/kv_store_relation_chunks.json DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:d9986b500dfb603ad44c42d9d5d88bff416e3d4eca47dea0e9c8748cef5ad1a4
-size 1184276

data/rag_storage/kv_store_text_chunks.json DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:66a70cbdef477b1ab7b66b29fbf156fce65e8dd50de8c5698d177e01690b1edb
-size 3428391

data/rag_storage/vdb_entities.json DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:fa886e92b33a43033ce138121f12574e10d35e3cd2acc48b9bcda8bdab412258
-size 124984482

data/rag_storage/vdb_relationships.json DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:dc4dad69ba410410554fcebaa3595ced6b520e26a301029a72d31d432d08bde9
-size 109748156

docker-compose.yml CHANGED Viewed

@@ -9,6 +9,8 @@ services:
     env_file:
       - .env  # Load environment variables from .env file
     environment:
       - LIGHTRAG_HOST=127.0.0.1
       - LIGHTRAG_PORT=9621
       - API_PORT=8000

     env_file:
       - .env  # Load environment variables from .env file
     environment:
+      - LIGHTRAG_GRAPHS=romania:9621,bahrain:9622
+      - LIGHTRAG_STORAGE_ROOT=data/rag_storage
       - LIGHTRAG_HOST=127.0.0.1
       - LIGHTRAG_PORT=9621
       - API_PORT=8000

langraph_agent.py CHANGED Viewed

@@ -15,14 +15,14 @@ from langchain_google_genai import ChatGoogleGenerativeAI
 from langchain_core.messages import HumanMessage, SystemMessage, AIMessage, ToolMessage
 logger = logging.getLogger(__name__)
-from agent_state import AgentState
-from prompts import SYSTEM_PROMPT, SYSTEM_PROMPT_CLIENT, SYSTEM_PROMPT_LAWYER
-from utils import LightRAGClient, PerformanceMonitor
-from tools import tools, tools_for_client, tools_for_lawyer
 class CyberLegalAgent:
-    def __init__(self, llm, system_prompt: str = SYSTEM_PROMPT_CLIENT, tools: List[Any] = tools):
         self.tools = tools
         self.llm = llm
         self.performance_monitor = PerformanceMonitor()

 from langchain_core.messages import HumanMessage, SystemMessage, AIMessage, ToolMessage
 logger = logging.getLogger(__name__)
+from agent_states.agent_state import AgentState
+from utils.utils import PerformanceMonitor
+from utils.lightrag_client import LightRAGClient
+from utils.tools import tools, tools_for_client, tools_for_lawyer
 class CyberLegalAgent:
+    def __init__(self, llm, tools: List[Any] = tools):
         self.tools = tools
         self.llm = llm
         self.performance_monitor = PerformanceMonitor()

lightrag.log DELETED Viewed

@@ -1,51 +0,0 @@
-2025-12-16 00:49:52,478 - lightrag - INFO - OpenAI LLM Options: {'max_completion_tokens': 9000}
-2025-12-16 00:49:52,478 - lightrag - INFO - Reranking is disabled
-2025-12-16 00:49:52,769 - lightrag - INFO - [_] Created new empty graph file: /Users/cgrdj/Documents/Code/Cyberlgl/test_minimal/rag_storage/graph_chunk_entity_relation.graphml
-2025-12-16 00:49:52,862 - uvicorn.error - INFO - Started server process [43560]
-2025-12-16 00:49:52,862 - uvicorn.error - INFO - Waiting for application startup.
-2025-12-16 00:49:52,871 - lightrag - INFO - [_] Process 43560 KV load full_docs with 7 records
-2025-12-16 00:49:52,871 - lightrag - INFO - [_] Process 43560 KV load text_chunks with 0 records
-2025-12-16 00:49:52,871 - lightrag - INFO - [_] Process 43560 KV load full_entities with 0 records
-2025-12-16 00:49:52,871 - lightrag - INFO - [_] Process 43560 KV load full_relations with 0 records
-2025-12-16 00:49:52,871 - lightrag - INFO - [_] Process 43560 KV load entity_chunks with 0 records
-2025-12-16 00:49:52,871 - lightrag - INFO - [_] Process 43560 KV load relation_chunks with 0 records
-2025-12-16 00:49:52,871 - lightrag - INFO - [_] Process 43560 KV load llm_response_cache with 0 records
-2025-12-16 00:49:52,872 - lightrag - INFO - [_] Process 43560 doc status load doc_status with 7 records
-2025-12-16 00:49:52,872 - uvicorn.error - INFO - Application startup complete.
-2025-12-16 00:49:52,872 - uvicorn.error - INFO - Uvicorn running on http://127.0.0.1:9621 (Press CTRL+C to quit)
-2025-12-16 01:46:32,747 - uvicorn.access - INFO - 127.0.0.1:50864 - "GET / HTTP/1.1" 307
-2025-12-16 01:46:32,749 - uvicorn.access - INFO - 127.0.0.1:50864 - "GET /webui HTTP/1.1" 307
-2025-12-16 01:46:32,771 - uvicorn.access - INFO - 127.0.0.1:50864 - "GET /webui/assets/index-CRtuqff2.js HTTP/1.1" 200
-2025-12-16 01:46:32,777 - uvicorn.access - INFO - 127.0.0.1:50865 - "GET /webui/assets/index-C8dNBpcg.css HTTP/1.1" 200
-2025-12-16 01:46:33,013 - uvicorn.access - INFO - 127.0.0.1:50865 - "GET /auth-status HTTP/1.1" 200
-2025-12-16 01:46:33,252 - uvicorn.access - INFO - 127.0.0.1:50865 - "GET /docs HTTP/1.1" 200
-2025-12-16 01:46:33,268 - lightrag - INFO - [_] Subgraph query successful | Node count: 0 | Edge count: 0
-2025-12-16 01:46:33,269 - uvicorn.access - INFO - 127.0.0.1:50864 - "GET /graphs?label=*&max_depth=3&max_nodes=1000 HTTP/1.1" 200
-2025-12-16 01:46:33,298 - uvicorn.access - INFO - 127.0.0.1:50865 - "GET /static/swagger-ui/swagger-ui.css HTTP/1.1" 200
-2025-12-16 01:46:33,298 - uvicorn.access - INFO - 127.0.0.1:50864 - "GET /static/swagger-ui/swagger-ui-bundle.js HTTP/1.1" 200
-2025-12-16 01:46:33,462 - uvicorn.access - INFO - 127.0.0.1:50864 - "GET /openapi.json HTTP/1.1" 200
-2025-12-16 01:46:40,237 - uvicorn.error - INFO - Shutting down
-2025-12-16 01:46:40,338 - uvicorn.error - INFO - Waiting for application shutdown.
-2025-12-16 01:46:40,339 - lightrag - INFO - Successfully finalized 12 storages
-2025-12-16 01:46:40,339 - uvicorn.error - INFO - Application shutdown complete.
-2025-12-16 01:46:40,339 - uvicorn.error - INFO - Finished server process [43560]
-2025-12-16 01:46:50,846 - lightrag - INFO - OpenAI LLM Options: {'max_completion_tokens': 9000}
-2025-12-16 01:46:50,847 - lightrag - INFO - Reranking is disabled
-2025-12-16 01:46:51,158 - lightrag - INFO - [_] Created new empty graph file: /Users/cgrdj/Documents/Code/Cyberlgl/test_minimal/rag_storage/graph_chunk_entity_relation.graphml
-2025-12-16 01:46:51,271 - uvicorn.error - INFO - Started server process [12197]
-2025-12-16 01:46:51,271 - uvicorn.error - INFO - Waiting for application startup.
-2025-12-16 01:46:51,280 - lightrag - INFO - [_] Process 12197 KV load full_docs with 7 records
-2025-12-16 01:46:51,281 - lightrag - INFO - [_] Process 12197 KV load text_chunks with 0 records
-2025-12-16 01:46:51,281 - lightrag - INFO - [_] Process 12197 KV load full_entities with 0 records
-2025-12-16 01:46:51,281 - lightrag - INFO - [_] Process 12197 KV load full_relations with 0 records
-2025-12-16 01:46:51,281 - lightrag - INFO - [_] Process 12197 KV load entity_chunks with 0 records
-2025-12-16 01:46:51,281 - lightrag - INFO - [_] Process 12197 KV load relation_chunks with 0 records
-2025-12-16 01:46:51,281 - lightrag - INFO - [_] Process 12197 KV load llm_response_cache with 0 records
-2025-12-16 01:46:51,281 - lightrag - INFO - [_] Process 12197 doc status load doc_status with 7 records
-2025-12-16 01:46:51,282 - uvicorn.error - INFO - Application startup complete.
-2025-12-16 01:46:51,282 - uvicorn.error - INFO - Uvicorn running on http://127.0.0.1:9621 (Press CTRL+C to quit)
-2025-12-16 01:47:02,388 - uvicorn.error - INFO - Shutting down
-2025-12-16 01:47:02,490 - uvicorn.error - INFO - Waiting for application shutdown.
-2025-12-16 01:47:02,491 - lightrag - INFO - Successfully finalized 12 storages
-2025-12-16 01:47:02,491 - uvicorn.error - INFO - Application shutdown complete.
-2025-12-16 01:47:02,491 - uvicorn.error - INFO - Finished server process [12197]

prompts/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ """Prompt templates for agents"""

prompts_lawyer_selector.py → prompts/lawyer_selector.py RENAMED Viewed

@@ -4,7 +4,7 @@ Prompts for lawyer selection agent
 """
 LAWYER_SELECTION_PROMPT = """### Task
-Based on the conversation above, select the TOP 3 lawyers who are most suitable to handle this case.
 ### Available Lawyers
 {lawyers}
@@ -15,5 +15,36 @@ Consider:
 2. Experience level
 3. Presentation and expertise description
 ### Important Note
 For each lawyer selected, explain in CLIENT-FRIENDLY language how they can help with the specific legal problem. Focus on benefits to the client, not just technical details. Use clear, accessible language that a non-lawyer can understand."""

 """
 LAWYER_SELECTION_PROMPT = """### Task
+Based on the conversation above, select 0 to 3 lawyers who are most suitable to handle this case.
 ### Available Lawyers
 {lawyers}
 2. Experience level
 3. Presentation and expertise description
+### Important Rules
+- Select 0 lawyers if the legal issue doesn't match any available lawyer's expertise
+- Select up to 3 lawyers, ranked by relevance (1 = most suitable)
+- Only select lawyers whose areas of practice clearly align with the case
+- Do not hallucinate lawyers that don't exist, just output a list of ids of the lawyers you selected
+### Response Format
+Return a structured response with:
+- rankings: list of 0-3 lawyer selections, each containing:
+  - reasoning: client-friendly explanation of why this lawyer is suitable
+  - lawyer_index: the unique index of the lawyer
+  - rank: position in the recommendation list (1, 2, 3)
+Example response format:
+```json
+{{
+  "rankings": [
+    {{
+      "reasoning": "This lawyer specializes in data protection and has experience with GDPR compliance...",
+      "lawyer_index": 3,
+      "rank": 1
+    }},
+    {{
+      "reasoning": "This lawyer has expertise in cyber law and can help with data breach incidents...",
+      "lawyer_index": 1,
+      "rank": 2
+    }}
+  ]
+}}
+```
 ### Important Note
 For each lawyer selected, explain in CLIENT-FRIENDLY language how they can help with the specific legal problem. Focus on benefits to the client, not just technical details. Use clear, accessible language that a non-lawyer can understand."""

prompts.py → prompts/main.py RENAMED Viewed

@@ -13,6 +13,7 @@ Client Jurisdiction: {jurisdiction}
 1. **query_knowledge_graph**: Search legal documents (GDPR, NIS2, DORA, etc.) to answer questions about EU cyber regulations and directives.
 2. **find_lawyers**: Recommend suitable lawyers based on the user's legal issue and conversation context.
 3. **search_web**: Search the web for current information, recent legal updates, court decisions, or news that may not be in the knowledge graph.
 ### Tool-Calling Process
 You operate in an iterative loop:
@@ -27,12 +28,31 @@ You operate in an iterative loop:
 1. Your responses should be clear, friendly, and provide practical, actionable answers to the user's question.
 2. Use simple language and avoid excessive legal jargon. When legal terms are necessary, explain them in plain terms.
 3. When answering legal questions, use the query_knowledge_graph tool to provide accurate, up-to-date information from official EU legal sources.
-4. If you use specific knowledge from a regulation or directive, reference it in your response and explain what it means in practical terms. Create a section at the end of your response called "References" that lists the source documents used to answer the user's question.
 5. When users ask for lawyer recommendations or legal representation, use the find_lawyers tool to provide suitable lawyer suggestions.
 6. Before calling find_lawyers, ask enough details about the case to understand the problem and provide context for the lawyer selection process if needed.
 7. If the user's question can be answered with your general knowledge, respond directly without calling tools.
 8. Remember: Your final response is sent to the user when you stop calling tools.
 ### Tone
 - Approachable and supportive
 - Focus on practical implications for the user

 1. **query_knowledge_graph**: Search legal documents (GDPR, NIS2, DORA, etc.) to answer questions about EU cyber regulations and directives.
 2. **find_lawyers**: Recommend suitable lawyers based on the user's legal issue and conversation context.
 3. **search_web**: Search the web for current information, recent legal updates, court decisions, or news that may not be in the knowledge graph.
+4. **send_email**: Send an email to a recipient. This tool has STRICT usage requirements (see below).
 ### Tool-Calling Process
 You operate in an iterative loop:
 1. Your responses should be clear, friendly, and provide practical, actionable answers to the user's question.
 2. Use simple language and avoid excessive legal jargon. When legal terms are necessary, explain them in plain terms.
 3. When answering legal questions, use the query_knowledge_graph tool to provide accurate, up-to-date information from official EU legal sources.
+4. If you use specific knowledge from a regulation or directive, reference it in your response and explain what it means in practical terms. Create a section at the end of your response called "References" that lists of source documents used to answer the user's question.
 5. When users ask for lawyer recommendations or legal representation, use the find_lawyers tool to provide suitable lawyer suggestions.
 6. Before calling find_lawyers, ask enough details about the case to understand the problem and provide context for the lawyer selection process if needed.
 7. If the user's question can be answered with your general knowledge, respond directly without calling tools.
 8. Remember: Your final response is sent to the user when you stop calling tools.
+### Email Tool Usage Requirements
+**CRITICAL**: The send_email tool MUST ONLY be used in the following specific workflow:
+1. User asks for a lawyer's help
+2. Agent uses the find_lawyers tool to recommend lawyers, and asks the user if they want to contact one of the recommended lawyers
+3. User confirms they want to contact a specific lawyer
+4. Agent proposes an email draft to contact that lawyer
+5. User agrees to send the email
+6. Agent can call the send_email tool
+**Prohibited usage:**
+- DO NOT use send_email in any other context
+- DO NOT use send_email before having a list of lawyers recommendations
+- DO NOT use send_email without user agreement
+- DO NOT use send_email to send emails to anyone other than lawyers from the recommendations
+- DO NOT use send_email proactively or spontaneously
+The send_email tool is exclusively for contacting lawyers after user explicitly requests and approves the action.
 ### Tone
 - Approachable and supportive
 - Focus on practical implications for the user

prompts_pdf_analyzer.py → prompts/pdf_analyzer.py RENAMED Viewed

File without changes

requirements.txt CHANGED Viewed

@@ -22,3 +22,4 @@ uvicorn[standard]>=0.24.0
 pydantic>=2.0.0
 typing-extensions>=4.0.0
 langchain-tavily>=0.2.16

 pydantic>=2.0.0
 typing-extensions>=4.0.0
 langchain-tavily>=0.2.16
+resend>=0.8.0

startup.sh CHANGED Viewed

@@ -1,31 +1,41 @@
 #!/usr/bin/env bash
 set -euo pipefail
-LIGHTRAG_HOST="${LIGHTRAG_HOST:-127.0.0.1}"
-LIGHTRAG_PORT="${LIGHTRAG_PORT:-9621}"
-PUBLIC_PORT="${PORT:-${API_PORT:-8000}}"
-echo "🚀 Starting CyberLegal AI Stack..."
-echo "Step 1: Starting LightRAG server on ${LIGHTRAG_HOST}:${LIGHTRAG_PORT} ..."
-lightrag-server --host "${LIGHTRAG_HOST}" --port "${LIGHTRAG_PORT}" &
-LIGHTRAG_PID=$!
-cleanup() {
-  kill -TERM "${LIGHTRAG_PID}" 2>/dev/null || true
-  wait "${LIGHTRAG_PID}" 2>/dev/null || true
-}
-trap cleanup EXIT INT TERM
-echo "Waiting for LightRAG server to be ready..."
-for i in {1..30}; do
-  if curl -fsS "http://${LIGHTRAG_HOST}:${LIGHTRAG_PORT}/health" >/dev/null 2>&1; then
-    echo "✅ LightRAG is ready!"
-    break
-  fi
-  sleep 2
 done
-export PORT="${PUBLIC_PORT}"
-echo "Step 2: Starting API on 0.0.0.0:${PORT} ..."
 python agent_api.py

 #!/usr/bin/env bash
 set -euo pipefail
+HOST="${LIGHTRAG_HOST:-127.0.0.1}"
+ROOT="${LIGHTRAG_STORAGE_ROOT:-data/rag_storage}"
+GRAPHS="${LIGHTRAG_GRAPHS:-romania:9621}"
+API_PORT="${PORT:-${API_PORT:-8000}}"
+echo "🚀 LIGHTRAG_GRAPHS=${GRAPHS}"
+echo "📁 Storage root: ${ROOT}"
+PIDS=()
+trap 'kill -TERM ${PIDS[@]:-} 2>/dev/null || true; wait ${PIDS[@]:-} 2>/dev/null || true' EXIT INT TERM
+ENDPOINTS=()
+IFS=',' read -r -a items <<<"${GRAPHS}"
+for item in "${items[@]}"; do
+  IFS=':' read -r id port <<<"${item}"
+  dir="${ROOT}/${id}"
+  mkdir -p "${dir}"
+  echo "➡️  Start LightRAG '${id}' on ${HOST}:${port} (dir=${dir})"
+  lightrag-server --host "${HOST}" --port "${port}" --working-dir "${dir}" &
+  PIDS+=("$!")
+  echo "   ⏳ Waiting health..."
+  for _ in {1..30}; do
+    curl -fsS "http://${HOST}:${port}/health" >/dev/null 2>&1 && { echo "   ✅ ${id} ready"; break; }
+    sleep 2
+  done
+  ENDPOINTS+=("${id}=http://${HOST}:${port}")
 done
+export LIGHTRAG_ENDPOINTS="$(IFS=,; echo "${ENDPOINTS[*]}")"
+export PORT="${API_PORT}"
+echo "✅ LIGHTRAG_ENDPOINTS=${LIGHTRAG_ENDPOINTS}"
+echo "🚀 Starting API on 0.0.0.0:${PORT} ..."
 python agent_api.py

structured_outputs/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ """Structured output definitions for agents"""

structured_outputs/api_models.py ADDED Viewed

	@@ -0,0 +1,64 @@

+#!/usr/bin/env python3
+"""
+Structured outputs for Agent API
+"""
+from typing import List, Optional
+from pydantic import BaseModel, Field
+class Message(BaseModel):
+    """Chat message"""
+    role: str = Field(..., description="Role: 'user' or 'assistant'")
+    content: str = Field(..., description="Message content")
+class DocumentAnalysis(BaseModel):
+    """Document analysis result"""
+    file_name: str
+    summary: Optional[str] = None
+    actors: Optional[str] = None
+    key_details: Optional[str] = None
+class ChatRequest(BaseModel):
+    """Chat request model"""
+    message: str = Field(..., description="User's question")
+    conversationHistory: Optional[List[Message]] = Field(default=[], description="Previous conversation messages")
+    userType: Optional[str] = Field(default="client", description="User type: 'client' for general users or 'lawyer' for legal professionals")
+    jurisdiction: Optional[str] = Field(default="Romania", description="Jurisdiction of the user")
+    documentAnalyses: Optional[List[DocumentAnalysis]] = Field(default=None, description="Lawyer's document analyses")
+class ChatResponse(BaseModel):
+    """Chat response model"""
+    response: str = Field(..., description="Assistant's response")
+    processing_time: float = Field(..., description="Processing time in seconds")
+    references: List[str] = Field(default=[], description="Referenced documents")
+    timestamp: str = Field(..., description="Response timestamp")
+    error: Optional[str] = Field(None, description="Error message if any")
+class HealthResponse(BaseModel):
+    """Health check response model"""
+    status: str = Field(..., description="Health status")
+    agent_ready: bool = Field(..., description="Whether agent is ready")
+    lightrag_healthy: bool = Field(..., description="Whether LightRAG is healthy")
+    timestamp: str = Field(..., description="Health check timestamp")
+class AnalyzePDFRequest(BaseModel):
+    """PDF analysis request model"""
+    pdf_content: str = Field(..., description="Base64 encoded document content (PDF or image)")
+    filename: Optional[str] = Field(default="document.pdf", description="Original filename")
+class AnalyzePDFResponse(BaseModel):
+    """PDF analysis response model"""
+    actors: str = Field(..., description="Extracted actors")
+    key_details: str = Field(..., description="Key details extracted")
+    summary: str = Field(..., description="High-level summary")
+    processing_status: str = Field(..., description="Processing status")
+    processing_time: float = Field(..., description="Processing time in seconds")
+    timestamp: str = Field(..., description="Analysis timestamp")
+    error: Optional[str] = Field(None, description="Error message if any")

structured_outputs/lawyer_selector.py ADDED Viewed

	@@ -0,0 +1,19 @@

+#!/usr/bin/env python3
+"""
+Structured outputs for Lawyer Selector Agent
+"""
+from typing import List
+from pydantic import BaseModel, Field
+class LawyerRanking(BaseModel):
+    """Individual lawyer ranking"""
+    reasoning: str = Field(description="Client-friendly explanation of how this lawyer can help with their specific legal problem")
+    rank: int = Field(description="1, 2, or 3")
+    lawyer_index: int = Field(description="Lawyer number from 1 to N")
+class LawyerRankings(BaseModel):
+    """Collection of lawyer rankings"""
+    rankings: List[LawyerRanking] = Field(description="List of 0 to 3 lawyer rankings", min_length=0, max_length=3)

subagents/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ """Subagent implementations for the main agent"""

lawyer_selector.py → subagents/lawyer_selector.py RENAMED Viewed

@@ -14,30 +14,17 @@ from langchain_core.messages import HumanMessage, SystemMessage, AIMessage
 from langchain_core.output_parsers import PydanticOutputParser
 from pydantic import BaseModel, Field
-from prompts_lawyer_selector import LAWYER_SELECTION_PROMPT
 load_dotenv()
-class LawyerRanking(BaseModel):
-    reasoning: str = Field(description="Client-friendly explanation of how this lawyer can help with their specific legal problem")
-    rank: int = Field(description="1, 2, or 3")
-    lawyer_index: int = Field(description="Lawyer number from 1 to N")
-class LawyerRankings(BaseModel):
-    rankings: List[LawyerRanking] = Field(description="List of top 3 lawyer rankings")
 class LawyerSelectorAgent:
     """Simple agent that analyzes conversations and selects top 3 lawyers"""
     def __init__(self, llm):
         self.llm = llm
-        with open("data/lawyers.json", "r", encoding="utf-8") as f:
-            self.lawyers = json.load(f)
         self.parser = PydanticOutputParser(pydantic_object=LawyerRankings)
         self.workflow = self._build_workflow()
@@ -48,14 +35,34 @@ class LawyerSelectorAgent:
         workflow.add_edge("select_lawyers", END)
         return workflow.compile()
-    def _format_lawyers(self) -> str:
         return "\n\n".join([
-            f"Lawyer {i}:\n- Name: {l['name']}\n- Specialty: {l['specialty']}\n- Experience: {l['experience_years']} years\n- Areas: {', '.join(l['areas_of_practice'])}"
-            for i, l in enumerate(self.lawyers, 1)
         ])
     async def _select_lawyers(self, state: dict) -> dict:
-        lawyers_text = self._format_lawyers()
         prompt = LAWYER_SELECTION_PROMPT.format(lawyers=lawyers_text)
         # Convert message dicts to Message objects
@@ -78,15 +85,30 @@ class LawyerSelectorAgent:
         result = self.parser.parse(response.content)
         rankings = result.rankings
-        state["top_lawyers"] = [
-            {**self.lawyers[r.lawyer_index - 1], **r.model_dump()}
-            for r in rankings
-        ]
         return state
-    async def select_lawyers(self, conversation_history: List[dict]) -> dict:
-        result = await self.workflow.ainvoke({"messages": conversation_history})
-        return {"top_lawyers": result["top_lawyers"]}
 async def main():

 from langchain_core.output_parsers import PydanticOutputParser
 from pydantic import BaseModel, Field
+from prompts.lawyer_selector import LAWYER_SELECTION_PROMPT
+from structured_outputs.lawyer_selector import LawyerRanking, LawyerRankings
 load_dotenv()
 class LawyerSelectorAgent:
     """Simple agent that analyzes conversations and selects top 3 lawyers"""
     def __init__(self, llm):
         self.llm = llm
         self.parser = PydanticOutputParser(pydantic_object=LawyerRankings)
         self.workflow = self._build_workflow()
         workflow.add_edge("select_lawyers", END)
         return workflow.compile()
+    def _format_lawyers(self, lawyers: List[dict]) -> str:
         return "\n\n".join([
+            f"Lawyer index:{i}\n\n- Name: {l['name']}\n- Specialty: {l['specialty']}\n- Experience: {l['experience_years']} years\n- Areas: {', '.join(l['areas_of_practice'])}"
+            for i, l in enumerate(lawyers, 1)
         ])
+    def _format_lawyer_profile(self, lawyer: dict, rank: int, reasoning: str) -> str:
+        """Format a single lawyer profile for the result output"""
+        lines = [
+            "\n" + "─" * 80,
+            f"RECOMMENDATION #{rank}",
+            "─" * 80,
+            f"\n👤 {lawyer['name']}",
+            f"   {lawyer['presentation']}",
+            f"\n📊 Experience: {lawyer['experience_years']} years",
+            f"🎯 Specialty: {lawyer['specialty']}",
+            f"\n✅ Why this lawyer matches your case:",
+            f"   {reasoning}",
+            f"\n📚 Areas of Practice:"
+        ]
+        for area in lawyer['areas_of_practice']:
+            lines.append(f"   • {area}")
+        lines.append("")
+        return "\n".join(lines)
     async def _select_lawyers(self, state: dict) -> dict:
+        lawyers = state["lawyers"]
+        lawyers_text = self._format_lawyers(lawyers)
         prompt = LAWYER_SELECTION_PROMPT.format(lawyers=lawyers_text)
         # Convert message dicts to Message objects
         result = self.parser.parse(response.content)
         rankings = result.rankings
+        # Retrieve and concatenate lawyer profiles
+        if not rankings:
+            output = ["=" * 80, "LAWYER RECOMMENDATIONS", "=" * 80]
+            output.append("\n❌ No lawyers available for this particular case.")
+            output.append("Your legal issue may fall outside our current areas of expertise.")
+            output.append("Please consider refining your request or contacting a general legal service.")
+        else:
+            output = ["=" * 80, f"{len(rankings)} RECOMMENDED LAWYERS FOR YOUR CASE", "=" * 80]
+            for r in rankings:
+                lawyer = lawyers[r.lawyer_index - 1]
+                output.append(self._format_lawyer_profile(lawyer, r.rank, r.reasoning))
+        state["result"] = "\n".join(output)
         return state
+    async def select_lawyers(self, conversation_history: List[dict]) -> str:
+        with open("data/lawyers.json", "r", encoding="utf-8") as f:
+            lawyers = json.load(f)
+        result = await self.workflow.ainvoke({
+            "messages": conversation_history,
+            "lawyers": lawyers
+        })
+        return result["result"]
 async def main():

pdf_analyzer.py → subagents/pdf_analyzer.py RENAMED Viewed

@@ -13,8 +13,8 @@ from langchain_google_genai import ChatGoogleGenerativeAI
 from langchain_core.messages import HumanMessage, SystemMessage
 from mistralai import Mistral
-from pdf_analyzer_state import PDFAnalyzerState
-from prompts_pdf_analyzer import SYSTEM_PROMPT, EXTRACT_ACTORS_PROMPT, EXTRACT_KEY_DETAILS_PROMPT, GENERATE_SUMMARY_PROMPT
 logger = logging.getLogger(__name__)

 from langchain_core.messages import HumanMessage, SystemMessage
 from mistralai import Mistral
+from agent_states.pdf_analyzer_state import PDFAnalyzerState
+from prompts.pdf_analyzer import SYSTEM_PROMPT, EXTRACT_ACTORS_PROMPT, EXTRACT_KEY_DETAILS_PROMPT, GENERATE_SUMMARY_PROMPT
 logger = logging.getLogger(__name__)

test_agent.ipynb DELETED Viewed

@@ -1,152 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "id": "9fc74685",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "True"
-      ]
-     },
-     "execution_count": 1,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "import json\n",
-    "from langraph_agent import CyberLegalAgent\n",
-    "from dotenv import load_dotenv\n",
-    "load_dotenv()"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 7,
-   "id": "d3ad35b4",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "history=[{'role': 'user', 'content': 'I need help with a data breach issue'}, {'role': 'assistant', 'content': 'I can help with that. Can you tell me more about the breach?'}, {'role': 'user', 'content': \"My company is in Romania and experienced a data breach. Customer names, addresses, and SSNs were stolen. We have about 500 affected customers. The breach occurred 2 days ago. We haven't notified authorities yet. I need a lawyer to help with GDPR compliance and breach notification.\"}, {'role': 'assistant', 'content': \"I understand. To find the best lawyer for you, could you tell me: Are you looking for a Romanian-based firm or international? What language do you prefer? What's your budget range?\"}]\n",
-    "user_query='I prefer a Romanian-based firm, English language is not fine, and my budget is around 2000-5000 EUR for this incident. I need help immediately with breach notification.'\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "id": "a45517e7",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "INFO:langraph_agent:🤖 Initialized with OpenAI (gpt-5-nano)\n"
-     ]
-    }
-   ],
-   "source": [
-    "agent=CyberLegalAgent()"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 11,
-   "id": "589dc2c3",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "INFO:langraph_agent:� Starting query processing: I prefer a Romanian-based firm, English language is not fine, and my budget is around 2000-5000 EUR ...\n",
-      "INFO:langraph_agent:� Conversation history length: 4 messages\n",
-      "WARNING:utils:Query timeout, attempt 1\n",
-      "INFO:utils:Query successful;{'response': 'Thanks for the details. Given the breach happened in Romania and involves sensitive customer data (names, addresses, SSNs) affecting about 500 people, you should act quickly to align with GDPR obligations and the EU cyber-notification framework. Here’s a practical, lawyer-focused plan you can use right away.\\n\\nImmediate next steps (urgent)\\n- Engage a Romanian-based GDPR/legal firm now (you’ve indicated a preference for a Romanian-language, Romania-based firm within a 2,000–5,000 EUR range). A local lawyer can lead the breach-notification process, coordinate with authorities, and help prepare communications to customers.\\n- Prepare a rapid internal factsheet for the lawyer and the incident response team (what happened, when it started, what data were affected, which systems were involved, what containment steps have been taken, and who is affected). This will support the legal review and notifications.\\n- Initiate coordination with the relevant authorities and incident-response bodies. In the EU framework you’re operating under, important entities handle incident notifications and information sharing (Single Points of Contact and CSIRTs). Your Romanian-based counsel can establish the right communication channel immediately.\\n- Do not delay notifications if advised by your lawyer. The directive framework requires timely reporting to the appropriate authorities and, where required, to the data subjects.\\n\\nKey regulatory obligations to address with your lawyer (based on the context you provided)\\n- Early warning and incident notification timing: The Directive (EU) 2022/2555 envisions an early warning within 24 hours and an incident notification within 72 hours of becoming aware of a significant incident, with a final report not later than one month after notification. Your lawyer can help determine if the breach qualifies as a significant incident and ensure the required timelines are met through the proper channels. [2]\\n- Personal data breach notification to supervisory authorities: If the breach involves personal data, authorities emphasize informing the supervisory authority without undue delay and within the established timeframes when applicable. Your lawyer will assess the specifics of your breach under GDPR and initiate the correct notification process. [2]\\n- Cooperation channels (SPOCs and CSIRTs): In the EU context, Single Points of Contact (SPOCs) and Computer Security Incident Response Teams (CSIRTs) play a central role in coordinating notifications and information sharing across competent authorities. Your Romanian counsel can engage these channels to ensure fast, compliant communication and support. [2]\\n- Civil-law liability context in Romania: Under Romanian Civil Code provisions, contractual and other damages liabilities arise if obligations are not fulfilled, which underscores why timely breach notification and remediation are important from a legal-arbitration risk perspective. This helps explain why a formal notification and remediation plan are essential. [1]\\n\\nWhat to expect from a Romanian-based GDPR lawyer (typical scope, aligned with your budget)\\n- Immediate breach-notification package: assessment of whether the breach triggers GDPR notification to the supervisory authority and/or to data subjects; drafting and coordinating the actual notices; ensuring timelines are met under the directive and Romanian law.\\n- Coordination with authorities and incident-response bodies: establishing contact with the applicable SPOC/CSIRT and managing liaison with the supervisory authority.\\n- Data-subject communications: drafting clear, compliant notices to affected customers, including information to help them mitigate risk (this is standard practice to minimize harm and regulatory risk).\\n- Documentation and follow-up plan: producing a formal incident report outline, a remediation plan, and a schedule for final reporting as required.\\n\\nPractical steps you can start today (to share with the lawyer)\\n- Share the breach details: approximate time of discovery, systems involved, data categories (names, addresses, SSNs), estimated number of affected individuals, current containment actions, and any steps already taken to secure data.\\n- Confirm the fastest permissible notification path: whether to file an early warning to the SPOC/CSIRT, followed by a 72-hour incident notification to the supervisory authority and affected individuals (if required), plus a final monthly report.\\n- Establish who will draft and approve all communications (legal, privacy/compliance lead, and senior management sign-off).\\n- Set clear expectations on cost and deliverables within your 2,000–5,000 EUR budget for this incident, with scope defined (not just advisory, but practical drafting, notifications, and coordination).\\n\\nNotes on grounding in the Context\\n- The described 24-hour/72-hour/one-month timelines and the roles of SPOCs/CSIRTs come from the EU Directive material you provided. They guide the notification cadence and the cooperation framework your lawyer will implement. [2]\\n- Romanian civil-law background underscores the importance of timely and proper remediation and communications to limit liability in contractual or civil-fault contexts. This supports using early, formal notifications as part of the overall risk management. [1]\\n\\nIf you’d like, I can help you draft a quick briefing for a Romanian GDPR lawyer (including the key questions to ask and the documents to provide) to accelerate starting the engagement and the breach-notification process.\\n\\n### References\\n\\n- [1] romanian_civil_code_2009.txt\\n- [2] gdpr_2022_2555.txt', 'references': [{'reference_id': '1', 'file_path': 'romanian_civil_code_2009.txt', 'content': None}, {'reference_id': '2', 'file_path': 'gdpr_2022_2555.txt', 'content': None}]}\n",
-      "INFO:langraph_agent:🔍 LightRAG response received:\n",
-      "INFO:langraph_agent:📄 Context length: 5594 characters\n",
-      "INFO:langraph_agent:📚 References found: 2\n",
-      "INFO:langraph_agent:📝 Context preview: Thanks for the details. Given the breach happened in Romania and involves sensitive customer data (names, addresses, SSNs) affecting about 500 people, you should act quickly to align with GDPR obligations and the EU cyber-notification framework. Here’s a practical, lawyer-focused plan you can use right away.\n",
-      "\n",
-      "Immediate next steps (urgent)\n",
-      "- Engage a Romanian-based GDPR/legal firm now (you’ve indicated a preference for a Romanian-language, Romania-based firm within a 2,000–5,000 EUR range). A loc...\n",
-      "INFO:langraph_agent:⏱️  LightRAG query processing time: 64.635s\n",
-      "INFO:langraph_agent:📝 Conversation history: [{'role': 'user', 'content': 'I need help with a data breach issue'}, {'role': 'assistant', 'content': 'I can help with that. Can you tell me more about the breach?'}, {'role': 'user', 'content': \"My company is in Romania and experienced a data breach. Customer names, addresses, and SSNs were stolen. We have about 500 affected customers. The breach occurred 2 days ago. We haven't notified authorities yet. I need a lawyer to help with GDPR compliance and breach notification.\"}, {'role': 'assistant', 'content': \"I understand. To find the best lawyer for you, could you tell me: Are you looking for a Romanian-based firm or international? What language do you prefer? What's your budget range?\"}]\n",
-      "INFO:langraph_agent:📝 Message: {'role': 'user', 'content': 'I need help with a data breach issue'}\n",
-      "INFO:langraph_agent:📝 Message type: <class 'dict'>\n",
-      "INFO:langraph_agent:📝 Message keys: dict_keys(['role', 'content'])\n",
-      "INFO:langraph_agent:📝 Message: {'role': 'assistant', 'content': 'I can help with that. Can you tell me more about the breach?'}\n",
-      "INFO:langraph_agent:📝 Message type: <class 'dict'>\n",
-      "INFO:langraph_agent:📝 Message keys: dict_keys(['role', 'content'])\n",
-      "INFO:langraph_agent:📝 Message: {'role': 'user', 'content': \"My company is in Romania and experienced a data breach. Customer names, addresses, and SSNs were stolen. We have about 500 affected customers. The breach occurred 2 days ago. We haven't notified authorities yet. I need a lawyer to help with GDPR compliance and breach notification.\"}\n",
-      "INFO:langraph_agent:📝 Message type: <class 'dict'>\n",
-      "INFO:langraph_agent:📝 Message keys: dict_keys(['role', 'content'])\n",
-      "INFO:langraph_agent:📝 Message: {'role': 'assistant', 'content': \"I understand. To find the best lawyer for you, could you tell me: Are you looking for a Romanian-based firm or international? What language do you prefer? What's your budget range?\"}\n",
-      "INFO:langraph_agent:📝 Message type: <class 'dict'>\n",
-      "INFO:langraph_agent:📝 Message keys: dict_keys(['role', 'content'])\n",
-      "INFO:langraph_agent:📝 Created Messages stack: [HumanMessage(content='I need help with a data breach issue', additional_kwargs={}, response_metadata={}), AIMessage(content='I can help with that. Can you tell me more about the breach?', additional_kwargs={}, response_metadata={}, tool_calls=[], invalid_tool_calls=[]), HumanMessage(content=\"My company is in Romania and experienced a data breach. Customer names, addresses, and SSNs were stolen. We have about 500 affected customers. The breach occurred 2 days ago. We haven't notified authorities yet. I need a lawyer to help with GDPR compliance and breach notification.\", additional_kwargs={}, response_metadata={}), AIMessage(content=\"I understand. To find the best lawyer for you, could you tell me: Are you looking for a Romanian-based firm or international? What language do you prefer? What's your budget range?\", additional_kwargs={}, response_metadata={}, tool_calls=[], invalid_tool_calls=[])]\n",
-      "INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
-      "INFO:langraph_agent:🤖 LLM Response type: <class 'langchain_core.messages.ai.AIMessage'>\n",
-      "INFO:langraph_agent:🤖 LLM Response content length: 0\n",
-      "INFO:langraph_agent:🤖 LLM Response has tool_calls: True\n",
-      "INFO:langraph_agent:🤖 LLM Response tool_calls: [{'name': 'find_lawyers', 'args': {'query': 'Romanian-based GDPR breach-notification lawyer for a data breach in Romania involving 500 customers with names, addresses, and SSNs; urgent breach notification within GDPR timelines; Romanian language; budget 2000-5000 EUR; immediate help with notification to authorities and data subjects; experienced with SPOCs/CSIRTs and ANSPDCP coordination.'}, 'id': 'call_DYoaaEVoh5tCVoKjlnD3r2dy', 'type': 'tool_call'}]\n",
-      "INFO:langraph_agent:🤖 LLM Response has invalid_tool_calls: True\n",
-      "INFO:langraph_agent:🤖 LLM Response invalid_tool_calls: []\n",
-      "INFO:langraph_agent:⏱️  Answer generation processing time: 9.758s\n",
-      "INFO:langraph_agent:⏱️  Total query processing time: 74.393s\n",
-      "INFO:langraph_agent:📚 References found: 2\n",
-      "INFO:langraph_agent:🔍 Checking for tool calls in response_message...\n",
-      "INFO:langraph_agent:   Response type: <class 'NoneType'>\n",
-      "INFO:langraph_agent:   No response_message found in state\n",
-      "INFO:langraph_agent:   No tool calls, routing to end\n",
-      "INFO:langraph_agent:final state: {'user_query': 'I prefer a Romanian-based firm, English language is not fine, and my budget is around 2000-5000 EUR for this incident. I need help immediately with breach notification.', 'conversation_history': [{'role': 'user', 'content': 'I need help with a data breach issue'}, {'role': 'assistant', 'content': 'I can help with that. Can you tell me more about the breach?'}, {'role': 'user', 'content': \"My company is in Romania and experienced a data breach. Customer names, addresses, and SSNs were stolen. We have about 500 affected customers. The breach occurred 2 days ago. We haven't notified authorities yet. I need a lawyer to help with GDPR compliance and breach notification.\"}, {'role': 'assistant', 'content': \"I understand. To find the best lawyer for you, could you tell me: Are you looking for a Romanian-based firm or international? What language do you prefer? What's your budget range?\"}], 'lightrag_response': {'response': 'Thanks for the details. Given the breach happened in Romania and involves sensitive customer data (names, addresses, SSNs) affecting about 500 people, you should act quickly to align with GDPR obligations and the EU cyber-notification framework. Here’s a practical, lawyer-focused plan you can use right away.\\n\\nImmediate next steps (urgent)\\n- Engage a Romanian-based GDPR/legal firm now (you’ve indicated a preference for a Romanian-language, Romania-based firm within a 2,000–5,000 EUR range). A local lawyer can lead the breach-notification process, coordinate with authorities, and help prepare communications to customers.\\n- Prepare a rapid internal factsheet for the lawyer and the incident response team (what happened, when it started, what data were affected, which systems were involved, what containment steps have been taken, and who is affected). This will support the legal review and notifications.\\n- Initiate coordination with the relevant authorities and incident-response bodies. In the EU framework you’re operating under, important entities handle incident notifications and information sharing (Single Points of Contact and CSIRTs). Your Romanian-based counsel can establish the right communication channel immediately.\\n- Do not delay notifications if advised by your lawyer. The directive framework requires timely reporting to the appropriate authorities and, where required, to the data subjects.\\n\\nKey regulatory obligations to address with your lawyer (based on the context you provided)\\n- Early warning and incident notification timing: The Directive (EU) 2022/2555 envisions an early warning within 24 hours and an incident notification within 72 hours of becoming aware of a significant incident, with a final report not later than one month after notification. Your lawyer can help determine if the breach qualifies as a significant incident and ensure the required timelines are met through the proper channels. [2]\\n- Personal data breach notification to supervisory authorities: If the breach involves personal data, authorities emphasize informing the supervisory authority without undue delay and within the established timeframes when applicable. Your lawyer will assess the specifics of your breach under GDPR and initiate the correct notification process. [2]\\n- Cooperation channels (SPOCs and CSIRTs): In the EU context, Single Points of Contact (SPOCs) and Computer Security Incident Response Teams (CSIRTs) play a central role in coordinating notifications and information sharing across competent authorities. Your Romanian counsel can engage these channels to ensure fast, compliant communication and support. [2]\\n- Civil-law liability context in Romania: Under Romanian Civil Code provisions, contractual and other damages liabilities arise if obligations are not fulfilled, which underscores why timely breach notification and remediation are important from a legal-arbitration risk perspective. This helps explain why a formal notification and remediation plan are essential. [1]\\n\\nWhat to expect from a Romanian-based GDPR lawyer (typical scope, aligned with your budget)\\n- Immediate breach-notification package: assessment of whether the breach triggers GDPR notification to the supervisory authority and/or to data subjects; drafting and coordinating the actual notices; ensuring timelines are met under the directive and Romanian law.\\n- Coordination with authorities and incident-response bodies: establishing contact with the applicable SPOC/CSIRT and managing liaison with the supervisory authority.\\n- Data-subject communications: drafting clear, compliant notices to affected customers, including information to help them mitigate risk (this is standard practice to minimize harm and regulatory risk).\\n- Documentation and follow-up plan: producing a formal incident report outline, a remediation plan, and a schedule for final reporting as required.\\n\\nPractical steps you can start today (to share with the lawyer)\\n- Share the breach details: approximate time of discovery, systems involved, data categories (names, addresses, SSNs), estimated number of affected individuals, current containment actions, and any steps already taken to secure data.\\n- Confirm the fastest permissible notification path: whether to file an early warning to the SPOC/CSIRT, followed by a 72-hour incident notification to the supervisory authority and affected individuals (if required), plus a final monthly report.\\n- Establish who will draft and approve all communications (legal, privacy/compliance lead, and senior management sign-off).\\n- Set clear expectations on cost and deliverables within your 2,000–5,000 EUR budget for this incident, with scope defined (not just advisory, but practical drafting, notifications, and coordination).\\n\\nNotes on grounding in the Context\\n- The described 24-hour/72-hour/one-month timelines and the roles of SPOCs/CSIRTs come from the EU Directive material you provided. They guide the notification cadence and the cooperation framework your lawyer will implement. [2]\\n- Romanian civil-law background underscores the importance of timely and proper remediation and communications to limit liability in contractual or civil-fault contexts. This supports using early, formal notifications as part of the overall risk management. [1]\\n\\nIf you’d like, I can help you draft a quick briefing for a Romanian GDPR lawyer (including the key questions to ask and the documents to provide) to accelerate starting the engagement and the breach-notification process.\\n\\n### References\\n\\n- [1] romanian_civil_code_2009.txt\\n- [2] gdpr_2022_2555.txt', 'references': [{'reference_id': '1', 'file_path': 'romanian_civil_code_2009.txt', 'content': None}, {'reference_id': '2', 'file_path': 'gdpr_2022_2555.txt', 'content': None}]}, 'processed_context': None, 'relevant_documents': ['romanian_civil_code_2009.txt', 'gdpr_2022_2555.txt'], 'analysis_thoughts': None, 'final_response': '', 'query_timestamp': '2026-01-03T18:32:20.045214', 'processing_time': 74.39322590827942}\n",
-      "INFO:langraph_agent:✅ Query processing completed successfully\n",
-      "INFO:langraph_agent:📄 Response length: 0 characters\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "{'response': '',\n",
-       " 'processing_time': 74.39322590827942,\n",
-       " 'references': ['romanian_civil_code_2009.txt', 'gdpr_2022_2555.txt'],\n",
-       " 'timestamp': '2026-01-03T18:32:20.045214'}"
-      ]
-     },
-     "execution_count": 11,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "await agent.process_query(conversation_history=history,user_query=user_query)"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "cyberlgl",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.12.12"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}

test_openai_key.ipynb DELETED Viewed

@@ -1,155 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "# Test OpenAI API Key\n",
-    "\n",
-    "This notebook tests if the OpenAI API key is working correctly."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "API Key loaded: sk-proj-zF...6Cny4xuHsA\n"
-     ]
-    }
-   ],
-   "source": [
-    "# Load environment variables\n",
-    "import os\n",
-    "from dotenv import load_dotenv\n",
-    "\n",
-    "# Load the .env file\n",
-    "load_dotenv('.env')\n",
-    "\n",
-    "# Get the API key\n",
-    "api_key = os.getenv('OPENAI_API_KEY')\n",
-    "print(f\"API Key loaded: {api_key[:10]}...{api_key[-10:] if api_key else 'None'}\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "❌ OpenAI API Test Failed: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}\n"
-     ]
-    }
-   ],
-   "source": [
-    "# Test OpenAI API\n",
-    "from openai import OpenAI\n",
-    "\n",
-    "try:\n",
-    "    client = OpenAI(api_key=api_key)\n",
-    "    \n",
-    "    # Simple test request\n",
-    "    response = client.chat.completions.create(\n",
-    "        model=\"gpt-5-nano-2025-08-07\",\n",
-    "        messages=[\n",
-    "            {\"role\": \"user\", \"content\": \"Hello, can you respond with just 'API key works!'?\"}\n",
-    "        ],\n",
-    "\n",
-    "    )\n",
-    "    \n",
-    "    print(\"✅ OpenAI API Test Successful!\")\n",
-    "    print(f\"Response: {response.choices[0].message.content}\")\n",
-    "    \n",
-    "except Exception as e:\n",
-    "    print(f\"❌ OpenAI API Test Failed: {e}\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Test embedding API\n",
-    "try:\n",
-    "    response = client.embeddings.create(\n",
-    "        model=\"text-embedding-3-large\",\n",
-    "        input=\"Test embedding for cyber legal regulations\"\n",
-    "    )\n",
-    "    \n",
-    "    print(\"✅ OpenAI Embedding API Test Successful!\")\n",
-    "    print(f\"Embedding dimension: {len(response.data[0].embedding)}\")\n",
-    "    print(f\"First 5 values: {response.data[0].embedding[:5]}\")\n",
-    "    \n",
-    "except Exception as e:\n",
-    "    print(f\"❌ OpenAI Embedding API Test Failed: {e}\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Test LightRAG connection with the API key\n",
-    "try:\n",
-    "    import sys\n",
-    "    sys.path.append('../LightRAG')\n",
-    "    \n",
-    "    from lightrag import LightRAG, QueryParam\n",
-    "    from lightrag.llm import openai_complete_if_cache, openai_embedding\n",
-    "    \n",
-    "    rag = LightRAG(\n",
-    "        working_dir=\"./rag_storage\",\n",
-    "        llm_model_func=openai_complete_if_cache,\n",
-    "        llm_model_name=\"gpt-4o\",\n",
-    "        llm_model_kwargs={\"api_key\": api_key, \"model\": \"gpt-4o\"},\n",
-    "        embedding_func=openai_embedding,\n",
-    "        embedding_model_name=\"text-embedding-3-large\",\n",
-    "        embedding_model_kwargs={\"api_key\": api_key, \"model\": \"text-embedding-3-large\"}\n",
-    "    )\n",
-    "    \n",
-    "    # Test a simple query\n",
-    "    result = rag.query(\n",
-    "        \"What is NIS2?\",\n",
-    "        param=QueryParam(mode=\"naive\")\n",
-    "    )\n",
-    "    \n",
-    "    print(\"✅ LightRAG API Test Successful!\")\n",
-    "    print(f\"Response length: {len(result)} characters\")\n",
-    "    print(f\"Response preview: {result[:200]}...\")\n",
-    "    \n",
-    "except Exception as e:\n",
-    "    print(f\"❌ LightRAG API Test Failed: {e}\")"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "cyberlgl",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.12.12"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 4
-}

test_tool_calling_demo.ipynb DELETED Viewed

@@ -1,676 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "# CyberLegalAI - Tool Calling Demo\n",
-    "\n",
-    "This notebook demonstrates the new flexible LangGraph agent with tool-calling capabilities.\n",
-    "\n",
-    "## What you'll see:\n",
-    "1. Agent initialization with different configurations\n",
-    "2. Tool calling scenarios (knowledge graph, lawyer finder)\n",
-    "3. Direct answer scenarios (no tools)\n",
-    "4. Multiple tool calls in sequence"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Install required packages if needed\n",
-    "# !pip install -r requirements.txt"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "True"
-      ]
-     },
-     "execution_count": 2,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "import asyncio\n",
-    "import json\n",
-    "from dotenv import load_dotenv\n",
-    "\n",
-    "from langraph_agent import CyberLegalAgent\n",
-    "from tools import tools_for_client, tools_for_lawyer\n",
-    "from prompts import SYSTEM_PROMPT_CLIENT, SYSTEM_PROMPT_LAWYER\n",
-    "\n",
-    "load_dotenv()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 1. Initialize Agents\n",
-    "\n",
-    "We'll create two agents with different configurations:\n",
-    "- **Client Agent**: Friendly tone, can find lawyers + query knowledge graph\n",
-    "- **Lawyer Agent**: Professional tone, only queries knowledge graph"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "✅ Agents initialized successfully!\n",
-      "\n",
-      "Client Agent Tools: ['query_knowledge_graph', 'find_lawyers']\n",
-      "Lawyer Agent Tools: ['query_knowledge_graph']\n"
-     ]
-    }
-   ],
-   "source": [
-    "# Initialize client agent\n",
-    "client_agent = CyberLegalAgent(\n",
-    "    llm_provider=\"openai\",\n",
-    "    system_prompt=SYSTEM_PROMPT_CLIENT,\n",
-    "    tools=tools_for_client\n",
-    ")\n",
-    "\n",
-    "# Initialize lawyer agent\n",
-    "lawyer_agent = CyberLegalAgent(\n",
-    "    llm_provider=\"openai\",\n",
-    "    system_prompt=SYSTEM_PROMPT_LAWYER,\n",
-    "    tools=tools_for_lawyer\n",
-    ")\n",
-    "\n",
-    "print(\"✅ Agents initialized successfully!\")\n",
-    "print(f\"\\nClient Agent Tools: {[t.name for t in client_agent.tools]}\")\n",
-    "print(f\"Lawyer Agent Tools: {[t.name for t in lawyer_agent.tools]}\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 2. Test 1: Direct Answer (No Tool Call)\n",
-    "\n",
-    "**Scenario**: User asks a general question that the LLM can answer with its knowledge.\n",
-    "\n",
-    "**Expected Behavior**: Agent responds directly without calling any tools."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "================================================================================\n",
-      "TEST 1: Direct Answer (No Tool Call)\n",
-      "================================================================================\n",
-      "\n",
-      "👤 User Query: What is GDPR and what are the main principles?\n",
-      "\n",
-      "Processing...\n",
-      "\n"
-     ]
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "INFO:httpx:HTTP Request: POST https://api.cerebras.ai/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
-      "INFO:langraph_agent:🔧 Calling tools: ['query_knowledge_graph']\n"
-     ]
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "[{'name': 'query_knowledge_graph', 'args': {'query': 'What is GDPR and what are the main principles', 'conversation_history': []}, 'id': 'd156ff401', 'type': 'tool_call'}]\n"
-     ]
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "INFO:utils:Query successful;{'response': '**What is the GDPR?**  \\n\\nThe **General Data Protection Regulation (GDPR)**, formally **Regulation (EU) 2016/679**, is a European Union regulation that establishes a comprehensive legal framework for the protection of personal data and privacy of natural persons within the EU and the European Economic Area. It sets out rules for how personal data must be processed, stored, and erased, granting individuals rights such as access to their data, the right to be forgotten, and the right to data portability. The regulation also obliges organisations to be transparent about their data‑handling practices and to embed data‑protection safeguards into the design of their systems【1】.  \\n\\n**Main principles of the GDPR**  \\n\\n| Principle | What it means (as reflected in the GDPR) |\\n|-----------|-------------------------------------------|\\n| **Lawfulness, fairness and transparency** | Personal data must be processed on a lawful basis, in a fair manner, and organisations must provide clear information to data subjects about how their data are used. |\\n| **Purpose limitation** | Data may only be collected for specified, explicit, and legitimate purposes and must not be further processed in a way that is incompatible with those purposes【2】. |\\n| **Data minimisation** | Only the minimum amount of personal data necessary for the intended purpose may be collected and processed【3】. |\\n| **Accuracy** | Personal data must be kept accurate and up‑to‑date; inaccurate data should be corrected or erased without delay. |\\n| **Storage limitation** | Data should be retained only for as long as necessary for the purposes for which it was collected. |\\n| **Integrity and confidentiality (security)** | Appropriate technical and organisational measures must protect data against unauthorised access, loss, or destruction. |\\n| **Accountability** | Controllers are responsible for, and must be able to demonstrate, compliance with all other principles. |\\n| **Data‑protection‑by‑design and by‑default** | Systems and processes must incorporate data‑protection measures from the outset and ensure that, by default, only the data necessary for a specific purpose are processed【4】. |\\n\\nThese principles together create a high‑level standard for data‑protection compliance across the Union, shaping how organisations handle personal information and reinforcing individuals’ control over their own data【5】.  \\n\\n---\\n\\n### References\\n\\n- [1] gdpr_2022_2555.txt  \\n- [2] cyber_resilience_act_2024_2847.txt  \\n- [3] cyber_resilience_act_2024_2847.txt  \\n- [4] cyber_resilience_act_2024_2847.txt  \\n- [5] nis2_2022_2555.txt  ', 'references': [{'reference_id': '1', 'file_path': 'gdpr_2022_2555.txt', 'content': None}, {'reference_id': '2', 'file_path': 'cyber_resilience_act_2024_2847.txt', 'content': None}, {'reference_id': '3', 'file_path': 'nis2_2022_2555.txt', 'content': None}]}\n",
-      "INFO:langraph_agent:🔧 Tool query_knowledge_graph returned: **What is the GDPR?**  \n",
-      "\n",
-      "The **General Data Protection Regulation (GDPR)**, formally **Regulation (EU) 2016/679**, is a European Union regulation that establishes a comprehensive legal framework for the protection of personal data and privacy of natural persons within the EU and the European Economic Area. It sets out rules for how personal data must be processed, stored, and erased, granting individuals rights such as access to their data, the right to be forgotten, and the right to data portability. The regulation also obliges organisations to be transparent about their data‑handling practices and to embed data‑protection safeguards into the design of their systems【1】.  \n",
-      "\n",
-      "**Main principles of the GDPR**  \n",
-      "\n",
-      "| Principle | What it means (as reflected in the GDPR) |\n",
-      "|-----------|-------------------------------------------|\n",
-      "| **Lawfulness, fairness and transparency** | Personal data must be processed on a lawful basis, in a fair manner, and organisations must provide clear information to data subjects about how their data are used. |\n",
-      "| **Purpose limitation** | Data may only be collected for specified, explicit, and legitimate purposes and must not be further processed in a way that is incompatible with those purposes【2】. |\n",
-      "| **Data minimisation** | Only the minimum amount of personal data necessary for the intended purpose may be collected and processed【3】. |\n",
-      "| **Accuracy** | Personal data must be kept accurate and up‑to‑date; inaccurate data should be corrected or erased without delay. |\n",
-      "| **Storage limitation** | Data should be retained only for as long as necessary for the purposes for which it was collected. |\n",
-      "| **Integrity and confidentiality (security)** | Appropriate technical and organisational measures must protect data against unauthorised access, loss, or destruction. |\n",
-      "| **Accountability** | Controllers are responsible for, and must be able to demonstrate, compliance with all other principles. |\n",
-      "| **Data‑protection‑by‑design and by‑default** | Systems and processes must incorporate data‑protection measures from the outset and ensure that, by default, only the data necessary for a specific purpose are processed【4】. |\n",
-      "\n",
-      "These principles together create a high‑level standard for data‑protection compliance across the Union, shaping how organisations handle personal information and reinforcing individuals’ control over their own data【5】.  \n",
-      "\n",
-      "---\n",
-      "\n",
-      "### References\n",
-      "\n",
-      "- [1] gdpr_2022_2555.txt  \n",
-      "- [2] cyber_resilience_act_2024_2847.txt  \n",
-      "- [3] cyber_resilience_act_2024_2847.txt  \n",
-      "- [4] cyber_resilience_act_2024_2847.txt  \n",
-      "- [5] nis2_2022_2555.txt  \n",
-      "INFO:httpx:HTTP Request: POST https://api.cerebras.ai/v1/chat/completions \"HTTP/1.1 200 OK\"\n"
-     ]
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "--------------------------------------------------------------------------------\n",
-      "🤖 Agent Response:\n",
-      "--------------------------------------------------------------------------------\n",
-      "**What is the GDPR?**  \n",
-      "The **General Data Protection Regulation (GDPR)** – formally **Regulation (EU) 2016/679** – is the EU’s main law on data protection. It sets a single, EU‑wide framework for how personal data (any information that can identify a person) must be handled. The goal is to give individuals more control over their data while requiring organisations to be transparent, secure, and accountable when they collect, store, or use that data.\n",
-      "\n",
-      "**The main GDPR principles**  \n",
-      "These are the building blocks that every data‑controller (the entity that decides why and how data is processed) must follow:\n",
-      "\n",
-      "| Principle | Plain‑English meaning |\n",
-      "|-----------|-----------------------|\n",
-      "| **Lawfulness, fairness & transparency** | You need a legal reason to process data, must treat people fairly,\n",
-      "...\n",
-      "\n",
-      "⏱️ Processing Time: 0.00s\n",
-      "📅 Timestamp: 2026-01-06T11:52:42.305179\n",
-      "================================================================================\n"
-     ]
-    }
-   ],
-   "source": [
-    "async def test_direct_answer():\n",
-    "    print(\"=\" * 80)\n",
-    "    print(\"TEST 1: Direct Answer (No Tool Call)\")\n",
-    "    print(\"=\" * 80)\n",
-    "    \n",
-    "    user_query = \"What is GDPR and what are the main principles?\"\n",
-    "    print(f\"\\n👤 User Query: {user_query}\\n\")\n",
-    "    print(\"Processing...\\n\")\n",
-    "    \n",
-    "    result = await client_agent.process_query(\n",
-    "        user_query=user_query,\n",
-    "        conversation_history=[]\n",
-    "    )\n",
-    "    \n",
-    "    print(\"-\" * 80)\n",
-    "    print(\"🤖 Agent Response:\")\n",
-    "    print(\"-\" * 80)\n",
-    "    print(result['response'][:800])\n",
-    "    print(\"...\" if len(result['response']) > 800 else \"\")\n",
-    "    \n",
-    "    print(f\"\\n⏱️ Processing Time: {result['processing_time']:.2f}s\")\n",
-    "    print(f\"📅 Timestamp: {result['timestamp']}\")\n",
-    "    print(\"=\" * 80)\n",
-    "\n",
-    "await test_direct_answer()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 3. Test 2: Tool Calling - Knowledge Graph\n",
-    "\n",
-    "**Scenario**: User asks a specific legal question requiring accurate information from EU regulations.\n",
-    "\n",
-    "**Expected Behavior**: Agent calls `query_knowledge_graph` tool, receives results, then formulates answer."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "================================================================================\n",
-      "TEST 2: Tool Calling - Knowledge Graph Query\n",
-      "================================================================================\n",
-      "\n",
-      "👤 User Query: What are the data breach notification requirements under GDPR?\n",
-      "\n",
-      "Processing...\n",
-      "\n"
-     ]
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "INFO:httpx:HTTP Request: POST https://api.cerebras.ai/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
-      "INFO:langraph_agent:🔧 Calling tools: ['query_knowledge_graph']\n"
-     ]
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "[{'name': 'query_knowledge_graph', 'args': {'query': 'GDPR data breach notification requirements', 'conversation_history': []}, 'id': 'f8f7b9d4f', 'type': 'tool_call'}]\n"
-     ]
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "INFO:utils:Query successful;{'response': '**GDPR data‑breach notification requirements**\\n\\n- **Timing of the first notification**  \\n  The regulator requires that a manufacturer (or any data‑controller) submit an **initial incident notification within 72\\u202fhours** of becoming aware of a breach that could affect the security of a product with digital elements. If a severe incident is identified earlier, a **warning must be sent within 24\\u202fhours**\\u202f【1】.  \\n\\n- **Content of the initial notice**  \\n  The notification must contain:  \\n  1. A description of the incident and its **severity and impact**;  \\n  2. The **type of threat or root cause** that triggered the breach;  \\n  3. Any **corrective or mitigating measures** already taken and those that users can apply;  \\n  4. An assessment of whether the breach is **suspected of being caused by unlawful or malicious acts**;  \\n  5. An indication of the **sensitivity** of the information disclosed.  \\n\\n- **Final report**  \\n  A **final report** – detailed, including the incident’s full description, the root cause, and ongoing mitigation actions – must be submitted **no later than one month** after the initial notification【1】.  \\n\\n- **Use of a single reporting platform**  \\n  Member States may provide a **single entry point** (e.g., an electronic portal) for submitting breach notifications. This platform can be used for GDPR‑related incident reports, ensuring that the same technical means serve multiple legal obligations and reducing administrative burden【2】.  \\n\\n- **Obligation to inform data subjects**  \\n  When the breach is likely to result in a **high risk to the rights and freedoms of natural persons**, the controller must **inform the affected individuals without undue delay**, providing clear, understandable information about the nature of the breach and recommended protective steps.  \\n\\n- **Supervisory‑authority role**  \\n  Supervisory Authorities, established under the GDPR, receive the notifications, assess the risk, may request further information, and are responsible for enforcing any required remedial actions.  \\n\\nThese requirements together ensure that personal‑data breaches are reported promptly, transparently, and with sufficient detail for authorities and affected individuals to respond effectively.  \\n\\n### References\\n\\n- [1] cyber_resilience_act_2024_2847.txt  \\n- [2] nis2_2022_2555.txt  ', 'references': [{'reference_id': '1', 'file_path': 'cyber_resilience_act_2024_2847.txt', 'content': None}, {'reference_id': '2', 'file_path': 'nis2_2022_2555.txt', 'content': None}, {'reference_id': '3', 'file_path': 'gdpr_2022_2555.txt', 'content': None}]}\n",
-      "INFO:langraph_agent:🔧 Tool query_knowledge_graph returned: **GDPR data‑breach notification requirements**\n",
-      "\n",
-      "- **Timing of the first notification**  \n",
-      "  The regulator requires that a manufacturer (or any data‑controller) submit an **initial incident notification within 72 hours** of becoming aware of a breach that could affect the security of a product with digital elements. If a severe incident is identified earlier, a **warning must be sent within 24 hours** 【1】.  \n",
-      "\n",
-      "- **Content of the initial notice**  \n",
-      "  The notification must contain:  \n",
-      "  1. A description of the incident and its **severity and impact**;  \n",
-      "  2. The **type of threat or root cause** that triggered the breach;  \n",
-      "  3. Any **corrective or mitigating measures** already taken and those that users can apply;  \n",
-      "  4. An assessment of whether the breach is **suspected of being caused by unlawful or malicious acts**;  \n",
-      "  5. An indication of the **sensitivity** of the information disclosed.  \n",
-      "\n",
-      "- **Final report**  \n",
-      "  A **final report** – detailed, including the incident’s full description, the root cause, and ongoing mitigation actions – must be submitted **no later than one month** after the initial notification【1】.  \n",
-      "\n",
-      "- **Use of a single reporting platform**  \n",
-      "  Member States may provide a **single entry point** (e.g., an electronic portal) for submitting breach notifications. This platform can be used for GDPR‑related incident reports, ensuring that the same technical means serve multiple legal obligations and reducing administrative burden【2】.  \n",
-      "\n",
-      "- **Obligation to inform data subjects**  \n",
-      "  When the breach is likely to result in a **high risk to the rights and freedoms of natural persons**, the controller must **inform the affected individuals without undue delay**, providing clear, understandable information about the nature of the breach and recommended protective steps.  \n",
-      "\n",
-      "- **Supervisory‑authority role**  \n",
-      "  Supervisory Authorities, established under the GDPR, receive the notifications, assess the risk, may request further information, and are responsible for enforcing any required remedial actions.  \n",
-      "\n",
-      "These requirements together ensure that personal‑data breaches are reported promptly, transparently, and with sufficient detail for authorities and affected individuals to respond effectively.  \n",
-      "\n",
-      "### References\n",
-      "\n",
-      "- [1] cyber_resilience_act_2024_2847.txt  \n",
-      "- [2] nis2_2022_2555.txt  \n",
-      "INFO:httpx:HTTP Request: POST https://api.cerebras.ai/v1/chat/completions \"HTTP/1.1 200 OK\"\n"
-     ]
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "--------------------------------------------------------------------------------\n",
-      "🤖 Agent Response:\n",
-      "--------------------------------------------------------------------------------\n",
-      "**GDPR data‑breach notification requirements – plain‑language summary**\n",
-      "\n",
-      "| What you must do | When you must do it | What you need to include |\n",
-      "|------------------|---------------------|--------------------------|\n",
-      "| **Notify the supervisory authority** (the data‑protection regulator) | **Within 72 hours** of becoming aware of a breach that could affect personal data. If the breach is clearly severe, you must send an early **warning within 24 hours**. | • A clear description of what happened <br>• The severity and potential impact <br>• The cause or type of threat (e.g., hacking, lost laptop) <br>• Measures already taken and any steps users should take <br>• Whether the breach appears to be caused by unlawful or malicious acts <br>• How sensitive the disclosed information is |\n",
-      "| **Submit a f\n",
-      "...\n",
-      "\n",
-      "⏱️ Processing Time: 0.00s\n",
-      "📅 Timestamp: 2026-01-06T11:53:48.934270\n",
-      "================================================================================\n"
-     ]
-    }
-   ],
-   "source": [
-    "async def test_knowledge_graph_query():\n",
-    "    print(\"=\" * 80)\n",
-    "    print(\"TEST 2: Tool Calling - Knowledge Graph Query\")\n",
-    "    print(\"=\" * 80)\n",
-    "    \n",
-    "    user_query = \"What are the data breach notification requirements under GDPR?\"\n",
-    "    print(f\"\\n👤 User Query: {user_query}\\n\")\n",
-    "    print(\"Processing...\\n\")\n",
-    "    \n",
-    "    result = await client_agent.process_query(\n",
-    "        user_query=user_query,\n",
-    "        conversation_history=[]\n",
-    "    )\n",
-    "    \n",
-    "    print(\"-\" * 80)\n",
-    "    print(\"🤖 Agent Response:\")\n",
-    "    print(\"-\" * 80)\n",
-    "    print(result['response'][:800])\n",
-    "    print(\"...\" if len(result['response']) > 800 else \"\")\n",
-    "    \n",
-    "    print(f\"\\n⏱️ Processing Time: {result['processing_time']:.2f}s\")\n",
-    "    print(f\"📅 Timestamp: {result['timestamp']}\")\n",
-    "    print(\"=\" * 80)\n",
-    "\n",
-    "await test_knowledge_graph_query()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 4. Test 3: Tool Calling - Find Lawyers\n",
-    "\n",
-    "**Scenario**: User with a data breach issue needs lawyer recommendations.\n",
-    "\n",
-    "**Expected Behavior**: Agent calls `find_lawyers` tool, receives recommendations, then presents them."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "================================================================================\n",
-      "TEST 3: Tool Calling - Find Lawyers\n",
-      "================================================================================\n",
-      "\n",
-      "👤 User Query: I prefer a Romanian-based firm, English language is not fine, and my budget is around 2000-5000 EUR for this incident. I need help immediately with breach notification.\n",
-      "\n",
-      "Processing...\n",
-      "\n"
-     ]
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "INFO:httpx:HTTP Request: POST https://api.cerebras.ai/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
-      "INFO:langraph_agent:🔧 Calling tools: ['find_lawyers']\n"
-     ]
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "[{'name': 'find_lawyers', 'args': {'query': 'Romanian law firm for GDPR data breach notification, budget 2000-5000 EUR, immediate help, English not required', 'conversation_history': [{'role': 'user', 'content': \"My company is in Romania and experienced a data breach. Customer names, addresses, and SSNs were stolen. We have about 500 affected customers. The breach occurred 2 days ago. We haven't notified authorities yet. I need a lawyer to help with GDPR compliance and breach notification.\"}, {'role': 'assistant', 'content': 'I can help with that. Can you tell me more about the breach?'}, {'role': 'user', 'content': 'I prefer a Romanian-based firm, English language is not fine, and my budget is around 2000-5000 EUR for this incident. I need help immediately with breach notification.'}]}, 'id': 'c53ab7aa9', 'type': 'tool_call'}]\n"
-     ]
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "INFO:httpx:HTTP Request: POST https://api.cerebras.ai/v1/chat/completions \"HTTP/1.1 404 Not Found\"\n",
-      "INFO:langraph_agent:🔧 Tool find_lawyers returned: Error finding lawyers: Error code: 404 - {'message': 'Model gpt-5-nano-2025-08-07 does not exist or you do not have access to it.', 'type': 'not_found_error', 'param': 'model', 'code': 'model_not_found'}\n",
-      "INFO:httpx:HTTP Request: POST https://api.cerebras.ai/v1/chat/completions \"HTTP/1.1 429 Too Many Requests\"\n",
-      "INFO:openai._base_client:Retrying request to /chat/completions in 59.000000 seconds\n"
-     ]
-    }
-   ],
-   "source": [
-    "async def test_find_lawyers():\n",
-    "    print(\"=\" * 80)\n",
-    "    print(\"TEST 3: Tool Calling - Find Lawyers\")\n",
-    "    print(\"=\" * 80)\n",
-    "    \n",
-    "    # Create conversation history\n",
-    "    history = [\n",
-    "        {'role': 'user', 'content': 'I need help with a data breach issue'},\n",
-    "        {'role': 'assistant', 'content': 'I can help with that. Can you tell me more about the breach?'},\n",
-    "        {'role': 'user', 'content': 'My company is in Romania and experienced a data breach. Customer names, addresses, and SSNs were stolen. We have about 500 affected customers. The breach occurred 2 days ago. We haven\\'t notified authorities yet. I need a lawyer to help with GDPR compliance and breach notification.'}\n",
-    "    ]\n",
-    "    \n",
-    "    user_query = \"I prefer a Romanian-based firm, English language is not fine, and my budget is around 2000-5000 EUR for this incident. I need help immediately with breach notification.\"\n",
-    "    print(f\"\\n👤 User Query: {user_query}\\n\")\n",
-    "    print(\"Processing...\\n\")\n",
-    "    \n",
-    "    result = await client_agent.process_query(\n",
-    "        user_query=user_query,\n",
-    "        conversation_history=history\n",
-    "    )\n",
-    "    \n",
-    "    print(\"-\" * 80)\n",
-    "    print(\"🤖 Agent Response:\")\n",
-    "    print(\"-\" * 80)\n",
-    "    print(result['response'][:1000])\n",
-    "    print(\"...\" if len(result['response']) > 1000 else \"\")\n",
-    "    \n",
-    "    print(f\"\\n⏱️ Processing Time: {result['processing_time']:.2f}s\")\n",
-    "    print(f\"📅 Timestamp: {result['timestamp']}\")\n",
-    "    print(\"=\" * 80)\n",
-    "\n",
-    "await test_find_lawyers()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 5. Test 4: Lawyer Agent - Professional Tone\n",
-    "\n",
-    "**Scenario**: Legal professional asks a technical question about NIS2.\n",
-    "\n",
-    "**Expected Behavior**: Lawyer agent responds with professional, technical language using knowledge graph."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "async def test_lawyer_agent():\n",
-    "    print(\"=\" * 80)\n",
-    "    print(\"TEST 4: Lawyer Agent - Professional Technical Response\")\n",
-    "    print(\"=\" * 80)\n",
-    "    \n",
-    "    user_query = \"What are the data breach notification requirements under NIS2 Directive?\"\n",
-    "    print(f\"\\n👨‍⚖️ Lawyer Query: {user_query}\\n\")\n",
-    "    print(\"Processing...\\n\")\n",
-    "    \n",
-    "    result = await lawyer_agent.process_query(\n",
-    "        user_query=user_query,\n",
-    "        conversation_history=[]\n",
-    "    )\n",
-    "    \n",
-    "    print(\"-\" * 80)\n",
-    "    print(\"🤖 Agent Response:\")\n",
-    "    print(\"-\" * 80)\n",
-    "    print(result['response'][:800])\n",
-    "    print(\"...\" if len(result['response']) > 800 else \"\")\n",
-    "    \n",
-    "    print(f\"\\n⏱️ Processing Time: {result['processing_time']:.2f}s\")\n",
-    "    print(f\"📅 Timestamp: {result['timestamp']}\")\n",
-    "    print(\"=\" * 80)\n",
-    "\n",
-    "await test_lawyer_agent()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 6. Test 5: Tool Choice Decision\n",
-    "\n",
-    "**Scenario**: User asks something that could be answered either way.\n",
-    "\n",
-    "**Expected Behavior**: Agent decides whether to call tools or answer directly based on question complexity."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "async def test_tool_choice():\n",
-    "    print(\"=\" * 80)\n",
-    "    print(\"TEST 5: Tool Choice Decision\")\n",
-    "    print(\"=\" * 80)\n",
-    "    \n",
-    "    # Simple question - might answer directly\n",
-    "    simple_query = \"What is the purpose of the NIS2 Directive?\"\n",
-    "    print(f\"\\n👤 Simple Query: {simple_query}\\n\")\n",
-    "    print(\"Processing...\\n\")\n",
-    "    \n",
-    "    result = await client_agent.process_query(\n",
-    "        user_query=simple_query,\n",
-    "        conversation_history=[]\n",
-    "    )\n",
-    "    \n",
-    "    print(\"-\" * 80)\n",
-    "    print(\"🤖 Agent Response:\")\n",
-    "    print(\"-\" * 80)\n",
-    "    print(result['response'][:600])\n",
-    "    print(\"...\" if len(result['response']) > 600 else \"\")\n",
-    "    \n",
-    "    print(f\"\\n⏱️ Processing Time: {result['processing_time']:.2f}s\")\n",
-    "    print(\"=\" * 80)\n",
-    "\n",
-    "await test_tool_choice()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 7. Compare Processing Times\n",
-    "\n",
-    "Let's run all tests and compare their processing times to understand the performance impact of tool calling."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import pandas as pd\n",
-    "\n",
-    "async def run_all_tests():\n",
-    "    results = []\n",
-    "    \n",
-    "    # Test 1: Direct answer\n",
-    "    result1 = await client_agent.process_query(\n",
-    "        user_query=\"What is GDPR?\",\n",
-    "        conversation_history=[]\n",
-    "    )\n",
-    "    results.append({\n",
-    "        'Test': 'Direct Answer',\n",
-    "        'Tools Called': 0,\n",
-    "        'Processing Time (s)': result1['processing_time']\n",
-    "    })\n",
-    "    \n",
-    "    # Test 2: Knowledge graph\n",
-    "    result2 = await client_agent.process_query(\n",
-    "        user_query=\"What are GDPR breach notification requirements?\",\n",
-    "        conversation_history=[]\n",
-    "    )\n",
-    "    results.append({\n",
-    "        'Test': 'Knowledge Graph Query',\n",
-    "        'Tools Called': 1,\n",
-    "        'Processing Time (s)': result2['processing_time']\n",
-    "    })\n",
-    "    \n",
-    "    # Test 3: Find lawyers\n",
-    "    result3 = await client_agent.process_query(\n",
-    "        user_query=\"I need a lawyer for a GDPR data breach in Romania\",\n",
-    "        conversation_history=[\n",
-    "            {'role': 'user', 'content': 'My company experienced a data breach in Romania with 500 affected customers.'}\n",
-    "        ]\n",
-    "    )\n",
-    "    results.append({\n",
-    "        'Test': 'Find Lawyers',\n",
-    "        'Tools Called': 1,\n",
-    "        'Processing Time (s)': result3['processing_time']\n",
-    "    })\n",
-    "    \n",
-    "    # Test 4: Lawyer agent\n",
-    "    result4 = await lawyer_agent.process_query(\n",
-    "        user_query=\"What are NIS2 notification requirements?\",\n",
-    "        conversation_history=[]\n",
-    "    )\n",
-    "    results.append({\n",
-    "        'Test': 'Lawyer Agent Query',\n",
-    "        'Tools Called': 1,\n",
-    "        'Processing Time (s)': result4['processing_time']\n",
-    "    })\n",
-    "    \n",
-    "    return pd.DataFrame(results)\n",
-    "\n",
-    "df = await run_all_tests()\n",
-    "print(\"\\n\" + \"=\"*80)\n",
-    "print(\"TEST RESULTS SUMMARY\")\n",
-    "print(\"=\"*80)\n",
-    "display(df)\n",
-    "\n",
-    "print(\"\\n💡 Insights:\")\n",
-    "print(\"- Direct answers are fastest (no tool overhead)\")\n",
-    "print(\"- Tool calls add processing time but provide accurate, sourced information\")\n",
-    "print(\"- The agent intelligently chooses when to use tools based on query complexity\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Summary\n",
-    "\n",
-    "### Key Takeaways:\n",
-    "\n",
-    "1. **Flexible Architecture**: The agent can be initialized with different system prompts and tool sets\n",
-    "\n",
-    "2. **Intelligent Tool Selection**: The LLM decides when to call tools based on the query\n",
-    "\n",
-    "3. **Iterative Process**: Tools can be called multiple times, with results fed back to the agent\n",
-    "\n",
-    "4. **User Type Specialization**: Different prompts and tools for clients vs lawyers\n",
-    "\n",
-    "5. **Performance Trade-off**: Direct answers are faster, but tool calls provide more accurate, sourced information\n",
-    "\n",
-    "### Architecture Benefits:\n",
-    "- ✅ Modular design - easy to add new tools\n",
-    "- ✅ Clear separation of concerns\n",
-    "- ✅ Flexible configuration\n",
-    "- ✅ Maintains conversation context\n",
-    "- ✅ Suitable for API integration"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "cyberlgl",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.12.12"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 4
-}

agent_state.py → utils/conversation_manager.py RENAMED Viewed

@@ -1,31 +1,12 @@
 #!/usr/bin/env python3
 """
-Agent state management for the LangGraph cyber-legal assistant
 """
-from typing import TypedDict, List, Dict, Any, Optional
 from datetime import datetime
-class AgentState(TypedDict):
-    """
-    State definition for the LangGraph agent workflow
-    """
-    # User interaction
-    user_query: str
-    conversation_history: List[Dict[str, str]]
-    intermediate_steps: List[Dict[str, Any]]
-    system_prompt: Optional[str]
-    # Context processing
-    relevant_documents: List[str]
-    # Metadata
-    query_timestamp: str
-    processing_time: Optional[float]
-    jurisdiction: Optional[str]
 class ConversationManager:
     """
     Manages conversation history and context
@@ -87,3 +68,62 @@ class ConversationManager:
             context_parts.append(f"{role}: {exchange['content']}")
         return "\n".join(context_parts)

 #!/usr/bin/env python3
 """
+Conversation management for the agent
 """
+from typing import Dict, List, Any
 from datetime import datetime
 class ConversationManager:
     """
     Manages conversation history and context
             context_parts.append(f"{role}: {exchange['content']}")
         return "\n".join(context_parts)
+class ConversationFormatter:
+    """
+    Format conversation data for different purposes
+    """
+    @staticmethod
+    def build_conversation_history(history: List[Dict[str, str]], max_turns: int = 10) -> List[Dict[str, str]]:
+        """
+        Build conversation history for LightRAG API
+        """
+        if not history:
+            return []
+        # Take last max_turns pairs (user + assistant)
+        recent_history = history[-max_turns*2:]
+        formatted = []
+        for exchange in recent_history:
+            # Handle both Message objects and dictionary formats
+            if hasattr(exchange, 'role'):
+                role = exchange.role
+                content = exchange.content
+            else:
+                role = exchange["role"]
+                content = exchange["content"]
+            formatted.append({
+                "role": role,
+                "content": content
+            })
+        return formatted
+    @staticmethod
+    def create_context_summary(history: List[Dict[str, str]]) -> str:
+        """
+        Create a summary of conversation context
+        """
+        if not history:
+            return "No previous conversation."
+        recent_exchanges = history[-4:]  # Last 2 exchanges
+        context_parts = []
+        for exchange in recent_exchanges:
+            # Handle both Message objects and dictionary formats
+            if hasattr(exchange, 'role'):
+                role = "User" if exchange.role == "user" else "Assistant"
+                content = exchange.content
+            else:
+                role = "User" if exchange["role"] == "user" else "Assistant"
+                content = exchange["content"]
+            content = content[:100] + "..." if len(content) > 100 else content
+            context_parts.append(f"{role}: {content}")
+        return "\n".join(context_parts)

utils.py → utils/lightrag_client.py RENAMED Viewed

@@ -1,14 +1,13 @@
 #!/usr/bin/env python3
 """
-Utility functions for LightRAG integration and agent operations
 """
 import os
 import requests
 import time
-from typing import Dict, List, Any, Optional, Tuple
 from dotenv import load_dotenv
-from datetime import datetime
 import logging
 # Load environment variables
@@ -24,6 +23,7 @@ LIGHTRAG_HOST = os.getenv("LIGHTRAG_HOST", "127.0.0.1")
 SERVER_URL = f"http://{LIGHTRAG_HOST}:{LIGHTRAG_PORT}"
 API_KEY = os.getenv("LIGHTRAG_API_KEY")
 class LightRAGClient:
     """
     Client for interacting with LightRAG server
@@ -77,9 +77,8 @@ class LightRAGClient:
                 )
                 if response.status_code == 200:
-                    logger.info(f"Query successful;{response.json()}")
                     return response.json()
                 else:
                     logger.warning(f"Query failed with status {response.status_code}, attempt {attempt + 1}")
@@ -150,142 +149,3 @@ class ResponseProcessor:
                 legal_entities.append(reg)
         return list(set(legal_entities))  # Remove duplicates
-class ConversationFormatter:
-    """
-    Format conversation data for different purposes
-    """
-    @staticmethod
-    def build_conversation_history(history: List[Dict[str, str]], max_turns: int = 10) -> List[Dict[str, str]]:
-        """
-        Build conversation history for LightRAG API
-        """
-        if not history:
-            return []
-        # Take last max_turns pairs (user + assistant)
-        recent_history = history[-max_turns*2:]
-        formatted = []
-        for exchange in recent_history:
-            # Handle both Message objects and dictionary formats
-            if hasattr(exchange, 'role'):
-                role = exchange.role
-                content = exchange.content
-            else:
-                role = exchange["role"]
-                content = exchange["content"]
-            formatted.append({
-                "role": role,
-                "content": content
-            })
-        return formatted
-    @staticmethod
-    def create_context_summary(history: List[Dict[str, str]]) -> str:
-        """
-        Create a summary of conversation context
-        """
-        if not history:
-            return "No previous conversation."
-        recent_exchanges = history[-4:]  # Last 2 exchanges
-        context_parts = []
-        for exchange in recent_exchanges:
-            # Handle both Message objects and dictionary formats
-            if hasattr(exchange, 'role'):
-                role = "User" if exchange.role == "user" else "Assistant"
-                content = exchange.content
-            else:
-                role = "User" if exchange["role"] == "user" else "Assistant"
-                content = exchange["content"]
-            content = content[:100] + "..." if len(content) > 100 else content
-            context_parts.append(f"{role}: {content}")
-        return "\n".join(context_parts)
-class PerformanceMonitor:
-    """
-    Monitor agent performance and timing
-    """
-    def __init__(self):
-        self.metrics = {}
-    def start_timer(self, operation: str) -> None:
-        """
-        Start timing an operation
-        """
-        self.metrics[f"{operation}_start"] = time.time()
-    def end_timer(self, operation: str) -> float:
-        """
-        End timing an operation and return duration
-        """
-        start_time = self.metrics.get(f"{operation}_start")
-        if start_time:
-            duration = time.time() - start_time
-            self.metrics[f"{operation}_duration"] = duration
-            return duration
-        return 0.0
-    def get_metrics(self) -> Dict[str, Any]:
-        """
-        Get all collected metrics
-        """
-        return self.metrics.copy()
-    def reset(self) -> None:
-        """
-        Reset all metrics
-        """
-        self.metrics.clear()
-def validate_query(query: str) -> Tuple[bool, Optional[str]]:
-    """
-    Validate user query
-    """
-    if not query or not query.strip():
-        return False, "Query cannot be empty."
-    if len(query) > 2500:
-        return False, "Query is too long. Please keep it under 1000 characters."
-    return True, None
-def format_error_message(error: str) -> str:
-    """
-    Format error messages for user display
-    """
-    error_map = {
-        "Server unreachable": "❌ The legal database is currently unavailable. Please try again in a moment.",
-        "timeout": "❌ The request timed out. Please try again.",
-        "invalid json": "❌ There was an issue processing the response. Please try again.",
-        "health check failed": "❌ The system is initializing. Please wait a moment and try again."
-    }
-    for key, message in error_map.items():
-        if key.lower() in error.lower():
-            return message
-    return f"❌ An error occurred: {error}"
-def create_safe_filename(query: str, timestamp: str) -> str:
-    """
-    Create a safe filename for logging purposes
-    """
-    # Remove problematic characters
-    safe_query = "".join(c for c in query if c.isalnum() or c in (' ', '-', '_')).strip()
-    safe_query = safe_query[:50]  # Limit length
-    return f"{timestamp}_{safe_query}.log"

 #!/usr/bin/env python3
 """
+LightRAG client for interacting with the RAG server
 """
 import os
 import requests
 import time
+from typing import Dict, List, Any, Optional
 from dotenv import load_dotenv
 import logging
 # Load environment variables
 SERVER_URL = f"http://{LIGHTRAG_HOST}:{LIGHTRAG_PORT}"
 API_KEY = os.getenv("LIGHTRAG_API_KEY")
 class LightRAGClient:
     """
     Client for interacting with LightRAG server
                 )
                 if response.status_code == 200:
+                    logger.info(f"Query successful")
                     return response.json()
                 else:
                     logger.warning(f"Query failed with status {response.status_code}, attempt {attempt + 1}")
                 legal_entities.append(reg)
         return list(set(legal_entities))  # Remove duplicates

tools.py → utils/tools.py RENAMED Viewed

@@ -7,13 +7,15 @@ import os
 from typing import List, Dict, Any, Optional
 from langchain_core.tools import tool
 from langchain_tavily import TavilySearch
-from lawyer_selector import LawyerSelectorAgent
-from utils import LightRAGClient, ConversationFormatter
 # Global instances - will be initialized in agent_api.py
 lawyer_selector_agent: Optional[LawyerSelectorAgent] = None
 lightrag_client: Optional[LightRAGClient] = None
 tavily_search = None
 @tool
 async def query_knowledge_graph(query: str, conversation_history: List[Dict[str, str]]) -> str:
@@ -77,6 +79,26 @@ async def search_web(query: str) -> str:
     except Exception as e:
         return f"Error: {str(e)}"
 @tool
 async def find_lawyers(query: str, conversation_history: List[Dict[str, str]]) -> str:
     """
@@ -94,52 +116,19 @@ async def find_lawyers(query: str, conversation_history: List[Dict[str, str]]) -
         conversation_history: The full conversation history with the user (automatically provided by the agent)
     Returns:
-        A formatted string with the top 3 lawyer recommendations, including:
-        - Lawyer name and presentation
-        - Experience and specialty
-        - Client-friendly explanation of why they match the case
-        - Areas of practice
     """
     try:
-        # Use the globally initialized lawyer selector agent
         if lawyer_selector_agent is None:
             raise ValueError("LawyerSelectorAgent not initialized. Please initialize it in agent_api.py")
-        # Get lawyer recommendations using the conversation history
-        result = await lawyer_selector_agent.select_lawyers(conversation_history)
-        top_lawyers = result["top_lawyers"]
-        # Format the output for the user
-        output = ["=" * 80, "TOP 3 RECOMMENDED LAWYERS FOR YOUR CASE", "=" * 80]
-        for lawyer in top_lawyers:
-            output.append("\n" + "─" * 80)
-            output.append(f"RECOMMENDATION #{lawyer['rank']}")
-            output.append("─" * 80)
-            output.append(f"\n👤 {lawyer['name']}")
-            output.append(f"   {lawyer['presentation']}")
-            output.append(f"\n📊 Experience: {lawyer['experience_years']} years")
-            output.append(f"🎯 Specialty: {lawyer['specialty']}")
-            output.append(f"\n✅ Why this lawyer matches your case:")
-            output.append(f"   {lawyer['reasoning']}")
-            output.append(f"\n📚 Areas of Practice:")
-            for area in lawyer['areas_of_practice']:
-                output.append(f"   • {area}")
-            output.append("")
-        return "\n".join(output)
     except Exception as e:
         return f"Error finding lawyers: {str(e)}"
 # Export tool sets for different user types
-# Tools available to general clients (knowledge graph + lawyer finder + web search)
-tools_for_client = [query_knowledge_graph, find_lawyers, search_web]
-# Tools available to lawyers (knowledge graph + web search for current legal updates)
 tools_for_lawyer = [query_knowledge_graph, search_web]
-# Default tools (backward compatibility - client tools)
 tools = tools_for_client

 from typing import List, Dict, Any, Optional
 from langchain_core.tools import tool
 from langchain_tavily import TavilySearch
+from subagents.lawyer_selector import LawyerSelectorAgent
+from utils.lightrag_client import LightRAGClient
+import resend
 # Global instances - will be initialized in agent_api.py
 lawyer_selector_agent: Optional[LawyerSelectorAgent] = None
 lightrag_client: Optional[LightRAGClient] = None
 tavily_search = None
+resend_api_key: Optional[str] = None
 @tool
 async def query_knowledge_graph(query: str, conversation_history: List[Dict[str, str]]) -> str:
     except Exception as e:
         return f"Error: {str(e)}"
+@tool
+async def send_email(to_email: str, subject: str, content: str) -> str:
+    """Send an email using Resend."""
+    try:
+        from_email = os.getenv("RESEND_FROM_EMAIL")
+        from_name = os.getenv("RESEND_FROM_NAME", "CyberLegalAI")
+        params = {
+            "from": f"{from_name} <{from_email}>",
+            "to": [to_email],
+            "subject": subject,
+            "text": content
+        }
+        response = resend.Emails.send(params)
+        return f"✅ Email sent to {to_email} (ID: {response.get('id', 'N/A')})"
+    except Exception as e:
+        return f"❌ Failed: {str(e)}"
 @tool
 async def find_lawyers(query: str, conversation_history: List[Dict[str, str]]) -> str:
     """
         conversation_history: The full conversation history with the user (automatically provided by the agent)
     Returns:
+        A formatted string with the top 3 lawyer recommendations
     """
     try:
         if lawyer_selector_agent is None:
             raise ValueError("LawyerSelectorAgent not initialized. Please initialize it in agent_api.py")
+        return await lawyer_selector_agent.select_lawyers(conversation_history)
     except Exception as e:
         return f"Error finding lawyers: {str(e)}"
 # Export tool sets for different user types
+tools_for_client = [query_knowledge_graph, find_lawyers, search_web, send_email]
 tools_for_lawyer = [query_knowledge_graph, search_web]
 tools = tools_for_client

utils/utils.py ADDED Viewed

	@@ -0,0 +1,92 @@

+#!/usr/bin/env python3
+"""
+Utility functions for agent operations
+"""
+import time
+from typing import Tuple
+import logging
+# Configure logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+class PerformanceMonitor:
+    """
+    Monitor agent performance and timing
+    """
+    def __init__(self):
+        self.metrics = {}
+    def start_timer(self, operation: str) -> None:
+        """
+        Start timing an operation
+        """
+        self.metrics[f"{operation}_start"] = time.time()
+    def end_timer(self, operation: str) -> float:
+        """
+        End timing an operation and return duration
+        """
+        start_time = self.metrics.get(f"{operation}_start")
+        if start_time:
+            duration = time.time() - start_time
+            self.metrics[f"{operation}_duration"] = duration
+            return duration
+        return 0.0
+    def get_metrics(self) -> dict:
+        """
+        Get all collected metrics
+        """
+        return self.metrics.copy()
+    def reset(self) -> None:
+        """
+        Reset all metrics
+        """
+        self.metrics.clear()
+def validate_query(query: str) -> Tuple[bool, str]:
+    """
+    Validate user query
+    """
+    if not query or not query.strip():
+        return False, "Query cannot be empty."
+    if len(query) > 2500:
+        return False, "Query is too long. Please keep it under 1000 characters."
+    return True, None
+def format_error_message(error: str) -> str:
+    """
+    Format error messages for user display
+    """
+    error_map = {
+        "Server unreachable": "❌ The legal database is currently unavailable. Please try again in a moment.",
+        "timeout": "❌ The request timed out. Please try again.",
+        "invalid json": "❌ There was an issue processing the response. Please try again.",
+        "health check failed": "❌ The system is initializing. Please wait a moment and try again."
+    }
+    for key, message in error_map.items():
+        if key.lower() in error.lower():
+            return message
+    return f"❌ An error occurred: {error}"
+def create_safe_filename(query: str, timestamp: str) -> str:
+    """
+    Create a safe filename for logging purposes
+    """
+    # Remove problematic characters
+    safe_query = "".join(c for c in query if c.isalnum() or c in (' ', '-', '_')).strip()
+    safe_query = safe_query[:50]  # Limit length
+    return f"{timestamp}_{safe_query}.log"