Spaces:

destinyebuka
/

AIDA

Running

App Files Files Community

destinyebuka commited on 30 days ago

Commit

bc0cd92

1 Parent(s): ee3d3d4

fyp

Browse files

Files changed (25) hide show

IMPLEMENTATION_SUMMARY.md +0 -319
VISION_FEATURE_INTEGRATION_GUIDE.md +0 -794
app/__pycache__/config.cpython-313.pyc +0 -0
app/ai/agent/__pycache__/graph.cpython-313.pyc +0 -0
app/ai/agent/brain.py +158 -55
app/ai/agent/graph.py +6 -2
app/ai/agent/nodes/__pycache__/listing_publish.cpython-313.pyc +0 -0
app/ai/agent/nodes/listing_collect.py +13 -74
app/ai/agent/nodes/listing_publish.py +10 -0
app/ai/lightning/__init__.py +38 -0
app/ai/lightning/rewards.py +249 -0
app/ai/lightning/tracer.py +326 -0
app/ai/services/__init__.py +52 -0
app/ai/services/osm_poi_service.py +499 -0
app/ai/services/rlm_query_analyzer.py +287 -0
app/ai/services/rlm_search_service.py +1202 -0
app/ai/services/search_strategy_selector.py +85 -17
app/ai/services/vision_service.py +0 -697
app/config.py +14 -6
app/routes/auth.py +56 -0
app/routes/media_upload.py +36 -330
app/schemas/user.py +63 -0
cloudflare-worker/image-upload-worker.js +44 -22
docs/CLARA_RLM_INTEGRATION_PLAN.md +537 -0
test_rlm.py +481 -0

IMPLEMENTATION_SUMMARY.md DELETED Viewed

@@ -1,319 +0,0 @@
-# 🎯 Vision AI Listing Feature - Implementation Summary
-## What Was Built
-A **smart AI-powered property listing feature** that intelligently handles THREE different listing methods and produces a unified result.
----
-## Key Features Implemented
-### 1. ✅ Smart Listing Method Detection
-The system knows HOW the user is listing and behaves accordingly:
-**TEXT Method** (User provides details via chat)
-- User says: "3-bed, 2-bath in Lagos, 500k/month, has WiFi, AC"
-- Uploads photos for VALIDATION (not re-extraction)
-- Backend: Validates images are property-related, uploads to Cloudflare
-- Result: Text data + validated photos
-**IMAGE Method** (User uploads photos only)
-- User just uploads photos (no text details)
-- Backend: EXTRACTS all details from images (bedrooms, bathrooms, amenities)
-- Generates: SHORT title (max 2 sentences) + full description
-- Result: Complete listing data extracted from photos
-**VIDEO Method** (User uploads video + photos)
-- User uploads video walkthrough
-- Backend: Uploads to Cloudinary, suggests adding photos
-- User uploads photos for analysis
-- Backend: Extracts details from photos (same as IMAGE method)
-- Result: Full data from photos + video URL
----
-### 2. ✅ Intelligent Title & Description Generation
-**Title Requirements:**
-- ✅ SHORT - Maximum 2 sentences
-- ✅ Examples: "Modern 3-bed apartment. Great location!"
-- ❌ NOT: Long descriptions with many details
-**Description:**
-- Full 2-3 sentence description of property
-- Professional tone
-- Highlights key features
-**Both generated by Vision AI** for image/video methods
----
-### 3. ✅ Smart File Naming Strategy
-**Pattern:** `{location}_{title}_{timestamp}_{index}.jpg`
-**Example filenames:**
-- `Lagos_Modern_Apartment_2025_01_31_0.jpg`
-- `Victoria_Island_3_Bed_Luxury_2025_01_31_1.jpg`
-- `Cotonou_Cozy_Studio_2025_01_31_0.jpg`
-**Benefits:**
-- Easy to identify property in storage
-- Shows when listed (timestamp)
-- Automatically indexed for multiple photos
-- Cloudflare worker detects duplicates and appends numbers
----
-### 4. ✅ Unified Response Format
-**All three methods return the SAME structure:**
-```json
-{
-  "success": true,
-  "listing_method": "text|image|video",
-  "extracted_fields": {
-    "bedrooms": 3,
-    "bathrooms": 2,
-    "amenities": ["WiFi", "Parking", "AC"],
-    "description": "Beautiful apartment...",
-    "title": "Modern 3-Bed Apartment. Great location!"
-  },
-  "confidence": {
-    "bedrooms": 0.95,
-    "bathrooms": 0.88,
-    "amenities": 0.72,
-    "title": 0.85
-  },
-  "image_urls": ["url1", "url2"],
-  "video_url": "https://cloudinary..."  // Only if video method
-}
-```
-**Frontend shows same UI** regardless of how user listed → Same draft card, same editing experience
----
-### 5. ✅ Property Validation BEFORE Upload
-**Critical feature for space saving:**
-```
-Image Upload Flow:
-1. Receive image from frontend
-2. Check: "Is this a property image?"
-3. If NO → Reject with message, no upload
-4. If YES → Upload to Cloudflare with smart filename
-```
-This prevents non-property images from consuming Cloudflare storage!
----
-### 6. ✅ Vision Service Enhancements
-**New capabilities in `vision_service.py`:**
-- `extract_property_fields()` - Now generates title + description
-- `_generate_title()` - Creates SHORT titles (max 2 sentences)
-- `_extract_room_count()` - Counts bedrooms/bathrooms
-- `_detect_amenities()` - Finds amenities in images
-- `_generate_description()` - Creates full descriptions
-- `merge_multiple_image_results()` - Combines results from multiple images
-- Confidence scoring for each field
----
-### 7. ✅ Enhanced Media Upload Routes
-**Updated endpoints:**
-`POST /listings/analyze-images`
-- Accepts `listing_method` parameter ("text", "image", "video")
-- Accepts optional `location` parameter for context
-- Returns: Complete extracted fields + image URLs + confidence scores
-- Generates intelligent filenames during upload
-`POST /listings/analyze-video`
-- Uploads video to Cloudinary with smart naming
-- Returns: Video URL + suggestions to upload photos
-- Recommends photos for better accuracy
----
-## Files Modified/Created
-### Created Files:
-1. **`app/ai/services/vision_service.py`** - Vision AI analysis service
-2. **`app/routes/media_upload.py`** - Image/video upload endpoints
-3. **`VISION_FEATURE_INTEGRATION_GUIDE.md`** - Complete integration guide
-4. **`IMPLEMENTATION_SUMMARY.md`** - This file
-### Modified Files:
-1. **`app/config.py`** - Added Cloudinary + Vision settings
-2. **`requirements.txt`** - Added cloudinary + ffmpeg-python
-3. **`app/ai/agent/nodes/listing_collect.py`** - Added `initialize_from_vision_analysis()` function
-4. **`main.py`** - Registered media_upload routes
----
-## Configuration Required
-Add to `.env`:
-```bash
-# Cloudinary (Video Storage)
-CLOUDINARY_CLOUD_NAME=your_cloud_name
-CLOUDINARY_API_KEY=your_api_key
-CLOUDINARY_API_SECRET=your_api_secret
-# Hugging Face Vision Model
-HF_TOKEN=your_hf_token
-HF_VISION_MODEL=vikhyatk/moondream2
-HF_VISION_API_ENABLED=true
-PROPERTY_IMAGE_MIN_CONFIDENCE=0.6
-```
----
-## Frontend Changes Required
-### Update Image Upload Flow
-**OLD (Direct to Cloudflare):**
-```javascript
-// Upload directly to Cloudflare
-const url = await uploadToCloudflare(image)
-```
-**NEW (Via Backend with Validation):**
-```javascript
-// Method 1: Text listing (chat + photos)
-const result = await fetch('/listings/analyze-images', {
-  method: 'POST',
-  body: formData,
-  headers: { 'listing_method': 'text', 'location': chatLocation }
-})
-// Method 2: Image listing (photos only)
-const result = await fetch('/listings/analyze-images', {
-  method: 'POST',
-  body: formData,
-  headers: { 'listing_method': 'image' }
-})
-// Method 3: Video listing
-const result = await fetch('/listings/analyze-video', {
-  method: 'POST',
-  body: formData,
-  headers: { 'listing_method': 'video' }
-})
-```
----
-## User Experience Flow
-### For Image Listing Method:
-```
-User clicks "List with Photos" → Uploads 2-3 images
-    ↓
-Backend validates images are property-related
-    ↓
-AI extracts:
-  - Bedrooms: 3 (confidence: 95%)
-  - Bathrooms: 2 (confidence: 88%)
-  - Amenities: WiFi, AC, Parking, Pool
-  - Title: "Modern 3-Bed Apartment. Great location!" (SHORT)
-  - Description: "Beautiful 3-bed with modern furnishings..."
-    ↓
-Shows Draft UI with:
-  - Photos with smart names (Lagos_Modern_Apartment_2025_01_31_0.jpg)
-  - Extracted fields
-  - Confidence indicators
-    ↓
-User asked: "What's the location, address, and price?"
-    ↓
-User provides: "Lagos, Victoria Island, 500,000 per month"
-    ↓
-AI infers listing_type: "rent" (from price context)
-    ↓
-User edits via text:
-  - "Change amenities to WiFi, gym, and pool"
-  - "Update title to something catchier"
-    ↓
-User publishes: "Publish this listing"
-    ↓
-Listing created with all auto-detected + user-provided data
-```
----
-## Key Differences from Previous Design
-| Aspect | Before | Now |
-|--------|--------|-----|
-| **File naming** | Random/original names | Smart names (location_title_date) |
-| **Title generation** | Not generated for images | AI generates SHORT titles (max 2 sentences) |
-| **Listing methods** | Only text-based | Three methods: text, image, video |
-| **Method detection** | N/A | AI knows how user is listing |
-| **Video storage** | N/A | Cloudinary for videos |
-| **Upload strategy** | Direct to Cloudflare | Backend validates first (saves space) |
-| **Confidence scores** | Not implemented | Per-field confidence for each extraction |
----
-## Performance Notes
-**Vision API Response Times:**
-- Image validation: 2-3 seconds (first image), +1s per additional
-- Field extraction: 2-4 seconds per image
-- Title generation: 1-2 seconds per image
-- Video upload: 5-10 seconds (depends on file size)
-**Cost Optimization:**
-- Only valid property images uploaded (rejects non-property images early)
-- Smaller file sizes with smart naming
-- Cloudflare worker deduplicates files
-- Hugging Face Inference API used (cheaper than self-hosted)
----
-## Testing Checklist
-- [ ] Test TEXT method: Chat + upload images
-- [ ] Test IMAGE method: Upload images only
-- [ ] Test VIDEO method: Upload video + photos
-- [ ] Verify short titles generated (max 2 sentences)
-- [ ] Verify descriptions generated (full, not short)
-- [ ] Verify file naming is intelligent (location_title_date)
-- [ ] Verify property validation rejects non-property images
-- [ ] Verify confidence scores are returned
-- [ ] Verify all three methods produce same draft UI
-- [ ] Test editing via natural language commands
-- [ ] Test publishing with all three methods
----
-## Next Steps
-1. **Frontend Integration** - Update image/video upload flows
-2. **Test All Three Methods** - Verify each method works end-to-end
-3. **Monitor Accuracy** - Track field extraction accuracy metrics
-4. **Optimize Prompts** - Fine-tune Vision AI prompts based on real data
-5. **User Feedback** - Gather feedback on titles/descriptions
-6. **Enhance Features** - Add OCR for address extraction, price suggestions, etc.
----
-## Support
-See `VISION_FEATURE_INTEGRATION_GUIDE.md` for:
-- Detailed API documentation
-- Complete example code
-- Error handling
-- Troubleshooting
-- Future enhancements

VISION_FEATURE_INTEGRATION_GUIDE.md DELETED Viewed

@@ -1,794 +0,0 @@
-# 🤖 AI-Powered Property Listing with Image/Video Analysis
-## Integration Guide
----
-## Overview
-This document explains how to integrate the new **Vision AI feature** that allows users to list properties by uploading images or videos. The AI automatically detects property details (bedrooms, bathrooms, amenities) and fills listing fields.
----
-## Architecture
-### Flow Diagram
-```
-USER UPLOADS IMAGES/VIDEO
-    ↓
-[BACKEND IMAGE VALIDATION]
-- Check if image is property-related (BEFORE upload)
-- Reject non-property images (saves Cloudflare space)
-    ↓
-[VISION AI ANALYSIS] (Hugging Face Inference API)
-- Extract bedrooms, bathrooms
-- Detect amenities
-- Generate description
-- Return confidence scores
-    ↓
-[UPLOAD TO CLOUD STORAGE]
-- Images → Cloudflare (only if validated)
-- Videos → Cloudinary
-    ↓
-[INITIALIZE LISTING]
-- Pre-fill extracted fields
-- Ask user for uncertain/missing fields (price, location, address)
-    ↓
-[DRAFT UI]
-- Show preview card like text-based flow
-    ↓
-[USER REVIEWS & EDITS]
-- Edit via natural language commands
-    ↓
-[PUBLISH]
-- Same as text-based flow
-```
----
-## New Files Created
-### 1. **Vision Service** - `app/ai/services/vision_service.py`
-**Purpose**: Analyzes images/videos using Hugging Face Inference API
-**Key Classes**:
-```python
-class VisionService:
-    def validate_property_image(image_bytes) → (bool, float, str)
-    def extract_property_fields(image_bytes) → Dict
-    def merge_multiple_image_results(results_list) → Dict
-```
-**Functions**:
-- `validate_property_image()` - Check if image is property-related (BEFORE upload)
-- `extract_property_fields()` - Extract bedrooms, bathrooms, amenities, description
-- `_extract_room_count()` - Count rooms
-- `_detect_amenities()` - Find amenities
-- `_generate_description()` - Create property description
-- `merge_multiple_image_results()` - Combine results from multiple images
----
-### 2. **Media Upload Routes** - `app/routes/media_upload.py`
-**Purpose**: Handle image/video uploads with validation
-**Endpoints**:
-#### `POST /listings/analyze-images`
-```
-Request:
-- files: List of image files (max 10, max 10MB each)
-- listing_method: "text" | "image" | "video" (how user is listing)
-- location: Optional string (context from text method)
-Process:
-1. Validate image format (JPEG, PNG, WebP)
-2. Validate image is property-related (BEFORE upload)
-3. Extract property fields
-4. Upload to Cloudflare (only if validated)
-5. Return extracted fields + image URLs
-Response:
-{
-  "success": true,
-  "images_processed": 2,
-  "images_validated": ["image1.jpg", "image2.jpg"],
-  "image_urls": [
-    "https://cloudflare.../image1.jpg",
-    "https://cloudflare.../image2.jpg"
-  ],
-  "extracted_fields": {
-    "bedrooms": 3,
-    "bathrooms": 2,
-    "amenities": ["WiFi", "Parking", "AC"],
-    "description": "Spacious modern apartment..."
-  },
-  "confidence": {
-    "bedrooms": 0.95,
-    "bathrooms": 0.88,
-    "amenities": 0.72,
-    "description": 0.91
-  },
-  "validation_errors": [],
-  "suggestions": ["Verify bedroom count", "...]
-}
-```
-#### `POST /listings/analyze-video`
-```
-Request:
-- video: Single video file (max 100MB)
-Response:
-{
-  "success": true,
-  "video_url": "https://res.cloudinary.com/.../video.mp4",
-  "message": "Video uploaded. Photos recommended for better accuracy.",
-  "extracted_fields": {...},
-  "suggestions": ["Upload property photos for better detection"]
-}
-```
-#### `POST /listings/validate-media`
-```
-Quick validation without uploading
-Returns: Validation results for each file
-```
----
-### 3. **Listing Collection Integration** - `app/ai/agent/nodes/listing_collect.py`
-**New Function**: `initialize_from_vision_analysis(state, vision_data)`
-**Purpose**: Pre-populate listing state with AI-detected fields
-**Usage**:
-```python
-# After user uploads images and AI analyzes them
-state = await initialize_from_vision_analysis(state, vision_data)
-# State now has bedrooms, bathrooms, amenities, images, description pre-filled
-```
----
-## Configuration
-### Add to `.env`
-```bash
-# Cloudinary (Video Storage)
-CLOUDINARY_CLOUD_NAME=your_cloud_name
-CLOUDINARY_API_KEY=your_api_key
-CLOUDINARY_API_SECRET=your_api_secret
-# Hugging Face Vision Model
-HF_TOKEN=your_hf_token
-HF_VISION_MODEL=vikhyatk/moondream2
-HF_VISION_API_ENABLED=true
-PROPERTY_IMAGE_MIN_CONFIDENCE=0.6
-```
-### Update `app/config.py` ✅ (Already Done)
-Added:
-- `CLOUDINARY_CLOUD_NAME`
-- `CLOUDINARY_API_KEY`
-- `CLOUDINARY_API_SECRET`
-- `HF_VISION_MODEL`
-- `HF_VISION_API_ENABLED`
-- `PROPERTY_IMAGE_MIN_CONFIDENCE`
-### Update `requirements.txt` ✅ (Already Done)
-Added:
-- `cloudinary>=1.40.0`
-- `ffmpeg-python>=0.2.1`
----
-## Frontend Integration
-### Frontend Responsibilities
-**IMPORTANT**: Images must now be uploaded to the **backend** (not directly to Cloudflare)
-#### 1. **Image Upload Flow**
-```typescript
-// OLD (Direct to Cloudflare) - DEPRECATED
-POST to Cloudflare directly
-// NEW (Via Backend with Validation) - REQUIRED
-POST /listings/analyze-images
-  Headers: Authorization: Bearer {token}
-  Body: FormData with files
-  Response: Extracted fields + image URLs
-```
-#### 2. **Example Frontend Code**
-**For TEXT method** (user provided details via chat):
-```typescript
-async function uploadImagesForTextListing(files: File[], location: string) {
-  const formData = new FormData()
-  files.forEach(file => formData.append('images', file))
-  formData.append('listing_method', 'text')
-  formData.append('location', location) // Context from text conversation
-  const response = await fetch('/listings/analyze-images', {
-    method: 'POST',
-    headers: { 'Authorization': `Bearer ${token}` },
-    body: formData
-  })
-  const result = await response.json()
-  if (!result.success) {
-    result.validation_errors.forEach(err => {
-      alert(`${err.image}: ${err.error}`)
-    })
-    return
-  }
-  // Images validated with text-provided data
-  showListingDraft({
-    // Use data from CHAT (text-provided), images as validation
-    bedrooms: result.extracted_fields.bedrooms,
-    bathrooms: result.extracted_fields.bathrooms,
-    images: result.image_urls,
-  })
-}
-```
-**For IMAGE method** (user uploading photos only):
-```typescript
-async function uploadImagesForPhotListing(files: File[]) {
-  const formData = new FormData()
-  files.forEach(file => formData.append('images', file))
-  formData.append('listing_method', 'image')
-  // No location - we'll extract everything from images
-  const response = await fetch('/listings/analyze-images', {
-    method: 'POST',
-    headers: { 'Authorization': `Bearer ${token}` },
-    body: formData
-  })
-  const result = await response.json()
-  if (!result.success) {
-    result.validation_errors.forEach(err => {
-      alert(`${err.image}: ${err.error}`)
-    })
-    return
-  }
-  // Show extracted fields (AI analyzed images)
-  showListingDraft({
-    title: result.extracted_fields.title,  // AI-generated SHORT title
-    description: result.extracted_fields.description,  // AI-generated description
-    bedrooms: result.extracted_fields.bedrooms,
-    bathrooms: result.extracted_fields.bathrooms,
-    amenities: result.extracted_fields.amenities,
-    images: result.image_urls,
-    confidence: result.confidence
-  })
-}
-```
-**For VIDEO method**:
-```typescript
-async function uploadVideoForListing(videoFile: File, location?: string) {
-  const formData = new FormData()
-  formData.append('video', videoFile)
-  if (location) formData.append('location', location)
-  const response = await fetch('/listings/analyze-video', {
-    method: 'POST',
-    headers: { 'Authorization': `Bearer ${token}` },
-    body: formData
-  })
-  const result = await response.json()
-  // Suggest uploading photos
-  alert(result.message)
-  // Then call uploadImagesForPhotListing with photos
-}
-```
-#### 3. **Video Upload Flow**
-```typescript
-POST /listings/analyze-video
-  Headers: Authorization: Bearer {token}
-  Body: FormData with video file
-  Response: Video URL + suggestions
-```
----
-## Three Listing Methods (Smart Differentiation)
-The system intelligently handles THREE different listing creation methods:
-### 1️⃣ Text-Based Listing (Existing - User provides details via text)
-```
-User says: "I have a 3-bed, 2-bath in Lagos for 500k per month.
-It has WiFi, AC, and parking."
-FLOW:
-1. AI extracts fields from text (bedrooms, bathrooms, price, etc.)
-2. User uploads photos to validate
-3. Backend:
-   - Validates images are property-related
-   - Just checks they match (no re-extraction needed)
-   - Uploads to Cloudflare with smart naming
-4. Shows draft UI with text-provided data + validated photos
-5. User edits via text: "change price to 450k"
-6. AI infers listing_type from price: "rent"
-7. User publishes: "publish this listing"
-METHOD CONTEXT: listing_method="text"
-```
-### 2️⃣ Image-Based Listing (NEW - User uploads photos only)
-```
-User clicks "List with Photos"
-FLOW:
-1. User uploads 1-5 photos (no text details provided)
-2. Backend:
-   - Validates images are property-related
-   - EXTRACTS ALL DETAILS: bedrooms, bathrooms, amenities
-   - GENERATES TITLE (short, max 2 sentences)
-   - GENERATES DESCRIPTION (full description)
-   - Creates intelligent filenames (location_title_date.jpg)
-   - Uploads to Cloudflare
-3. Shows draft UI with AI-extracted fields
-4. User is prompted: "What's the location, address, and price?"
-5. User provides: "Lagos, Victoria Island, 500,000 per month"
-6. AIDA Auto-infers:
-   - Currency from location: Lagos → NGN (via CurrencyManager API)
-   - Listing_type from price_type: "per month" → "rent" ✓
-7. User can edit via text: "add gym to amenities", "change title"
-8. User publishes: "publish this listing"
-METHOD CONTEXT: listing_method="image"
-AI EXTRACTS: bedrooms, bathrooms, amenities, description, title
-AUTO-INFERRED: currency (from location), listing_type (from price_type)
-```
-### 3️⃣ Video-Based Listing (NEW - User uploads video, optionally photos)
-```
-User clicks "List with Video"
-FLOW:
-1. User uploads video (walkthrough)
-2. Backend:
-   - Uploads to Cloudinary
-   - Creates intelligent filename
-3. System suggests: "Video uploaded! Upload 2-3 photos for better detection."
-4. User uploads photos
-5. Backend:
-   - Validates images are property-related
-   - EXTRACTS ALL DETAILS from photos
-   - GENERATES TITLE and DESCRIPTION
-6. Shows draft UI with extracted fields + video URL
-7. Same flow as image-based from step 5 onwards:
-   - User prompted for: location, address, price (with price_type)
-   - AIDA auto-infers: currency (from location), listing_type (from price_type)
-METHOD CONTEXT: listing_method="video"
-AI EXTRACTS: From photos (not video)
-AUTO-INFERRED: currency (from location), listing_type (from price_type)
-VIDEO STORAGE: Cloudinary
-PHOTO STORAGE: Cloudflare
-```
-### Unified Draft UI Result
-**All three methods produce the SAME final result:**
-```json
-{
-  "success": true,
-  "listing_method": "text|image|video",
-  "extracted_fields": {
-    "bedrooms": 3,
-    "bathrooms": 2,
-    "amenities": ["WiFi", "Parking", "AC"],
-    "description": "Beautiful apartment with modern amenities.",
-    "title": "3-Bed Modern Apartment. Great location!"
-  },
-  "confidence": { ... },
-  "image_urls": [ ... ],
-  "video_url": "..." // Only if video method
-}
-```
-The **frontend shows the same UI** regardless of listing method - user sees:
-- Property images
-- Extracted details
-- Ability to edit via text commands
-- Publish button
----
-## Data Flow Example
-### Request
-```bash
-curl -X POST http://localhost:8000/listings/analyze-images \
-  -H "Authorization: Bearer {token}" \
-  -F "images=@bedroom.jpg" \
-  -F "images=@kitchen.jpg" \
-  -F "images=@bathroom.jpg"
-```
-### Response
-```json
-{
-  "success": true,
-  "images_processed": 3,
-  "images_validated": ["bedroom.jpg", "kitchen.jpg", "bathroom.jpg"],
-  "image_urls": [
-    "https://imagedelivery.net/lojiz/bedroom_hash/public",
-    "https://imagedelivery.net/lojiz/kitchen_hash/public",
-    "https://imagedelivery.net/lojiz/bathroom_hash/public"
-  ],
-  "extracted_fields": {
-    "bedrooms": 3,
-    "bathrooms": 2,
-    "amenities": ["WiFi Router", "AC Unit", "Furniture", "Balcony"],
-    "description": "Beautiful 3-bedroom, 2-bathroom modern apartment with contemporary furnishings and excellent amenities."
-  },
-  "confidence": {
-    "bedrooms": 0.95,
-    "bathrooms": 0.88,
-    "amenities": 0.72,
-    "description": 0.91
-  },
-  "validation_errors": [],
-  "suggestions": [
-    "Verify bedroom and bathroom counts are accurate",
-    "You'll need to provide location, address, and price information"
-  ]
-}
-```
----
-## API Endpoints Summary
-| Endpoint | Method | Purpose | Auth |
-|----------|--------|---------|------|
-| `/listings/analyze-images` | POST | Upload & analyze images | Required |
-| `/listings/analyze-video` | POST | Upload & analyze video | Required |
-| `/listings/validate-media` | POST | Quick file validation | Required |
----
-## Important Notes
-### Image Validation
-- **Property validation happens BEFORE upload** - Non-property images are rejected, saving Cloudflare storage
-- **Confidence threshold**: Default 0.6 (60%) - Can be adjusted via `PROPERTY_IMAGE_MIN_CONFIDENCE`
-- **High-confidence fields** (>0.7): Auto-filled in listing form
-- **Medium-confidence fields** (0.5-0.7): Shown as suggestions; user confirms
-- **Low-confidence fields** (<0.5): User must provide manually
-### Video Processing
-- Videos uploaded to **Cloudinary** (not Cloudflare)
-- Frame extraction available for future frame-by-frame analysis
-- Users encouraged to upload photos alongside video for better accuracy
-### Listing Type Inference
-After user provides **price**, system infers listing_type:
-```python
-Price Input → Listing Type
-- High monthly (e.g., 500,000/month) → "rent"
-- Low nightly (e.g., 5,000/night) → "short-stay"
-- Very high one-time (e.g., 50,000,000) → "sale"
-- "Looking for roommate" context → "roommate"
-```
----
-## Testing
-### Test 1: TEXT Method (User provided text details + uploading images)
-```bash
-# User already provided details via chat
-# Now uploading images to validate
-curl -X POST /listings/analyze-images \
-  -H "Authorization: Bearer {token}" \
-  -F "images=@bedroom.jpg" \
-  -F "images=@kitchen.jpg" \
-  -F "listing_method=text" \
-  -F "location=Lagos"
-# Response:
-# - Images validated as property-related ✓
-# - Details preserved from text conversation
-# - Returns same format with extracted fields + image URLs
-```
-### Test 2: IMAGE Method (User uploading photos only)
-```bash
-# User has no text details - AI extracts everything
-curl -X POST /listings/analyze-images \
-  -H "Authorization: Bearer {token}" \
-  -F "images=@bedroom.jpg" \
-  -F "images=@kitchen.jpg" \
-  -F "images=@bathroom.jpg" \
-  -F "listing_method=image"
-# Response:
-# - bedrooms: 3 (extracted from images)
-# - bathrooms: 2 (extracted from images)
-# - title: "Modern 3-Bed Apartment. Great Location!" (AI-generated, SHORT)
-# - description: "Beautiful apartment with..." (AI-generated, full)
-# - amenities: ["WiFi", "AC", "Parking"] (extracted)
-# - confidence: { bedrooms: 0.95, bathrooms: 0.88, ... }
-```
-### Test 3: VIDEO Method (User uploading video + photos)
-```bash
-# Step 1: Upload video
-curl -X POST /listings/analyze-video \
-  -H "Authorization: Bearer {token}" \
-  -F "video=@walkthrough.mp4" \
-  -F "location=Lagos"
-# Response: video_url, suggestions to upload photos
-# Step 2: Upload photos for analysis
-curl -X POST /listings/analyze-images \
-  -H "Authorization: Bearer {token}" \
-  -F "images=@photo1.jpg" \
-  -F "images=@photo2.jpg" \
-  -F "listing_method=video" \
-  -F "location=Lagos"
-# Response: Same as IMAGE method + video_url in final listing
-```
-### Test 4: File Naming
-```bash
-# Upload images with location context
-curl -X POST /listings/analyze-images \
-  -F "images=@IMG_1234.jpg" \
-  -F "images=@IMG_5678.jpg" \
-  -F "listing_method=image" \
-  -F "location=Lagos"
-# Backend generates:
-# - Lagos_Modern_Apartment_2025_01_31_0.jpg
-# - Lagos_Modern_Apartment_2025_01_31_1.jpg
-# (AI extracts title from image and uses it in filename)
-# Cloudflare stores with these intelligent names
-# If duplicate: Lagos_Modern_Apartment_2025_01_31_0_1.jpg (worker appends _1)
-```
-### Test 5: Short Title Validation
-```bash
-# Verify title is SHORT (max 2 sentences)
-Response:
-{
-  "extracted_fields": {
-    "title": "Modern 3-Bed Apartment. Great location!",  ✓ SHORT
-    "description": "Beautiful 3-bedroom, 2-bathroom modern apartment..."  ✓ FULL
-  }
-}
-# NOT acceptable:
-{
-  "title": "This is a beautiful 3-bedroom, 2-bathroom modern apartment..."  ❌ TOO LONG
-}
-```
----
-## Error Handling
-### Common Errors
-| Error | Cause | Solution |
-|-------|-------|----------|
-| `Not a property photo` | Image rejected by vision AI | Upload actual property photos |
-| `Image size exceeds 10MB` | File too large | Compress image or use smaller file |
-| `Invalid image type` | Wrong file format | Use JPEG, PNG, or WebP |
-| `Cloudinary upload failed` | Credentials not set | Check `.env` variables |
-| `HF API timeout` | Vision model slow | Retry or use Cloudinary-hosted fallback |
----
-## Smart File Naming & Storage
-### Intelligent Filename Generation
-**Backend generates meaningful filenames instead of using random names:**
-```python
-Pattern: {location}_{title}_{timestamp}_{index}.jpg
-Examples:
-- Lagos_Modern_Apartment_2025_01_31_1.jpg
-- Victoria_Island_3_Bed_Luxury_2025_01_31_0.jpg
-- Cotonou_Cozy_Studio_2025_01_31_0.jpg
-```
-**Algorithm:**
-1. Extract location (if available)
-2. Extract title (first 20 chars, AI-generated if image/video method)
-3. Add timestamp (YYYY_MM_DD_HHMMSS)
-4. Add index for multiple images (0, 1, 2...)
-**Benefits:**
-- Easy to identify property in storage
-- Date shows when listed
-- Cloudflare worker can detect duplicates
-- Organized file structure
-### Cloudflare Worker Deduplication
-When image reaches Cloudflare:
-```
-1. Check if filename exists
-2. If NEW → Store as-is
-3. If DUPLICATE → Append counter
-   - first duplicate: {name}_1.jpg
-   - second: {name}_2.jpg
-```
-**Example:**
-```
-Scenario: Same user uploads "Lagos_Apartment.jpg" twice
-1st upload → Lagos_Apartment.jpg
-2nd upload → Lagos_Apartment_1.jpg (worker auto-appended)
-```
----
-## Title & Description Generation
-### Title Requirements
-**MUST BE SHORT:**
-- ✅ "Modern 3-bed apartment. Great location!"
-- ✅ "Spacious family home with garden."
-- ❌ "This is a beautiful 3-bedroom, 2-bathroom modern apartment with contemporary furnishings, located in a prime area of the city with excellent amenities and facilities"
-**Maximum:** 2 sentences (not full descriptions)
-**Generated by Vision AI for image/video methods:**
-```python
-Example prompts:
-"Generate a SHORT, catchy real estate listing title for this property (3bed, 2bath) in Lagos.
-Maximum 2 sentences. Must be concise and appealing.
-Example: 'Modern 2-bed apartment with balcony. Great location!'"
-```
-### Description Generation
-**Full property description (2-3 sentences):**
-- Generated from images/video
-- Professional tone
-- Highlights key features
-- Stored in `extracted_fields.description`
-**Example:**
-```
-"Beautiful 3-bedroom, 2-bathroom modern apartment featuring contemporary
-furnishings, air conditioning, WiFi, and private balcony overlooking the
-city. Located in a secure, gated community with excellent amenities."
-```
----
-## Performance Optimization
-### Recommended for Production
-1. **Implement caching**: Cache similar property images to reduce API calls
-2. **Batch processing**: Process multiple images in parallel
-3. **Frame extraction**: For videos, extract key frames instead of all frames
-4. **Model optimization**: Consider smaller model variant for faster inference
-5. **Async processing**: Long-running tasks (video analysis) should be async jobs
-### Estimated Response Times
-- Image validation: **2-3 seconds** (first image), **+1s per additional**
-- Video upload: **5-10 seconds** depending on file size
-- Vision analysis: **2-4 seconds** per image
----
-## Success Metrics
-Track these to measure feature adoption:
-1. **Adoption Rate**: % of new listings created via image/video upload
-2. **Time Saved**: Avg creation time (image-based vs text-based)
-3. **Accuracy**: % of auto-detected fields accepted by users
-4. **Field Coverage**: Which fields have highest accuracy
-5. **Error Rate**: % of images rejected as non-property
----
-## Future Enhancements
-1. **Multi-frame video analysis**: Extract key frames from video, analyze each
-2. **OCR for signs**: Extract property addresses from signs visible in photos
-3. **Furniture detection**: Count furniture items, estimate age
-4. **Damage detection**: Identify needed repairs
-5. **Neighborhood analysis**: Analyze background (street view, buildings)
-6. **Price estimation**: AI suggests price based on similar listings
-7. **Virtual tour generation**: Automatically create walkthrough from photos
----
-## Support & Troubleshooting
-### Check Vision Service Status
-```bash
-GET /health
-# Returns: vision_service: "healthy" | "unavailable"
-```
-### View Logs
-```bash
-# Backend logs for vision analysis
-grep "Vision Service" logs/app.log
-grep "Hugging Face API" logs/app.log
-```
-### Reset Cloudinary Cache
-```bash
-# Clear vision service cache (if implemented)
-DELETE /admin/cache/vision
-```
----
-## Summary
-✅ **Phase 1 Complete:**
-- Vision service created (Hugging Face integration)
-- Media upload endpoints ready
-- Property validation implemented
-- Listing collection integration done
-- Image/video storage configured
-**Next Steps:**
-1. Update frontend to use `/listings/analyze-images` endpoint
-2. Update frontend to use `/listings/analyze-video` endpoint
-3. Add vision results to chat UI
-4. Test end-to-end flow
-5. Monitor accuracy metrics
-6. Optimize based on user feedback

app/__pycache__/config.cpython-313.pyc CHANGED Viewed

Binary files a/app/__pycache__/config.cpython-313.pyc and b/app/__pycache__/config.cpython-313.pyc differ

app/ai/agent/__pycache__/graph.cpython-313.pyc CHANGED Viewed

Binary files a/app/ai/agent/__pycache__/graph.cpython-313.pyc and b/app/ai/agent/__pycache__/graph.cpython-313.pyc differ

app/ai/agent/brain.py CHANGED Viewed

@@ -15,6 +15,8 @@ from langchain_core.messages import SystemMessage, HumanMessage
 from app.ai.agent.state import AgentState, FlowState
 from app.ai.agent.schema import get_schema_for_llm, get_draft_summary, get_missing_fields
 from app.config import settings
 logger = get_logger(__name__)
@@ -851,21 +853,25 @@ async def execute_tool(tool_name: str, params: Dict[str, Any], state: AgentState
             else:
                 # No draft_ui yet - AIDA will ask for images
                 state.temp_data["action"] = "respond"
             return True, f"Updated: {list(fields.keys())}", state.provided_fields
         elif tool_name == "search_properties":
             # Import and call search service
             from app.ai.services.search_extractor import extract_search_params
             from app.ai.services.search_service import search_listings_hybrid, search_mongodb
             # SMART UI: Clear old my_listings when doing new search
             state.my_listings = []
             state.temp_data.pop("my_listings", None)
             # Step 1: Extract params from the full user message (LLM is smart)
             search_params = await extract_search_params(state.last_user_message)
             # Step 2: Merge with Brain-extracted params (these have priority if present)
             if params.get("location"):
                 search_params["location"] = params["location"]
@@ -875,81 +881,142 @@ async def execute_tool(tool_name: str, params: Dict[str, Any], state: AgentState
                 search_params["max_price"] = params["max_price"]
             if params.get("beds"):
                 search_params["bedrooms"] = params["beds"]
             is_suggestion = False
-            # Step 3: STRICT SEARCH FIRST (only use what user specified)
-            results = await search_mongodb(search_params, limit=10)
-            # Step 4: If 0 results, try RELAXED SUGGESTION search
-            if not results:
-                logger.info("Strict search yielded 0 results, trying suggestion search...")
-                suggestion_results, currency = await search_listings_hybrid(
-                    user_query=state.last_user_message,
-                    search_params=search_params,
-                    limit=10,
-                    mode="relaxed"
                 )
-                # Step 4.5: Filter suggestions - only keep results from the requested location
-                # If user asked for "New York" but we get results from "Lagos", discard them
-                requested_location = (search_params.get("location") or "").lower()
-                if requested_location and suggestion_results:
-                    relevant_suggestions = []
-                    for listing in suggestion_results:
-                        listing_location = (listing.get("location") or "").lower()
-                        # Check if the listing is from the requested location (or nearby)
-                        if requested_location in listing_location or listing_location in requested_location:
-                            relevant_suggestions.append(listing)
-                    if relevant_suggestions:
-                        results = relevant_suggestions
-                        is_suggestion = True
-                        logger.info(f"Found {len(relevant_suggestions)} relevant suggestions for {requested_location}")
                     else:
-                        # No relevant suggestions - results stay empty for "Notify me" prompt
-                        results = []
-                        is_suggestion = False
-                        logger.info(f"No relevant suggestions found for {requested_location}, will prompt for notification")
-                else:
-                    # No location filter specified, use all suggestions
-                    results = suggestion_results
-                    is_suggestion = True
             # Step 5: Enrich results with owner/review data (same as listings API)
             if results:
                 from app.database import get_db
                 from app.services.listing_service import enrich_listings_batch
                 db = await get_db()
-                # Convert to dicts if needed and stringify _id
                 formatted_results = []
                 for doc in results:
                     if "_id" in doc and not isinstance(doc["_id"], str):
                         doc["_id"] = str(doc["_id"])
                     formatted_results.append(doc)
                 results = await enrich_listings_batch(formatted_results, db)
                 logger.info(f"Enriched {len(results)} search results with owner/review data")
             # Step 6: Store results and flags
             state.search_results = results
             state.temp_data["search_results"] = results
             state.temp_data["action"] = "search_results"
             state.temp_data["is_suggestion"] = is_suggestion
-            state.temp_data["search_params"] = search_params  # For "Notify me" feature
             # Always save last search params for "Notify me" feature
             state.temp_data["last_search_params"] = search_params
             state.temp_data["last_search_query"] = state.last_user_message
             # If no results found, flag to propose alert
             if len(results) == 0:
                 state.temp_data["propose_alert"] = True
                 state.temp_data["response_text"] = f"I couldn't find any properties matching your search right now. Would you like me to notify you when something becomes available? 🔔"
-            return True, f"Found {len(results)} properties", results
         elif tool_name == "get_my_listings":
             # Get user's listings
@@ -1147,6 +1214,9 @@ async def execute_tool(tool_name: str, params: Dict[str, Any], state: AgentState
                         state.temp_data["response_text"] = f"Got it! 🔔 I'll keep watching for properties in {location} and notify you the moment something becomes available!"
                     state.temp_data["action"] = "alert_created"
                 return True, f"Alert created: {alert.id} (found {len(current_results)} current matches)", {
                     "alert_id": str(alert.id),
                     "current_match_count": len(current_results)
@@ -1297,6 +1367,8 @@ async def execute_tool(tool_name: str, params: Dict[str, Any], state: AgentState
     except Exception as e:
         logger.error("Tool execution error", tool=tool_name, exc_info=e)
         return False, str(e), None
@@ -1305,11 +1377,31 @@ async def agent_think(state: AgentState) -> AgentState:
     Main agent thinking loop.
     LLM reasons → decides tool → executes → generates response.
     """
     logger.info("Agent thinking started", user_id=state.user_id)
     # Step 1: Brain decides what to do
     decision = await brain_decide(state)
     # Store thinking for debugging
     state.temp_data["brain_thinking"] = decision.thinking
@@ -1359,8 +1451,19 @@ async def agent_think(state: AgentState) -> AgentState:
     else:
         state.temp_data["action"] = "respond"  # Just text, no data cards
     logger.info("Agent thinking complete", action=decision.tool, show_data=decision.show_data)
     return state

 from app.ai.agent.state import AgentState, FlowState
 from app.ai.agent.schema import get_schema_for_llm, get_draft_summary, get_missing_fields
 from app.config import settings
+from app.ai.lightning.rewards import log_field_reward, log_reward, log_negative_reward, REWARD_SEARCH_COMPLETED, REWARD_ALERT_CREATED
+from app.ai.lightning.tracer import log_trajectory_step
 logger = get_logger(__name__)
             else:
                 # No draft_ui yet - AIDA will ask for images
                 state.temp_data["action"] = "respond"
+            # Log reward for field extraction (Agent Lightning)
+            await log_field_reward(state.session_id, list(fields.keys()))
             return True, f"Updated: {list(fields.keys())}", state.provided_fields
         elif tool_name == "search_properties":
             # Import and call search service
             from app.ai.services.search_extractor import extract_search_params
             from app.ai.services.search_service import search_listings_hybrid, search_mongodb
+            from app.ai.services.search_strategy_selector import select_search_strategy, SearchStrategy
             # SMART UI: Clear old my_listings when doing new search
             state.my_listings = []
             state.temp_data.pop("my_listings", None)
             # Step 1: Extract params from the full user message (LLM is smart)
             search_params = await extract_search_params(state.last_user_message)
             # Step 2: Merge with Brain-extracted params (these have priority if present)
             if params.get("location"):
                 search_params["location"] = params["location"]
                 search_params["max_price"] = params["max_price"]
             if params.get("beds"):
                 search_params["bedrooms"] = params["beds"]
             is_suggestion = False
+            rlm_used = False
+            # ================================================================
+            # Step 2.5: CHECK IF RLM SHOULD BE USED (NEW!)
+            # ================================================================
+            strategy_result = await select_search_strategy(state.last_user_message, search_params)
+            if strategy_result.get("use_rlm"):
+                # Use RLM for complex queries
+                logger.info(
+                    "🧠 RLM activated for search",
+                    strategy=strategy_result["strategy"].value,
+                    reasoning=strategy_result["reasoning"][:50]
                 )
+                try:
+                    from app.ai.services.rlm_search_service import rlm_search
+                    rlm_result = await rlm_search(
+                        query=state.last_user_message,
+                        context={
+                            "user_location": state.user_location,
+                            "search_params": search_params
+                        }
+                    )
+                    results = rlm_result.get("results", [])
+                    rlm_used = True
+                    # Store RLM metadata
+                    state.temp_data["rlm_strategy"] = rlm_result.get("strategy_used")
+                    state.temp_data["rlm_reasoning_steps"] = rlm_result.get("reasoning_steps")
+                    state.temp_data["rlm_call_count"] = rlm_result.get("call_count")
+                    # Use RLM-generated message if available
+                    if rlm_result.get("message"):
+                        state.temp_data["response_text"] = rlm_result["message"]
+                    # Store comparison data if available
+                    if rlm_result.get("comparison_data"):
+                        state.temp_data["comparison_data"] = rlm_result["comparison_data"]
+                    # Store aggregation result if available
+                    if rlm_result.get("aggregation_result"):
+                        state.temp_data["aggregation_result"] = rlm_result["aggregation_result"]
+                    logger.info(
+                        f"🧠 RLM search complete",
+                        result_count=len(results),
+                        strategy=rlm_result.get("strategy_used"),
+                        calls=rlm_result.get("call_count")
+                    )
+                except Exception as rlm_error:
+                    logger.error(f"RLM search failed, falling back to standard: {rlm_error}")
+                    rlm_used = False
+                    results = []
+            # ================================================================
+            # Standard search path (if RLM not used or failed)
+            # ================================================================
+            if not rlm_used:
+                # Step 3: STRICT SEARCH FIRST (only use what user specified)
+                results = await search_mongodb(search_params, limit=10)
+                # Step 4: If 0 results, try RELAXED SUGGESTION search
+                if not results:
+                    logger.info("Strict search yielded 0 results, trying suggestion search...")
+                    suggestion_results, currency = await search_listings_hybrid(
+                        user_query=state.last_user_message,
+                        search_params=search_params,
+                        limit=10,
+                        mode="relaxed"
+                    )
+                    # Step 4.5: Filter suggestions - only keep results from the requested location
+                    requested_location = (search_params.get("location") or "").lower()
+                    if requested_location and suggestion_results:
+                        relevant_suggestions = []
+                        for listing in suggestion_results:
+                            listing_location = (listing.get("location") or "").lower()
+                            if requested_location in listing_location or listing_location in requested_location:
+                                relevant_suggestions.append(listing)
+                        if relevant_suggestions:
+                            results = relevant_suggestions
+                            is_suggestion = True
+                            logger.info(f"Found {len(relevant_suggestions)} relevant suggestions for {requested_location}")
+                        else:
+                            results = []
+                            is_suggestion = False
+                            logger.info(f"No relevant suggestions found for {requested_location}")
                     else:
+                        results = suggestion_results
+                        is_suggestion = True
             # Step 5: Enrich results with owner/review data (same as listings API)
             if results:
                 from app.database import get_db
                 from app.services.listing_service import enrich_listings_batch
                 db = await get_db()
                 formatted_results = []
                 for doc in results:
                     if "_id" in doc and not isinstance(doc["_id"], str):
                         doc["_id"] = str(doc["_id"])
                     formatted_results.append(doc)
                 results = await enrich_listings_batch(formatted_results, db)
                 logger.info(f"Enriched {len(results)} search results with owner/review data")
             # Step 6: Store results and flags
             state.search_results = results
             state.temp_data["search_results"] = results
             state.temp_data["action"] = "search_results"
             state.temp_data["is_suggestion"] = is_suggestion
+            state.temp_data["search_params"] = search_params
+            state.temp_data["search_strategy"] = strategy_result["strategy"].value if hasattr(strategy_result["strategy"], "value") else str(strategy_result["strategy"])
             # Always save last search params for "Notify me" feature
             state.temp_data["last_search_params"] = search_params
             state.temp_data["last_search_query"] = state.last_user_message
             # If no results found, flag to propose alert
             if len(results) == 0:
                 state.temp_data["propose_alert"] = True
                 state.temp_data["response_text"] = f"I couldn't find any properties matching your search right now. Would you like me to notify you when something becomes available? 🔔"
+            # Log reward for search completion (Agent Lightning)
+            if len(results) > 0:
+                await log_reward(state.session_id, REWARD_SEARCH_COMPLETED, "search_completed", {"result_count": len(results)})
+            return True, f"Found {len(results)} properties" + (" (via RLM)" if rlm_used else ""), results
         elif tool_name == "get_my_listings":
             # Get user's listings
                         state.temp_data["response_text"] = f"Got it! 🔔 I'll keep watching for properties in {location} and notify you the moment something becomes available!"
                     state.temp_data["action"] = "alert_created"
+                # Log reward for alert creation (Agent Lightning)
+                await log_reward(state.session_id, REWARD_ALERT_CREATED, "alert_created", {"alert_id": str(alert.id)})
                 return True, f"Alert created: {alert.id} (found {len(current_results)} current matches)", {
                     "alert_id": str(alert.id),
                     "current_match_count": len(current_results)
     except Exception as e:
         logger.error("Tool execution error", tool=tool_name, exc_info=e)
+        # Log negative reward for tool execution error (Agent Lightning)
+        await log_negative_reward(state.session_id, "error", f"Tool {tool_name} failed: {str(e)}")
         return False, str(e), None
     Main agent thinking loop.
     LLM reasons → decides tool → executes → generates response.
     """
     logger.info("Agent thinking started", user_id=state.user_id)
+    # Log user input trajectory (Agent Lightning)
+    await log_trajectory_step(
+        state.session_id,
+        "user_input",
+        {"message": state.last_user_message[:500] if state.last_user_message else ""},
+        state.user_id
+    )
     # Step 1: Brain decides what to do
     decision = await brain_decide(state)
+    # Log brain decision trajectory (Agent Lightning)
+    await log_trajectory_step(
+        state.session_id,
+        "brain_decision",
+        {
+            "thinking": decision.thinking[:200] if decision.thinking else "",
+            "tool": decision.tool,
+            "is_final": decision.is_final
+        },
+        state.user_id
+    )
     # Store thinking for debugging
     state.temp_data["brain_thinking"] = decision.thinking
     else:
         state.temp_data["action"] = "respond"  # Just text, no data cards
+    # Log response trajectory (Agent Lightning)
+    await log_trajectory_step(
+        state.session_id,
+        "response",
+        {
+            "response": state.temp_data.get("response_text", "")[:500],
+            "action": state.temp_data.get("action", "respond")
+        },
+        state.user_id
+    )
     logger.info("Agent thinking complete", action=decision.tool, show_data=decision.show_data)
     return state

app/ai/agent/graph.py CHANGED Viewed

@@ -15,6 +15,7 @@ from langgraph.checkpoint.memory import MemorySaver
 from structlog import get_logger
 from app.ai.agent.state import AgentState, FlowState
 from app.ai.agent.nodes.authenticate import authenticate
 from app.ai.agent.brain import agent_think
 from app.ai.agent.nodes.validate_output import validate_output_node
@@ -117,9 +118,12 @@ def build_aida_graph():
     checkpointer = MemorySaver()
     compiled_graph = graph.compile(checkpointer=checkpointer)
     logger.info("✅ LangGraph V2 compiled (Brain-Based)")
     return compiled_graph

 from structlog import get_logger
 from app.ai.agent.state import AgentState, FlowState
+from app.ai.lightning.tracer import wrap_graph_if_enabled
 from app.ai.agent.nodes.authenticate import authenticate
 from app.ai.agent.brain import agent_think
 from app.ai.agent.nodes.validate_output import validate_output_node
     checkpointer = MemorySaver()
     compiled_graph = graph.compile(checkpointer=checkpointer)
+    # Wrap with Agent Lightning tracer (if enabled)
+    compiled_graph = wrap_graph_if_enabled(compiled_graph)
     logger.info("✅ LangGraph V2 compiled (Brain-Based)")
     return compiled_graph

app/ai/agent/nodes/__pycache__/listing_publish.cpython-313.pyc CHANGED Viewed

Binary files a/app/ai/agent/nodes/__pycache__/listing_publish.cpython-313.pyc and b/app/ai/agent/nodes/__pycache__/listing_publish.cpython-313.pyc differ

app/ai/agent/nodes/listing_collect.py CHANGED Viewed

@@ -30,80 +30,19 @@ llm = ChatOpenAI(
     temperature=0.7,
 )
-async def initialize_from_vision_analysis(
-    state: AgentState,
-    vision_data: Dict
-) -> AgentState:
-    """
-    Initialize listing from AI vision analysis (images/video)
-    Populates state with auto-detected fields and sets up for user confirmation.
-    User will be prompted for required fields: location, address, price (with price_type).
-    Auto-inferred fields:
-    - Currency: Auto-detected from location via external API
-    - Listing type: Auto-inferred from price_type (per month → rent, once → sale, etc.)
-    Args:
-        state: Current agent state
-        vision_data: Dict with extracted fields from vision service
-    Returns:
-        Updated state ready for collection
-    """
-    try:
-        # Extract vision analysis results
-        extracted_fields = vision_data.get("extracted_fields", {})
-        confidence = vision_data.get("confidence", {})
-        image_urls = vision_data.get("image_urls", [])
-        logger.info("🤖 Initializing listing from vision analysis",
-                   bedrooms=extracted_fields.get("bedrooms"),
-                   bathrooms=extracted_fields.get("bathrooms"),
-                   amenities_count=len(extracted_fields.get("amenities", [])))
-        # Pre-fill detected fields with high confidence (>0.7)
-        high_confidence_threshold = 0.7
-        # Always add images (they were validated)
-        if image_urls:
-            state.update_listing_progress("images", image_urls)
-            logger.info(f"✅ Added {len(image_urls)} validated images")
-        # Bedrooms (high confidence)
-        if extracted_fields.get("bedrooms") is not None and confidence.get("bedrooms", 0) > high_confidence_threshold:
-            state.update_listing_progress("bedrooms", extracted_fields["bedrooms"])
-            logger.info(f"✅ Auto-filled bedrooms: {extracted_fields['bedrooms']}")
-        # Bathrooms (high confidence)
-        if extracted_fields.get("bathrooms") is not None and confidence.get("bathrooms", 0) > high_confidence_threshold:
-            state.update_listing_progress("bathrooms", extracted_fields["bathrooms"])
-            logger.info(f"✅ Auto-filled bathrooms: {extracted_fields['bathrooms']}")
-        # Amenities (even medium confidence is good for amenities)
-        if extracted_fields.get("amenities") and confidence.get("amenities", 0) > 0.5:
-            state.update_listing_progress("amenities", extracted_fields["amenities"])
-            logger.info(f"✅ Auto-filled amenities: {extracted_fields['amenities']}")
-        # Description (if high confidence)
-        if extracted_fields.get("description") and confidence.get("description", 0) > high_confidence_threshold:
-            state.update_listing_progress("description", extracted_fields["description"])
-            logger.info(f"✅ Auto-filled description")
-        # Store vision confidence scores in temp_data for reference
-        state.temp_data["vision_confidence"] = confidence
-        state.temp_data["from_vision_analysis"] = True
-        # Set user message to indicate vision analysis was done
-        state.last_user_message = "[Vision analysis completed - awaiting user confirmation]"
-        logger.info("✅ Vision analysis initialization complete")
-        return state
-    except Exception as e:
-        logger.error("Error initializing from vision analysis", exc_info=e)
-        state.set_error(f"Error initializing from vision: {str(e)}", should_retry=True)
-        return state
 async def generate_contextual_question(state: AgentState, next_field: str = None) -> str:

     temperature=0.7,
 )
+# ============================================================
+# VISION ANALYSIS - DISABLED
+# ============================================================
+# NOTE: Vision analysis is NOT in use. Image uploads are handled
+# directly by Cloudflare Worker (frontend upload).
+# This function is kept for future reference only.
+# ============================================================
+# async def initialize_from_vision_analysis(
+#     state: AgentState,
+#     vision_data: Dict
+# ) -> AgentState:
+#     """Initialize listing from AI vision analysis (images/video) - DISABLED"""
+#     pass
 async def generate_contextual_question(state: AgentState, next_field: str = None) -> str:

app/ai/agent/nodes/listing_publish.py CHANGED Viewed

@@ -11,6 +11,7 @@ from app.ai.agent.state import AgentState, FlowState
 from app.ai.agent.schemas import ListingDraft
 from app.database import get_db
 from app.ai.services.vector_service import upsert_listing_to_vector_db
 logger = get_logger(__name__)
@@ -282,6 +283,15 @@ async def listing_publish_handler(state: AgentState) -> AgentState:
                 logger.info("Proactive alerts processed for new listing", listing_id=listing_id)
             except Exception as notify_err:
                 logger.warning("Proactive notification check failed", error=str(notify_err))
         except Exception as e:
             logger.error("MongoDB save failed", exc_info=e)

 from app.ai.agent.schemas import ListingDraft
 from app.database import get_db
 from app.ai.services.vector_service import upsert_listing_to_vector_db
+from app.ai.lightning.rewards import log_reward
 logger = get_logger(__name__)
                 logger.info("Proactive alerts processed for new listing", listing_id=listing_id)
             except Exception as notify_err:
                 logger.warning("Proactive notification check failed", error=str(notify_err))
+            # ✅ STEP 3.3: Log reward for successful publish (Agent Lightning)
+            is_update = bool(state.temp_data.get("editing_listing_id"))
+            await log_reward(
+                state.session_id,
+                1.0,  # Primary success signal
+                "listing_published",
+                {"listing_id": listing_id, "is_update": is_update}
+            )
         except Exception as e:
             logger.error("MongoDB save failed", exc_info=e)

app/ai/lightning/__init__.py ADDED Viewed

	@@ -0,0 +1,38 @@

+# app/ai/lightning/__init__.py
+"""
+Agent Lightning - RL Trajectory Capture for AIDA
+This module implements Agent Lightning-inspired reinforcement learning
+trajectory capture for training AIDA to improve listing completion rates.
+Components:
+- tracer.py: Captures state transitions, tool calls, and outcomes
+- rewards.py: Logs reward signals at key events
+- config.py: Lightning-specific configuration
+Usage:
+    from app.ai.lightning import log_reward, log_trajectory
+    # Log a reward signal
+    await log_reward(session_id, 1.0, "listing_published", {"listing_id": "..."})
+    # Trajectories are captured automatically when LIGHTNING_ENABLED=true
+"""
+from app.ai.lightning.rewards import log_reward, log_field_reward, log_negative_reward
+from app.ai.lightning.tracer import (
+    wrap_graph_if_enabled,
+    log_trajectory_step,
+    get_session_trajectory,
+    export_trajectories_for_training
+)
+__all__ = [
+    "log_reward",
+    "log_field_reward",
+    "log_negative_reward",
+    "wrap_graph_if_enabled",
+    "log_trajectory_step",
+    "get_session_trajectory",
+    "export_trajectories_for_training"
+]

app/ai/lightning/rewards.py ADDED Viewed

	@@ -0,0 +1,249 @@

+# app/ai/lightning/rewards.py
+"""
+Agent Lightning Reward Signals
+Logs reward signals at key events for RL training.
+Rewards are associated with session trajectories.
+Reward Definitions:
+- listing_published: +1.0 (primary success signal)
+- field_extracted: +0.1 per field (incremental progress)
+- search_completed: +0.3 (user found what they wanted)
+- alert_created: +0.2 (user engaged with notifications)
+- conversation_error: -0.5 (negative signal for failures)
+- conversation_abandoned: -0.3 (user left mid-flow)
+"""
+import json
+from datetime import datetime
+from typing import Any, Dict, List, Optional
+from structlog import get_logger
+logger = get_logger(__name__)
+# Reward value constants
+REWARD_LISTING_PUBLISHED = 1.0
+REWARD_FIELD_EXTRACTED = 0.1
+REWARD_SEARCH_COMPLETED = 0.3
+REWARD_ALERT_CREATED = 0.2
+REWARD_CONVERSATION_ERROR = -0.5
+REWARD_CONVERSATION_ABANDONED = -0.3
+# Redis connection (lazy initialization)
+_redis_client = None
+async def _get_redis():
+    """Get or create Redis client"""
+    global _redis_client
+    if _redis_client is None:
+        try:
+            from app.ai.memory.redis_memory import get_redis_client
+            _redis_client = await get_redis_client()
+        except Exception as e:
+            logger.warning("Lightning Rewards: Redis connection failed", error=str(e))
+            return None
+    return _redis_client
+def _is_lightning_enabled() -> bool:
+    """Check if Lightning is enabled"""
+    try:
+        from app.config import settings
+        return getattr(settings, 'LIGHTNING_ENABLED', False)
+    except Exception:
+        return False
+def _get_reward_ttl() -> int:
+    """Get TTL for rewards in seconds (same as trajectories)"""
+    try:
+        from app.config import settings
+        days = getattr(settings, 'LIGHTNING_TRAJECTORY_TTL_DAYS', 30)
+        return days * 24 * 60 * 60
+    except Exception:
+        return 30 * 24 * 60 * 60
+async def log_reward(
+    session_id: str,
+    reward: float,
+    event_type: str,
+    metadata: Optional[Dict[str, Any]] = None
+) -> bool:
+    """
+    Log a reward signal for a session.
+    Args:
+        session_id: Session identifier
+        reward: Reward value (positive or negative)
+        event_type: Type of event that triggered reward
+        metadata: Optional additional context
+    Returns:
+        True if logged successfully
+    """
+    if not _is_lightning_enabled():
+        return False
+    try:
+        redis = await _get_redis()
+        if not redis:
+            return False
+        reward_entry = {
+            "timestamp": datetime.utcnow().isoformat(),
+            "reward": reward,
+            "event_type": event_type,
+            "metadata": metadata or {}
+        }
+        key = f"lightning:rewards:{session_id}"
+        await redis.rpush(key, json.dumps(reward_entry))
+        await redis.expire(key, _get_reward_ttl())
+        # Also increment global counters for monitoring
+        counter_key = f"lightning:stats:{event_type}"
+        await redis.incr(counter_key)
+        logger.info("Lightning: Reward logged",
+                   session_id=session_id[:8],
+                   reward=reward,
+                   event_type=event_type)
+        return True
+    except Exception as e:
+        logger.warning("Lightning: Failed to log reward", error=str(e))
+        return False
+async def log_field_reward(
+    session_id: str,
+    fields: List[str]
+) -> bool:
+    """
+    Log reward for successfully extracted listing fields.
+    Args:
+        session_id: Session identifier
+        fields: List of field names that were extracted
+    Returns:
+        True if logged successfully
+    """
+    if not fields:
+        return False
+    # Calculate reward: 0.1 per field
+    reward = REWARD_FIELD_EXTRACTED * len(fields)
+    return await log_reward(
+        session_id,
+        reward,
+        "field_extracted",
+        {"fields": fields, "field_count": len(fields)}
+    )
+async def log_negative_reward(
+    session_id: str,
+    event_type: str,
+    reason: str
+) -> bool:
+    """
+    Log a negative reward for errors or abandonment.
+    Args:
+        session_id: Session identifier
+        event_type: "error" or "abandoned"
+        reason: Description of what went wrong
+    Returns:
+        True if logged successfully
+    """
+    if event_type == "error":
+        reward = REWARD_CONVERSATION_ERROR
+    elif event_type == "abandoned":
+        reward = REWARD_CONVERSATION_ABANDONED
+    else:
+        reward = -0.1  # Generic negative
+    return await log_reward(
+        session_id,
+        reward,
+        f"conversation_{event_type}",
+        {"reason": reason}
+    )
+async def get_session_rewards(session_id: str) -> List[Dict[str, Any]]:
+    """
+    Get all rewards for a session.
+    Args:
+        session_id: Session identifier
+    Returns:
+        List of reward entries
+    """
+    if not _is_lightning_enabled():
+        return []
+    try:
+        redis = await _get_redis()
+        if not redis:
+            return []
+        key = f"lightning:rewards:{session_id}"
+        raw_rewards = await redis.lrange(key, 0, -1)
+        return [json.loads(r) for r in raw_rewards]
+    except Exception as e:
+        logger.warning("Lightning: Failed to get rewards", error=str(e))
+        return []
+async def get_total_session_reward(session_id: str) -> float:
+    """
+    Calculate total reward for a session.
+    Args:
+        session_id: Session identifier
+    Returns:
+        Sum of all rewards
+    """
+    rewards = await get_session_rewards(session_id)
+    return sum(r.get("reward", 0) for r in rewards)
+async def get_lightning_stats() -> Dict[str, int]:
+    """
+    Get global Lightning statistics.
+    Returns:
+        Dict of event_type -> count
+    """
+    if not _is_lightning_enabled():
+        return {}
+    try:
+        redis = await _get_redis()
+        if not redis:
+            return {}
+        # Get all stats keys
+        stats_keys = await redis.keys("lightning:stats:*")
+        stats = {}
+        for key in stats_keys:
+            event_type = key.decode().split(":")[-1]
+            count = await redis.get(key)
+            stats[event_type] = int(count) if count else 0
+        return stats
+    except Exception as e:
+        logger.warning("Lightning: Failed to get stats", error=str(e))
+        return {}

app/ai/lightning/tracer.py ADDED Viewed

	@@ -0,0 +1,326 @@

+# app/ai/lightning/tracer.py
+"""
+Agent Lightning Trajectory Tracer
+Captures state transitions and tool calls for RL training.
+Uses Redis for trajectory storage with automatic TTL cleanup.
+Design Principles:
+1. Zero overhead when disabled (LIGHTNING_ENABLED=false)
+2. Non-blocking async operations
+3. Graceful degradation on errors
+4. Compatible with existing LangGraph architecture
+"""
+import json
+import asyncio
+from datetime import datetime
+from typing import Any, Dict, List, Optional, Callable
+from functools import wraps
+from structlog import get_logger
+logger = get_logger(__name__)
+# Redis connection (lazy initialization)
+_redis_client = None
+async def _get_redis():
+    """Get or create Redis client for lightning storage"""
+    global _redis_client
+    if _redis_client is None:
+        try:
+            from app.ai.memory.redis_memory import get_redis_client
+            _redis_client = await get_redis_client()
+        except Exception as e:
+            logger.warning("Lightning: Redis connection failed, trajectories will not be stored", error=str(e))
+            return None
+    return _redis_client
+def _is_lightning_enabled() -> bool:
+    """Check if Lightning is enabled via config"""
+    try:
+        from app.config import settings
+        return getattr(settings, 'LIGHTNING_ENABLED', False)
+    except Exception:
+        return False
+def _get_trajectory_ttl() -> int:
+    """Get TTL for trajectories in seconds"""
+    try:
+        from app.config import settings
+        days = getattr(settings, 'LIGHTNING_TRAJECTORY_TTL_DAYS', 30)
+        return days * 24 * 60 * 60  # Convert to seconds
+    except Exception:
+        return 30 * 24 * 60 * 60  # Default 30 days
+async def log_trajectory_step(
+    session_id: str,
+    step_type: str,
+    data: Dict[str, Any],
+    user_id: Optional[str] = None
+) -> bool:
+    """
+    Log a single trajectory step to Redis.
+    Args:
+        session_id: Unique session identifier
+        step_type: Type of step (user_input, brain_decision, tool_call, tool_result, response)
+        data: Step data (varies by type)
+        user_id: Optional user ID for filtering
+    Returns:
+        True if logged successfully, False otherwise
+    """
+    if not _is_lightning_enabled():
+        return False
+    try:
+        redis = await _get_redis()
+        if not redis:
+            return False
+        step = {
+            "timestamp": datetime.utcnow().isoformat(),
+            "step_type": step_type,
+            "data": data,
+            "user_id": user_id
+        }
+        # Store as list under session key
+        key = f"lightning:trajectory:{session_id}"
+        await redis.rpush(key, json.dumps(step))
+        await redis.expire(key, _get_trajectory_ttl())
+        logger.debug("Lightning: Trajectory step logged",
+                    session_id=session_id[:8],
+                    step_type=step_type)
+        return True
+    except Exception as e:
+        logger.warning("Lightning: Failed to log trajectory step", error=str(e))
+        return False
+async def get_session_trajectory(session_id: str) -> List[Dict[str, Any]]:
+    """
+    Retrieve full trajectory for a session.
+    Args:
+        session_id: Session identifier
+    Returns:
+        List of trajectory steps
+    """
+    if not _is_lightning_enabled():
+        return []
+    try:
+        redis = await _get_redis()
+        if not redis:
+            return []
+        key = f"lightning:trajectory:{session_id}"
+        raw_steps = await redis.lrange(key, 0, -1)
+        return [json.loads(step) for step in raw_steps]
+    except Exception as e:
+        logger.warning("Lightning: Failed to get trajectory", error=str(e))
+        return []
+async def export_trajectories_for_training(
+    min_steps: int = 3,
+    max_trajectories: int = 1000,
+    only_completed: bool = True
+) -> List[Dict[str, Any]]:
+    """
+    Export trajectories for RL training.
+    Args:
+        min_steps: Minimum steps required per trajectory
+        max_trajectories: Maximum number to export
+        only_completed: Only include trajectories with rewards
+    Returns:
+        List of trajectories with their rewards
+    """
+    if not _is_lightning_enabled():
+        logger.warning("Lightning: Cannot export - Lightning not enabled")
+        return []
+    try:
+        redis = await _get_redis()
+        if not redis:
+            return []
+        # Get all trajectory keys
+        trajectory_keys = await redis.keys("lightning:trajectory:*")
+        trajectories = []
+        for key in trajectory_keys[:max_trajectories * 2]:  # Get extra to filter
+            session_id = key.decode().split(":")[-1]
+            # Get trajectory
+            raw_steps = await redis.lrange(key, 0, -1)
+            steps = [json.loads(step) for step in raw_steps]
+            if len(steps) < min_steps:
+                continue
+            # Get rewards for this session
+            reward_key = f"lightning:rewards:{session_id}"
+            raw_rewards = await redis.lrange(reward_key, 0, -1)
+            rewards = [json.loads(r) for r in raw_rewards]
+            if only_completed and not rewards:
+                continue
+            # Calculate total reward
+            total_reward = sum(r.get("reward", 0) for r in rewards)
+            trajectories.append({
+                "session_id": session_id,
+                "steps": steps,
+                "rewards": rewards,
+                "total_reward": total_reward,
+                "step_count": len(steps)
+            })
+            if len(trajectories) >= max_trajectories:
+                break
+        logger.info("Lightning: Exported trajectories for training", count=len(trajectories))
+        return trajectories
+    except Exception as e:
+        logger.error("Lightning: Failed to export trajectories", error=str(e))
+        return []
+def wrap_graph_if_enabled(compiled_graph):
+    """
+    Wrap a compiled LangGraph with trajectory logging.
+    This is a passthrough wrapper that logs trajectory steps
+    without modifying the graph's behavior.
+    Args:
+        compiled_graph: The compiled LangGraph
+    Returns:
+        Wrapped graph (or original if Lightning disabled)
+    """
+    if not _is_lightning_enabled():
+        logger.info("Lightning: Disabled - returning unwrapped graph")
+        return compiled_graph
+    logger.info("Lightning: Wrapping graph with trajectory capture")
+    # For now, return the original graph
+    # The trajectory logging is done at the brain.py level
+    # This wrapper is a hook for future enhancements
+    return compiled_graph
+class TrajectoryContext:
+    """
+    Context manager for tracking a complete conversation trajectory.
+    Usage:
+        async with TrajectoryContext(session_id, user_id) as ctx:
+            ctx.log_user_input(message)
+            ctx.log_brain_decision(decision)
+            ctx.log_tool_call(tool, params, result)
+            ctx.log_response(response)
+    """
+    def __init__(self, session_id: str, user_id: Optional[str] = None):
+        self.session_id = session_id
+        self.user_id = user_id
+        self.start_time = None
+        self.enabled = _is_lightning_enabled()
+    async def __aenter__(self):
+        self.start_time = datetime.utcnow()
+        if self.enabled:
+            await log_trajectory_step(
+                self.session_id,
+                "session_start",
+                {"timestamp": self.start_time.isoformat()},
+                self.user_id
+            )
+        return self
+    async def __aexit__(self, exc_type, exc_val, exc_tb):
+        if self.enabled:
+            end_time = datetime.utcnow()
+            duration = (end_time - self.start_time).total_seconds()
+            await log_trajectory_step(
+                self.session_id,
+                "session_end",
+                {
+                    "duration_seconds": duration,
+                    "error": str(exc_val) if exc_val else None
+                },
+                self.user_id
+            )
+        return False  # Don't suppress exceptions
+    async def log_user_input(self, message: str, is_voice: bool = False):
+        """Log user input step"""
+        if self.enabled:
+            await log_trajectory_step(
+                self.session_id,
+                "user_input",
+                {
+                    "message": message[:500],  # Truncate long messages
+                    "is_voice": is_voice
+                },
+                self.user_id
+            )
+    async def log_brain_decision(self, thinking: str, tool: Optional[str], params: Dict):
+        """Log brain decision step"""
+        if self.enabled:
+            await log_trajectory_step(
+                self.session_id,
+                "brain_decision",
+                {
+                    "thinking": thinking[:200],  # Truncate
+                    "tool": tool,
+                    "params": {k: str(v)[:100] for k, v in params.items()} if params else {}
+                },
+                self.user_id
+            )
+    async def log_tool_call(self, tool: str, success: bool, message: str):
+        """Log tool execution step"""
+        if self.enabled:
+            await log_trajectory_step(
+                self.session_id,
+                "tool_result",
+                {
+                    "tool": tool,
+                    "success": success,
+                    "message": message[:200]
+                },
+                self.user_id
+            )
+    async def log_response(self, response: str, action: Optional[str] = None):
+        """Log AI response step"""
+        if self.enabled:
+            await log_trajectory_step(
+                self.session_id,
+                "response",
+                {
+                    "response": response[:500],
+                    "action": action
+                },
+                self.user_id
+            )

app/ai/services/__init__.py CHANGED Viewed

	@@ -0,0 +1,52 @@

+# app/ai/services/__init__.py
+"""
+AI Services for AIDA
+Includes:
+- Search services (hybrid, MongoDB, Qdrant)
+- RLM (Recursive Language Model) for complex queries
+- Strategy selection
+- Intent classification
+- OpenStreetMap POI service for proximity searches
+"""
+from app.ai.services.rlm_query_analyzer import (
+    QueryComplexity,
+    QueryAnalysis,
+    analyze_query_complexity,
+    should_use_rlm
+)
+from app.ai.services.rlm_search_service import (
+    RLMSearchAgent,
+    get_rlm_agent,
+    rlm_search
+)
+from app.ai.services.osm_poi_service import (
+    find_pois,
+    find_pois_overpass,
+    geocode_location,
+    find_multiple_poi_types,
+    calculate_distance_km
+)
+__all__ = [
+    # RLM Query Analyzer
+    "QueryComplexity",
+    "QueryAnalysis",
+    "analyze_query_complexity",
+    "should_use_rlm",
+    # RLM Search Service
+    "RLMSearchAgent",
+    "get_rlm_agent",
+    "rlm_search",
+    # OpenStreetMap POI Service
+    "find_pois",
+    "find_pois_overpass",
+    "geocode_location",
+    "find_multiple_poi_types",
+    "calculate_distance_km",
+]

app/ai/services/osm_poi_service.py ADDED Viewed

	@@ -0,0 +1,499 @@

+# app/ai/services/osm_poi_service.py
+"""
+OpenStreetMap POI (Point of Interest) Service for AIDA RLM.
+Uses FREE OpenStreetMap APIs:
+- Nominatim: Geocoding (location name → coordinates)
+- Overpass: POI search (find schools, hospitals, parks near a location)
+No API key required! Just respect rate limits (1 request/second for Nominatim).
+Supports:
+- Schools, universities, colleges
+- Hospitals, clinics, pharmacies
+- Parks, gardens, beaches
+- Markets, supermarkets, malls
+- Airports, bus stations
+- Mosques, churches
+- And more...
+"""
+import asyncio
+import httpx
+from typing import List, Dict, Optional, Tuple
+from structlog import get_logger
+logger = get_logger(__name__)
+# Rate limiting: Nominatim requires max 1 request/second
+_last_nominatim_request = 0
+# =============================================================================
+# OSM Tag Mappings
+# =============================================================================
+OSM_POI_TAGS = {
+    # Education
+    "school": "amenity=school",
+    "schools": "amenity=school",
+    "primary school": "amenity=school",
+    "secondary school": "amenity=school",
+    "high school": "amenity=school",
+    "university": "amenity=university",
+    "college": "amenity=college",
+    "kindergarten": "amenity=kindergarten",
+    # Healthcare
+    "hospital": "amenity=hospital",
+    "clinic": "amenity=clinic",
+    "pharmacy": "amenity=pharmacy",
+    "doctor": "amenity=doctors",
+    # Recreation & Nature
+    "beach": "natural=beach",
+    "park": "leisure=park",
+    "garden": "leisure=garden",
+    "playground": "leisure=playground",
+    "sports": "leisure=sports_centre",
+    "gym": "leisure=fitness_centre",
+    "swimming pool": "leisure=swimming_pool",
+    "stadium": "leisure=stadium",
+    # Shopping
+    "market": "amenity=marketplace",
+    "supermarket": "shop=supermarket",
+    "mall": "shop=mall",
+    "shopping center": "shop=mall",
+    "shop": "shop=supermarket",
+    # Transport
+    "airport": "aeroway=aerodrome",
+    "bus station": "amenity=bus_station",
+    "bus stop": "highway=bus_stop",
+    "train station": "railway=station",
+    "port": "amenity=ferry_terminal",
+    # Religious
+    "mosque": 'amenity=place_of_worship"][religion=muslim',
+    "church": 'amenity=place_of_worship"][religion=christian',
+    "cathedral": "building=cathedral",
+    # Food & Drink
+    "restaurant": "amenity=restaurant",
+    "cafe": "amenity=cafe",
+    "bar": "amenity=bar",
+    # Business & Services
+    "bank": "amenity=bank",
+    "atm": "amenity=atm",
+    "police": "amenity=police",
+    "post office": "amenity=post_office",
+    "embassy": "amenity=embassy",
+    # Landmarks
+    "downtown": "place=city_centre",
+    "city center": "place=city_centre",
+    "city centre": "place=city_centre",
+}
+# French translations
+OSM_POI_TAGS_FR = {
+    "école": "amenity=school",
+    "ecole": "amenity=school",
+    "lycée": "amenity=school",
+    "lycee": "amenity=school",
+    "collège": "amenity=school",
+    "college": "amenity=college",
+    "université": "amenity=university",
+    "universite": "amenity=university",
+    "hôpital": "amenity=hospital",
+    "hopital": "amenity=hospital",
+    "clinique": "amenity=clinic",
+    "pharmacie": "amenity=pharmacy",
+    "plage": "natural=beach",
+    "parc": "leisure=park",
+    "jardin": "leisure=garden",
+    "marché": "amenity=marketplace",
+    "marche": "amenity=marketplace",
+    "supermarché": "shop=supermarket",
+    "aéroport": "aeroway=aerodrome",
+    "aeroport": "aeroway=aerodrome",
+    "gare": "railway=station",
+    "mosquée": 'amenity=place_of_worship"][religion=muslim',
+    "mosquee": 'amenity=place_of_worship"][religion=muslim',
+    "église": 'amenity=place_of_worship"][religion=christian',
+    "eglise": 'amenity=place_of_worship"][religion=christian',
+    "centre-ville": "place=city_centre",
+}
+# Merge all tags
+ALL_POI_TAGS = {**OSM_POI_TAGS, **OSM_POI_TAGS_FR}
+# =============================================================================
+# Nominatim Geocoding
+# =============================================================================
+async def geocode_location(location: str) -> Optional[Tuple[float, float]]:
+    """
+    Convert location name to coordinates using Nominatim.
+    Args:
+        location: Location name (e.g., "Cotonou, Benin")
+    Returns:
+        Tuple of (latitude, longitude) or None if not found
+    """
+    global _last_nominatim_request
+    # Rate limiting: wait if needed
+    import time
+    now = time.time()
+    if now - _last_nominatim_request < 1:
+        await asyncio.sleep(1 - (now - _last_nominatim_request))
+    _last_nominatim_request = time.time()
+    try:
+        async with httpx.AsyncClient(timeout=15) as client:
+            response = await client.get(
+                "https://nominatim.openstreetmap.org/search",
+                params={
+                    "q": location,
+                    "format": "json",
+                    "limit": 1,
+                    "addressdetails": 1
+                },
+                headers={
+                    "User-Agent": "AIDA-RealEstate/1.0 (contact@lojiz.com)"
+                }
+            )
+            if response.status_code != 200:
+                logger.error(f"Nominatim error: {response.status_code}")
+                return None
+            data = response.json()
+            if not data:
+                logger.warning(f"Location not found: {location}")
+                return None
+            lat = float(data[0]["lat"])
+            lon = float(data[0]["lon"])
+            logger.info(
+                "Geocoded location",
+                location=location,
+                lat=lat,
+                lon=lon
+            )
+            return (lat, lon)
+    except Exception as e:
+        logger.error(f"Geocoding failed: {e}")
+        return None
+# =============================================================================
+# Overpass POI Search
+# =============================================================================
+async def find_pois_overpass(
+    poi_type: str,
+    center_lat: float,
+    center_lon: float,
+    radius_km: float = 5
+) -> List[Dict]:
+    """
+    Find POIs near a location using Overpass API.
+    Args:
+        poi_type: Type of POI (school, hospital, beach, etc.)
+        center_lat: Center latitude
+        center_lon: Center longitude
+        radius_km: Search radius in kilometers
+    Returns:
+        List of POI dicts with name, lat, lon, type
+    """
+    # Get OSM tag for this POI type
+    poi_lower = poi_type.lower().strip()
+    osm_tag = ALL_POI_TAGS.get(poi_lower)
+    if not osm_tag:
+        # Try partial matching
+        for key, tag in ALL_POI_TAGS.items():
+            if poi_lower in key or key in poi_lower:
+                osm_tag = tag
+                break
+    if not osm_tag:
+        # Default to amenity search
+        osm_tag = f"amenity={poi_lower}"
+        logger.warning(f"Unknown POI type '{poi_type}', using default: {osm_tag}")
+    # Build Overpass QL query
+    radius_meters = radius_km * 1000
+    query = f"""
+    [out:json][timeout:25];
+    (
+      node[{osm_tag}](around:{radius_meters},{center_lat},{center_lon});
+      way[{osm_tag}](around:{radius_meters},{center_lat},{center_lon});
+      relation[{osm_tag}](around:{radius_meters},{center_lat},{center_lon});
+    );
+    out center tags;
+    """
+    try:
+        async with httpx.AsyncClient(timeout=30) as client:
+            response = await client.post(
+                "https://overpass-api.de/api/interpreter",
+                data={"data": query},
+                headers={
+                    "User-Agent": "AIDA-RealEstate/1.0"
+                }
+            )
+            if response.status_code != 200:
+                logger.error(f"Overpass error: {response.status_code}")
+                return []
+            data = response.json()
+    except Exception as e:
+        logger.error(f"Overpass query failed: {e}")
+        return []
+    # Parse results
+    pois = []
+    for element in data.get("elements", []):
+        # Get coordinates
+        if element["type"] == "node":
+            lat = element.get("lat")
+            lon = element.get("lon")
+        else:
+            # For ways/relations, use center
+            center = element.get("center", {})
+            lat = center.get("lat")
+            lon = center.get("lon")
+        if not lat or not lon:
+            continue
+        tags = element.get("tags", {})
+        # Build POI entry
+        poi = {
+            "name": tags.get("name", f"{poi_type.title()} (unnamed)"),
+            "lat": lat,
+            "lon": lon,
+            "type": poi_type,
+            "osm_id": element.get("id"),
+            "osm_type": element.get("type"),
+        }
+        # Add extra info if available
+        if tags.get("addr:street"):
+            poi["address"] = f"{tags.get('addr:housenumber', '')} {tags['addr:street']}".strip()
+        if tags.get("website"):
+            poi["website"] = tags["website"]
+        if tags.get("phone"):
+            poi["phone"] = tags["phone"]
+        pois.append(poi)
+    logger.info(
+        "Found POIs",
+        poi_type=poi_type,
+        count=len(pois),
+        radius_km=radius_km
+    )
+    return pois
+# =============================================================================
+# Main Function: Find POIs by Location Name
+# =============================================================================
+async def find_pois(
+    poi_type: str,
+    location: str,
+    radius_km: float = 5,
+    limit: int = 10
+) -> List[Dict]:
+    """
+    Find POIs near a location (main entry point).
+    Args:
+        poi_type: Type of POI (school, hospital, beach, park, etc.)
+        location: Location name (e.g., "Cotonou", "Calavi, Benin")
+        radius_km: Search radius in kilometers (default 5km)
+        limit: Maximum number of results (default 10)
+    Returns:
+        List of POI dicts:
+        [
+            {
+                "name": "Collège Père Aupiais",
+                "lat": 6.3654,
+                "lon": 2.4183,
+                "type": "school",
+                "osm_id": 12345678,
+                "address": "Rue de l'École"
+            },
+            ...
+        ]
+    Example:
+        pois = await find_pois("school", "Cotonou, Benin", radius_km=3)
+    """
+    # Step 1: Geocode the location
+    coords = await geocode_location(location)
+    if not coords:
+        logger.warning(f"Could not geocode location: {location}")
+        return []
+    center_lat, center_lon = coords
+    # Step 2: Find POIs near those coordinates
+    pois = await find_pois_overpass(
+        poi_type=poi_type,
+        center_lat=center_lat,
+        center_lon=center_lon,
+        radius_km=radius_km
+    )
+    # Limit results
+    return pois[:limit]
+# =============================================================================
+# Batch POI Search
+# =============================================================================
+async def find_multiple_poi_types(
+    poi_types: List[str],
+    location: str,
+    radius_km: float = 5
+) -> Dict[str, List[Dict]]:
+    """
+    Find multiple types of POIs at once.
+    Args:
+        poi_types: List of POI types (e.g., ["school", "hospital", "park"])
+        location: Location name
+    Returns:
+        Dict mapping POI type to list of POIs:
+        {
+            "school": [...],
+            "hospital": [...],
+            "park": [...]
+        }
+    """
+    # Geocode once
+    coords = await geocode_location(location)
+    if not coords:
+        return {poi_type: [] for poi_type in poi_types}
+    center_lat, center_lon = coords
+    # Search each POI type in parallel
+    async def search_poi(poi_type: str):
+        return poi_type, await find_pois_overpass(
+            poi_type, center_lat, center_lon, radius_km
+        )
+    results = await asyncio.gather(*[search_poi(pt) for pt in poi_types])
+    return {poi_type: pois for poi_type, pois in results}
+# =============================================================================
+# Utility: Calculate Distance
+# =============================================================================
+def calculate_distance_km(
+    lat1: float,
+    lon1: float,
+    lat2: float,
+    lon2: float
+) -> float:
+    """
+    Calculate distance between two points using Haversine formula.
+    Returns distance in kilometers.
+    """
+    import math
+    R = 6371  # Earth's radius in km
+    lat1_rad = math.radians(lat1)
+    lat2_rad = math.radians(lat2)
+    delta_lat = math.radians(lat2 - lat1)
+    delta_lon = math.radians(lon2 - lon1)
+    a = (math.sin(delta_lat / 2) ** 2 +
+         math.cos(lat1_rad) * math.cos(lat2_rad) * math.sin(delta_lon / 2) ** 2)
+    c = 2 * math.atan2(math.sqrt(a), math.sqrt(1 - a))
+    return R * c
+# =============================================================================
+# Test Function
+# =============================================================================
+async def test_osm_service():
+    """Test the OSM POI service."""
+    print("\n" + "=" * 60)
+    print("Testing OpenStreetMap POI Service")
+    print("=" * 60 + "\n")
+    # Test 1: Geocoding
+    print("Test 1: Geocoding 'Cotonou, Benin'")
+    coords = await geocode_location("Cotonou, Benin")
+    if coords:
+        print(f"  ✅ Found: {coords}")
+    else:
+        print("  ❌ Failed")
+    # Test 2: Find schools
+    print("\nTest 2: Find schools in Cotonou")
+    schools = await find_pois("school", "Cotonou, Benin", radius_km=3)
+    print(f"  Found {len(schools)} schools:")
+    for school in schools[:5]:
+        print(f"    - {school['name']} ({school['lat']:.4f}, {school['lon']:.4f})")
+    # Test 3: Find hospitals
+    print("\nTest 3: Find hospitals in Cotonou")
+    hospitals = await find_pois("hospital", "Cotonou, Benin", radius_km=5)
+    print(f"  Found {len(hospitals)} hospitals:")
+    for hospital in hospitals[:3]:
+        print(f"    - {hospital['name']} ({hospital['lat']:.4f}, {hospital['lon']:.4f})")
+    # Test 4: Find markets
+    print("\nTest 4: Find markets in Cotonou")
+    markets = await find_pois("market", "Cotonou, Benin", radius_km=3)
+    print(f"  Found {len(markets)} markets:")
+    for market in markets[:3]:
+        print(f"    - {market['name']} ({market['lat']:.4f}, {market['lon']:.4f})")
+    # Test 5: French POI type
+    print("\nTest 5: Find 'école' (French) in Cotonou")
+    ecoles = await find_pois("école", "Cotonou, Benin", radius_km=3)
+    print(f"  Found {len(ecoles)} écoles")
+    print("\n" + "=" * 60)
+    print("OSM Service Tests Complete!")
+    print("=" * 60 + "\n")
+if __name__ == "__main__":
+    asyncio.run(test_osm_service())

app/ai/services/rlm_query_analyzer.py ADDED Viewed

	@@ -0,0 +1,287 @@

+# app/ai/services/rlm_query_analyzer.py
+"""
+RLM Query Analyzer - Detects complex queries that need recursive reasoning.
+Identifies:
+- Multi-hop queries: "near schools", "close to beach"
+- Boolean OR queries: "under 500k OR has pool"
+- Comparative queries: "compare Cotonou vs Calavi"
+- Aggregation queries: "average price", "how many"
+- Multi-factor queries: "best family apartment near schools and parks"
+"""
+import re
+from typing import Dict, List, Literal, Optional
+from enum import Enum
+from structlog import get_logger
+from pydantic import BaseModel
+logger = get_logger(__name__)
+class QueryComplexity(str, Enum):
+    """Types of complex queries that RLM can handle"""
+    SIMPLE = "simple"                    # Standard single-hop search
+    MULTI_HOP = "multi_hop"              # "near X", "close to Y"
+    BOOLEAN_OR = "boolean_or"            # "A OR B"
+    COMPARATIVE = "comparative"          # "compare A vs B"
+    AGGREGATION = "aggregation"          # "average", "total", "count"
+    MULTI_FACTOR = "multi_factor"        # Multiple ranking criteria
+class QueryAnalysis(BaseModel):
+    """Result of query analysis"""
+    complexity: QueryComplexity
+    confidence: float  # 0.0 to 1.0
+    reasoning: str
+    detected_patterns: List[str]
+    sub_query_hints: List[str]  # Hints for decomposition
+    use_rlm: bool
+# Pattern definitions for each complexity type
+MULTI_HOP_PATTERNS = [
+    r"\bnear\b",
+    r"\bclose to\b",
+    r"\bnearby\b",
+    r"\bwalking distance\b",
+    r"\bwithin \d+ ?(?:km|m|meters|miles|minutes)\b",
+    r"\baround\b",
+    r"\bproximity\b",
+    r"\bnext to\b",
+    r"\bbeside\b",
+    r"\bopposite\b",
+    r"\bacross from\b",
+    # French equivalents
+    r"\bprès de\b",
+    r"\bà côté de\b",
+    r"\bproche de\b",
+    r"\baux alentours\b",
+]
+BOOLEAN_OR_PATTERNS = [
+    r"\bor\b",
+    r"\beither\b",
+    r"\balternatively\b",
+    r"\botherwise\b",
+    # French
+    r"\bou\b",
+    r"\bsoit\b",
+]
+COMPARATIVE_PATTERNS = [
+    r"\bcompare\b",
+    r"\bvs\.?\b",
+    r"\bversus\b",
+    r"\bdifference between\b",
+    r"\bcheaper\b",
+    r"\bmore expensive\b",
+    r"\bbetter\b",
+    r"\bwhich is\b",
+    # French
+    r"\bcomparer\b",
+    r"\bentre\b",
+    r"\bmoins cher\b",
+    r"\bplus cher\b",
+]
+AGGREGATION_PATTERNS = [
+    r"\baverage\b",
+    r"\bmean\b",
+    r"\btotal\b",
+    r"\bcount\b",
+    r"\bhow many\b",
+    r"\bsum\b",
+    r"\bstatistics\b",
+    r"\brange\b",
+    r"\bmin(?:imum)?\b",
+    r"\bmax(?:imum)?\b",
+    # French
+    r"\bmoyenne\b",
+    r"\bcombien\b",
+    r"\btotal\b",
+]
+MULTI_FACTOR_PATTERNS = [
+    r"\bbest\b",
+    r"\btop\b",
+    r"\bideal\b",
+    r"\bperfect\b",
+    r"\brecommend\b",
+    r"\bsuitable\b",
+    r"\bfamily.?friendly\b",
+    r"\bsafe\b",
+    r"\bquiet\b",
+    r"\bpeaceful\b",
+    # Combined criteria indicators
+    r"\band\b.*\band\b",  # Multiple ANDs suggest multi-factor
+    # French
+    r"\bmeilleur\b",
+    r"\bidéal\b",
+    r"\brecommandé\b",
+    r"\bfamilial\b",
+    r"\bsécurisé\b",
+]
+# Points of Interest that trigger multi-hop search
+POI_KEYWORDS = [
+    # Education
+    "school", "university", "college", "campus", "école", "université",
+    # Health
+    "hospital", "clinic", "pharmacy", "hôpital", "clinique",
+    # Recreation
+    "beach", "park", "garden", "gym", "plage", "parc", "jardin",
+    # Shopping
+    "mall", "market", "supermarket", "marché", "supermarché",
+    # Transport
+    "airport", "station", "bus stop", "aéroport", "gare",
+    # Business
+    "downtown", "city center", "business district", "centre-ville",
+    # Landmarks
+    "mosque", "church", "cathedral", "mosquée", "église",
+]
+def analyze_query_complexity(query: str) -> QueryAnalysis:
+    """
+    Analyze a search query to determine if it needs RLM processing.
+    Args:
+        query: User's search query
+    Returns:
+        QueryAnalysis with complexity type and recommendations
+    """
+    query_lower = query.lower()
+    detected_patterns = []
+    sub_query_hints = []
+    scores = {
+        QueryComplexity.MULTI_HOP: 0.0,
+        QueryComplexity.BOOLEAN_OR: 0.0,
+        QueryComplexity.COMPARATIVE: 0.0,
+        QueryComplexity.AGGREGATION: 0.0,
+        QueryComplexity.MULTI_FACTOR: 0.0,
+    }
+    # Check for multi-hop patterns
+    for pattern in MULTI_HOP_PATTERNS:
+        if re.search(pattern, query_lower, re.IGNORECASE):
+            scores[QueryComplexity.MULTI_HOP] += 0.4
+            detected_patterns.append(f"proximity: {pattern}")
+    # Check for POI keywords (boost multi-hop if found with proximity)
+    poi_found = []
+    for poi in POI_KEYWORDS:
+        if poi.lower() in query_lower:
+            poi_found.append(poi)
+            if scores[QueryComplexity.MULTI_HOP] > 0:
+                scores[QueryComplexity.MULTI_HOP] += 0.3
+                sub_query_hints.append(f"Find {poi} locations first")
+    if poi_found:
+        detected_patterns.append(f"POI: {', '.join(poi_found)}")
+    # Check for boolean OR patterns
+    for pattern in BOOLEAN_OR_PATTERNS:
+        if re.search(pattern, query_lower, re.IGNORECASE):
+            scores[QueryComplexity.BOOLEAN_OR] += 0.5
+            detected_patterns.append(f"boolean: {pattern}")
+            # Try to extract OR branches
+            parts = re.split(r'\bor\b|\bou\b', query_lower, flags=re.IGNORECASE)
+            if len(parts) > 1:
+                for i, part in enumerate(parts):
+                    sub_query_hints.append(f"Branch {i+1}: {part.strip()}")
+    # Check for comparative patterns
+    for pattern in COMPARATIVE_PATTERNS:
+        if re.search(pattern, query_lower, re.IGNORECASE):
+            scores[QueryComplexity.COMPARATIVE] += 0.5
+            detected_patterns.append(f"comparative: {pattern}")
+            # Try to extract comparison subjects
+            vs_match = re.search(r'(\w+)\s+(?:vs\.?|versus|or)\s+(\w+)', query_lower)
+            if vs_match:
+                sub_query_hints.append(f"Compare: {vs_match.group(1)} vs {vs_match.group(2)}")
+    # Check for aggregation patterns
+    for pattern in AGGREGATION_PATTERNS:
+        if re.search(pattern, query_lower, re.IGNORECASE):
+            scores[QueryComplexity.AGGREGATION] += 0.5
+            detected_patterns.append(f"aggregation: {pattern}")
+            sub_query_hints.append("Fetch all matching listings, then aggregate")
+    # Check for multi-factor patterns
+    multi_factor_count = 0
+    for pattern in MULTI_FACTOR_PATTERNS:
+        if re.search(pattern, query_lower, re.IGNORECASE):
+            multi_factor_count += 1
+            detected_patterns.append(f"multi-factor: {pattern}")
+    # If 2+ factors detected, it's multi-factor
+    if multi_factor_count >= 2:
+        scores[QueryComplexity.MULTI_FACTOR] += 0.3 * multi_factor_count
+        sub_query_hints.append("Evaluate each factor separately, then combine scores")
+    # Determine dominant complexity type
+    max_score = max(scores.values())
+    if max_score < 0.3:
+        # Simple query - no RLM needed
+        return QueryAnalysis(
+            complexity=QueryComplexity.SIMPLE,
+            confidence=1.0 - max_score,
+            reasoning="No complex patterns detected, standard search sufficient",
+            detected_patterns=detected_patterns,
+            sub_query_hints=[],
+            use_rlm=False
+        )
+    # Find the complexity type with highest score
+    dominant_type = max(scores, key=scores.get)
+    confidence = min(scores[dominant_type], 1.0)
+    # Build reasoning
+    reasoning_map = {
+        QueryComplexity.MULTI_HOP: f"Query requires finding POI locations first, then searching nearby. POIs: {poi_found}",
+        QueryComplexity.BOOLEAN_OR: "Query has OR logic requiring separate searches and union",
+        QueryComplexity.COMPARATIVE: "Query requires searching multiple locations and comparing results",
+        QueryComplexity.AGGREGATION: "Query requires aggregating data across listings",
+        QueryComplexity.MULTI_FACTOR: f"Query has {multi_factor_count} ranking factors requiring weighted scoring",
+    }
+    logger.info(
+        "Query analyzed",
+        complexity=dominant_type.value,
+        confidence=confidence,
+        patterns=len(detected_patterns),
+        use_rlm=True
+    )
+    return QueryAnalysis(
+        complexity=dominant_type,
+        confidence=confidence,
+        reasoning=reasoning_map.get(dominant_type, "Complex query detected"),
+        detected_patterns=detected_patterns,
+        sub_query_hints=sub_query_hints,
+        use_rlm=True
+    )
+async def should_use_rlm(query: str) -> bool:
+    """
+    Quick check if query should use RLM.
+    Returns True if query is complex enough for RLM.
+    """
+    analysis = analyze_query_complexity(query)
+    return analysis.use_rlm
+# Export for use in other modules
+__all__ = [
+    "QueryComplexity",
+    "QueryAnalysis",
+    "analyze_query_complexity",
+    "should_use_rlm"
+]

app/ai/services/rlm_search_service.py ADDED Viewed

	@@ -0,0 +1,1202 @@

+# app/ai/services/rlm_search_service.py
+"""
+RLM (Recursive Language Model) Search Service for AIDA.
+Implements multi-hop reasoning for complex search queries using
+recursive decomposition and aggregation.
+Key Features:
+- Multi-hop proximity search ("near schools", "close to beach")
+- Boolean OR query handling ("under 500k OR has pool")
+- Comparative analysis ("compare Cotonou vs Calavi")
+- Aggregation queries ("average price in Cotonou")
+- Multi-factor ranking ("best family apartment near schools and parks")
+Uses existing DeepSeek LLM (brain_llm) - no additional infrastructure needed.
+"""
+import json
+import asyncio
+from typing import Dict, List, Any, Optional, Tuple
+from structlog import get_logger
+from langchain_openai import ChatOpenAI
+from langchain_core.messages import SystemMessage, HumanMessage
+from app.config import settings
+from app.ai.services.rlm_query_analyzer import (
+    QueryComplexity,
+    QueryAnalysis,
+    analyze_query_complexity
+)
+logger = get_logger(__name__)
+# Use existing DeepSeek LLM configuration
+rlm_llm = ChatOpenAI(
+    api_key=settings.DEEPSEEK_API_KEY,
+    base_url=settings.DEEPSEEK_BASE_URL,
+    model="deepseek-chat",
+    temperature=0.3,  # Lower temp for more deterministic decomposition
+)
+# =============================================================================
+# RLM CORE: Recursive Search Agent
+# =============================================================================
+class RLMSearchAgent:
+    """
+    Recursive Language Model Search Agent.
+    Decomposes complex queries into sub-queries, executes them recursively,
+    and aggregates results using LLM reasoning.
+    Example:
+    Query: "3-bed apartment near international schools in Cotonou under 500k"
+    RLM Flow:
+    1. Decompose: ["Find schools in Cotonou", "Find 3-bed under 500k near schools"]
+    2. Execute: Search schools → Get coordinates → Search apartments nearby
+    3. Aggregate: Rank by proximity to schools
+    """
+    def __init__(self):
+        self.llm = rlm_llm
+        self.max_depth = 3
+        self.call_count = 0
+        self.search_cache = {}  # Cache sub-query results
+    async def search(
+        self,
+        query: str,
+        context: Optional[Dict] = None,
+        analysis: Optional[QueryAnalysis] = None
+    ) -> Dict[str, Any]:
+        """
+        Main entry point for RLM search.
+        Args:
+            query: User's search query
+            context: Optional context (user location, previous results, etc.)
+            analysis: Optional pre-computed query analysis
+        Returns:
+            Dict with:
+            - results: List of matching listings
+            - strategy_used: RLM strategy name
+            - reasoning_steps: List of reasoning steps taken
+            - call_count: Number of LLM calls made
+        """
+        self.call_count = 0
+        # Analyze query if not provided
+        if analysis is None:
+            analysis = analyze_query_complexity(query)
+        logger.info(
+            "RLM search started",
+            query=query[:50],
+            complexity=analysis.complexity.value,
+            confidence=analysis.confidence
+        )
+        # Route to appropriate handler based on complexity
+        handler_map = {
+            QueryComplexity.MULTI_HOP: self._handle_multi_hop,
+            QueryComplexity.BOOLEAN_OR: self._handle_boolean_or,
+            QueryComplexity.COMPARATIVE: self._handle_comparative,
+            QueryComplexity.AGGREGATION: self._handle_aggregation,
+            QueryComplexity.MULTI_FACTOR: self._handle_multi_factor,
+            QueryComplexity.SIMPLE: self._handle_simple,
+        }
+        handler = handler_map.get(analysis.complexity, self._handle_simple)
+        try:
+            results = await handler(query, context or {}, analysis)
+            logger.info(
+                "RLM search complete",
+                query=query[:50],
+                result_count=len(results.get("results", [])),
+                call_count=self.call_count
+            )
+            return {
+                **results,
+                "strategy_used": f"RLM_{analysis.complexity.value.upper()}",
+                "call_count": self.call_count,
+                "analysis": analysis.model_dump()
+            }
+        except Exception as e:
+            logger.error("RLM search failed", error=str(e), query=query[:50])
+            # Fallback to simple search
+            return await self._handle_simple(query, context or {}, analysis)
+    # =========================================================================
+    # Handler: Multi-hop Queries ("near X", "close to Y")
+    # =========================================================================
+    async def _handle_multi_hop(
+        self,
+        query: str,
+        context: Dict,
+        analysis: QueryAnalysis
+    ) -> Dict[str, Any]:
+        """
+        Handle multi-hop proximity queries.
+        Example: "3-bed apartment near international schools in Cotonou"
+        Steps:
+        1. Extract POI type (schools) and location (Cotonou)
+        2. Find POI coordinates (schools in Cotonou)
+        3. Search listings near POI coordinates
+        4. Rank by proximity
+        """
+        reasoning_steps = []
+        # Step 1: Decompose query to extract POI and criteria
+        decomposition_prompt = f"""
+Analyze this real estate search query and extract the proximity components:
+Query: "{query}"
+Extract:
+1. POI (Point of Interest) type: What the user wants to be near (school, beach, park, etc.)
+2. Location: The city/area being searched
+3. Listing criteria: bedrooms, price, amenities, etc.
+Return JSON:
+{{
+    "poi_type": "school" or "beach" or "park" or "hospital" or "market" or "airport" or null,
+    "poi_name": "specific name if mentioned" or null,
+    "location": "city or area name",
+    "listing_criteria": {{
+        "bedrooms": number or null,
+        "max_price": number or null,
+        "min_price": number or null,
+        "amenities": ["list"] or [],
+        "listing_type": "rent" or "sale" or null
+    }},
+    "proximity_km": 2  // default proximity radius in km
+}}
+"""
+        self.call_count += 1
+        decomp_response = await self.llm.ainvoke([
+            HumanMessage(content=decomposition_prompt)
+        ])
+        try:
+            decomposition = self._extract_json(decomp_response.content)
+        except Exception:
+            logger.error("Failed to parse decomposition, falling back to simple search")
+            return await self._handle_simple(query, context, analysis)
+        reasoning_steps.append({
+            "step": "decomposition",
+            "result": decomposition
+        })
+        poi_type = decomposition.get("poi_type")
+        location = decomposition.get("location")
+        criteria = decomposition.get("listing_criteria", {})
+        proximity_km = decomposition.get("proximity_km", 2)
+        # Step 2: Find POI coordinates
+        poi_locations = []
+        if poi_type and location:
+            poi_locations = await self._find_poi_locations(
+                poi_type,
+                location,
+                decomposition.get("poi_name")
+            )
+            reasoning_steps.append({
+                "step": "find_poi",
+                "poi_type": poi_type,
+                "location": location,
+                "found": len(poi_locations)
+            })
+        # Step 3: Search listings near POI locations
+        if poi_locations:
+            # Search near each POI and aggregate
+            all_listings = []
+            for poi in poi_locations[:3]:  # Limit to top 3 POIs
+                nearby_listings = await self._search_near_coordinates(
+                    lat=poi["lat"],
+                    lon=poi["lon"],
+                    radius_km=proximity_km,
+                    criteria=criteria,
+                    location=location
+                )
+                # Add distance info to each listing
+                for listing in nearby_listings:
+                    listing["_poi_name"] = poi.get("name", poi_type)
+                    listing["_distance_km"] = self._calculate_distance(
+                        poi["lat"], poi["lon"],
+                        listing.get("latitude"), listing.get("longitude")
+                    )
+                all_listings.extend(nearby_listings)
+            # Deduplicate by listing ID
+            seen_ids = set()
+            unique_listings = []
+            for listing in all_listings:
+                lid = str(listing.get("_id") or listing.get("mongo_id"))
+                if lid not in seen_ids:
+                    seen_ids.add(lid)
+                    unique_listings.append(listing)
+            # Sort by distance
+            unique_listings.sort(key=lambda x: x.get("_distance_km", 999))
+            reasoning_steps.append({
+                "step": "proximity_search",
+                "poi_count": len(poi_locations),
+                "listings_found": len(unique_listings)
+            })
+            return {
+                "results": unique_listings[:10],
+                "reasoning_steps": reasoning_steps,
+                "message": f"Found {len(unique_listings)} listings near {poi_type}s in {location}"
+            }
+        else:
+            # No POI found, fall back to semantic search with location
+            logger.warning("No POI locations found, using semantic search")
+            return await self._semantic_search_with_criteria(query, location, criteria)
+    # =========================================================================
+    # Handler: Boolean OR Queries
+    # =========================================================================
+    async def _handle_boolean_or(
+        self,
+        query: str,
+        context: Dict,
+        analysis: QueryAnalysis
+    ) -> Dict[str, Any]:
+        """
+        Handle queries with OR logic.
+        Example: "Under 500k XOF OR (2-bedroom AND has pool)"
+        Steps:
+        1. Parse OR branches
+        2. Execute each branch in parallel
+        3. Union results
+        """
+        reasoning_steps = []
+        # Step 1: Parse OR branches
+        parse_prompt = f"""
+Parse this real estate query into separate OR branches:
+Query: "{query}"
+Return JSON:
+{{
+    "branches": [
+        {{
+            "description": "human-readable description",
+            "criteria": {{
+                "location": "city" or null,
+                "max_price": number or null,
+                "min_price": number or null,
+                "bedrooms": number or null,
+                "amenities": ["list"] or [],
+                "listing_type": "rent" or "sale" or null
+            }}
+        }}
+    ],
+    "shared_criteria": {{
+        // Criteria that apply to ALL branches (e.g., location)
+        "location": "city" or null
+    }}
+}}
+Example for "Under 500k OR (2-bed AND pool) in Cotonou":
+{{
+    "branches": [
+        {{"description": "Under 500k", "criteria": {{"max_price": 500000}}}},
+        {{"description": "2-bed with pool", "criteria": {{"bedrooms": 2, "amenities": ["pool"]}}}}
+    ],
+    "shared_criteria": {{"location": "Cotonou"}}
+}}
+"""
+        self.call_count += 1
+        parse_response = await self.llm.ainvoke([
+            HumanMessage(content=parse_prompt)
+        ])
+        try:
+            parsed = self._extract_json(parse_response.content)
+        except Exception:
+            logger.error("Failed to parse OR branches")
+            return await self._handle_simple(query, context, analysis)
+        branches = parsed.get("branches", [])
+        shared = parsed.get("shared_criteria", {})
+        reasoning_steps.append({
+            "step": "parse_or_branches",
+            "branch_count": len(branches),
+            "shared_criteria": shared
+        })
+        # Step 2: Execute each branch in parallel
+        async def execute_branch(branch: Dict) -> List[Dict]:
+            criteria = {**shared, **branch.get("criteria", {})}
+            return await self._execute_criteria_search(criteria)
+        branch_results = await asyncio.gather(
+            *[execute_branch(b) for b in branches]
+        )
+        # Step 3: Union results (deduplicate)
+        seen_ids = set()
+        union_results = []
+        for i, results in enumerate(branch_results):
+            reasoning_steps.append({
+                "step": f"branch_{i+1}",
+                "description": branches[i].get("description"),
+                "results_count": len(results)
+            })
+            for listing in results:
+                lid = str(listing.get("_id") or listing.get("mongo_id"))
+                if lid not in seen_ids:
+                    seen_ids.add(lid)
+                    listing["_matched_branch"] = branches[i].get("description")
+                    union_results.append(listing)
+        reasoning_steps.append({
+            "step": "union",
+            "total_unique": len(union_results)
+        })
+        return {
+            "results": union_results[:10],
+            "reasoning_steps": reasoning_steps,
+            "message": f"Found {len(union_results)} listings matching any of {len(branches)} criteria"
+        }
+    # =========================================================================
+    # Handler: Comparative Queries
+    # =========================================================================
+    async def _handle_comparative(
+        self,
+        query: str,
+        context: Dict,
+        analysis: QueryAnalysis
+    ) -> Dict[str, Any]:
+        """
+        Handle comparative queries.
+        Example: "Compare average prices in Cotonou vs Calavi"
+        Steps:
+        1. Extract comparison subjects and metrics
+        2. Search each subject
+        3. Calculate and compare metrics
+        """
+        reasoning_steps = []
+        # Step 1: Parse comparison
+        compare_prompt = f"""
+Parse this comparative real estate query:
+Query: "{query}"
+Return JSON:
+{{
+    "subjects": [
+        {{"name": "Cotonou", "type": "location"}},
+        {{"name": "Calavi", "type": "location"}}
+    ],
+    "metric": "average_price" or "count" or "price_range",
+    "listing_criteria": {{
+        "bedrooms": number or null,
+        "listing_type": "rent" or "sale" or null
+    }}
+}}
+"""
+        self.call_count += 1
+        compare_response = await self.llm.ainvoke([
+            HumanMessage(content=compare_prompt)
+        ])
+        try:
+            comparison = self._extract_json(compare_response.content)
+        except Exception:
+            return await self._handle_simple(query, context, analysis)
+        subjects = comparison.get("subjects", [])
+        metric = comparison.get("metric", "average_price")
+        criteria = comparison.get("listing_criteria", {})
+        reasoning_steps.append({
+            "step": "parse_comparison",
+            "subjects": [s["name"] for s in subjects],
+            "metric": metric
+        })
+        # Step 2: Search each subject
+        subject_results = []
+        for subject in subjects:
+            search_criteria = {**criteria, "location": subject["name"]}
+            listings = await self._execute_criteria_search(search_criteria, limit=50)
+            subject_results.append({
+                "name": subject["name"],
+                "listings": listings,
+                "count": len(listings)
+            })
+        # Step 3: Calculate metrics
+        for result in subject_results:
+            listings = result["listings"]
+            if listings:
+                prices = [l.get("price", 0) for l in listings if l.get("price")]
+                result["avg_price"] = sum(prices) / len(prices) if prices else 0
+                result["min_price"] = min(prices) if prices else 0
+                result["max_price"] = max(prices) if prices else 0
+            else:
+                result["avg_price"] = 0
+                result["min_price"] = 0
+                result["max_price"] = 0
+            reasoning_steps.append({
+                "step": f"metrics_{result['name']}",
+                "count": result["count"],
+                "avg_price": result["avg_price"]
+            })
+        # Step 4: Generate comparison summary
+        summary = await self._generate_comparison_summary(subject_results, metric)
+        # Return top listings from each subject
+        combined_results = []
+        for result in subject_results:
+            for listing in result["listings"][:5]:
+                listing["_comparison_group"] = result["name"]
+                combined_results.append(listing)
+        return {
+            "results": combined_results[:10],
+            "reasoning_steps": reasoning_steps,
+            "comparison_data": subject_results,
+            "message": summary
+        }
+    # =========================================================================
+    # Handler: Aggregation Queries
+    # =========================================================================
+    async def _handle_aggregation(
+        self,
+        query: str,
+        context: Dict,
+        analysis: QueryAnalysis
+    ) -> Dict[str, Any]:
+        """
+        Handle aggregation queries (average, count, etc.)
+        """
+        reasoning_steps = []
+        # Parse aggregation request
+        agg_prompt = f"""
+Parse this aggregation query:
+Query: "{query}"
+Return JSON:
+{{
+    "aggregation_type": "average" or "count" or "sum" or "min" or "max",
+    "field": "price" or "bedrooms",
+    "filters": {{
+        "location": "city" or null,
+        "listing_type": "rent" or "sale" or null
+    }}
+}}
+"""
+        self.call_count += 1
+        agg_response = await self.llm.ainvoke([
+            HumanMessage(content=agg_prompt)
+        ])
+        try:
+            aggregation = self._extract_json(agg_response.content)
+        except Exception:
+            return await self._handle_simple(query, context, analysis)
+        agg_type = aggregation.get("aggregation_type", "count")
+        field = aggregation.get("field", "price")
+        filters = aggregation.get("filters", {})
+        # Fetch listings
+        listings = await self._execute_criteria_search(filters, limit=100)
+        # Calculate aggregation
+        values = [l.get(field, 0) for l in listings if l.get(field) is not None]
+        result = 0
+        if agg_type == "count":
+            result = len(listings)
+        elif agg_type == "average" and values:
+            result = sum(values) / len(values)
+        elif agg_type == "sum":
+            result = sum(values)
+        elif agg_type == "min" and values:
+            result = min(values)
+        elif agg_type == "max" and values:
+            result = max(values)
+        reasoning_steps.append({
+            "step": "aggregation",
+            "type": agg_type,
+            "field": field,
+            "sample_size": len(listings),
+            "result": result
+        })
+        location = filters.get("location", "all areas")
+        message = f"The {agg_type} {field} in {location} is {result:,.0f}"
+        return {
+            "results": listings[:10],
+            "reasoning_steps": reasoning_steps,
+            "aggregation_result": {
+                "type": agg_type,
+                "field": field,
+                "value": result,
+                "sample_size": len(listings)
+            },
+            "message": message
+        }
+    # =========================================================================
+    # Handler: Multi-Factor Queries
+    # =========================================================================
+    async def _handle_multi_factor(
+        self,
+        query: str,
+        context: Dict,
+        analysis: QueryAnalysis
+    ) -> Dict[str, Any]:
+        """
+        Handle multi-factor ranking queries.
+        Example: "Best family apartment near schools and parks, safe area"
+        Steps:
+        1. Extract ranking factors
+        2. Score each factor
+        3. Combine scores with weights
+        """
+        reasoning_steps = []
+        # Parse factors
+        factor_prompt = f"""
+Extract ranking factors from this query:
+Query: "{query}"
+Return JSON:
+{{
+    "location": "city" or null,
+    "base_criteria": {{
+        "bedrooms": number or null,
+        "max_price": number or null
+    }},
+    "ranking_factors": [
+        {{"factor": "school_proximity", "weight": 0.3}},
+        {{"factor": "park_proximity", "weight": 0.2}},
+        {{"factor": "safety", "weight": 0.3}},
+        {{"factor": "family_friendly", "weight": 0.2}}
+    ]
+}}
+Available factors:
+- school_proximity: Near schools
+- park_proximity: Near parks
+- beach_proximity: Near beach
+- safety: Safe neighborhood
+- family_friendly: Family-friendly amenities
+- luxury: Luxury amenities
+- modern: Modern/renovated
+- quiet: Quiet/peaceful area
+"""
+        self.call_count += 1
+        factor_response = await self.llm.ainvoke([
+            HumanMessage(content=factor_prompt)
+        ])
+        try:
+            factors = self._extract_json(factor_response.content)
+        except Exception:
+            return await self._handle_simple(query, context, analysis)
+        location = factors.get("location")
+        base_criteria = factors.get("base_criteria", {})
+        ranking_factors = factors.get("ranking_factors", [])
+        reasoning_steps.append({
+            "step": "extract_factors",
+            "location": location,
+            "factor_count": len(ranking_factors)
+        })
+        # Get base listings
+        search_criteria = {**base_criteria}
+        if location:
+            search_criteria["location"] = location
+        listings = await self._execute_criteria_search(search_criteria, limit=30)
+        if not listings:
+            return {
+                "results": [],
+                "reasoning_steps": reasoning_steps,
+                "message": f"No listings found in {location}"
+            }
+        # Score each listing on each factor
+        for listing in listings:
+            total_score = 0
+            factor_scores = {}
+            for factor_info in ranking_factors:
+                factor = factor_info["factor"]
+                weight = factor_info.get("weight", 0.25)
+                score = await self._score_factor(listing, factor, location)
+                factor_scores[factor] = score
+                total_score += score * weight
+            listing["_factor_scores"] = factor_scores
+            listing["_total_score"] = total_score
+        # Sort by total score
+        listings.sort(key=lambda x: x.get("_total_score", 0), reverse=True)
+        reasoning_steps.append({
+            "step": "scoring",
+            "listings_scored": len(listings),
+            "top_score": listings[0].get("_total_score") if listings else 0
+        })
+        return {
+            "results": listings[:10],
+            "reasoning_steps": reasoning_steps,
+            "message": f"Found {len(listings)} listings ranked by {len(ranking_factors)} factors"
+        }
+    # =========================================================================
+    # Handler: Simple Queries (Fallback)
+    # =========================================================================
+    async def _handle_simple(
+        self,
+        query: str,
+        context: Dict,
+        analysis: QueryAnalysis
+    ) -> Dict[str, Any]:
+        """
+        Fallback handler for simple queries - uses existing hybrid search.
+        """
+        from app.ai.services.search_service import hybrid_search
+        from app.ai.services.search_extractor import extract_search_params
+        params = await extract_search_params(query)
+        results = await hybrid_search(
+            query_text=query,
+            search_params=params,
+            limit=10
+        )
+        return {
+            "results": results,
+            "reasoning_steps": [{"step": "simple_search", "params": params}],
+            "message": f"Found {len(results)} listings"
+        }
+    # =========================================================================
+    # Helper Methods
+    # =========================================================================
+    async def _find_poi_locations(
+        self,
+        poi_type: str,
+        location: str,
+        specific_name: Optional[str] = None
+    ) -> List[Dict]:
+        """
+        Find POI (Point of Interest) locations using OpenStreetMap.
+        Uses FREE OpenStreetMap APIs:
+        - Nominatim: Geocoding (location name → coordinates)
+        - Overpass: POI search (find schools, hospitals, parks near a location)
+        Args:
+            poi_type: Type of POI (school, hospital, beach, park, etc.)
+            location: City/area name (e.g., "Cotonou, Benin")
+            specific_name: Optional specific POI name to search for
+        Returns:
+            List of POI dicts with name, lat, lon, type
+        """
+        try:
+            from app.ai.services.osm_poi_service import find_pois
+            # Use OSM to get real POI locations
+            search_type = specific_name if specific_name else poi_type
+            logger.info(
+                "Finding POIs via OpenStreetMap",
+                poi_type=search_type,
+                location=location
+            )
+            pois = await find_pois(
+                poi_type=search_type,
+                location=location,
+                radius_km=5,  # Search within 5km of location center
+                limit=5       # Get top 5 POIs
+            )
+            if pois:
+                logger.info(
+                    "OSM POIs found",
+                    count=len(pois),
+                    poi_type=poi_type,
+                    location=location
+                )
+                return pois
+            logger.warning(
+                "No OSM POIs found, trying with broader search",
+                poi_type=poi_type,
+                location=location
+            )
+            # Try with just the POI type if specific name returned nothing
+            if specific_name:
+                pois = await find_pois(
+                    poi_type=poi_type,
+                    location=location,
+                    radius_km=10,  # Expand radius
+                    limit=5
+                )
+                if pois:
+                    return pois
+        except ImportError:
+            logger.error("OSM POI service not available")
+        except Exception as e:
+            logger.error(f"OSM POI search failed: {e}")
+        # Fallback: Use LLM to estimate coordinates (less accurate)
+        logger.warning("Falling back to LLM POI estimation")
+        return await self._fallback_llm_poi_locations(poi_type, location, specific_name)
+    async def _fallback_llm_poi_locations(
+        self,
+        poi_type: str,
+        location: str,
+        specific_name: Optional[str] = None
+    ) -> List[Dict]:
+        """
+        Fallback: Use LLM to estimate POI coordinates when OSM fails.
+        Note: This is less accurate than OSM data and should only be used as fallback.
+        """
+        poi_prompt = f"""
+You are a geolocation assistant. Provide approximate coordinates for {poi_type}s in {location}.
+{f"Specifically looking for: {specific_name}" if specific_name else ""}
+Return JSON array of up to 3 POIs:
+[
+    {{"name": "POI name", "lat": 6.3654, "lon": 2.4183, "type": "{poi_type}"}}
+]
+Use realistic coordinates for {location}. If you don't know exact coordinates,
+provide approximate city center coordinates for {location}.
+"""
+        self.call_count += 1
+        poi_response = await self.llm.ainvoke([
+            HumanMessage(content=poi_prompt)
+        ])
+        try:
+            pois = self._extract_json(poi_response.content)
+            return pois if isinstance(pois, list) else []
+        except Exception:
+            logger.error("Failed to get POI locations from LLM fallback")
+            return []
+    async def _search_near_coordinates(
+        self,
+        lat: float,
+        lon: float,
+        radius_km: float,
+        criteria: Dict,
+        location: str
+    ) -> List[Dict]:
+        """
+        Search listings near specific coordinates.
+        Uses MongoDB geospatial query if listings have lat/lon,
+        otherwise falls back to location-based search.
+        """
+        from app.database import get_db
+        try:
+            db = await get_db()
+            # Build geo query
+            # Note: This requires a 2dsphere index on listings collection
+            # db.listings.create_index([("location_geo", "2dsphere")])
+            geo_query = {
+                "status": "active"
+            }
+            # Add criteria filters
+            if criteria.get("bedrooms"):
+                geo_query["bedrooms"] = {"$gte": criteria["bedrooms"]}
+            if criteria.get("max_price"):
+                geo_query["price"] = {"$lte": criteria["max_price"]}
+            if criteria.get("min_price"):
+                if "price" in geo_query:
+                    geo_query["price"]["$gte"] = criteria["min_price"]
+                else:
+                    geo_query["price"] = {"$gte": criteria["min_price"]}
+            if criteria.get("listing_type"):
+                geo_query["listing_type"] = {"$regex": criteria["listing_type"], "$options": "i"}
+            # Try geospatial query first
+            if lat and lon:
+                # Convert km to meters for MongoDB
+                radius_meters = radius_km * 1000
+                geo_query["$or"] = [
+                    # Check if listing has coordinates
+                    {
+                        "latitude": {"$exists": True, "$ne": None},
+                        "longitude": {"$exists": True, "$ne": None}
+                    }
+                ]
+                # Fetch listings and filter by distance in Python
+                # (More flexible than requiring 2dsphere index)
+                cursor = db.listings.find(geo_query).limit(50)
+                listings = await cursor.to_list(length=50)
+                # Filter by distance
+                nearby = []
+                for listing in listings:
+                    if listing.get("latitude") and listing.get("longitude"):
+                        dist = self._calculate_distance(
+                            lat, lon,
+                            listing["latitude"], listing["longitude"]
+                        )
+                        if dist <= radius_km:
+                            listing["_id"] = str(listing["_id"])
+                            nearby.append(listing)
+                    # Also include listings in the same location (fallback)
+                    elif location and location.lower() in str(listing.get("location", "")).lower():
+                        listing["_id"] = str(listing["_id"])
+                        nearby.append(listing)
+                return nearby
+            else:
+                # No coordinates, search by location name
+                if location:
+                    geo_query["location"] = {"$regex": location, "$options": "i"}
+                cursor = db.listings.find(geo_query).limit(20)
+                listings = await cursor.to_list(length=20)
+                for listing in listings:
+                    listing["_id"] = str(listing["_id"])
+                return listings
+        except Exception as e:
+            logger.error("Geo search failed", error=str(e))
+            return []
+    async def _execute_criteria_search(
+        self,
+        criteria: Dict,
+        limit: int = 20
+    ) -> List[Dict]:
+        """
+        Execute a search with given criteria using existing search infrastructure.
+        """
+        from app.ai.services.search_service import search_mongodb
+        results = await search_mongodb(criteria, limit=limit)
+        return results
+    async def _semantic_search_with_criteria(
+        self,
+        query: str,
+        location: str,
+        criteria: Dict
+    ) -> Dict[str, Any]:
+        """
+        Semantic search with additional criteria.
+        """
+        from app.ai.services.search_service import hybrid_search
+        search_params = {**criteria}
+        if location:
+            search_params["location"] = location
+        results = await hybrid_search(
+            query_text=query,
+            search_params=search_params,
+            limit=10
+        )
+        return {
+            "results": results,
+            "reasoning_steps": [{"step": "semantic_fallback"}],
+            "message": f"Found {len(results)} listings in {location}"
+        }
+    async def _generate_comparison_summary(
+        self,
+        subject_results: List[Dict],
+        metric: str
+    ) -> str:
+        """
+        Generate a natural language comparison summary.
+        """
+        if len(subject_results) < 2:
+            return "Not enough data for comparison"
+        s1, s2 = subject_results[0], subject_results[1]
+        if metric == "average_price":
+            diff = abs(s1["avg_price"] - s2["avg_price"])
+            cheaper = s1["name"] if s1["avg_price"] < s2["avg_price"] else s2["name"]
+            pct = (diff / max(s1["avg_price"], s2["avg_price"]) * 100) if max(s1["avg_price"], s2["avg_price"]) > 0 else 0
+            return (
+                f"Average prices: {s1['name']}: {s1['avg_price']:,.0f} XOF | "
+                f"{s2['name']}: {s2['avg_price']:,.0f} XOF. "
+                f"{cheaper} is {pct:.0f}% cheaper."
+            )
+        else:
+            return f"Comparison: {s1['name']} ({s1['count']} listings) vs {s2['name']} ({s2['count']} listings)"
+    async def _score_factor(
+        self,
+        listing: Dict,
+        factor: str,
+        location: str
+    ) -> float:
+        """
+        Score a listing on a specific factor (0-1).
+        Uses:
+        - OpenStreetMap for proximity calculations (school_proximity, park_proximity, etc.)
+        - Text analysis for non-proximity factors (safety, luxury, modern, etc.)
+        """
+        # Proximity factors - use OSM for actual distance calculation
+        proximity_factors = {
+            "school_proximity": "school",
+            "park_proximity": "park",
+            "beach_proximity": "beach",
+            "hospital_proximity": "hospital",
+            "market_proximity": "market",
+        }
+        # Check if this is a proximity factor and listing has coordinates
+        if factor in proximity_factors and listing.get("latitude") and listing.get("longitude"):
+            poi_type = proximity_factors[factor]
+            return await self._score_proximity_factor(
+                listing=listing,
+                poi_type=poi_type,
+                location=location
+            )
+        # Non-proximity factors - use text analysis
+        score = 0.5  # Default neutral score
+        title = str(listing.get("title", "")).lower()
+        description = str(listing.get("description", "")).lower()
+        amenities = [a.lower() for a in listing.get("amenities", [])]
+        text = f"{title} {description} {' '.join(amenities)}"
+        factor_keywords = {
+            "school_proximity": ["school", "école", "university", "campus", "education"],
+            "park_proximity": ["park", "garden", "parc", "jardin", "green"],
+            "beach_proximity": ["beach", "plage", "ocean", "sea", "waterfront"],
+            "safety": ["safe", "secure", "security", "sécurité", "gated", "guard"],
+            "family_friendly": ["family", "children", "kids", "playground", "familial"],
+            "luxury": ["luxury", "luxe", "premium", "high-end", "prestige", "elegant"],
+            "modern": ["modern", "new", "renovated", "contemporary", "neuf"],
+            "quiet": ["quiet", "peaceful", "calm", "tranquil", "calme"],
+        }
+        keywords = factor_keywords.get(factor, [])
+        matches = sum(1 for kw in keywords if kw in text)
+        if matches > 0:
+            score = min(0.5 + (matches * 0.2), 1.0)
+        return score
+    async def _score_proximity_factor(
+        self,
+        listing: Dict,
+        poi_type: str,
+        location: str
+    ) -> float:
+        """
+        Score a listing based on actual proximity to POIs using OpenStreetMap.
+        Scoring:
+        - < 0.5 km: 1.0 (excellent)
+        - 0.5 - 1 km: 0.9 (very good)
+        - 1 - 2 km: 0.75 (good)
+        - 2 - 3 km: 0.5 (average)
+        - 3 - 5 km: 0.3 (below average)
+        - > 5 km: 0.1 (poor)
+        """
+        try:
+            from app.ai.services.osm_poi_service import find_pois_overpass
+            listing_lat = listing.get("latitude")
+            listing_lon = listing.get("longitude")
+            if not listing_lat or not listing_lon:
+                return 0.5  # No coordinates, return neutral score
+            # Find nearby POIs
+            pois = await find_pois_overpass(
+                poi_type=poi_type,
+                center_lat=listing_lat,
+                center_lon=listing_lon,
+                radius_km=5
+            )
+            if not pois:
+                return 0.3  # No POIs found nearby
+            # Find closest POI
+            min_distance = float('inf')
+            for poi in pois:
+                dist = self._calculate_distance(
+                    listing_lat, listing_lon,
+                    poi.get("lat"), poi.get("lon")
+                )
+                min_distance = min(min_distance, dist)
+            # Score based on distance
+            if min_distance < 0.5:
+                return 1.0
+            elif min_distance < 1:
+                return 0.9
+            elif min_distance < 2:
+                return 0.75
+            elif min_distance < 3:
+                return 0.5
+            elif min_distance < 5:
+                return 0.3
+            else:
+                return 0.1
+        except Exception as e:
+            logger.error(f"Proximity scoring failed: {e}")
+            return 0.5  # Return neutral on error
+    def _calculate_distance(
+        self,
+        lat1: float,
+        lon1: float,
+        lat2: Optional[float],
+        lon2: Optional[float]
+    ) -> float:
+        """
+        Calculate distance between two points using Haversine formula.
+        Returns distance in kilometers.
+        """
+        import math
+        if lat2 is None or lon2 is None:
+            return 999.0  # Return large distance for missing coordinates
+        R = 6371  # Earth's radius in km
+        lat1_rad = math.radians(lat1)
+        lat2_rad = math.radians(lat2)
+        delta_lat = math.radians(lat2 - lat1)
+        delta_lon = math.radians(lon2 - lon1)
+        a = (math.sin(delta_lat/2)**2 +
+             math.cos(lat1_rad) * math.cos(lat2_rad) * math.sin(delta_lon/2)**2)
+        c = 2 * math.atan2(math.sqrt(a), math.sqrt(1-a))
+        return R * c
+    def _extract_json(self, text: str) -> Any:
+        """
+        Extract JSON from LLM response text.
+        """
+        import re
+        # Try to find JSON in the response
+        json_match = re.search(r'[\[{][\s\S]*[\]}]', text)
+        if json_match:
+            return json.loads(json_match.group())
+        raise ValueError("No JSON found in response")
+# =============================================================================
+# Singleton Instance
+# =============================================================================
+_rlm_agent: Optional[RLMSearchAgent] = None
+def get_rlm_agent() -> RLMSearchAgent:
+    """Get or create the singleton RLM agent."""
+    global _rlm_agent
+    if _rlm_agent is None:
+        _rlm_agent = RLMSearchAgent()
+    return _rlm_agent
+# =============================================================================
+# Convenience Function
+# =============================================================================
+async def rlm_search(query: str, context: Optional[Dict] = None) -> Dict[str, Any]:
+    """
+    Convenience function for RLM search.
+    Usage:
+        from app.ai.services.rlm_search_service import rlm_search
+        results = await rlm_search("3-bed near schools in Cotonou")
+    """
+    agent = get_rlm_agent()
+    return await agent.search(query, context)
+__all__ = [
+    "RLMSearchAgent",
+    "get_rlm_agent",
+    "rlm_search"
+]

app/ai/services/search_strategy_selector.py CHANGED Viewed

@@ -7,6 +7,13 @@ Strategies:
 - QDRANT_ONLY: Pure semantic search (vague/descriptive queries)
 - MONGO_THEN_QDRANT: Filter by location/price in MongoDB, then semantic search within results
 - QDRANT_THEN_MONGO: Semantic search first, then apply MongoDB filters
 """
 import logging
@@ -22,11 +29,19 @@ logger = logging.getLogger(__name__)
 class SearchStrategy(str, Enum):
     """Available search strategies"""
     MONGO_ONLY = "MONGO_ONLY"
     QDRANT_ONLY = "QDRANT_ONLY"
     MONGO_THEN_QDRANT = "MONGO_THEN_QDRANT"
     QDRANT_THEN_MONGO = "QDRANT_THEN_MONGO"
 # LLM for strategy selection
 llm = ChatOpenAI(
@@ -81,19 +96,67 @@ Return ONLY valid JSON:
 async def select_search_strategy(user_query: str, search_params: Dict) -> Dict:
     """
     Select optimal search strategy based on query and extracted parameters.
     Args:
         user_query: Original user query
         search_params: Extracted search parameters
     Returns:
         Dict with:
         - strategy: SearchStrategy enum value
         - reasoning: str
         - has_semantic_features: bool
         - has_structured_filters: bool
     """
     # Quick heuristics for obvious cases
     has_location = bool(search_params.get("location"))
     has_price = bool(search_params.get("min_price") or search_params.get("max_price"))
@@ -101,9 +164,9 @@ async def select_search_strategy(user_query: str, search_params: Dict) -> Dict:
     has_bathrooms = bool(search_params.get("bathrooms"))
     has_listing_type = bool(search_params.get("listing_type"))
     has_amenities = bool(search_params.get("amenities") and len(search_params.get("amenities", [])) > 0)
     structured_count = sum([has_location, has_price, has_bedrooms, has_bathrooms, has_listing_type])
     # Detect semantic keywords in query
     semantic_keywords = [
         "close to", "near", "nearby", "walking distance",
@@ -117,10 +180,9 @@ async def select_search_strategy(user_query: str, search_params: Dict) -> Dict:
         "good vibes", "nice area", "good neighborhood",
         "beach", "school", "market", "downtown", "city center",
     ]
-    query_lower = user_query.lower()
     has_semantic = any(keyword in query_lower for keyword in semantic_keywords)
     # Simple rule-based decision for clear cases
     if structured_count >= 2 and not has_semantic and not has_amenities:
         # Pure structured query
@@ -128,25 +190,28 @@ async def select_search_strategy(user_query: str, search_params: Dict) -> Dict:
             "strategy": SearchStrategy.MONGO_ONLY,
             "reasoning": "Query has multiple structured filters and no semantic features",
             "has_semantic_features": False,
-            "has_structured_filters": True
         }
     if structured_count == 0 and (has_semantic or has_amenities):
         # Pure semantic query
         return {
             "strategy": SearchStrategy.QDRANT_ONLY,
             "reasoning": "Query is purely semantic/descriptive with no structured filters",
             "has_semantic_features": True,
-            "has_structured_filters": False
         }
     if has_location and has_semantic:
         # Location + semantic features
         return {
             "strategy": SearchStrategy.MONGO_THEN_QDRANT,
             "reasoning": "Query has location filter and semantic features - filter by location first, then semantic search",
             "has_semantic_features": True,
-            "has_structured_filters": True
         }
     # Use LLM for complex cases
@@ -171,13 +236,15 @@ async def select_search_strategy(user_query: str, search_params: Dict) -> Dict:
                 "strategy": SearchStrategy.MONGO_ONLY,
                 "reasoning": "Strategy selection failed, using MongoDB filters",
                 "has_semantic_features": False,
-                "has_structured_filters": True
             }
         result = validation.data
         logger.info(f"Strategy selected: {result.get('strategy')} - {result.get('reasoning')}")
         return result
     except Exception as e:
         logger.error(f"Strategy selection error: {e}")
         # Default to MONGO_ONLY on error
@@ -185,5 +252,6 @@ async def select_search_strategy(user_query: str, search_params: Dict) -> Dict:
             "strategy": SearchStrategy.MONGO_ONLY,
             "reasoning": "Strategy selection error, defaulting to MongoDB",
             "has_semantic_features": False,
-            "has_structured_filters": True
         }

 - QDRANT_ONLY: Pure semantic search (vague/descriptive queries)
 - MONGO_THEN_QDRANT: Filter by location/price in MongoDB, then semantic search within results
 - QDRANT_THEN_MONGO: Semantic search first, then apply MongoDB filters
+RLM Strategies (Recursive Language Model):
+- RLM_MULTI_HOP: "near schools", "close to beach" - requires finding POI first
+- RLM_BOOLEAN_OR: "under 500k OR has pool" - complex OR logic
+- RLM_COMPARATIVE: "compare Cotonou vs Calavi" - multi-location comparison
+- RLM_AGGREGATION: "average price", "how many" - data aggregation
+- RLM_MULTI_FACTOR: "best family apartment" - multi-criteria ranking
 """
 import logging
 class SearchStrategy(str, Enum):
     """Available search strategies"""
+    # Traditional strategies
     MONGO_ONLY = "MONGO_ONLY"
     QDRANT_ONLY = "QDRANT_ONLY"
     MONGO_THEN_QDRANT = "MONGO_THEN_QDRANT"
     QDRANT_THEN_MONGO = "QDRANT_THEN_MONGO"
+    # RLM (Recursive Language Model) strategies
+    RLM_MULTI_HOP = "RLM_MULTI_HOP"           # "near X", "close to Y"
+    RLM_BOOLEAN_OR = "RLM_BOOLEAN_OR"         # "X OR Y"
+    RLM_COMPARATIVE = "RLM_COMPARATIVE"       # "compare A vs B"
+    RLM_AGGREGATION = "RLM_AGGREGATION"       # "average", "count"
+    RLM_MULTI_FACTOR = "RLM_MULTI_FACTOR"     # multi-criteria ranking
 # LLM for strategy selection
 llm = ChatOpenAI(
 async def select_search_strategy(user_query: str, search_params: Dict) -> Dict:
     """
     Select optimal search strategy based on query and extracted parameters.
+    PRIORITY ORDER:
+    1. Check for RLM-appropriate queries (complex multi-hop, OR, comparative)
+    2. Fall back to traditional strategies for simple queries
     Args:
         user_query: Original user query
         search_params: Extracted search parameters
     Returns:
         Dict with:
         - strategy: SearchStrategy enum value
         - reasoning: str
         - has_semantic_features: bool
         - has_structured_filters: bool
+        - use_rlm: bool (NEW)
     """
+    query_lower = user_query.lower()
+    # =========================================================================
+    # STEP 1: Check for RLM-appropriate queries FIRST
+    # =========================================================================
+    try:
+        from app.ai.services.rlm_query_analyzer import analyze_query_complexity, QueryComplexity
+        rlm_analysis = analyze_query_complexity(user_query)
+        if rlm_analysis.use_rlm:
+            # Map QueryComplexity to SearchStrategy
+            rlm_strategy_map = {
+                QueryComplexity.MULTI_HOP: SearchStrategy.RLM_MULTI_HOP,
+                QueryComplexity.BOOLEAN_OR: SearchStrategy.RLM_BOOLEAN_OR,
+                QueryComplexity.COMPARATIVE: SearchStrategy.RLM_COMPARATIVE,
+                QueryComplexity.AGGREGATION: SearchStrategy.RLM_AGGREGATION,
+                QueryComplexity.MULTI_FACTOR: SearchStrategy.RLM_MULTI_FACTOR,
+            }
+            strategy = rlm_strategy_map.get(rlm_analysis.complexity)
+            if strategy:
+                logger.info(
+                    f"RLM strategy selected: {strategy.value}",
+                    query=user_query[:50],
+                    confidence=rlm_analysis.confidence
+                )
+                return {
+                    "strategy": strategy,
+                    "reasoning": rlm_analysis.reasoning,
+                    "has_semantic_features": True,
+                    "has_structured_filters": True,
+                    "use_rlm": True,
+                    "rlm_analysis": rlm_analysis.model_dump()
+                }
+    except ImportError:
+        logger.warning("RLM module not available, using traditional strategies")
+    except Exception as e:
+        logger.error(f"RLM analysis failed: {e}, falling back to traditional")
+    # =========================================================================
+    # STEP 2: Traditional strategy selection (for simple queries)
+    # =========================================================================
     # Quick heuristics for obvious cases
     has_location = bool(search_params.get("location"))
     has_price = bool(search_params.get("min_price") or search_params.get("max_price"))
     has_bathrooms = bool(search_params.get("bathrooms"))
     has_listing_type = bool(search_params.get("listing_type"))
     has_amenities = bool(search_params.get("amenities") and len(search_params.get("amenities", [])) > 0)
     structured_count = sum([has_location, has_price, has_bedrooms, has_bathrooms, has_listing_type])
     # Detect semantic keywords in query
     semantic_keywords = [
         "close to", "near", "nearby", "walking distance",
         "good vibes", "nice area", "good neighborhood",
         "beach", "school", "market", "downtown", "city center",
     ]
     has_semantic = any(keyword in query_lower for keyword in semantic_keywords)
     # Simple rule-based decision for clear cases
     if structured_count >= 2 and not has_semantic and not has_amenities:
         # Pure structured query
             "strategy": SearchStrategy.MONGO_ONLY,
             "reasoning": "Query has multiple structured filters and no semantic features",
             "has_semantic_features": False,
+            "has_structured_filters": True,
+            "use_rlm": False
         }
     if structured_count == 0 and (has_semantic or has_amenities):
         # Pure semantic query
         return {
             "strategy": SearchStrategy.QDRANT_ONLY,
             "reasoning": "Query is purely semantic/descriptive with no structured filters",
             "has_semantic_features": True,
+            "has_structured_filters": False,
+            "use_rlm": False
         }
     if has_location and has_semantic:
         # Location + semantic features
         return {
             "strategy": SearchStrategy.MONGO_THEN_QDRANT,
             "reasoning": "Query has location filter and semantic features - filter by location first, then semantic search",
             "has_semantic_features": True,
+            "has_structured_filters": True,
+            "use_rlm": False
         }
     # Use LLM for complex cases
                 "strategy": SearchStrategy.MONGO_ONLY,
                 "reasoning": "Strategy selection failed, using MongoDB filters",
                 "has_semantic_features": False,
+                "has_structured_filters": True,
+                "use_rlm": False
             }
         result = validation.data
+        result["use_rlm"] = False  # LLM-selected strategies are not RLM
         logger.info(f"Strategy selected: {result.get('strategy')} - {result.get('reasoning')}")
         return result
     except Exception as e:
         logger.error(f"Strategy selection error: {e}")
         # Default to MONGO_ONLY on error
             "strategy": SearchStrategy.MONGO_ONLY,
             "reasoning": "Strategy selection error, defaulting to MongoDB",
             "has_semantic_features": False,
+            "has_structured_filters": True,
+            "use_rlm": False
         }

app/ai/services/vision_service.py DELETED Viewed

@@ -1,697 +0,0 @@
-# ============================================================
-# app/ai/services/vision_service.py
-# Vision AI Service for Property Image Analysis
-# Uses Hugging Face Inference API (Moondream2 model)
-# ============================================================
-import io
-import os
-import base64
-import logging
-from typing import Dict, List, Optional, Tuple
-from PIL import Image
-import requests
-import cv2
-import numpy as np
-import tempfile
-from app.config import settings
-logger = logging.getLogger(__name__)
-class VisionService:
-    """Service for analyzing property images using HuggingFace Inference API (BLIP - FREE)"""
-    def __init__(self):
-        # BLIP image captioning works with HuggingFace FREE Inference API
-        # No special providers needed - uses standard inference endpoint
-        self.hf_token = settings.HF_TOKEN or settings.HUGGINGFACE_API_KEY
-        self.model_id = settings.HF_VISION_MODEL  # Salesforce/blip-image-captioning-large
-        # Standard HuggingFace Inference API endpoint (works with BLIP!)
-        self.api_url = f"https://api-inference.huggingface.co/models/{self.model_id}"
-        self.headers = {
-            "Authorization": f"Bearer {self.hf_token}",
-            "Content-Type": "application/json"
-        }
-        self.property_confidence_threshold = settings.PROPERTY_IMAGE_MIN_CONFIDENCE
-        logger.info(f"🔧 Vision Service initialized with HF Inference: {self.model_id}")
-    # ============================================================
-    # Core Image Validation & Analysis
-    # ============================================================
-    def validate_property_image(self, image_bytes: bytes) -> Tuple[bool, float, str]:
-        """
-        Validate if image is property-related before uploading
-        Args:
-            image_bytes: Raw image bytes
-        Returns:
-            Tuple of (is_valid, confidence, message)
-        """
-        try:
-            # Check if image is readable
-            image = Image.open(io.BytesIO(image_bytes))
-            image_rgb = image.convert("RGB")
-            # Query vision model to check if it's a property
-            payload = {
-                "inputs": image_rgb,
-                "question": (
-                    "Is this image a photo of a real property (house, apartment, room, "
-                    "office, land, or commercial building)? Answer only yes or no."
-                ),
-            }
-            response = self._query_hf_api(payload)
-            if not response:
-                return False, 0.0, "Failed to process image"
-            answer = response.strip().lower()
-            is_property = "yes" in answer or "this is a property" in answer.lower()
-            # Assign confidence based on response clarity
-            confidence = 0.95 if is_property else 0.5
-            if is_property:
-                return (
-                    True,
-                    confidence,
-                    "Property image validated successfully"
-                )
-            else:
-                return (
-                    False,
-                    confidence,
-                    "This doesn't look like a property photo. Please upload images of "
-                    "actual properties (houses, apartments, rooms, offices, or land)."
-                )
-        except Exception as e:
-            logger.error(f"Error validating property image: {str(e)}")
-            return False, 0.0, f"Error processing image: {str(e)}"
-    # ============================================================
-    # Property Field Extraction
-    # ============================================================
-    def extract_property_fields(
-        self,
-        image_bytes: bytes,
-        location: Optional[str] = None,
-        fast_validate: bool = False
-    ) -> Dict:
-        """
-        Extract property listing fields from image
-        Args:
-            image_bytes: Raw image bytes
-            location: Optional location context (helps with accuracy)
-            fast_validate: If True, only generate title (skip detailed extraction)
-                          Use when image is complementary to text listing
-        Returns:
-            Dict with extracted fields and confidence scores
-        """
-        try:
-            image = Image.open(io.BytesIO(image_bytes))
-            image_rgb = image.convert("RGB")
-            extracted = {
-                "bedrooms": None,
-                "bathrooms": None,
-                "amenities": [],
-                "description": "",
-                "title": "",
-                "confidence": {}
-            }
-            # ============================================================
-            # FAST VALIDATE MODE: Only generate title, skip extraction
-            # Used when user has already provided details via text
-            # ============================================================
-            if fast_validate:
-                logger.info("🚀 Fast validation mode: Generating title only")
-                title_data = self._generate_title(
-                    image_rgb,
-                    bedrooms=None,
-                    bathrooms=None,
-                    location=location
-                )
-                extracted["title"] = title_data.get("title", "Property Image")
-                extracted["confidence"]["title"] = title_data.get("confidence", 0.8)
-                extracted["fast_validated"] = True
-                return extracted
-            # ============================================================
-            # FULL EXTRACTION MODE: Extract all details for new listing
-            # ============================================================
-            # Query 1: Count rooms (bedrooms + bathrooms)
-            rooms_data = self._extract_room_count(image_rgb)
-            extracted["bedrooms"] = rooms_data.get("bedrooms")
-            extracted["bathrooms"] = rooms_data.get("bathrooms")
-            extracted["confidence"].update({
-                "bedrooms": rooms_data.get("bedroom_confidence", 0.0),
-                "bathrooms": rooms_data.get("bathroom_confidence", 0.0)
-            })
-            # Query 2: Detect amenities
-            amenities_data = self._detect_amenities(image_rgb)
-            extracted["amenities"] = amenities_data.get("amenities", [])
-            extracted["confidence"]["amenities"] = amenities_data.get("confidence", 0.0)
-            # Query 3: Generate description
-            description_data = self._generate_description(image_rgb)
-            extracted["description"] = description_data.get("description", "")
-            extracted["confidence"]["description"] = description_data.get("confidence", 0.0)
-            # Query 4: Generate SHORT title (max 2 sentences)
-            title_data = self._generate_title(
-                image_rgb,
-                bedrooms=extracted.get("bedrooms"),
-                bathrooms=extracted.get("bathrooms"),
-                location=location
-            )
-            extracted["title"] = title_data.get("title", "")
-            extracted["confidence"]["title"] = title_data.get("confidence", 0.0)
-            return extracted
-        except Exception as e:
-            logger.error(f"Error extracting property fields: {str(e)}")
-            return {
-                "bedrooms": None,
-                "bathrooms": None,
-                "amenities": [],
-                "description": "",
-                "title": "",
-                "confidence": {},
-                "error": str(e)
-            }
-    # ============================================================
-    # Specific Field Extraction Methods
-    # ============================================================
-    def _extract_room_count(self, image: Image.Image) -> Dict:
-        """
-        Extract bedroom and bathroom count (matches listing schema)
-        Returns bedrooms and bathrooms as integers (not property_type)
-        """
-        try:
-            payload = {
-                "inputs": image,
-                "question": (
-                    "Count the number of bedrooms and bathrooms you can see in this property photo. "
-                    "Only count what you can clearly identify. "
-                    "Format: bedrooms: [number], bathrooms: [number]"
-                ),
-            }
-            response = self._query_hf_api(payload)
-            bedrooms = None
-            bathrooms = None
-            bedroom_conf = 0.0
-            bathroom_conf = 0.0
-            if response:
-                response_lower = response.lower()
-                # Extract bedrooms
-                if "bedrooms:" in response_lower or "bedroom:" in response_lower:
-                    try:
-                        # Handle both "bedrooms:" and "bedroom:"
-                        if "bedrooms:" in response_lower:
-                            bed_str = response_lower.split("bedrooms:")[1].split(",")[0].strip()
-                        else:
-                            bed_str = response_lower.split("bedroom:")[1].split(",")[0].strip()
-                        # Extract first number found
-                        numbers = ''.join(filter(str.isdigit, bed_str))
-                        if numbers:
-                            bedrooms = int(numbers)
-                            bedroom_conf = 0.80  # Good confidence if extracted
-                    except Exception as e:
-                        logger.debug(f"Failed to parse bedrooms: {e}")
-                        bedroom_conf = 0.2
-                # Extract bathrooms
-                if "bathrooms:" in response_lower or "bathroom:" in response_lower:
-                    try:
-                        if "bathrooms:" in response_lower:
-                            bath_str = response_lower.split("bathrooms:")[1].strip()
-                        else:
-                            bath_str = response_lower.split("bathroom:")[1].strip()
-                        numbers = ''.join(filter(str.isdigit, bath_str))
-                        if numbers:
-                            bathrooms = int(numbers)
-                            bathroom_conf = 0.80
-                    except Exception as e:
-                        logger.debug(f"Failed to parse bathrooms: {e}")
-                        bathroom_conf = 0.2
-            return {
-                "bedrooms": bedrooms,
-                "bathrooms": bathrooms,
-                "bedroom_confidence": bedroom_conf,
-                "bathroom_confidence": bathroom_conf
-            }
-        except Exception as e:
-            logger.error(f"Error extracting room count: {str(e)}")
-            return {
-                "bedrooms": None,
-                "bathrooms": None,
-                "bedroom_confidence": 0.0,
-                "bathroom_confidence": 0.0
-            }
-    def _detect_amenities(self, image: Image.Image) -> Dict:
-        """
-        Detect amenities visible in property (matches listing schema)
-        Amenities is a simple list of strings in the listing model.
-        Common amenities: balcony, pool, parking, garden, gym, wifi, AC, security
-        """
-        try:
-            payload = {
-                "inputs": image,
-                "question": (
-                    "What amenities can you see in this property? "
-                    "List only what is clearly visible. Examples: "
-                    "balcony, pool, parking, garden, gym, wifi router, AC unit, security gate, "
-                    "furnished, modern kitchen, etc. "
-                    "If nothing special, say 'none'."
-                ),
-            }
-            response = self._query_hf_api(payload)
-            amenities = []
-            confidence = 0.5
-            if response and response.lower().strip() not in ["none", "none."]:
-                # Split by common separators (comma, and, newline)
-                import re
-                # Replace "and" with comma for easier splitting
-                cleaned = response.replace(" and ", ", ")
-                # Split by comma or newline
-                parts = re.split(r'[,\n]', cleaned)
-                # Clean and filter amenities
-                for amenity in parts:
-                    amenity = amenity.strip().lower()
-                    # Remove numbers, bullets, dashes at start
-                    amenity = re.sub(r'^[\d\-•\*\.\)]+\s*', '', amenity)
-                    # Skip empty or too short
-                    if amenity and len(amenity) > 2:
-                        amenities.append(amenity)
-                # Remove duplicates while preserving order
-                amenities = list(dict.fromkeys(amenities))
-                confidence = 0.70 if amenities else 0.3
-            return {
-                "amenities": amenities,
-                "confidence": confidence
-            }
-        except Exception as e:
-            logger.error(f"Error detecting amenities: {str(e)}")
-            return {"amenities": [], "confidence": 0.0}
-    def _generate_description(self, image: Image.Image) -> Dict:
-        """
-        Generate brief property description (matches listing schema)
-        Note: The listing flow will later use LLM to generate full title/description
-        based on all provided fields. This is just initial extraction from image.
-        """
-        try:
-            payload = {
-                "inputs": image,
-                "question": (
-                    "Describe what you see in this property photo in 1-2 sentences. "
-                    "Focus on visible features: room type, condition, style, notable features. "
-                    "Be factual and concise."
-                ),
-            }
-            response = self._query_hf_api(payload)
-            # Limit length
-            if response and len(response) > 150:
-                response = response[:147] + "..."
-            return {
-                "description": response or "",
-                "confidence": 0.70 if response else 0.0
-            }
-        except Exception as e:
-            logger.error(f"Error generating description: {str(e)}")
-            return {"description": "", "confidence": 0.0}
-    def _generate_title(self, image: Image.Image, bedrooms: int = None, bathrooms: int = None, location: str = None) -> Dict:
-        """
-        Generate simple property title (matches listing schema)
-        Note: The listing flow will later generate SEO-optimized title using LLM
-        based on all fields. This is just placeholder from image.
-        """
-        try:
-            # Build basic context
-            room_info = ""
-            if bedrooms is not None:
-                room_info = f"{bedrooms}-bedroom property"
-            elif bathrooms is not None:
-                room_info = "Property"
-            else:
-                room_info = "Property listing"
-            payload = {
-                "inputs": image,
-                "question": (
-                    f"Generate a short title for this property photo. "
-                    f"It appears to be a {room_info}. "
-                    "Keep it under 50 characters. Example: 'Modern apartment with balcony'"
-                ),
-            }
-            response = self._query_hf_api(payload)
-            # Ensure it's short
-            if response and len(response) > 80:
-                response = response[:77] + "..."
-            # Fallback if no response
-            if not response:
-                if bedrooms:
-                    response = f"{bedrooms}-Bedroom Property"
-                else:
-                    response = "Property Listing"
-            return {
-                "title": response or "Property Listing",
-                "confidence": 0.60 if response else 0.3
-            }
-        except Exception as e:
-            logger.error(f"Error generating title: {str(e)}")
-            return {"title": "Property Listing", "confidence": 0.3}
-    # ============================================================
-    # Hugging Face API Communication
-    # ============================================================
-    def _query_hf_api(self, payload: Dict) -> Optional[str]:
-        """
-        Query HuggingFace Inference API for image captioning (BLIP model).
-        Works with FREE HuggingFace Inference API - no special providers needed!
-        Args:
-            payload: Dict with "inputs" (PIL Image) and "question" (str - optional prompt)
-        Returns:
-            Response text (caption) or None
-        """
-        try:
-            if not self.hf_token:
-                logger.error("HF_TOKEN not set!")
-                return None
-            # BLIP accepts raw image bytes directly
-            if isinstance(payload.get("inputs"), Image.Image):
-                # Convert PIL Image to bytes
-                image_buffer = io.BytesIO()
-                payload["inputs"].save(image_buffer, format="JPEG")
-                image_bytes = image_buffer.getvalue()
-                # Optional: Add question/prompt for conditional captioning
-                # BLIP supports this via inputs parameter
-                question = payload.get("question", "")
-                # For BLIP, we send image bytes directly
-                # The question can be sent as a query parameter or ignored
-                response = requests.post(
-                    self.api_url,
-                    headers={"Authorization": f"Bearer {self.hf_token}"},
-                    data=image_bytes,
-                    timeout=60
-                )
-            else:
-                # Text-only query not supported for image captioning
-                logger.warning("BLIP requires an image input")
-                return None
-            if response.status_code == 200:
-                result = response.json()
-                # BLIP response format: [{"generated_text": "..."}]
-                if isinstance(result, list) and len(result) > 0:
-                    return result[0].get("generated_text", "")
-                elif isinstance(result, dict):
-                    return result.get("generated_text", "") or result.get("caption", "")
-                elif isinstance(result, str):
-                    return result
-                else:
-                    logger.warning(f"Unexpected BLIP response format: {result}")
-                    return str(result)
-            elif response.status_code == 503:
-                # Model loading - wait and retry
-                logger.warning("Model loading (503). Retrying in 15s...")
-                import time
-                time.sleep(15)
-                response = requests.post(
-                    self.api_url,
-                    headers={"Authorization": f"Bearer {self.hf_token}"},
-                    data=image_bytes,
-                    timeout=60
-                )
-                if response.status_code == 200:
-                    result = response.json()
-                    if isinstance(result, list) and len(result) > 0:
-                        return result[0].get("generated_text", "")
-                    return str(result)
-                else:
-                    logger.error(f"HF API error after retry: {response.status_code}")
-                    return None
-            else:
-                logger.error(f"HF API error: {response.status_code} - {response.text[:200]}")
-                return None
-        except Exception as e:
-            logger.error(f"Error querying HF API: {str(e)}")
-            return None
-    # ============================================================
-    # Video Frame Extraction
-    # ============================================================
-    def extract_frames_from_video(self, video_bytes: bytes, max_frames: int = 8) -> List[Image.Image]:
-        """
-        Extract key frames from video for analysis
-        Args:
-            video_bytes: Raw video file bytes
-            max_frames: Maximum number of frames to extract (default 8)
-        Returns:
-            List of PIL Images extracted from video
-        """
-        frames = []
-        temp_video_path = None
-        try:
-            # Save video bytes to temp file (OpenCV needs a file path)
-            with tempfile.NamedTemporaryFile(delete=False, suffix='.mp4') as temp_video:
-                temp_video.write(video_bytes)
-                temp_video_path = temp_video.name
-            # Open video with OpenCV
-            cap = cv2.VideoCapture(temp_video_path)
-            if not cap.isOpened():
-                logger.error("Failed to open video file")
-                return frames
-            # Get video properties
-            total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
-            fps = cap.get(cv2.CAP_PROP_FPS)
-            duration = total_frames / fps if fps > 0 else 0
-            logger.info(f"Video: {total_frames} frames, {fps:.2f} FPS, {duration:.2f}s duration")
-            # Calculate frame interval to extract max_frames evenly distributed
-            if total_frames <= max_frames:
-                # Extract all frames if video has fewer frames than max_frames
-                frame_indices = list(range(total_frames))
-            else:
-                # Extract frames at regular intervals
-                interval = total_frames // max_frames
-                frame_indices = [i * interval for i in range(max_frames)]
-            # Extract frames
-            for frame_idx in frame_indices:
-                cap.set(cv2.CAP_PROP_POS_FRAMES, frame_idx)
-                ret, frame = cap.read()
-                if ret:
-                    # Convert BGR (OpenCV) to RGB (PIL)
-                    frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
-                    # Convert numpy array to PIL Image
-                    pil_image = Image.fromarray(frame_rgb)
-                    frames.append(pil_image)
-                    logger.info(f"Extracted frame {len(frames)}/{max_frames} at index {frame_idx}")
-            cap.release()
-            logger.info(f"✅ Successfully extracted {len(frames)} frames from video")
-        except Exception as e:
-            logger.error(f"Error extracting video frames: {str(e)}")
-        finally:
-            # Cleanup temp file
-            if temp_video_path:
-                try:
-                    import os
-                    os.unlink(temp_video_path)
-                except:
-                    pass
-        return frames
-    def analyze_video(self, video_bytes: bytes, location: str = None, max_frames: int = 8) -> Dict:
-        """
-        Analyze property video by extracting frames and analyzing them
-        Args:
-            video_bytes: Raw video file bytes
-            location: Optional location for context
-            max_frames: Maximum frames to extract (default 8)
-        Returns:
-            Dict with extracted property fields and confidence scores
-        """
-        try:
-            # Step 1: Extract frames from video
-            logger.info(f"🎬 Extracting up to {max_frames} frames from video...")
-            frames = self.extract_frames_from_video(video_bytes, max_frames=max_frames)
-            if not frames:
-                logger.error("No frames extracted from video")
-                return {
-                    "bedrooms": None,
-                    "bathrooms": None,
-                    "amenities": [],
-                    "description": "Unable to analyze video - no frames extracted",
-                    "title": "Property Video",
-                    "confidence": {},
-                    "error": "Failed to extract frames from video"
-                }
-            logger.info(f"✅ Extracted {len(frames)} frames, analyzing each frame...")
-            # Step 2: Analyze each frame as an image
-            frame_results = []
-            for idx, frame in enumerate(frames):
-                logger.info(f"Analyzing frame {idx + 1}/{len(frames)}...")
-                # Convert PIL Image to bytes for analysis
-                frame_bytes = io.BytesIO()
-                frame.save(frame_bytes, format='JPEG')
-                frame_bytes.seek(0)
-                # Analyze this frame
-                frame_data = self.extract_property_fields(frame_bytes.getvalue(), location=location)
-                frame_results.append(frame_data)
-            # Step 3: Merge results from all frames
-            logger.info(f"Merging results from {len(frame_results)} analyzed frames...")
-            consolidated = self.merge_multiple_image_results(frame_results)
-            logger.info(f"✅ Video analysis complete: {consolidated.get('bedrooms')} beds, {consolidated.get('bathrooms')} baths, {len(consolidated.get('amenities', []))} amenities")
-            return consolidated
-        except Exception as e:
-            logger.error(f"Error analyzing video: {str(e)}")
-            return {
-                "bedrooms": None,
-                "bathrooms": None,
-                "amenities": [],
-                "description": "",
-                "title": "",
-                "confidence": {},
-                "error": str(e)
-            }
-    # ============================================================
-    # Utility Methods
-    # ============================================================
-    def merge_multiple_image_results(self, results_list: List[Dict]) -> Dict:
-        """
-        Merge results from multiple images into single listing data
-        Args:
-            results_list: List of extracted field dicts from different images
-        Returns:
-            Consolidated dict with most likely values
-        """
-        if not results_list:
-            return {}
-        consolidated = {
-            "bedrooms": None,
-            "bathrooms": None,
-            "amenities": [],
-            "description": "",
-            "confidence": {}
-        }
-        # Bedrooms: take highest count mentioned
-        bedrooms_list = [r.get("bedrooms") for r in results_list if r.get("bedrooms")]
-        if bedrooms_list:
-            consolidated["bedrooms"] = max(bedrooms_list)
-            consolidated["confidence"]["bedrooms"] = sum(
-                [r.get("confidence", {}).get("bedrooms", 0)
-                 for r in results_list]
-            ) / len(results_list)
-        # Bathrooms: take highest count mentioned
-        bathrooms_list = [r.get("bathrooms") for r in results_list if r.get("bathrooms")]
-        if bathrooms_list:
-            consolidated["bathrooms"] = max(bathrooms_list)
-            consolidated["confidence"]["bathrooms"] = sum(
-                [r.get("confidence", {}).get("bathrooms", 0)
-                 for r in results_list]
-            ) / len(results_list)
-        # Amenities: deduplicate and combine
-        all_amenities = set()
-        for result in results_list:
-            all_amenities.update(result.get("amenities", []))
-        consolidated["amenities"] = list(all_amenities)
-        consolidated["confidence"]["amenities"] = sum(
-            [r.get("confidence", {}).get("amenities", 0)
-             for r in results_list]
-        ) / len(results_list)
-        # Description: use longest one
-        descriptions = [r.get("description", "") for r in results_list if r.get("description")]
-        if descriptions:
-            consolidated["description"] = max(descriptions, key=len)
-            consolidated["confidence"]["description"] = 0.8
-        return consolidated

app/config.py CHANGED Viewed

@@ -114,13 +114,15 @@ class Settings(BaseSettings):
     HF_WHISPER_MODEL: str = os.getenv("HF_WHISPER_MODEL", "openai/whisper-large-v3")
     # ------------------------------------------------------------------
-    # Vision AI (Property Analysis)
     # ------------------------------------------------------------------
-    # BLIP - Works with HuggingFace FREE Inference API
-    # Alternative: nlpconnect/vit-gpt2-image-captioning
-    HF_VISION_MODEL: str = os.getenv("HF_VISION_MODEL", "Salesforce/blip-image-captioning-large")
-    HF_VISION_API_ENABLED: bool = os.getenv("HF_VISION_API_ENABLED", "true").lower() == "true"
-    PROPERTY_IMAGE_MIN_CONFIDENCE: float = float(os.getenv("PROPERTY_IMAGE_MIN_CONFIDENCE", "0.6"))
     # ------------------------------------------------------------------
     # LLM / Tooling keys
@@ -141,6 +143,12 @@ class Settings(BaseSettings):
     LANGCHAIN_TRACING_V2: bool = os.getenv("LANGCHAIN_TRACING_V2", "false").lower() == "true"
     LANGCHAIN_API_KEY: str = os.getenv("LANGCHAIN_API_KEY", "")
     LANGCHAIN_PROJECT: str = os.getenv("LANGCHAIN_PROJECT", "aida_agent")
     # ============ REDIS (SESSION & MEMORY) ============
     REDIS_URL: str = os.getenv("REDIS_URL", "redis://localhost:6379")

     HF_WHISPER_MODEL: str = os.getenv("HF_WHISPER_MODEL", "openai/whisper-large-v3")
     # ------------------------------------------------------------------
+    # Vision AI (Property Analysis) - DISABLED
     # ------------------------------------------------------------------
+    # NOTE: Vision analysis is NOT in use. Image uploads are handled
+    # directly by Cloudflare Worker (frontend upload).
+    # These settings are kept for future reference only.
+    # ------------------------------------------------------------------
+    # HF_VISION_MODEL: str = os.getenv("HF_VISION_MODEL", "Salesforce/blip-image-captioning-large")
+    # HF_VISION_API_ENABLED: bool = os.getenv("HF_VISION_API_ENABLED", "true").lower() == "true"
+    # PROPERTY_IMAGE_MIN_CONFIDENCE: float = float(os.getenv("PROPERTY_IMAGE_MIN_CONFIDENCE", "0.6"))
     # ------------------------------------------------------------------
     # LLM / Tooling keys
     LANGCHAIN_TRACING_V2: bool = os.getenv("LANGCHAIN_TRACING_V2", "false").lower() == "true"
     LANGCHAIN_API_KEY: str = os.getenv("LANGCHAIN_API_KEY", "")
     LANGCHAIN_PROJECT: str = os.getenv("LANGCHAIN_PROJECT", "aida_agent")
+    # ============ AGENT LIGHTNING (RL TRAINING) ============
+    # Enable trajectory capture for reinforcement learning
+    # Set LIGHTNING_ENABLED=true in .env to start collecting training data
+    LIGHTNING_ENABLED: bool = os.getenv("LIGHTNING_ENABLED", "false").lower() == "true"
+    LIGHTNING_TRAJECTORY_TTL_DAYS: int = int(os.getenv("LIGHTNING_TRAJECTORY_TTL_DAYS", "30"))
     # ============ REDIS (SESSION & MEMORY) ============
     REDIS_URL: str = os.getenv("REDIS_URL", "redis://localhost:6379")

app/routes/auth.py CHANGED Viewed

@@ -11,6 +11,7 @@ from app.schemas.auth import (
     ResetPasswordDto,
     ResendOtpDto,
 )
 from app.services.auth_service import auth_service
 from app.services.user_service import user_service
 from app.services.otp_service import otp_service
@@ -162,6 +163,61 @@ async def get_current_user_profile(current_user: dict = Depends(get_current_user
     logger.info(f"Get current user profile: {current_user.get('user_id')}")
     return await user_service.get_current_user_profile(current_user.get("user_id"))
 # ============================================================
 # LOGOUT ENDPOINT
 # ============================================================

     ResetPasswordDto,
     ResendOtpDto,
 )
+from app.schemas.user import ProfileUpdateRequest, ProfileUpdateResponse
 from app.services.auth_service import auth_service
 from app.services.user_service import user_service
 from app.services.otp_service import otp_service
     logger.info(f"Get current user profile: {current_user.get('user_id')}")
     return await user_service.get_current_user_profile(current_user.get("user_id"))
+@router.patch("/profile", status_code=status.HTTP_200_OK, response_model=ProfileUpdateResponse)
+async def update_current_user_profile(
+    profile_data: ProfileUpdateRequest,
+    current_user: dict = Depends(get_current_user)
+):
+    """
+    Update Current User Profile
+    Update the logged-in user's profile information.
+    All fields are optional - only include fields you want to update.
+    **Allowed fields:**
+    - `firstName`: User's first name (1-50 chars)
+    - `lastName`: User's last name (1-50 chars)
+    - `bio`: Short bio (max 150 chars)
+    - `location`: Location in "City, Country" format (max 100 chars)
+    - `languages`: Array of languages spoken (max 3)
+    - `profilePicture`: URL to profile picture
+    **Requires:** Bearer token in Authorization header
+    **Example request body:**
+    ```json
+    {
+        "firstName": "John",
+        "lastName": "Doe",
+        "bio": "Real estate enthusiast",
+        "location": "Cotonou, Benin",
+        "languages": ["English", "French"]
+    }
+    ```
+    """
+    user_id = current_user.get("user_id")
+    logger.info(f"Update user profile: {user_id}")
+    # Convert to dict and remove None values
+    update_data = {k: v for k, v in profile_data.model_dump().items() if v is not None}
+    if not update_data:
+        raise HTTPException(
+            status_code=status.HTTP_400_BAD_REQUEST,
+            detail="No fields to update. Please provide at least one field."
+        )
+    # Validate languages count
+    if "languages" in update_data and len(update_data["languages"]) > 3:
+        raise HTTPException(
+            status_code=status.HTTP_400_BAD_REQUEST,
+            detail="Maximum 3 languages allowed"
+        )
+    return await user_service.update_user_profile(user_id, update_data)
 # ============================================================
 # LOGOUT ENDPOINT
 # ============================================================

app/routes/media_upload.py CHANGED Viewed

@@ -1,282 +1,33 @@
 # ============================================================
 # app/routes/media_upload.py
-# Media Upload & Property Analysis Routes
-# Handles image validation, video upload, and field extraction
 # ============================================================
-import io
 import logging
-from typing import List, Optional
-from fastapi import APIRouter, UploadFile, File, Form, Depends, HTTPException, status
-from fastapi.responses import JSONResponse
-import cloudinary
-import cloudinary.uploader
 from app.config import settings
-# PAUSED: Vision service temporarily disabled
-# from app.ai.services.vision_service import VisionService
-from app.guards.jwt_guard import get_current_user
-from app.core.llm_router import LLMRouter, TaskComplexity
 logger = logging.getLogger(__name__)
 router = APIRouter(prefix="/listings", tags=["media"])
-# PAUSED: Vision Service temporarily disabled
-# vision_service = VisionService()
-# Initialize LLM Router for generating personalized messages
-llm_router = LLMRouter()
-# Configure Cloudinary
-if settings.CLOUDINARY_CLOUD_NAME:
-    cloudinary.config(
-        cloud_name=settings.CLOUDINARY_CLOUD_NAME,
-        api_key=settings.CLOUDINARY_API_KEY,
-        api_secret=settings.CLOUDINARY_API_SECRET,
-        secure=True
-    )
-# ============================================================
-# File Validation & Limits
-# ============================================================
-ALLOWED_IMAGE_TYPES = {"image/jpeg", "image/png", "image/webp"}
-ALLOWED_VIDEO_TYPES = {"video/mp4", "video/quicktime", "video/x-msvideo"}
-MAX_IMAGE_SIZE = 10 * 1024 * 1024  # 10MB
-MAX_VIDEO_SIZE = 100 * 1024 * 1024  # 100MB
-MAX_IMAGES_PER_UPLOAD = 10
-MAX_VIDEO_DURATION = 300  # 5 minutes
-# ============================================================
-# Helper Functions
-# ============================================================
-async def validate_image_file(file: UploadFile) -> bytes:
-    """Validate and read image file"""
-    if file.content_type not in ALLOWED_IMAGE_TYPES:
-        raise HTTPException(
-            status_code=status.HTTP_400_BAD_REQUEST,
-            detail=f"Invalid image type. Allowed: {', '.join(ALLOWED_IMAGE_TYPES)}"
-        )
-    contents = await file.read()
-    if len(contents) > MAX_IMAGE_SIZE:
-        raise HTTPException(
-            status_code=status.HTTP_413_REQUEST_ENTITY_TOO_LARGE,
-            detail=f"Image size exceeds {MAX_IMAGE_SIZE / 1024 / 1024}MB limit"
-        )
-    return contents
-async def validate_video_file(file: UploadFile) -> bytes:
-    """Validate and read video file"""
-    if file.content_type not in ALLOWED_VIDEO_TYPES:
-        raise HTTPException(
-            status_code=status.HTTP_400_BAD_REQUEST,
-            detail=f"Invalid video type. Allowed: {', '.join(ALLOWED_VIDEO_TYPES)}"
-        )
-    contents = await file.read()
-    if len(contents) > MAX_VIDEO_SIZE:
-        raise HTTPException(
-            status_code=status.HTTP_413_REQUEST_ENTITY_TOO_LARGE,
-            detail=f"Video size exceeds {MAX_VIDEO_SIZE / 1024 / 1024}MB limit"
-        )
-    return contents
-def generate_intelligent_filename(
-    original_filename: str,
-    location: Optional[str] = None,
-    title: Optional[str] = None,
-    index: int = 0
-) -> str:
-    """
-    Generate intelligent filename for uploaded image
-    Pattern: {location}_{title}_{date}_{index}.jpg
-    Example: Lagos_Modern_Apartment_2025_01_31_1.jpg
-    The Cloudflare worker will handle duplicates by appending numbers
-    """
-    from datetime import datetime
-    # Get original extension
-    _, ext = original_filename.rsplit('.', 1) if '.' in original_filename else (original_filename, 'jpg')
-    ext = ext.lower()
-    if ext not in ['jpg', 'jpeg', 'png', 'webp']:
-        ext = 'jpg'
-    # Build filename components
-    parts = []
-    # Add location if available
-    if location:
-        clean_location = location.replace(' ', '_').replace(',', '').lower()[:20]
-        parts.append(clean_location)
-    # Add title if available (first 20 chars)
-    if title:
-        clean_title = title.replace(' ', '_').replace(',', '').lower()[:20]
-        parts.append(clean_title)
-    # Add timestamp
-    timestamp = datetime.utcnow().strftime("%Y_%m_%d_%H%M%S")
-    parts.append(timestamp)
-    # Add index if multiple images
-    if index > 0:
-        parts.append(str(index))
-    filename = "_".join(parts)
-    return f"{filename}.{ext}"
-async def upload_to_cloudflare(file_bytes: bytes, filename: str, meaningful_name: str = None) -> str:
-    """
-    Upload image/video to Cloudflare R2
-    Args:
-        file_bytes: File bytes
-        filename: Original filename
-        meaningful_name: AI-generated meaningful filename (optional)
-    Returns:
-        Public URL of uploaded file
-    """
-    import boto3
-    from botocore.config import Config
-    import os
-    from datetime import datetime
-    try:
-        # Use meaningful name if provided, otherwise original filename
-        final_filename = meaningful_name or filename
-        # Initialize R2 client
-        r2_client = boto3.client(
-            's3',
-            endpoint_url=settings.CF_R2_ENDPOINT,
-            aws_access_key_id=settings.CF_R2_ACCESS_KEY_ID,
-            aws_secret_access_key=settings.CF_R2_SECRET_ACCESS_KEY,
-            config=Config(
-                signature_version='s3v4',
-                s3={'addressing_style': 'path'}
-            ),
-            region_name='auto'
-        )
-        # Determine content type based on file extension
-        ext = os.path.splitext(final_filename)[1].lower()
-        content_type_map = {
-            '.jpg': 'image/jpeg',
-            '.jpeg': 'image/jpeg',
-            '.png': 'image/png',
-            '.webp': 'image/webp',
-            '.mp4': 'video/mp4',
-            '.mov': 'video/quicktime',
-            '.avi': 'video/x-msvideo'
-        }
-        content_type = content_type_map.get(ext, 'application/octet-stream')
-        # Create folder structure: media/YYYY/MM/filename
-        now = datetime.utcnow()
-        folder_path = f"media/{now.year}/{now.month:02d}"
-        object_key = f"{folder_path}/{final_filename}"
-        # Upload to R2 (use lojiz-audio bucket for now, or create lojiz-media bucket)
-        bucket_name = settings.CF_R2_BUCKET_NAME
-        r2_client.put_object(
-            Bucket=bucket_name,
-            Key=object_key,
-            Body=file_bytes,
-            ContentType=content_type,
-            CacheControl='public, max-age=31536000',  # Cache for 1 year
-        )
-        # Construct public URL
-        public_url = f"{settings.CF_R2_PUBLIC_URL}/{object_key}"
-        logger.info(f"✅ Uploaded to Cloudflare R2: {public_url}")
-        return public_url
-    except Exception as e:
-        logger.error(f"❌ Error uploading to Cloudflare R2: {str(e)}")
-        raise HTTPException(
-            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
-            detail=f"Failed to upload to cloud storage: {str(e)}"
-        )
-async def upload_to_cloudinary(file_bytes: bytes, filename: str, resource_type: str = "video") -> str:
-    """Upload video to Cloudinary"""
-    try:
-        file_obj = io.BytesIO(file_bytes)
-        result = cloudinary.uploader.upload(
-            file_obj,
-            resource_type=resource_type,
-            folder="lojiz/property-videos",
-            public_id=filename.split(".")[0],
-            overwrite=True,
-            quality="auto",
-            fetch_format="auto"
-        )
-        return result.get("secure_url", "")
-    except Exception as e:
-        logger.error(f"Error uploading to Cloudinary: {str(e)}")
-        raise HTTPException(
-            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
-            detail="Failed to upload video to Cloudinary"
-        )
 # ============================================================
 # API Endpoints
 # ============================================================
-@router.post("/analyze-images")
-async def analyze_property_images(
-    images: List[UploadFile] = File(...),
-    listing_method: str = "image",
-    location: Optional[str] = None,
-    user_input: Optional[str] = Form(None),
-    session_id: Optional[str] = Form(None),
-    current_user = Depends(get_current_user)
-):
-    """
-    🚧 VISION FEATURE PAUSED 🚧
-    This route is temporarily disabled while we set up a reliable vision provider.
-    For now, use /upload-images for direct image upload without AI analysis.
-    """
-    raise HTTPException(
-        status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
-        detail={
-            "message": "Vision analysis temporarily unavailable",
-            "suggestion": "Use /listings/upload-images for direct upload without AI analysis",
-            "feature_status": "paused"
-        }
-    )
 @router.get("/upload-config")
 async def get_upload_configuration():
     """
     Get image upload configuration for frontend.
     Frontend should upload images DIRECTLY to the Cloudflare Worker URL.
     No backend processing involved.
     Returns:
         {
             "worker_url": "https://image-upload-worker.destinyebuka7.workers.dev",
@@ -290,55 +41,60 @@ async def get_upload_configuration():
         "max_file_size_mb": 5,
         "allowed_types": ["image/jpeg", "image/png", "image/webp"],
         "instructions": {
-            "step_1": "Create FormData with: file, user_id, session_id, message (optional)",
-            "step_2": "POST to worker_url",
-            "step_3": "Worker validates image, uploads to Cloudflare, returns URL",
-            "step_4": "Send URL back to AIDA in chat message"
         }
     }
 @router.post("/get-image-name")
 async def get_image_name_for_upload(
     user_id: str = Form(...),
     session_id: str = Form(...)
 ):
     """
     Generate intelligent filename for Cloudflare Worker uploads.
-    Called by Cloudflare Worker before uploading image.
     Returns a descriptive filename based on current listing context.
     Args:
         user_id: User ID
         session_id: Current session ID
     Returns:
         {"name": "lagos_modern_apartment_2bed"}
     """
     try:
         from app.ai.services.conversation_service import ConversationService
-        from datetime import datetime
         # Get current conversation state to extract listing context
         conv_service = ConversationService()
         state = await conv_service.get_or_create_conversation(user_id, session_id)
         # Extract relevant fields for filename
         location = state.provided_fields.get("location", "")
         title = state.provided_fields.get("title", "")
         bedrooms = state.provided_fields.get("bedrooms")
         listing_type = state.provided_fields.get("listing_type", "property")
         # Build intelligent filename
         parts = []
         if location:
             clean_location = location.replace(' ', '_').replace(',', '').lower()[:15]
             parts.append(clean_location)
         if title:
             # Extract first 2-3 meaningful words from title
             title_words = [w for w in title.lower().split() if len(w) > 3][:2]
@@ -346,23 +102,23 @@ async def get_image_name_for_upload(
                 parts.extend(title_words)
         elif bedrooms:
             parts.append(f"{bedrooms}bed")
         if listing_type and listing_type != "property":
             parts.append(listing_type[:4])
         # If we have no context, use generic name
         if not parts:
             parts = ["property", datetime.now().strftime("%Y%m%d")]
         filename = "_".join(parts)
-        logger.info(f"Generated image name: {filename}", user_id=user_id, session_id=session_id)
         return {
             "name": filename,
             "success": True
         }
     except Exception as e:
         logger.error(f"Failed to generate image name: {str(e)}")
         # Fallback to timestamp-based name
@@ -370,53 +126,3 @@ async def get_image_name_for_upload(
             "name": f"property_{datetime.now().strftime('%Y%m%d_%H%M%S')}",
             "success": True
         }
-# ============================================================
-# DEPRECATED ENDPOINTS (Vision Feature Paused)
-# ============================================================
-@router.post("/analyze-images")
-async def analyze_property_images_deprecated(current_user = Depends(get_current_user)):
-    """
-    🚧 VISION FEATURE PAUSED 🚧
-    Image analysis is now handled by Cloudflare Worker with AI vision.
-    Frontend should upload directly to the worker endpoint.
-    """
-    raise HTTPException(
-        status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
-        detail={
-            "message": "Vision analysis moved to Cloudflare Worker",
-            "suggestion": "Upload images to Cloudflare Worker endpoint for direct processing",
-            "feature_status": "deprecated"
-        }
-    )
-@router.post("/analyze-video")
-async def analyze_property_video_deprecated(current_user = Depends(get_current_user)):
-    """
-    🚧 VISION FEATURE PAUSED 🚧
-    """
-    raise HTTPException(
-        status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
-        detail={
-            "message": "Video analysis temporarily unavailable",
-            "feature_status": "paused"
-        }
-    )
-@router.post("/validate-media")
-async def validate_media_deprecated(current_user = Depends(get_current_user)):
-    """
-    🚧 VISION FEATURE PAUSED 🚧
-    """
-    raise HTTPException(
-        status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
-        detail={
-            "message": "Media validation moved to Cloudflare Worker",
-            "feature_status": "deprecated"
-        }
-    )

 # ============================================================
 # app/routes/media_upload.py
+# Media Upload Configuration Routes
+# ============================================================
+# NOTE: Image/video uploads are handled DIRECTLY by Cloudflare Worker
+# This file only provides configuration endpoints for the frontend
 # ============================================================
 import logging
+from datetime import datetime
+from fastapi import APIRouter, Form, HTTPException, status
 from app.config import settings
 logger = logging.getLogger(__name__)
 router = APIRouter(prefix="/listings", tags=["media"])
 # ============================================================
 # API Endpoints
 # ============================================================
 @router.get("/upload-config")
 async def get_upload_configuration():
     """
     Get image upload configuration for frontend.
     Frontend should upload images DIRECTLY to the Cloudflare Worker URL.
     No backend processing involved.
     Returns:
         {
             "worker_url": "https://image-upload-worker.destinyebuka7.workers.dev",
         "max_file_size_mb": 5,
         "allowed_types": ["image/jpeg", "image/png", "image/webp"],
         "instructions": {
+            "profile_upload": {
+                "step_1": "Create FormData with: file, type='profile', user_name, user_id",
+                "step_2": "POST to worker_url",
+                "step_3": "Worker uploads to Cloudflare, returns URL",
+                "step_4": "Include URL in PATCH /auth/profile payload"
+            },
+            "property_upload": {
+                "step_1": "Create FormData with: file, type='property', user_id, session_id",
+                "step_2": "POST to worker_url",
+                "step_3": "Worker uploads to Cloudflare, returns URL",
+                "step_4": "Send URL to AIDA in chat message"
+            }
         }
     }
 @router.post("/get-image-name")
 async def get_image_name_for_upload(
     user_id: str = Form(...),
     session_id: str = Form(...)
 ):
     """
     Generate intelligent filename for Cloudflare Worker uploads.
+    Called by Cloudflare Worker before uploading property images.
     Returns a descriptive filename based on current listing context.
     Args:
         user_id: User ID
         session_id: Current session ID
     Returns:
         {"name": "lagos_modern_apartment_2bed"}
     """
     try:
         from app.ai.services.conversation_service import ConversationService
         # Get current conversation state to extract listing context
         conv_service = ConversationService()
         state = await conv_service.get_or_create_conversation(user_id, session_id)
         # Extract relevant fields for filename
         location = state.provided_fields.get("location", "")
         title = state.provided_fields.get("title", "")
         bedrooms = state.provided_fields.get("bedrooms")
         listing_type = state.provided_fields.get("listing_type", "property")
         # Build intelligent filename
         parts = []
         if location:
             clean_location = location.replace(' ', '_').replace(',', '').lower()[:15]
             parts.append(clean_location)
         if title:
             # Extract first 2-3 meaningful words from title
             title_words = [w for w in title.lower().split() if len(w) > 3][:2]
                 parts.extend(title_words)
         elif bedrooms:
             parts.append(f"{bedrooms}bed")
         if listing_type and listing_type != "property":
             parts.append(listing_type[:4])
         # If we have no context, use generic name
         if not parts:
             parts = ["property", datetime.now().strftime("%Y%m%d")]
         filename = "_".join(parts)
+        logger.info(f"Generated image name: {filename} for user: {user_id}")
         return {
             "name": filename,
             "success": True
         }
     except Exception as e:
         logger.error(f"Failed to generate image name: {str(e)}")
         # Fallback to timestamp-based name
             "name": f"property_{datetime.now().strftime('%Y%m%d_%H%M%S')}",
             "success": True
         }

app/schemas/user.py CHANGED Viewed

@@ -100,6 +100,69 @@ class UserUpdateDto(BaseModel):
     languages: Optional[list[str]] = None
 # ============================================================
 # Generic Response DTOs
 # ============================================================

     languages: Optional[list[str]] = None
+class ProfileUpdateRequest(BaseModel):
+    """
+    Update user profile request body.
+    All fields are optional - only include fields you want to update.
+    """
+    firstName: Optional[str] = Field(
+        None,
+        min_length=1,
+        max_length=50,
+        description="User's first name",
+        examples=["John"]
+    )
+    lastName: Optional[str] = Field(
+        None,
+        min_length=1,
+        max_length=50,
+        description="User's last name",
+        examples=["Doe"]
+    )
+    bio: Optional[str] = Field(
+        None,
+        max_length=150,
+        description="Short bio about the user (max 150 characters)",
+        examples=["Real estate enthusiast looking for the perfect apartment in Cotonou."]
+    )
+    location: Optional[str] = Field(
+        None,
+        max_length=100,
+        description="User's location in 'City, Country' format",
+        examples=["Cotonou, Benin"]
+    )
+    languages: Optional[list[str]] = Field(
+        None,
+        max_length=3,
+        description="Languages spoken by the user (max 3)",
+        examples=[["English", "French", "Portuguese"]]
+    )
+    profilePicture: Optional[str] = Field(
+        None,
+        description="URL to the user's profile picture",
+        examples=["https://example.com/images/profile.jpg"]
+    )
+    class Config:
+        json_schema_extra = {
+            "example": {
+                "firstName": "John",
+                "lastName": "Doe",
+                "bio": "Real estate enthusiast",
+                "location": "Cotonou, Benin",
+                "languages": ["English", "French"],
+                "profilePicture": "https://example.com/profile.jpg"
+            }
+        }
+class ProfileUpdateResponse(BaseModel):
+    """Response after updating user profile"""
+    success: bool = Field(default=True, description="Whether the update was successful")
+    message: str = Field(..., description="Response message")
+    data: UserProfileWithReviewsDto = Field(..., description="Updated user profile")
 # ============================================================
 # Generic Response DTOs
 # ============================================================

cloudflare-worker/image-upload-worker.js CHANGED Viewed

@@ -1,9 +1,11 @@
-// src/index.js - Updated Image Upload Worker with AI Vision Validation
-// Features:
-// 1. AI Vision validation (is this a property photo?)
-// 2. Get image name from AIDA
-// 3. Handle add/replace operations
-// 4. Duplicate name numbering
 export default {
   async fetch(request, env) {
@@ -52,9 +54,11 @@ export default {
       // Parse form data
       const formData = await request.formData();
       const imageFile = formData.get("file");
-      const userMessage = formData.get("message") || "";
       const userId = formData.get("user_id") || "";
       const sessionId = formData.get("session_id") || "";
       const operation = formData.get("operation") || "add"; // "add" or "replace"
       const replaceIndex = formData.get("replace_index"); // For replace operations
       const existingImageId = formData.get("existing_image_id"); // ID of image to replace
@@ -63,19 +67,26 @@ export default {
         return jsonResponse({ success: false, error: "no_image", message: "No image file provided" }, 400);
       }
-      // Convert image to bytes for AI validation
       const imageBytes = await imageFile.arrayBuffer();
-      const imageArray = [...new Uint8Array(imageBytes)];
       // ============================================================
-      // STEP 1: AI Vision Validation
       // ============================================================
       let isPropertyImage = false;
       let validationReason = "";
       try {
         const aiResult = await env.AI.run('@cf/llava-hf/llava-1.5-7b-hf', {
-          image: imageArray,
           prompt: "Is this image showing a real estate property such as a house, apartment, room, building, or property exterior/interior? Answer with ONLY 'YES' or 'NO' followed by a brief reason.",
           max_tokens: 50
         });
@@ -85,14 +96,13 @@ export default {
         validationReason = response;
       } catch (aiError) {
-        // If AI fails, allow the image through (fail-open for better UX)
         console.error("AI validation error:", aiError);
         isPropertyImage = true;
         validationReason = "AI validation skipped due to error";
       }
-      // If not a property image, return error (AIDA will handle friendly message)
-      if (!isPropertyImage) {
         return jsonResponse({
           success: false,
           error: "not_property_image",
@@ -102,13 +112,24 @@ export default {
           session_id: sessionId
         }, 400);
       }
       // ============================================================
-      // STEP 2: Get Image Name from AIDA (if new image)
       // ============================================================
       let imageName = "";
-      if (operation === "add") {
         try {
           const nameResponse = await fetch(`${AIDA_BASE_URL}/ai/get-image-name`, {
             method: "POST",
@@ -132,7 +153,7 @@ export default {
       }
       // ============================================================
-      // STEP 3: Handle Replace Operation (delete old image)
       // ============================================================
       if (operation === "replace" && existingImageId) {
         try {
@@ -167,15 +188,15 @@ export default {
       }
       // ============================================================
-      // STEP 4: Upload to Cloudflare Images
       // ============================================================
       // Clean the image name for use as filename
       const cleanName = imageName
         .toLowerCase()
-        .replace(/[^a-z0-9]+/g, '-')
         .replace(/^-|-$/g, '')
-        || `property-${Date.now()}`;
       // Create new FormData for Cloudflare upload
       const uploadFormData = new FormData();
@@ -194,12 +215,13 @@ export default {
         const imageId = uploadResponseBody.result.id;
         const imageUrl = `https://imagedelivery.net/${ACCOUNT_HASH}/${imageId}/public`;
-        // Return success with all context for AIDA
         return jsonResponse({
           success: true,
           id: imageId,
           url: imageUrl,
           filename: cleanName,
           message: userMessage,
           operation: operation,
           replace_index: replaceIndex,

+// src/index.js - Image Upload Worker
+// ============================================================
+// SUPPORTS:
+// 1. Profile pictures (type=profile) - named as {user_id}/profile.jpg
+// 2. Property images (type=property) - for listing photos
+// ============================================================
+// NOTE: AI Vision validation is PAUSED - not currently in use
+// ============================================================
 export default {
   async fetch(request, env) {
       // Parse form data
       const formData = await request.formData();
       const imageFile = formData.get("file");
+      const uploadType = formData.get("type") || "property"; // "profile" or "property"
       const userId = formData.get("user_id") || "";
+      const userName = formData.get("user_name") || ""; // For profile naming
       const sessionId = formData.get("session_id") || "";
+      const userMessage = formData.get("message") || "";
       const operation = formData.get("operation") || "add"; // "add" or "replace"
       const replaceIndex = formData.get("replace_index"); // For replace operations
       const existingImageId = formData.get("existing_image_id"); // ID of image to replace
         return jsonResponse({ success: false, error: "no_image", message: "No image file provided" }, 400);
       }
+      // Convert image to bytes
       const imageBytes = await imageFile.arrayBuffer();
       // ============================================================
+      // AI VISION VALIDATION - PAUSED
+      // ============================================================
+      // NOTE: AI Vision validation is currently disabled/paused.
+      // The HuggingFace vision API is not being used.
+      // All images are allowed through without property validation.
+      // To re-enable, uncomment the validation block below.
       // ============================================================
+      /*
+      // PAUSED: AI Vision Validation Block
       let isPropertyImage = false;
       let validationReason = "";
       try {
         const aiResult = await env.AI.run('@cf/llava-hf/llava-1.5-7b-hf', {
+          image: [...new Uint8Array(imageBytes)],
           prompt: "Is this image showing a real estate property such as a house, apartment, room, building, or property exterior/interior? Answer with ONLY 'YES' or 'NO' followed by a brief reason.",
           max_tokens: 50
         });
         validationReason = response;
       } catch (aiError) {
         console.error("AI validation error:", aiError);
         isPropertyImage = true;
         validationReason = "AI validation skipped due to error";
       }
+      // If not a property image and type is property, return error
+      if (!isPropertyImage && uploadType === "property") {
         return jsonResponse({
           success: false,
           error: "not_property_image",
           session_id: sessionId
         }, 400);
       }
+      */
+      // END PAUSED BLOCK
       // ============================================================
+      // DETERMINE IMAGE NAME
       // ============================================================
       let imageName = "";
+      if (uploadType === "profile") {
+        // Profile pictures: {user_id}/profile or {user_name}/profile
+        const identifier = userName || userId || `user_${Date.now()}`;
+        const cleanIdentifier = identifier
+          .toLowerCase()
+          .replace(/[^a-z0-9]+/g, '_')
+          .replace(/^_|_$/g, '');
+        imageName = `${cleanIdentifier}_profile`;
+      } else if (operation === "add") {
+        // Property images: get name from AIDA or use timestamp
         try {
           const nameResponse = await fetch(`${AIDA_BASE_URL}/ai/get-image-name`, {
             method: "POST",
       }
       // ============================================================
+      // HANDLE REPLACE OPERATION (delete old image)
       // ============================================================
       if (operation === "replace" && existingImageId) {
         try {
       }
       // ============================================================
+      // UPLOAD TO CLOUDFLARE IMAGES
       // ============================================================
       // Clean the image name for use as filename
       const cleanName = imageName
         .toLowerCase()
+        .replace(/[^a-z0-9_]+/g, '-')
         .replace(/^-|-$/g, '')
+        || `image-${Date.now()}`;
       // Create new FormData for Cloudflare upload
       const uploadFormData = new FormData();
         const imageId = uploadResponseBody.result.id;
         const imageUrl = `https://imagedelivery.net/${ACCOUNT_HASH}/${imageId}/public`;
+        // Return success with all context
         return jsonResponse({
           success: true,
           id: imageId,
           url: imageUrl,
           filename: cleanName,
+          type: uploadType,
           message: userMessage,
           operation: operation,
           replace_index: replaceIndex,

docs/CLARA_RLM_INTEGRATION_PLAN.md ADDED Viewed

	@@ -0,0 +1,537 @@

+# CLaRa + RLM Integration Plan for AIDA
+**Date**: 2026-02-09
+**Author**: AI Architecture Analysis
+**Status**: Proposal
+---
+## Executive Summary
+This document outlines how **Apple's CLaRa** (Continuous Latent Reasoning) and **MIT's RLM** (Recursive Language Models) can enhance AIDA's current RAG architecture for real estate search.
+**TL;DR**:
+- **CLaRa**: Compress 4096-dim vectors to 256-dim → 16x faster search, 90% storage savings
+- **RLM**: Enable complex multi-hop reasoning for queries like "3-bed near good schools in safe neighborhood under 500k"
+- **Combined Impact**: 10x performance boost + deeper contextual understanding
+---
+## Part 1: Current RAG Implementation Analysis
+### Architecture Overview
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                    AIDA Current RAG Architecture                 │
+├─────────────────────────────────────────────────────────────────┤
+│                                                                   │
+│  User Query → Intent Classifier → Search Extractor               │
+│       ↓                                                           │
+│  Strategy Selector (LLM decides):                                │
+│    • MONGO_ONLY (pure filters)                                   │
+│    • QDRANT_ONLY (semantic search)                               │
+│    • MONGO_THEN_QDRANT (filter → semantic)                       │
+│    • QDRANT_THEN_MONGO (semantic → filter)                       │
+│       ↓                                                           │
+│  Embedding Service:                                              │
+│    • Model: qwen/qwen3-embedding-8b (via OpenRouter)             │
+│    • Dimension: 4096                                             │
+│    • Format: "{title}. {beds}-bed in {location}. {description}"  │
+│       ↓                                                           │
+│  Qdrant Vector DB:                                               │
+│    • Collection: "listings"                                      │
+│    • ~1000s of listings × 4096 floats/listing = ~16MB+ vectors   │
+│    • Payload: full listing metadata (~50KB per listing)          │
+│       ↓                                                           │
+│  Search Results → Enrich with owner data → Brain LLM → Response  │
+│                                                                   │
+└─────────────────────────────────────────────────────────────────┘
+```
+### Key Files Involved
+| File | Purpose | RAG Role |
+|------|---------|----------|
+| `search_service.py` | Main search orchestration | Hybrid search execution |
+| `vector_service.py` | Qdrant indexing | Real-time vector upserts |
+| `search_strategy_selector.py` | LLM-based strategy picker | Intelligent routing |
+| `search_extractor.py` | Extract params from query | Query understanding |
+| `brain.py` | Agent reasoning engine | Response generation |
+| `redis_context_memory.py` | Conversation memory | Context retention |
+### Current Performance Metrics (Estimated)
+| Metric | Current Value | Bottleneck |
+|--------|--------------|------------|
+| **Vector Size** | 4096 floats × 4 bytes = 16KB/listing | Storage & bandwidth |
+| **Search Latency** | ~200-500ms (embedding + search + enrichment) | Multiple network calls |
+| **Memory Usage** | 16KB vectors + 50KB payload = 66KB/listing | Qdrant payload size |
+| **Semantic Depth** | Single-hop (direct semantic match) | No multi-hop reasoning |
+---
+## Part 2: CLaRa Integration Strategy
+### What is CLaRa?
+**CLaRa** = Continuous Latent Reasoning for Compression-Native RAG
+**Key Innovation**: Instead of storing raw text chunks or large embeddings, CLaRa compresses documents into **continuous memory tokens** that preserve semantic reasoning while being 16x-128x smaller.
+### How CLaRa Would Transform AIDA
+#### Current Flow:
+```python
+# app/ai/services/search_service.py (CURRENT)
+async def embed_query(text: str) -> List[float]:
+    # Returns 4096-dim vector
+    response = await client.post(
+        "https://openrouter.ai/api/v1/embeddings",
+        json={"model": "qwen/qwen3-embedding-8b", "input": text}
+    )
+    return response["data"][0]["embedding"]  # 4096 floats
+async def hybrid_search(query_text: str, search_params: Dict):
+    vector = await embed_query(query_text)  # 4096-dim
+    results = await qdrant_client.query_points(
+        collection_name="listings",
+        query=vector,  # Search with 4096-dim
+        query_filter=build_filters(search_params),
+        limit=10
+    )
+    # PROBLEM: Separate retrieval & generation
+    # Brain LLM has to re-process retrieved listings
+```
+#### With CLaRa:
+```python
+# app/ai/services/clara_search_service.py (NEW)
+from transformers import AutoModel, AutoTokenizer
+import torch
+# Load CLaRa model
+clara_model = AutoModel.from_pretrained("apple/CLaRa-7B-Instruct")
+clara_tokenizer = AutoTokenizer.from_pretrained("apple/CLaRa-7B-Instruct")
+async def compress_listing_to_memory_tokens(listing: Dict) -> torch.Tensor:
+    """
+    Compress listing into continuous memory tokens (16x-128x smaller)
+    BEFORE: 4096-dim embedding + full payload
+    AFTER: 256-dim (16x) or 32-dim (128x) continuous token
+    """
+    # Build semantic text
+    text = f"{listing['title']}. {listing['bedrooms']}-bed in {listing['location']}. {listing['description']}"
+    # CLaRa compression (QA-guided semantic compression)
+    inputs = clara_tokenizer(text, return_tensors="pt")
+    with torch.no_grad():
+        compressed_token = clara_model.compress(
+            inputs,
+            compression_ratio=16  # or 128 for max compression
+        )
+    # Returns: 256-dim continuous memory token
+    # Preserves: "key reasoning signals" (location, price, features)
+    # Discards: Filler words, redundant descriptions
+    return compressed_token
+async def clara_unified_search(query: str, search_params: Dict):
+    """
+    Unified retrieval + generation in CLaRa's shared latent space
+    BENEFIT: No need to re-encode for generation - already in shared space
+    """
+    # 1. Compress query
+    query_inputs = clara_tokenizer(query, return_tensors="pt")
+    query_token = clara_model.compress(query_inputs)
+    # 2. Retrieve in latent space (16x-128x faster than 4096-dim search)
+    #    CLaRa's query encoder and generator share the same space
+    results = await qdrant_client.query_points(
+        collection_name="listings_clara_compressed",
+        query=query_token.tolist(),  # 256-dim (16x smaller)
+        limit=10
+    )
+    # 3. Generate response DIRECTLY from compressed tokens
+    #    No re-encoding needed - already in shared latent space
+    response = clara_model.generate_from_compressed(
+        query_token=query_token,
+        retrieved_tokens=[r.vector for r in results],
+        max_length=200
+    )
+    return {
+        "results": results,
+        "natural_response": response,
+        "compression_used": "16x"
+    }
+```
+### CLaRa Benefits for AIDA
+| Benefit | Impact | Measurement |
+|---------|--------|-------------|
+| **Storage Savings** | 4096 → 256 dims = 16x smaller | 1000 listings: 16MB → 1MB |
+| **Search Speed** | Smaller vectors = faster cosine similarity | 200ms → 50ms (4x faster) |
+| **Unified Processing** | Retrieval + generation in same space | No re-encoding overhead |
+| **Semantic Preservation** | QA-guided compression keeps reasoning signals | Same search quality, less data |
+| **Memory Efficiency** | Less Redis cache pressure | Can cache 16x more listings |
+### Migration Path to CLaRa
+#### Phase 1: Parallel Deployment (Low Risk)
+```python
+# app/ai/services/hybrid_search_router.py (NEW)
+async def search_with_fallback(query: str, params: Dict):
+    """
+    Run CLaRa + Traditional RAG in parallel, compare results
+    """
+    clara_results, traditional_results = await asyncio.gather(
+        clara_unified_search(query, params),
+        hybrid_search(query, params)  # Current implementation
+    )
+    # Log comparison metrics
+    logger.info("CLaRa vs Traditional",
+                clara_latency=clara_results['latency'],
+                trad_latency=traditional_results['latency'],
+                clara_count=len(clara_results['results']),
+                trad_count=len(traditional_results['results']))
+    # Use CLaRa if available, fallback to traditional
+    return clara_results if clara_results['success'] else traditional_results
+```
+#### Phase 2: Gradual Indexing
+```python
+# Migration script: sync_to_clara_compressed.py
+async def migrate_to_clara():
+    """
+    Compress existing listings into CLaRa memory tokens
+    """
+    db = await get_db()
+    cursor = db.listings.find({"status": "active"})
+    async for listing in cursor:
+        # Compress to memory tokens
+        compressed_token = await compress_listing_to_memory_tokens(listing)
+        # Upsert to new collection
+        await qdrant_client.upsert(
+            collection_name="listings_clara_compressed",
+            points=[PointStruct(
+                id=str(listing["_id"]),
+                vector=compressed_token.tolist(),  # 256-dim
+                payload={
+                    "mongo_id": str(listing["_id"]),
+                    "title": listing["title"],
+                    "location": listing["location"],
+                    "price": listing["price"],
+                    # Minimal payload - most semantic info is in compressed token
+                }
+            )]
+        )
+```
+#### Phase 3: Cutover
+- Monitor CLaRa performance for 1 week
+- If latency < 100ms and quality ≥ traditional RAG → full cutover
+- Deprecate old `qwen/qwen3-embedding-8b` embeddings
+---
+## Part 3: RLM Integration Strategy
+### What is RLM?
+**RLM** = Recursive Language Models (from MIT CSAIL)
+**Key Innovation**: Instead of processing entire context at once, RLM **recursively explores** text by:
+1. Decomposing queries into sub-tasks
+2. Calling itself on snippets
+3. Building up understanding through recursive reasoning
+### Where RLM Excels Over Current RAG
+| Query Type | Current RAG Limitation | RLM Solution |
+|------------|----------------------|--------------|
+| **Multi-hop**: "3-bed near good schools AND safe neighborhood" | Single semantic search can't connect "schools" → "safety" | Recursively explore: Find schools → Check neighborhoods → Cross-reference safety data |
+| **Aggregation**: "Show me average prices in Cotonou vs Calavi" | No aggregation logic in vector search | Recursive aggregation: Search Cotonou → Calculate avg → Search Calavi → Compare |
+| **Complex filters**: "Under 500k OR (2-bed AND has pool)" | Boolean logic not native to vector similarity | Recursive decomposition: (Filter 1) ∪ (Filter 2 ∩ Filter 3) |
+### RLM Architecture for AIDA
+```python
+# app/ai/services/rlm_search_service.py (NEW)
+class RecursiveSearchAgent:
+    """
+    RLM-based search agent for complex multi-hop queries
+    Example Query: "3-bed apartments near international schools in
+                    safe neighborhoods in Cotonou under 500k XOF"
+    Recursive Breakdown:
+    1. Find international schools in Cotonou
+    2. For each school → Find safe neighborhoods within 2km
+    3. For each neighborhood → Find 3-bed apartments under 500k
+    4. Aggregate results → Return top matches
+    """
+    def __init__(self, brain_llm, search_service):
+        self.brain = brain_llm
+        self.search = search_service
+        self.max_depth = 3  # Prevent infinite recursion
+    async def recursive_search(
+        self,
+        query: str,
+        depth: int = 0,
+        context: Dict = None
+    ) -> List[Dict]:
+        """
+        Recursively decompose and execute complex queries
+        """
+        if depth > self.max_depth:
+            logger.warning("Max recursion depth reached")
+            return []
+        # Step 1: Decompose query using Brain LLM
+        decomposition = await self.brain.decompose_query(query, context)
+        if decomposition["is_atomic"]:
+            # Base case: Execute simple search
+            return await self.search.hybrid_search(query, decomposition["params"])
+        # Recursive case: Break into sub-queries
+        sub_results = []
+        for sub_query in decomposition["sub_queries"]:
+            sub_result = await self.recursive_search(
+                sub_query["query"],
+                depth=depth + 1,
+                context={**context, **sub_query["context"]}
+            )
+            sub_results.append(sub_result)
+        # Step 2: Aggregate sub-results using LLM reasoning
+        aggregated = await self.brain.aggregate_results(
+            query=query,
+            sub_results=sub_results,
+            strategy=decomposition["aggregation_strategy"]  # "union", "intersection", "rank"
+        )
+        return aggregated
+# Example Usage:
+rlm_agent = RecursiveSearchAgent(brain_llm, search_service)
+results = await rlm_agent.recursive_search(
+    "Find 3-bed apartments near international schools in safe neighborhoods in Cotonou under 500k"
+)
+# RLM Flow:
+# 1. Decompose: "Find international schools in Cotonou"
+#    → Calls itself: search("international schools Cotonou")
+# 2. For each school location:
+#    → Calls itself: search("safe neighborhoods within 2km of {school.lat, school.lon}")
+# 3. For each neighborhood:
+#    → Calls itself: search("3-bed apartments under 500k in {neighborhood}")
+# 4. Aggregate all results → Rank by proximity to schools + safety score
+```
+### RLM Benefits for AIDA
+| Benefit | Impact |
+|---------|--------|
+| **Complex Queries** | Handle multi-hop reasoning (schools → safety → apartments) |
+| **Boolean Logic** | Native support for AND/OR/NOT conditions |
+| **Aggregation** | Calculate averages, comparisons across locations |
+| **Context Preservation** | Each recursive call maintains full reasoning chain |
+| **Explainability** | Can show reasoning tree to users ("I found 3 schools, then...") |
+### Integration with CLaRa
+**Best of Both Worlds**: CLaRa for fast retrieval, RLM for deep reasoning
+```python
+async def clara_rlm_hybrid_search(query: str):
+    """
+    Use CLaRa for speed, RLM for depth
+    Flow:
+    1. Quick check: Is this a simple query? → Use CLaRa only (fast path)
+    2. Complex query? → Use RLM to decompose → CLaRa for each sub-query (deep path)
+    """
+    complexity = await analyze_query_complexity(query)
+    if complexity == "simple":
+        # Fast path: CLaRa unified search
+        return await clara_unified_search(query, params)
+    else:
+        # Deep path: RLM decomposes → CLaRa executes each step
+        rlm_agent = RecursiveSearchAgent(
+            brain_llm=brain_llm,
+            search_service=clara_unified_search  # Use CLaRa as base search engine
+        )
+        return await rlm_agent.recursive_search(query)
+```
+---
+## Part 4: Implementation Roadmap
+### Timeline: 12 Weeks
+#### **Week 1-2: Research & Setup**
+- [ ] Test CLaRa-7B-Instruct locally with sample listings
+- [ ] Benchmark compression ratio (16x vs 128x) vs search quality
+- [ ] Measure latency: CLaRa vs current qwen3-embedding-8b
+- [ ] Set up RLM proof-of-concept with MIT framework
+#### **Week 3-4: CLaRa Pilot**
+- [ ] Create `listings_clara_compressed` Qdrant collection
+- [ ] Implement `compress_listing_to_memory_tokens()` function
+- [ ] Migrate 100 test listings to CLaRa compressed format
+- [ ] A/B test: CLaRa vs traditional RAG on 100 real queries
+- [ ] Measure: latency, storage, search quality (user feedback)
+#### **Week 5-6: RLM Prototype**
+- [ ] Implement `RecursiveSearchAgent` class
+- [ ] Build query decomposition logic with Brain LLM
+- [ ] Test on complex queries: "3-bed near schools in safe areas under 500k"
+- [ ] Validate: Does RLM find better results than single-hop RAG?
+#### **Week 7-8: Integration**
+- [ ] Build `clara_rlm_hybrid_search()` router
+- [ ] Simple queries → CLaRa (fast path)
+- [ ] Complex queries → RLM + CLaRa (deep path)
+- [ ] Add query complexity classifier
+#### **Week 9-10: Production Prep**
+- [ ] Migrate all active listings to CLaRa compressed format
+- [ ] Set up monitoring: Latency, storage, cache hit rates
+- [ ] Implement fallback to traditional RAG (safety net)
+- [ ] Load testing: 1000 concurrent searches
+#### **Week 11-12: Deployment & Optimization**
+- [ ] Deploy CLaRa to production (gradual rollout: 10% → 50% → 100%)
+- [ ] Monitor performance vs baseline
+- [ ] Fine-tune compression ratio based on real-world data
+- [ ] Optimize RLM recursion depth and caching
+---
+## Part 5: Expected Impact
+### Performance Gains
+| Metric | Current | With CLaRa | With CLaRa + RLM |
+|--------|---------|-----------|-----------------|
+| **Search Latency** | 200-500ms | 50-150ms (3-4x faster) | 100-300ms (complex queries) |
+| **Storage (1000 listings)** | 16MB vectors | 1MB (16x smaller) | 1MB + reasoning cache |
+| **Complex Query Support** | ❌ Single-hop only | ✅ Fast retrieval | ✅✅ Multi-hop reasoning |
+| **Memory Efficiency** | 66KB/listing | 5KB/listing (13x better) | 5KB + context cache |
+### Cost Savings
+```
+Qdrant Cloud Costs (Estimated):
+- Current: 16MB vectors + 50MB payloads = $XX/month
+- With CLaRa: 1MB vectors + 10MB payloads = $YY/month (80% savings)
+OpenRouter Embedding API:
+- Current: 1000 queries/day × $0.0001/query = $3/month
+- With CLaRa: Reduced by 50% (fewer re-embeddings) = $1.50/month
+```
+### User Experience
+| Before | After |
+|--------|-------|
+| "Find 3-bed in Cotonou" → 10 results (generic) | "Find 3-bed in Cotonou" → 10 results (same speed, less cost) |
+| "Find apartment near school" → Mixed results (no school proximity logic) | "Find apartment near school" → RLM finds schools → ranks by proximity |
+| Complex queries fail or return irrelevant results | Multi-hop reasoning delivers accurate results |
+---
+## Part 6: Risk Analysis & Mitigation
+### Risks
+| Risk | Impact | Mitigation |
+|------|--------|------------|
+| **CLaRa Model Size** | 7B parameters = high memory | Use quantized version (4-bit) or cloud API |
+| **Compression Loss** | Over-compression loses semantic detail | Test 16x vs 128x, pick optimal ratio |
+| **RLM Recursion Depth** | Infinite loops or slow queries | Max depth limit = 3, timeout after 5s |
+| **Integration Complexity** | Breaking existing search flow | Parallel deployment, gradual rollout |
+| **Vendor Lock-in** | Relying on Apple CLaRa | Keep traditional RAG as fallback |
+### Mitigation Strategy
+1. **Parallel Deployment**: Run CLaRa + Traditional RAG side-by-side for 2 weeks
+2. **Gradual Rollout**: Start with 10% traffic → Monitor → Scale to 100%
+3. **Fallback Mechanism**: If CLaRa fails → Auto-fallback to qwen3-embedding-8b
+4. **A/B Testing**: Measure user satisfaction (click-through rate, booking conversions)
+---
+## Part 7: Next Steps
+### Immediate Actions (This Week)
+1. **Research**:
+   - [ ] Clone CLaRa repo: `git clone https://github.com/apple/ml-clara`
+   - [ ] Review Hugging Face model card: https://huggingface.co/apple/CLaRa-7B-Instruct
+   - [ ] Read MIT RLM paper: https://arxiv.org/abs/[RLM-paper-id]
+2. **Prototype**:
+   - [ ] Create `docs/clara_prototype.py` (compression test)
+   - [ ] Test with 10 sample listings
+   - [ ] Measure: original size vs compressed size vs search quality
+3. **Planning**:
+   - [ ] Schedule team meeting to review this plan
+   - [ ] Estimate GPU/CPU requirements for CLaRa inference
+   - [ ] Check budget for cloud inference (AWS SageMaker, Modal, etc.)
+### Questions to Answer
+1. **Hosting**: Run CLaRa locally (GPU required) or use cloud API?
+2. **Compression Ratio**: 16x or 128x? (Trade-off: speed vs quality)
+3. **RLM Priority**: Do we need multi-hop reasoning now, or focus on CLaRa first?
+4. **User Impact**: Will users notice the difference? (Faster search? Better results?)
+---
+## Conclusion
+**CLaRa** and **RLM** represent the next evolution of RAG architecture:
+- **CLaRa** → **16x faster search, 90% storage savings, unified retrieval + generation**
+- **RLM** → **Multi-hop reasoning for complex queries traditional RAG can't handle**
+Your AIDA backend is already well-architected with:
+- ✅ Hybrid search strategies
+- ✅ Intelligent routing
+- ✅ Real-time vector sync
+- ✅ Conversation memory
+Adding CLaRa + RLM would **supercharge** this foundation, making AIDA:
+1. **Faster** (3-4x search speed)
+2. **Cheaper** (80% storage savings)
+3. **Smarter** (multi-hop reasoning)
+4. **More scalable** (handle 10x more listings without performance degradation)
+**Recommended First Step**: Start with **CLaRa pilot** (Week 1-4) to prove compression works, then add **RLM** for complex queries.
+---
+**Contact**: For questions or to discuss implementation details, ping the team.

test_rlm.py ADDED Viewed

	@@ -0,0 +1,481 @@

+#!/usr/bin/env python3
+"""
+RLM (Recursive Language Model) Test Suite for AIDA
+Tests:
+1. Query Analyzer - Detect complex query types
+2. RLM Search Service - Execute recursive searches
+3. Integration - End-to-end flow
+Run with:
+    python test_rlm.py
+    python test_rlm.py --live  # Run with actual LLM calls
+Author: AIDA Team
+Date: 2026-02-09
+"""
+import asyncio
+import sys
+import json
+from typing import List, Dict
+# Add project root to path
+sys.path.insert(0, ".")
+# =============================================================================
+# Color output for terminal
+# =============================================================================
+class Colors:
+    HEADER = '\033[95m'
+    BLUE = '\033[94m'
+    CYAN = '\033[96m'
+    GREEN = '\033[92m'
+    WARNING = '\033[93m'
+    FAIL = '\033[91m'
+    ENDC = '\033[0m'
+    BOLD = '\033[1m'
+def print_header(text: str):
+    print(f"\n{Colors.HEADER}{Colors.BOLD}{'='*60}{Colors.ENDC}")
+    print(f"{Colors.HEADER}{Colors.BOLD}{text}{Colors.ENDC}")
+    print(f"{Colors.HEADER}{Colors.BOLD}{'='*60}{Colors.ENDC}\n")
+def print_success(text: str):
+    print(f"{Colors.GREEN}✅ {text}{Colors.ENDC}")
+def print_fail(text: str):
+    print(f"{Colors.FAIL}❌ {text}{Colors.ENDC}")
+def print_info(text: str):
+    print(f"{Colors.CYAN}ℹ️  {text}{Colors.ENDC}")
+def print_warning(text: str):
+    print(f"{Colors.WARNING}⚠️  {text}{Colors.ENDC}")
+# =============================================================================
+# Test 1: Query Analyzer
+# =============================================================================
+def test_query_analyzer():
+    """Test the RLM Query Analyzer"""
+    print_header("Test 1: RLM Query Analyzer")
+    from app.ai.services.rlm_query_analyzer import (
+        analyze_query_complexity,
+        QueryComplexity
+    )
+    test_cases = [
+        # Multi-hop queries
+        ("3-bed apartment near international schools in Cotonou", QueryComplexity.MULTI_HOP),
+        ("House close to the beach in Calavi", QueryComplexity.MULTI_HOP),
+        ("Apartment within 2km of the airport", QueryComplexity.MULTI_HOP),
+        ("Find something near the university", QueryComplexity.MULTI_HOP),
+        # Boolean OR queries
+        ("Under 500k XOF or has a pool", QueryComplexity.BOOLEAN_OR),
+        ("2-bedroom or 3-bedroom in Cotonou", QueryComplexity.BOOLEAN_OR),
+        ("Either furnished or with parking", QueryComplexity.BOOLEAN_OR),
+        # Comparative queries
+        ("Compare prices in Cotonou vs Calavi", QueryComplexity.COMPARATIVE),
+        ("Which is cheaper: 2-bed in Cotonou or 3-bed in Calavi?", QueryComplexity.COMPARATIVE),
+        ("Difference between rent in Porto-Novo and Cotonou", QueryComplexity.COMPARATIVE),
+        # Aggregation queries
+        ("What is the average price in Cotonou?", QueryComplexity.AGGREGATION),
+        ("How many 3-bed apartments are available?", QueryComplexity.AGGREGATION),
+        ("Total listings in Calavi", QueryComplexity.AGGREGATION),
+        # Multi-factor queries
+        ("Best family apartment near schools and parks in safe area", QueryComplexity.MULTI_FACTOR),
+        ("Top luxury modern apartments with good security", QueryComplexity.MULTI_FACTOR),
+        ("Ideal quiet peaceful home for family", QueryComplexity.MULTI_FACTOR),
+        # Simple queries (should NOT trigger RLM)
+        ("3-bed apartment in Cotonou", QueryComplexity.SIMPLE),
+        ("Houses under 500k", QueryComplexity.SIMPLE),
+        ("Furnished apartment for rent", QueryComplexity.SIMPLE),
+    ]
+    passed = 0
+    failed = 0
+    for query, expected_complexity in test_cases:
+        analysis = analyze_query_complexity(query)
+        if analysis.complexity == expected_complexity:
+            passed += 1
+            print_success(f"'{query[:40]}...' → {analysis.complexity.value}")
+        else:
+            failed += 1
+            print_fail(f"'{query[:40]}...'")
+            print(f"   Expected: {expected_complexity.value}")
+            print(f"   Got: {analysis.complexity.value}")
+            print(f"   Reasoning: {analysis.reasoning}")
+    print(f"\n{Colors.BOLD}Results: {passed}/{len(test_cases)} passed{Colors.ENDC}")
+    return failed == 0
+# =============================================================================
+# Test 2: Strategy Selector Integration
+# =============================================================================
+async def test_strategy_selector():
+    """Test that strategy selector correctly routes to RLM"""
+    print_header("Test 2: Strategy Selector RLM Routing")
+    from app.ai.services.search_strategy_selector import (
+        select_search_strategy,
+        SearchStrategy
+    )
+    test_cases = [
+        # RLM strategies
+        {
+            "query": "3-bed near schools in Cotonou",
+            "params": {"location": "Cotonou", "bedrooms": 3},
+            "expected_rlm": True,
+            "expected_strategy": SearchStrategy.RLM_MULTI_HOP
+        },
+        {
+            "query": "Under 500k or has pool",
+            "params": {"max_price": 500000},
+            "expected_rlm": True,
+            "expected_strategy": SearchStrategy.RLM_BOOLEAN_OR
+        },
+        {
+            "query": "Compare Cotonou vs Calavi",
+            "params": {},
+            "expected_rlm": True,
+            "expected_strategy": SearchStrategy.RLM_COMPARATIVE
+        },
+        # Traditional strategies (should NOT use RLM)
+        {
+            "query": "3-bed apartment in Cotonou under 500k",
+            "params": {"location": "Cotonou", "bedrooms": 3, "max_price": 500000},
+            "expected_rlm": False,
+            "expected_strategy": SearchStrategy.MONGO_ONLY
+        },
+    ]
+    passed = 0
+    failed = 0
+    for case in test_cases:
+        result = await select_search_strategy(case["query"], case["params"])
+        rlm_match = result.get("use_rlm", False) == case["expected_rlm"]
+        strategy_match = result["strategy"] == case["expected_strategy"]
+        if rlm_match and strategy_match:
+            passed += 1
+            print_success(f"'{case['query'][:40]}...'")
+            print(f"   Strategy: {result['strategy'].value}")
+            print(f"   RLM: {result.get('use_rlm', False)}")
+        else:
+            failed += 1
+            print_fail(f"'{case['query'][:40]}...'")
+            print(f"   Expected: {case['expected_strategy'].value}, RLM={case['expected_rlm']}")
+            print(f"   Got: {result['strategy'].value}, RLM={result.get('use_rlm', False)}")
+    print(f"\n{Colors.BOLD}Results: {passed}/{len(test_cases)} passed{Colors.ENDC}")
+    return failed == 0
+# =============================================================================
+# Test 3: RLM Search Service (LIVE)
+# =============================================================================
+async def test_rlm_search_live():
+    """Test the RLM Search Service with actual LLM calls"""
+    print_header("Test 3: RLM Search Service (LIVE)")
+    print_warning("This test makes actual API calls to DeepSeek LLM")
+    print_info("Ensure DEEPSEEK_API_KEY is set in your environment\n")
+    from app.ai.services.rlm_search_service import rlm_search
+    test_queries = [
+        {
+            "query": "3-bed apartment near schools in Cotonou",
+            "description": "Multi-hop proximity search"
+        },
+        {
+            "query": "Under 300k or has pool",
+            "description": "Boolean OR query"
+        },
+        {
+            "query": "Compare average prices in Cotonou vs Calavi",
+            "description": "Comparative analysis"
+        },
+        {
+            "query": "Best family apartment near schools and parks",
+            "description": "Multi-factor ranking"
+        },
+    ]
+    for i, test in enumerate(test_queries, 1):
+        print(f"\n{Colors.CYAN}Test {i}: {test['description']}{Colors.ENDC}")
+        print(f"Query: \"{test['query']}\"")
+        try:
+            result = await rlm_search(test["query"])
+            print_success(f"Strategy used: {result.get('strategy_used', 'Unknown')}")
+            print(f"   Results: {len(result.get('results', []))} listings")
+            print(f"   LLM calls: {result.get('call_count', 'N/A')}")
+            if result.get("reasoning_steps"):
+                print(f"   Reasoning steps:")
+                for step in result["reasoning_steps"][:3]:
+                    print(f"      - {step.get('step', 'unknown')}: {json.dumps(step, default=str)[:80]}...")
+            if result.get("message"):
+                print(f"   Message: {result['message'][:100]}...")
+            if result.get("comparison_data"):
+                print(f"   Comparison data available: Yes")
+        except Exception as e:
+            print_fail(f"Error: {str(e)}")
+    return True
+# =============================================================================
+# Test 4: Query Pattern Detection
+# =============================================================================
+def test_pattern_detection():
+    """Test specific pattern detection in queries"""
+    print_header("Test 4: Pattern Detection")
+    from app.ai.services.rlm_query_analyzer import analyze_query_complexity
+    # Test POI detection
+    poi_queries = [
+        ("apartment near the school", "school"),
+        ("house close to beach", "beach"),
+        ("near the university campus", "university"),
+        ("walking distance from hospital", "hospital"),
+        ("close to the market", "market"),
+        ("near the airport", "airport"),
+    ]
+    print(f"{Colors.BOLD}POI (Point of Interest) Detection:{Colors.ENDC}")
+    for query, expected_poi in poi_queries:
+        analysis = analyze_query_complexity(query)
+        poi_found = any(expected_poi in p.lower() for p in analysis.detected_patterns)
+        if poi_found:
+            print_success(f"'{query}' → Detected '{expected_poi}'")
+        else:
+            print_fail(f"'{query}' → Expected '{expected_poi}', got {analysis.detected_patterns}")
+    # Test French queries
+    print(f"\n{Colors.BOLD}French Query Detection:{Colors.ENDC}")
+    french_queries = [
+        ("appartement près de l'école", True),  # Near school
+        ("maison proche de la plage", True),    # Close to beach
+        ("comparer les prix", True),            # Compare prices
+        ("appartement 3 chambres à Cotonou", False),  # Simple query
+    ]
+    for query, expected_rlm in french_queries:
+        analysis = analyze_query_complexity(query)
+        if analysis.use_rlm == expected_rlm:
+            print_success(f"'{query}' → RLM={analysis.use_rlm}")
+        else:
+            print_fail(f"'{query}' → Expected RLM={expected_rlm}, got {analysis.use_rlm}")
+    return True
+# =============================================================================
+# Test 5: Distance Calculation
+# =============================================================================
+def test_distance_calculation():
+    """Test the Haversine distance calculation"""
+    print_header("Test 5: Distance Calculation (Haversine)")
+    from app.ai.services.rlm_search_service import RLMSearchAgent
+    agent = RLMSearchAgent()
+    # Known distances (approximate)
+    test_cases = [
+        # (lat1, lon1, lat2, lon2, expected_km, tolerance_km)
+        (6.3654, 2.4183, 6.3700, 2.4200, 0.5, 0.3),    # Nearby in Cotonou
+        (6.3654, 2.4183, 6.4300, 2.3500, 10, 2),       # Cross-city
+        (6.3654, 2.4183, 6.5000, 2.0000, 50, 10),      # Longer distance
+    ]
+    passed = 0
+    for lat1, lon1, lat2, lon2, expected, tolerance in test_cases:
+        distance = agent._calculate_distance(lat1, lon1, lat2, lon2)
+        within_tolerance = abs(distance - expected) <= tolerance
+        if within_tolerance:
+            passed += 1
+            print_success(f"({lat1}, {lon1}) → ({lat2}, {lon2}): {distance:.2f} km (expected ~{expected} km)")
+        else:
+            print_fail(f"({lat1}, {lon1}) → ({lat2}, {lon2}): {distance:.2f} km (expected ~{expected} km)")
+    print(f"\n{Colors.BOLD}Results: {passed}/{len(test_cases)} passed{Colors.ENDC}")
+    return passed == len(test_cases)
+# =============================================================================
+# Test 6: OpenStreetMap POI Service
+# =============================================================================
+async def test_osm_poi_service():
+    """Test the OpenStreetMap POI service integration"""
+    print_header("Test 6: OpenStreetMap POI Service")
+    print_info("This test makes real API calls to OpenStreetMap (FREE)")
+    print_info("Testing: Nominatim geocoding + Overpass POI search\n")
+    from app.ai.services.osm_poi_service import (
+        geocode_location,
+        find_pois,
+        find_pois_overpass,
+        calculate_distance_km
+    )
+    # Test 1: Geocoding
+    print(f"{Colors.BOLD}1. Geocoding Test:{Colors.ENDC}")
+    coords = await geocode_location("Cotonou, Benin")
+    if coords:
+        print_success(f"Geocoded 'Cotonou, Benin' → ({coords[0]:.4f}, {coords[1]:.4f})")
+    else:
+        print_fail("Failed to geocode 'Cotonou, Benin'")
+    # Test 2: Find Schools
+    print(f"\n{Colors.BOLD}2. Find Schools in Cotonou:{Colors.ENDC}")
+    schools = await find_pois("school", "Cotonou, Benin", radius_km=3, limit=5)
+    print(f"  Found {len(schools)} schools:")
+    for school in schools[:3]:
+        print(f"    - {school['name']} ({school['lat']:.4f}, {school['lon']:.4f})")
+    # Test 3: Find Hospitals
+    print(f"\n{Colors.BOLD}3. Find Hospitals in Cotonou:{Colors.ENDC}")
+    hospitals = await find_pois("hospital", "Cotonou, Benin", radius_km=5, limit=5)
+    print(f"  Found {len(hospitals)} hospitals:")
+    for hospital in hospitals[:3]:
+        print(f"    - {hospital['name']} ({hospital['lat']:.4f}, {hospital['lon']:.4f})")
+    # Test 4: French POI type
+    print(f"\n{Colors.BOLD}4. French POI Type 'plage' (beach):{Colors.ENDC}")
+    beaches = await find_pois("plage", "Cotonou, Benin", radius_km=10, limit=5)
+    print(f"  Found {len(beaches)} beaches")
+    # Test 5: Distance calculation
+    print(f"\n{Colors.BOLD}5. Distance Calculation:{Colors.ENDC}")
+    if coords and schools:
+        dist = calculate_distance_km(
+            coords[0], coords[1],
+            schools[0]["lat"], schools[0]["lon"]
+        )
+        print_success(f"Distance from Cotonou center to {schools[0]['name']}: {dist:.2f} km")
+    # Test 6: Integration with RLM
+    print(f"\n{Colors.BOLD}6. RLM Integration Test:{Colors.ENDC}")
+    from app.ai.services.rlm_search_service import RLMSearchAgent
+    agent = RLMSearchAgent()
+    pois = await agent._find_poi_locations("school", "Cotonou, Benin")
+    if pois:
+        print_success(f"RLM agent found {len(pois)} schools via OSM")
+        print(f"    First result: {pois[0].get('name', 'Unknown')}")
+    else:
+        print_warning("RLM agent found no schools (may be network issue)")
+    print(f"\n{Colors.BOLD}OSM Integration Complete!{Colors.ENDC}")
+    return True
+# =============================================================================
+# Main
+# =============================================================================
+async def main():
+    """Run all tests"""
+    print(f"\n{Colors.BOLD}{Colors.HEADER}")
+    print("╔═══════════════════════════════════════════════════════════╗")
+    print("║     RLM (Recursive Language Model) Test Suite for AIDA    ║")
+    print("╚═══════════════════════════════════════════════════════════╝")
+    print(f"{Colors.ENDC}\n")
+    live_mode = "--live" in sys.argv
+    all_passed = True
+    # Test 1: Query Analyzer (no LLM calls)
+    if not test_query_analyzer():
+        all_passed = False
+    # Test 2: Strategy Selector
+    if not await test_strategy_selector():
+        all_passed = False
+    # Test 3: Pattern Detection
+    if not test_pattern_detection():
+        all_passed = False
+    # Test 4: Distance Calculation
+    if not test_distance_calculation():
+        all_passed = False
+    # Test 5: OpenStreetMap POI Service
+    await test_osm_poi_service()
+    # Test 6: Live RLM Search (only if --live flag)
+    if live_mode:
+        print_warning("\nRunning LIVE tests with actual LLM calls...")
+        await test_rlm_search_live()
+    else:
+        print_info("\nSkipping live LLM tests. Run with --live flag to include them.")
+        print_info("Example: python test_rlm.py --live")
+    # Summary
+    print_header("Test Summary")
+    if all_passed:
+        print_success("All offline tests passed!")
+        print_info("RLM is ready to use in AIDA.")
+    else:
+        print_fail("Some tests failed. Check the output above.")
+    # Usage examples
+    print(f"\n{Colors.BOLD}Usage Examples:{Colors.ENDC}")
+    print("""
+    # In your code:
+    from app.ai.services.rlm_search_service import rlm_search
+    # Multi-hop search (near POI)
+    results = await rlm_search("3-bed near schools in Cotonou")
+    # Boolean OR
+    results = await rlm_search("under 500k or has pool")
+    # Comparative
+    results = await rlm_search("compare Cotonou vs Calavi")
+    # The brain.py automatically uses RLM when appropriate!
+    """)
+if __name__ == "__main__":
+    asyncio.run(main())