Spaces:
Running
Running
Commit Β·
bc0cd92
1
Parent(s): ee3d3d4
fyp
Browse files- IMPLEMENTATION_SUMMARY.md +0 -319
- VISION_FEATURE_INTEGRATION_GUIDE.md +0 -794
- app/__pycache__/config.cpython-313.pyc +0 -0
- app/ai/agent/__pycache__/graph.cpython-313.pyc +0 -0
- app/ai/agent/brain.py +158 -55
- app/ai/agent/graph.py +6 -2
- app/ai/agent/nodes/__pycache__/listing_publish.cpython-313.pyc +0 -0
- app/ai/agent/nodes/listing_collect.py +13 -74
- app/ai/agent/nodes/listing_publish.py +10 -0
- app/ai/lightning/__init__.py +38 -0
- app/ai/lightning/rewards.py +249 -0
- app/ai/lightning/tracer.py +326 -0
- app/ai/services/__init__.py +52 -0
- app/ai/services/osm_poi_service.py +499 -0
- app/ai/services/rlm_query_analyzer.py +287 -0
- app/ai/services/rlm_search_service.py +1202 -0
- app/ai/services/search_strategy_selector.py +85 -17
- app/ai/services/vision_service.py +0 -697
- app/config.py +14 -6
- app/routes/auth.py +56 -0
- app/routes/media_upload.py +36 -330
- app/schemas/user.py +63 -0
- cloudflare-worker/image-upload-worker.js +44 -22
- docs/CLARA_RLM_INTEGRATION_PLAN.md +537 -0
- test_rlm.py +481 -0
IMPLEMENTATION_SUMMARY.md
DELETED
|
@@ -1,319 +0,0 @@
|
|
| 1 |
-
# π― Vision AI Listing Feature - Implementation Summary
|
| 2 |
-
|
| 3 |
-
## What Was Built
|
| 4 |
-
|
| 5 |
-
A **smart AI-powered property listing feature** that intelligently handles THREE different listing methods and produces a unified result.
|
| 6 |
-
|
| 7 |
-
---
|
| 8 |
-
|
| 9 |
-
## Key Features Implemented
|
| 10 |
-
|
| 11 |
-
### 1. β
Smart Listing Method Detection
|
| 12 |
-
|
| 13 |
-
The system knows HOW the user is listing and behaves accordingly:
|
| 14 |
-
|
| 15 |
-
**TEXT Method** (User provides details via chat)
|
| 16 |
-
- User says: "3-bed, 2-bath in Lagos, 500k/month, has WiFi, AC"
|
| 17 |
-
- Uploads photos for VALIDATION (not re-extraction)
|
| 18 |
-
- Backend: Validates images are property-related, uploads to Cloudflare
|
| 19 |
-
- Result: Text data + validated photos
|
| 20 |
-
|
| 21 |
-
**IMAGE Method** (User uploads photos only)
|
| 22 |
-
- User just uploads photos (no text details)
|
| 23 |
-
- Backend: EXTRACTS all details from images (bedrooms, bathrooms, amenities)
|
| 24 |
-
- Generates: SHORT title (max 2 sentences) + full description
|
| 25 |
-
- Result: Complete listing data extracted from photos
|
| 26 |
-
|
| 27 |
-
**VIDEO Method** (User uploads video + photos)
|
| 28 |
-
- User uploads video walkthrough
|
| 29 |
-
- Backend: Uploads to Cloudinary, suggests adding photos
|
| 30 |
-
- User uploads photos for analysis
|
| 31 |
-
- Backend: Extracts details from photos (same as IMAGE method)
|
| 32 |
-
- Result: Full data from photos + video URL
|
| 33 |
-
|
| 34 |
-
---
|
| 35 |
-
|
| 36 |
-
### 2. β
Intelligent Title & Description Generation
|
| 37 |
-
|
| 38 |
-
**Title Requirements:**
|
| 39 |
-
- β
SHORT - Maximum 2 sentences
|
| 40 |
-
- β
Examples: "Modern 3-bed apartment. Great location!"
|
| 41 |
-
- β NOT: Long descriptions with many details
|
| 42 |
-
|
| 43 |
-
**Description:**
|
| 44 |
-
- Full 2-3 sentence description of property
|
| 45 |
-
- Professional tone
|
| 46 |
-
- Highlights key features
|
| 47 |
-
|
| 48 |
-
**Both generated by Vision AI** for image/video methods
|
| 49 |
-
|
| 50 |
-
---
|
| 51 |
-
|
| 52 |
-
### 3. β
Smart File Naming Strategy
|
| 53 |
-
|
| 54 |
-
**Pattern:** `{location}_{title}_{timestamp}_{index}.jpg`
|
| 55 |
-
|
| 56 |
-
**Example filenames:**
|
| 57 |
-
- `Lagos_Modern_Apartment_2025_01_31_0.jpg`
|
| 58 |
-
- `Victoria_Island_3_Bed_Luxury_2025_01_31_1.jpg`
|
| 59 |
-
- `Cotonou_Cozy_Studio_2025_01_31_0.jpg`
|
| 60 |
-
|
| 61 |
-
**Benefits:**
|
| 62 |
-
- Easy to identify property in storage
|
| 63 |
-
- Shows when listed (timestamp)
|
| 64 |
-
- Automatically indexed for multiple photos
|
| 65 |
-
- Cloudflare worker detects duplicates and appends numbers
|
| 66 |
-
|
| 67 |
-
---
|
| 68 |
-
|
| 69 |
-
### 4. β
Unified Response Format
|
| 70 |
-
|
| 71 |
-
**All three methods return the SAME structure:**
|
| 72 |
-
|
| 73 |
-
```json
|
| 74 |
-
{
|
| 75 |
-
"success": true,
|
| 76 |
-
"listing_method": "text|image|video",
|
| 77 |
-
"extracted_fields": {
|
| 78 |
-
"bedrooms": 3,
|
| 79 |
-
"bathrooms": 2,
|
| 80 |
-
"amenities": ["WiFi", "Parking", "AC"],
|
| 81 |
-
"description": "Beautiful apartment...",
|
| 82 |
-
"title": "Modern 3-Bed Apartment. Great location!"
|
| 83 |
-
},
|
| 84 |
-
"confidence": {
|
| 85 |
-
"bedrooms": 0.95,
|
| 86 |
-
"bathrooms": 0.88,
|
| 87 |
-
"amenities": 0.72,
|
| 88 |
-
"title": 0.85
|
| 89 |
-
},
|
| 90 |
-
"image_urls": ["url1", "url2"],
|
| 91 |
-
"video_url": "https://cloudinary..." // Only if video method
|
| 92 |
-
}
|
| 93 |
-
```
|
| 94 |
-
|
| 95 |
-
**Frontend shows same UI** regardless of how user listed β Same draft card, same editing experience
|
| 96 |
-
|
| 97 |
-
---
|
| 98 |
-
|
| 99 |
-
### 5. β
Property Validation BEFORE Upload
|
| 100 |
-
|
| 101 |
-
**Critical feature for space saving:**
|
| 102 |
-
|
| 103 |
-
```
|
| 104 |
-
Image Upload Flow:
|
| 105 |
-
1. Receive image from frontend
|
| 106 |
-
2. Check: "Is this a property image?"
|
| 107 |
-
3. If NO β Reject with message, no upload
|
| 108 |
-
4. If YES β Upload to Cloudflare with smart filename
|
| 109 |
-
```
|
| 110 |
-
|
| 111 |
-
This prevents non-property images from consuming Cloudflare storage!
|
| 112 |
-
|
| 113 |
-
---
|
| 114 |
-
|
| 115 |
-
### 6. β
Vision Service Enhancements
|
| 116 |
-
|
| 117 |
-
**New capabilities in `vision_service.py`:**
|
| 118 |
-
|
| 119 |
-
- `extract_property_fields()` - Now generates title + description
|
| 120 |
-
- `_generate_title()` - Creates SHORT titles (max 2 sentences)
|
| 121 |
-
- `_extract_room_count()` - Counts bedrooms/bathrooms
|
| 122 |
-
- `_detect_amenities()` - Finds amenities in images
|
| 123 |
-
- `_generate_description()` - Creates full descriptions
|
| 124 |
-
- `merge_multiple_image_results()` - Combines results from multiple images
|
| 125 |
-
- Confidence scoring for each field
|
| 126 |
-
|
| 127 |
-
---
|
| 128 |
-
|
| 129 |
-
### 7. β
Enhanced Media Upload Routes
|
| 130 |
-
|
| 131 |
-
**Updated endpoints:**
|
| 132 |
-
|
| 133 |
-
`POST /listings/analyze-images`
|
| 134 |
-
- Accepts `listing_method` parameter ("text", "image", "video")
|
| 135 |
-
- Accepts optional `location` parameter for context
|
| 136 |
-
- Returns: Complete extracted fields + image URLs + confidence scores
|
| 137 |
-
- Generates intelligent filenames during upload
|
| 138 |
-
|
| 139 |
-
`POST /listings/analyze-video`
|
| 140 |
-
- Uploads video to Cloudinary with smart naming
|
| 141 |
-
- Returns: Video URL + suggestions to upload photos
|
| 142 |
-
- Recommends photos for better accuracy
|
| 143 |
-
|
| 144 |
-
---
|
| 145 |
-
|
| 146 |
-
## Files Modified/Created
|
| 147 |
-
|
| 148 |
-
### Created Files:
|
| 149 |
-
1. **`app/ai/services/vision_service.py`** - Vision AI analysis service
|
| 150 |
-
2. **`app/routes/media_upload.py`** - Image/video upload endpoints
|
| 151 |
-
3. **`VISION_FEATURE_INTEGRATION_GUIDE.md`** - Complete integration guide
|
| 152 |
-
4. **`IMPLEMENTATION_SUMMARY.md`** - This file
|
| 153 |
-
|
| 154 |
-
### Modified Files:
|
| 155 |
-
1. **`app/config.py`** - Added Cloudinary + Vision settings
|
| 156 |
-
2. **`requirements.txt`** - Added cloudinary + ffmpeg-python
|
| 157 |
-
3. **`app/ai/agent/nodes/listing_collect.py`** - Added `initialize_from_vision_analysis()` function
|
| 158 |
-
4. **`main.py`** - Registered media_upload routes
|
| 159 |
-
|
| 160 |
-
---
|
| 161 |
-
|
| 162 |
-
## Configuration Required
|
| 163 |
-
|
| 164 |
-
Add to `.env`:
|
| 165 |
-
|
| 166 |
-
```bash
|
| 167 |
-
# Cloudinary (Video Storage)
|
| 168 |
-
CLOUDINARY_CLOUD_NAME=your_cloud_name
|
| 169 |
-
CLOUDINARY_API_KEY=your_api_key
|
| 170 |
-
CLOUDINARY_API_SECRET=your_api_secret
|
| 171 |
-
|
| 172 |
-
# Hugging Face Vision Model
|
| 173 |
-
HF_TOKEN=your_hf_token
|
| 174 |
-
HF_VISION_MODEL=vikhyatk/moondream2
|
| 175 |
-
HF_VISION_API_ENABLED=true
|
| 176 |
-
PROPERTY_IMAGE_MIN_CONFIDENCE=0.6
|
| 177 |
-
```
|
| 178 |
-
|
| 179 |
-
---
|
| 180 |
-
|
| 181 |
-
## Frontend Changes Required
|
| 182 |
-
|
| 183 |
-
### Update Image Upload Flow
|
| 184 |
-
|
| 185 |
-
**OLD (Direct to Cloudflare):**
|
| 186 |
-
```javascript
|
| 187 |
-
// Upload directly to Cloudflare
|
| 188 |
-
const url = await uploadToCloudflare(image)
|
| 189 |
-
```
|
| 190 |
-
|
| 191 |
-
**NEW (Via Backend with Validation):**
|
| 192 |
-
```javascript
|
| 193 |
-
// Method 1: Text listing (chat + photos)
|
| 194 |
-
const result = await fetch('/listings/analyze-images', {
|
| 195 |
-
method: 'POST',
|
| 196 |
-
body: formData,
|
| 197 |
-
headers: { 'listing_method': 'text', 'location': chatLocation }
|
| 198 |
-
})
|
| 199 |
-
|
| 200 |
-
// Method 2: Image listing (photos only)
|
| 201 |
-
const result = await fetch('/listings/analyze-images', {
|
| 202 |
-
method: 'POST',
|
| 203 |
-
body: formData,
|
| 204 |
-
headers: { 'listing_method': 'image' }
|
| 205 |
-
})
|
| 206 |
-
|
| 207 |
-
// Method 3: Video listing
|
| 208 |
-
const result = await fetch('/listings/analyze-video', {
|
| 209 |
-
method: 'POST',
|
| 210 |
-
body: formData,
|
| 211 |
-
headers: { 'listing_method': 'video' }
|
| 212 |
-
})
|
| 213 |
-
```
|
| 214 |
-
|
| 215 |
-
---
|
| 216 |
-
|
| 217 |
-
## User Experience Flow
|
| 218 |
-
|
| 219 |
-
### For Image Listing Method:
|
| 220 |
-
|
| 221 |
-
```
|
| 222 |
-
User clicks "List with Photos" β Uploads 2-3 images
|
| 223 |
-
β
|
| 224 |
-
Backend validates images are property-related
|
| 225 |
-
β
|
| 226 |
-
AI extracts:
|
| 227 |
-
- Bedrooms: 3 (confidence: 95%)
|
| 228 |
-
- Bathrooms: 2 (confidence: 88%)
|
| 229 |
-
- Amenities: WiFi, AC, Parking, Pool
|
| 230 |
-
- Title: "Modern 3-Bed Apartment. Great location!" (SHORT)
|
| 231 |
-
- Description: "Beautiful 3-bed with modern furnishings..."
|
| 232 |
-
β
|
| 233 |
-
Shows Draft UI with:
|
| 234 |
-
- Photos with smart names (Lagos_Modern_Apartment_2025_01_31_0.jpg)
|
| 235 |
-
- Extracted fields
|
| 236 |
-
- Confidence indicators
|
| 237 |
-
β
|
| 238 |
-
User asked: "What's the location, address, and price?"
|
| 239 |
-
β
|
| 240 |
-
User provides: "Lagos, Victoria Island, 500,000 per month"
|
| 241 |
-
β
|
| 242 |
-
AI infers listing_type: "rent" (from price context)
|
| 243 |
-
β
|
| 244 |
-
User edits via text:
|
| 245 |
-
- "Change amenities to WiFi, gym, and pool"
|
| 246 |
-
- "Update title to something catchier"
|
| 247 |
-
β
|
| 248 |
-
User publishes: "Publish this listing"
|
| 249 |
-
β
|
| 250 |
-
Listing created with all auto-detected + user-provided data
|
| 251 |
-
```
|
| 252 |
-
|
| 253 |
-
---
|
| 254 |
-
|
| 255 |
-
## Key Differences from Previous Design
|
| 256 |
-
|
| 257 |
-
| Aspect | Before | Now |
|
| 258 |
-
|--------|--------|-----|
|
| 259 |
-
| **File naming** | Random/original names | Smart names (location_title_date) |
|
| 260 |
-
| **Title generation** | Not generated for images | AI generates SHORT titles (max 2 sentences) |
|
| 261 |
-
| **Listing methods** | Only text-based | Three methods: text, image, video |
|
| 262 |
-
| **Method detection** | N/A | AI knows how user is listing |
|
| 263 |
-
| **Video storage** | N/A | Cloudinary for videos |
|
| 264 |
-
| **Upload strategy** | Direct to Cloudflare | Backend validates first (saves space) |
|
| 265 |
-
| **Confidence scores** | Not implemented | Per-field confidence for each extraction |
|
| 266 |
-
|
| 267 |
-
---
|
| 268 |
-
|
| 269 |
-
## Performance Notes
|
| 270 |
-
|
| 271 |
-
**Vision API Response Times:**
|
| 272 |
-
- Image validation: 2-3 seconds (first image), +1s per additional
|
| 273 |
-
- Field extraction: 2-4 seconds per image
|
| 274 |
-
- Title generation: 1-2 seconds per image
|
| 275 |
-
- Video upload: 5-10 seconds (depends on file size)
|
| 276 |
-
|
| 277 |
-
**Cost Optimization:**
|
| 278 |
-
- Only valid property images uploaded (rejects non-property images early)
|
| 279 |
-
- Smaller file sizes with smart naming
|
| 280 |
-
- Cloudflare worker deduplicates files
|
| 281 |
-
- Hugging Face Inference API used (cheaper than self-hosted)
|
| 282 |
-
|
| 283 |
-
---
|
| 284 |
-
|
| 285 |
-
## Testing Checklist
|
| 286 |
-
|
| 287 |
-
- [ ] Test TEXT method: Chat + upload images
|
| 288 |
-
- [ ] Test IMAGE method: Upload images only
|
| 289 |
-
- [ ] Test VIDEO method: Upload video + photos
|
| 290 |
-
- [ ] Verify short titles generated (max 2 sentences)
|
| 291 |
-
- [ ] Verify descriptions generated (full, not short)
|
| 292 |
-
- [ ] Verify file naming is intelligent (location_title_date)
|
| 293 |
-
- [ ] Verify property validation rejects non-property images
|
| 294 |
-
- [ ] Verify confidence scores are returned
|
| 295 |
-
- [ ] Verify all three methods produce same draft UI
|
| 296 |
-
- [ ] Test editing via natural language commands
|
| 297 |
-
- [ ] Test publishing with all three methods
|
| 298 |
-
|
| 299 |
-
---
|
| 300 |
-
|
| 301 |
-
## Next Steps
|
| 302 |
-
|
| 303 |
-
1. **Frontend Integration** - Update image/video upload flows
|
| 304 |
-
2. **Test All Three Methods** - Verify each method works end-to-end
|
| 305 |
-
3. **Monitor Accuracy** - Track field extraction accuracy metrics
|
| 306 |
-
4. **Optimize Prompts** - Fine-tune Vision AI prompts based on real data
|
| 307 |
-
5. **User Feedback** - Gather feedback on titles/descriptions
|
| 308 |
-
6. **Enhance Features** - Add OCR for address extraction, price suggestions, etc.
|
| 309 |
-
|
| 310 |
-
---
|
| 311 |
-
|
| 312 |
-
## Support
|
| 313 |
-
|
| 314 |
-
See `VISION_FEATURE_INTEGRATION_GUIDE.md` for:
|
| 315 |
-
- Detailed API documentation
|
| 316 |
-
- Complete example code
|
| 317 |
-
- Error handling
|
| 318 |
-
- Troubleshooting
|
| 319 |
-
- Future enhancements
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
VISION_FEATURE_INTEGRATION_GUIDE.md
DELETED
|
@@ -1,794 +0,0 @@
|
|
| 1 |
-
# π€ AI-Powered Property Listing with Image/Video Analysis
|
| 2 |
-
## Integration Guide
|
| 3 |
-
|
| 4 |
-
---
|
| 5 |
-
|
| 6 |
-
## Overview
|
| 7 |
-
|
| 8 |
-
This document explains how to integrate the new **Vision AI feature** that allows users to list properties by uploading images or videos. The AI automatically detects property details (bedrooms, bathrooms, amenities) and fills listing fields.
|
| 9 |
-
|
| 10 |
-
---
|
| 11 |
-
|
| 12 |
-
## Architecture
|
| 13 |
-
|
| 14 |
-
### Flow Diagram
|
| 15 |
-
|
| 16 |
-
```
|
| 17 |
-
USER UPLOADS IMAGES/VIDEO
|
| 18 |
-
β
|
| 19 |
-
[BACKEND IMAGE VALIDATION]
|
| 20 |
-
- Check if image is property-related (BEFORE upload)
|
| 21 |
-
- Reject non-property images (saves Cloudflare space)
|
| 22 |
-
β
|
| 23 |
-
[VISION AI ANALYSIS] (Hugging Face Inference API)
|
| 24 |
-
- Extract bedrooms, bathrooms
|
| 25 |
-
- Detect amenities
|
| 26 |
-
- Generate description
|
| 27 |
-
- Return confidence scores
|
| 28 |
-
β
|
| 29 |
-
[UPLOAD TO CLOUD STORAGE]
|
| 30 |
-
- Images β Cloudflare (only if validated)
|
| 31 |
-
- Videos β Cloudinary
|
| 32 |
-
β
|
| 33 |
-
[INITIALIZE LISTING]
|
| 34 |
-
- Pre-fill extracted fields
|
| 35 |
-
- Ask user for uncertain/missing fields (price, location, address)
|
| 36 |
-
β
|
| 37 |
-
[DRAFT UI]
|
| 38 |
-
- Show preview card like text-based flow
|
| 39 |
-
β
|
| 40 |
-
[USER REVIEWS & EDITS]
|
| 41 |
-
- Edit via natural language commands
|
| 42 |
-
β
|
| 43 |
-
[PUBLISH]
|
| 44 |
-
- Same as text-based flow
|
| 45 |
-
```
|
| 46 |
-
|
| 47 |
-
---
|
| 48 |
-
|
| 49 |
-
## New Files Created
|
| 50 |
-
|
| 51 |
-
### 1. **Vision Service** - `app/ai/services/vision_service.py`
|
| 52 |
-
|
| 53 |
-
**Purpose**: Analyzes images/videos using Hugging Face Inference API
|
| 54 |
-
|
| 55 |
-
**Key Classes**:
|
| 56 |
-
```python
|
| 57 |
-
class VisionService:
|
| 58 |
-
def validate_property_image(image_bytes) β (bool, float, str)
|
| 59 |
-
def extract_property_fields(image_bytes) β Dict
|
| 60 |
-
def merge_multiple_image_results(results_list) β Dict
|
| 61 |
-
```
|
| 62 |
-
|
| 63 |
-
**Functions**:
|
| 64 |
-
- `validate_property_image()` - Check if image is property-related (BEFORE upload)
|
| 65 |
-
- `extract_property_fields()` - Extract bedrooms, bathrooms, amenities, description
|
| 66 |
-
- `_extract_room_count()` - Count rooms
|
| 67 |
-
- `_detect_amenities()` - Find amenities
|
| 68 |
-
- `_generate_description()` - Create property description
|
| 69 |
-
- `merge_multiple_image_results()` - Combine results from multiple images
|
| 70 |
-
|
| 71 |
-
---
|
| 72 |
-
|
| 73 |
-
### 2. **Media Upload Routes** - `app/routes/media_upload.py`
|
| 74 |
-
|
| 75 |
-
**Purpose**: Handle image/video uploads with validation
|
| 76 |
-
|
| 77 |
-
**Endpoints**:
|
| 78 |
-
|
| 79 |
-
#### `POST /listings/analyze-images`
|
| 80 |
-
```
|
| 81 |
-
Request:
|
| 82 |
-
- files: List of image files (max 10, max 10MB each)
|
| 83 |
-
- listing_method: "text" | "image" | "video" (how user is listing)
|
| 84 |
-
- location: Optional string (context from text method)
|
| 85 |
-
|
| 86 |
-
Process:
|
| 87 |
-
1. Validate image format (JPEG, PNG, WebP)
|
| 88 |
-
2. Validate image is property-related (BEFORE upload)
|
| 89 |
-
3. Extract property fields
|
| 90 |
-
4. Upload to Cloudflare (only if validated)
|
| 91 |
-
5. Return extracted fields + image URLs
|
| 92 |
-
|
| 93 |
-
Response:
|
| 94 |
-
{
|
| 95 |
-
"success": true,
|
| 96 |
-
"images_processed": 2,
|
| 97 |
-
"images_validated": ["image1.jpg", "image2.jpg"],
|
| 98 |
-
"image_urls": [
|
| 99 |
-
"https://cloudflare.../image1.jpg",
|
| 100 |
-
"https://cloudflare.../image2.jpg"
|
| 101 |
-
],
|
| 102 |
-
"extracted_fields": {
|
| 103 |
-
"bedrooms": 3,
|
| 104 |
-
"bathrooms": 2,
|
| 105 |
-
"amenities": ["WiFi", "Parking", "AC"],
|
| 106 |
-
"description": "Spacious modern apartment..."
|
| 107 |
-
},
|
| 108 |
-
"confidence": {
|
| 109 |
-
"bedrooms": 0.95,
|
| 110 |
-
"bathrooms": 0.88,
|
| 111 |
-
"amenities": 0.72,
|
| 112 |
-
"description": 0.91
|
| 113 |
-
},
|
| 114 |
-
"validation_errors": [],
|
| 115 |
-
"suggestions": ["Verify bedroom count", "...]
|
| 116 |
-
}
|
| 117 |
-
```
|
| 118 |
-
|
| 119 |
-
#### `POST /listings/analyze-video`
|
| 120 |
-
```
|
| 121 |
-
Request:
|
| 122 |
-
- video: Single video file (max 100MB)
|
| 123 |
-
|
| 124 |
-
Response:
|
| 125 |
-
{
|
| 126 |
-
"success": true,
|
| 127 |
-
"video_url": "https://res.cloudinary.com/.../video.mp4",
|
| 128 |
-
"message": "Video uploaded. Photos recommended for better accuracy.",
|
| 129 |
-
"extracted_fields": {...},
|
| 130 |
-
"suggestions": ["Upload property photos for better detection"]
|
| 131 |
-
}
|
| 132 |
-
```
|
| 133 |
-
|
| 134 |
-
#### `POST /listings/validate-media`
|
| 135 |
-
```
|
| 136 |
-
Quick validation without uploading
|
| 137 |
-
Returns: Validation results for each file
|
| 138 |
-
```
|
| 139 |
-
|
| 140 |
-
---
|
| 141 |
-
|
| 142 |
-
### 3. **Listing Collection Integration** - `app/ai/agent/nodes/listing_collect.py`
|
| 143 |
-
|
| 144 |
-
**New Function**: `initialize_from_vision_analysis(state, vision_data)`
|
| 145 |
-
|
| 146 |
-
**Purpose**: Pre-populate listing state with AI-detected fields
|
| 147 |
-
|
| 148 |
-
**Usage**:
|
| 149 |
-
```python
|
| 150 |
-
# After user uploads images and AI analyzes them
|
| 151 |
-
state = await initialize_from_vision_analysis(state, vision_data)
|
| 152 |
-
# State now has bedrooms, bathrooms, amenities, images, description pre-filled
|
| 153 |
-
```
|
| 154 |
-
|
| 155 |
-
---
|
| 156 |
-
|
| 157 |
-
## Configuration
|
| 158 |
-
|
| 159 |
-
### Add to `.env`
|
| 160 |
-
|
| 161 |
-
```bash
|
| 162 |
-
# Cloudinary (Video Storage)
|
| 163 |
-
CLOUDINARY_CLOUD_NAME=your_cloud_name
|
| 164 |
-
CLOUDINARY_API_KEY=your_api_key
|
| 165 |
-
CLOUDINARY_API_SECRET=your_api_secret
|
| 166 |
-
|
| 167 |
-
# Hugging Face Vision Model
|
| 168 |
-
HF_TOKEN=your_hf_token
|
| 169 |
-
HF_VISION_MODEL=vikhyatk/moondream2
|
| 170 |
-
HF_VISION_API_ENABLED=true
|
| 171 |
-
PROPERTY_IMAGE_MIN_CONFIDENCE=0.6
|
| 172 |
-
```
|
| 173 |
-
|
| 174 |
-
### Update `app/config.py` β
(Already Done)
|
| 175 |
-
|
| 176 |
-
Added:
|
| 177 |
-
- `CLOUDINARY_CLOUD_NAME`
|
| 178 |
-
- `CLOUDINARY_API_KEY`
|
| 179 |
-
- `CLOUDINARY_API_SECRET`
|
| 180 |
-
- `HF_VISION_MODEL`
|
| 181 |
-
- `HF_VISION_API_ENABLED`
|
| 182 |
-
- `PROPERTY_IMAGE_MIN_CONFIDENCE`
|
| 183 |
-
|
| 184 |
-
### Update `requirements.txt` β
(Already Done)
|
| 185 |
-
|
| 186 |
-
Added:
|
| 187 |
-
- `cloudinary>=1.40.0`
|
| 188 |
-
- `ffmpeg-python>=0.2.1`
|
| 189 |
-
|
| 190 |
-
---
|
| 191 |
-
|
| 192 |
-
## Frontend Integration
|
| 193 |
-
|
| 194 |
-
### Frontend Responsibilities
|
| 195 |
-
|
| 196 |
-
**IMPORTANT**: Images must now be uploaded to the **backend** (not directly to Cloudflare)
|
| 197 |
-
|
| 198 |
-
#### 1. **Image Upload Flow**
|
| 199 |
-
|
| 200 |
-
```typescript
|
| 201 |
-
// OLD (Direct to Cloudflare) - DEPRECATED
|
| 202 |
-
POST to Cloudflare directly
|
| 203 |
-
|
| 204 |
-
// NEW (Via Backend with Validation) - REQUIRED
|
| 205 |
-
POST /listings/analyze-images
|
| 206 |
-
Headers: Authorization: Bearer {token}
|
| 207 |
-
Body: FormData with files
|
| 208 |
-
Response: Extracted fields + image URLs
|
| 209 |
-
```
|
| 210 |
-
|
| 211 |
-
#### 2. **Example Frontend Code**
|
| 212 |
-
|
| 213 |
-
**For TEXT method** (user provided details via chat):
|
| 214 |
-
```typescript
|
| 215 |
-
async function uploadImagesForTextListing(files: File[], location: string) {
|
| 216 |
-
const formData = new FormData()
|
| 217 |
-
files.forEach(file => formData.append('images', file))
|
| 218 |
-
formData.append('listing_method', 'text')
|
| 219 |
-
formData.append('location', location) // Context from text conversation
|
| 220 |
-
|
| 221 |
-
const response = await fetch('/listings/analyze-images', {
|
| 222 |
-
method: 'POST',
|
| 223 |
-
headers: { 'Authorization': `Bearer ${token}` },
|
| 224 |
-
body: formData
|
| 225 |
-
})
|
| 226 |
-
|
| 227 |
-
const result = await response.json()
|
| 228 |
-
|
| 229 |
-
if (!result.success) {
|
| 230 |
-
result.validation_errors.forEach(err => {
|
| 231 |
-
alert(`${err.image}: ${err.error}`)
|
| 232 |
-
})
|
| 233 |
-
return
|
| 234 |
-
}
|
| 235 |
-
|
| 236 |
-
// Images validated with text-provided data
|
| 237 |
-
showListingDraft({
|
| 238 |
-
// Use data from CHAT (text-provided), images as validation
|
| 239 |
-
bedrooms: result.extracted_fields.bedrooms,
|
| 240 |
-
bathrooms: result.extracted_fields.bathrooms,
|
| 241 |
-
images: result.image_urls,
|
| 242 |
-
})
|
| 243 |
-
}
|
| 244 |
-
```
|
| 245 |
-
|
| 246 |
-
**For IMAGE method** (user uploading photos only):
|
| 247 |
-
```typescript
|
| 248 |
-
async function uploadImagesForPhotListing(files: File[]) {
|
| 249 |
-
const formData = new FormData()
|
| 250 |
-
files.forEach(file => formData.append('images', file))
|
| 251 |
-
formData.append('listing_method', 'image')
|
| 252 |
-
// No location - we'll extract everything from images
|
| 253 |
-
|
| 254 |
-
const response = await fetch('/listings/analyze-images', {
|
| 255 |
-
method: 'POST',
|
| 256 |
-
headers: { 'Authorization': `Bearer ${token}` },
|
| 257 |
-
body: formData
|
| 258 |
-
})
|
| 259 |
-
|
| 260 |
-
const result = await response.json()
|
| 261 |
-
|
| 262 |
-
if (!result.success) {
|
| 263 |
-
result.validation_errors.forEach(err => {
|
| 264 |
-
alert(`${err.image}: ${err.error}`)
|
| 265 |
-
})
|
| 266 |
-
return
|
| 267 |
-
}
|
| 268 |
-
|
| 269 |
-
// Show extracted fields (AI analyzed images)
|
| 270 |
-
showListingDraft({
|
| 271 |
-
title: result.extracted_fields.title, // AI-generated SHORT title
|
| 272 |
-
description: result.extracted_fields.description, // AI-generated description
|
| 273 |
-
bedrooms: result.extracted_fields.bedrooms,
|
| 274 |
-
bathrooms: result.extracted_fields.bathrooms,
|
| 275 |
-
amenities: result.extracted_fields.amenities,
|
| 276 |
-
images: result.image_urls,
|
| 277 |
-
confidence: result.confidence
|
| 278 |
-
})
|
| 279 |
-
}
|
| 280 |
-
```
|
| 281 |
-
|
| 282 |
-
**For VIDEO method**:
|
| 283 |
-
```typescript
|
| 284 |
-
async function uploadVideoForListing(videoFile: File, location?: string) {
|
| 285 |
-
const formData = new FormData()
|
| 286 |
-
formData.append('video', videoFile)
|
| 287 |
-
if (location) formData.append('location', location)
|
| 288 |
-
|
| 289 |
-
const response = await fetch('/listings/analyze-video', {
|
| 290 |
-
method: 'POST',
|
| 291 |
-
headers: { 'Authorization': `Bearer ${token}` },
|
| 292 |
-
body: formData
|
| 293 |
-
})
|
| 294 |
-
|
| 295 |
-
const result = await response.json()
|
| 296 |
-
|
| 297 |
-
// Suggest uploading photos
|
| 298 |
-
alert(result.message)
|
| 299 |
-
// Then call uploadImagesForPhotListing with photos
|
| 300 |
-
}
|
| 301 |
-
```
|
| 302 |
-
|
| 303 |
-
#### 3. **Video Upload Flow**
|
| 304 |
-
|
| 305 |
-
```typescript
|
| 306 |
-
POST /listings/analyze-video
|
| 307 |
-
Headers: Authorization: Bearer {token}
|
| 308 |
-
Body: FormData with video file
|
| 309 |
-
Response: Video URL + suggestions
|
| 310 |
-
```
|
| 311 |
-
|
| 312 |
-
---
|
| 313 |
-
|
| 314 |
-
## Three Listing Methods (Smart Differentiation)
|
| 315 |
-
|
| 316 |
-
The system intelligently handles THREE different listing creation methods:
|
| 317 |
-
|
| 318 |
-
### 1οΈβ£ Text-Based Listing (Existing - User provides details via text)
|
| 319 |
-
|
| 320 |
-
```
|
| 321 |
-
User says: "I have a 3-bed, 2-bath in Lagos for 500k per month.
|
| 322 |
-
It has WiFi, AC, and parking."
|
| 323 |
-
|
| 324 |
-
FLOW:
|
| 325 |
-
1. AI extracts fields from text (bedrooms, bathrooms, price, etc.)
|
| 326 |
-
2. User uploads photos to validate
|
| 327 |
-
3. Backend:
|
| 328 |
-
- Validates images are property-related
|
| 329 |
-
- Just checks they match (no re-extraction needed)
|
| 330 |
-
- Uploads to Cloudflare with smart naming
|
| 331 |
-
4. Shows draft UI with text-provided data + validated photos
|
| 332 |
-
5. User edits via text: "change price to 450k"
|
| 333 |
-
6. AI infers listing_type from price: "rent"
|
| 334 |
-
7. User publishes: "publish this listing"
|
| 335 |
-
|
| 336 |
-
METHOD CONTEXT: listing_method="text"
|
| 337 |
-
```
|
| 338 |
-
|
| 339 |
-
### 2οΈβ£ Image-Based Listing (NEW - User uploads photos only)
|
| 340 |
-
|
| 341 |
-
```
|
| 342 |
-
User clicks "List with Photos"
|
| 343 |
-
|
| 344 |
-
FLOW:
|
| 345 |
-
1. User uploads 1-5 photos (no text details provided)
|
| 346 |
-
2. Backend:
|
| 347 |
-
- Validates images are property-related
|
| 348 |
-
- EXTRACTS ALL DETAILS: bedrooms, bathrooms, amenities
|
| 349 |
-
- GENERATES TITLE (short, max 2 sentences)
|
| 350 |
-
- GENERATES DESCRIPTION (full description)
|
| 351 |
-
- Creates intelligent filenames (location_title_date.jpg)
|
| 352 |
-
- Uploads to Cloudflare
|
| 353 |
-
3. Shows draft UI with AI-extracted fields
|
| 354 |
-
4. User is prompted: "What's the location, address, and price?"
|
| 355 |
-
5. User provides: "Lagos, Victoria Island, 500,000 per month"
|
| 356 |
-
6. AIDA Auto-infers:
|
| 357 |
-
- Currency from location: Lagos β NGN (via CurrencyManager API)
|
| 358 |
-
- Listing_type from price_type: "per month" β "rent" β
|
| 359 |
-
7. User can edit via text: "add gym to amenities", "change title"
|
| 360 |
-
8. User publishes: "publish this listing"
|
| 361 |
-
|
| 362 |
-
METHOD CONTEXT: listing_method="image"
|
| 363 |
-
AI EXTRACTS: bedrooms, bathrooms, amenities, description, title
|
| 364 |
-
AUTO-INFERRED: currency (from location), listing_type (from price_type)
|
| 365 |
-
```
|
| 366 |
-
|
| 367 |
-
### 3οΈβ£ Video-Based Listing (NEW - User uploads video, optionally photos)
|
| 368 |
-
|
| 369 |
-
```
|
| 370 |
-
User clicks "List with Video"
|
| 371 |
-
|
| 372 |
-
FLOW:
|
| 373 |
-
1. User uploads video (walkthrough)
|
| 374 |
-
2. Backend:
|
| 375 |
-
- Uploads to Cloudinary
|
| 376 |
-
- Creates intelligent filename
|
| 377 |
-
3. System suggests: "Video uploaded! Upload 2-3 photos for better detection."
|
| 378 |
-
4. User uploads photos
|
| 379 |
-
5. Backend:
|
| 380 |
-
- Validates images are property-related
|
| 381 |
-
- EXTRACTS ALL DETAILS from photos
|
| 382 |
-
- GENERATES TITLE and DESCRIPTION
|
| 383 |
-
6. Shows draft UI with extracted fields + video URL
|
| 384 |
-
7. Same flow as image-based from step 5 onwards:
|
| 385 |
-
- User prompted for: location, address, price (with price_type)
|
| 386 |
-
- AIDA auto-infers: currency (from location), listing_type (from price_type)
|
| 387 |
-
|
| 388 |
-
METHOD CONTEXT: listing_method="video"
|
| 389 |
-
AI EXTRACTS: From photos (not video)
|
| 390 |
-
AUTO-INFERRED: currency (from location), listing_type (from price_type)
|
| 391 |
-
VIDEO STORAGE: Cloudinary
|
| 392 |
-
PHOTO STORAGE: Cloudflare
|
| 393 |
-
```
|
| 394 |
-
|
| 395 |
-
### Unified Draft UI Result
|
| 396 |
-
|
| 397 |
-
**All three methods produce the SAME final result:**
|
| 398 |
-
|
| 399 |
-
```json
|
| 400 |
-
{
|
| 401 |
-
"success": true,
|
| 402 |
-
"listing_method": "text|image|video",
|
| 403 |
-
"extracted_fields": {
|
| 404 |
-
"bedrooms": 3,
|
| 405 |
-
"bathrooms": 2,
|
| 406 |
-
"amenities": ["WiFi", "Parking", "AC"],
|
| 407 |
-
"description": "Beautiful apartment with modern amenities.",
|
| 408 |
-
"title": "3-Bed Modern Apartment. Great location!"
|
| 409 |
-
},
|
| 410 |
-
"confidence": { ... },
|
| 411 |
-
"image_urls": [ ... ],
|
| 412 |
-
"video_url": "..." // Only if video method
|
| 413 |
-
}
|
| 414 |
-
```
|
| 415 |
-
|
| 416 |
-
The **frontend shows the same UI** regardless of listing method - user sees:
|
| 417 |
-
- Property images
|
| 418 |
-
- Extracted details
|
| 419 |
-
- Ability to edit via text commands
|
| 420 |
-
- Publish button
|
| 421 |
-
|
| 422 |
-
---
|
| 423 |
-
|
| 424 |
-
## Data Flow Example
|
| 425 |
-
|
| 426 |
-
### Request
|
| 427 |
-
|
| 428 |
-
```bash
|
| 429 |
-
curl -X POST http://localhost:8000/listings/analyze-images \
|
| 430 |
-
-H "Authorization: Bearer {token}" \
|
| 431 |
-
-F "images=@bedroom.jpg" \
|
| 432 |
-
-F "images=@kitchen.jpg" \
|
| 433 |
-
-F "images=@bathroom.jpg"
|
| 434 |
-
```
|
| 435 |
-
|
| 436 |
-
### Response
|
| 437 |
-
|
| 438 |
-
```json
|
| 439 |
-
{
|
| 440 |
-
"success": true,
|
| 441 |
-
"images_processed": 3,
|
| 442 |
-
"images_validated": ["bedroom.jpg", "kitchen.jpg", "bathroom.jpg"],
|
| 443 |
-
"image_urls": [
|
| 444 |
-
"https://imagedelivery.net/lojiz/bedroom_hash/public",
|
| 445 |
-
"https://imagedelivery.net/lojiz/kitchen_hash/public",
|
| 446 |
-
"https://imagedelivery.net/lojiz/bathroom_hash/public"
|
| 447 |
-
],
|
| 448 |
-
"extracted_fields": {
|
| 449 |
-
"bedrooms": 3,
|
| 450 |
-
"bathrooms": 2,
|
| 451 |
-
"amenities": ["WiFi Router", "AC Unit", "Furniture", "Balcony"],
|
| 452 |
-
"description": "Beautiful 3-bedroom, 2-bathroom modern apartment with contemporary furnishings and excellent amenities."
|
| 453 |
-
},
|
| 454 |
-
"confidence": {
|
| 455 |
-
"bedrooms": 0.95,
|
| 456 |
-
"bathrooms": 0.88,
|
| 457 |
-
"amenities": 0.72,
|
| 458 |
-
"description": 0.91
|
| 459 |
-
},
|
| 460 |
-
"validation_errors": [],
|
| 461 |
-
"suggestions": [
|
| 462 |
-
"Verify bedroom and bathroom counts are accurate",
|
| 463 |
-
"You'll need to provide location, address, and price information"
|
| 464 |
-
]
|
| 465 |
-
}
|
| 466 |
-
```
|
| 467 |
-
|
| 468 |
-
---
|
| 469 |
-
|
| 470 |
-
## API Endpoints Summary
|
| 471 |
-
|
| 472 |
-
| Endpoint | Method | Purpose | Auth |
|
| 473 |
-
|----------|--------|---------|------|
|
| 474 |
-
| `/listings/analyze-images` | POST | Upload & analyze images | Required |
|
| 475 |
-
| `/listings/analyze-video` | POST | Upload & analyze video | Required |
|
| 476 |
-
| `/listings/validate-media` | POST | Quick file validation | Required |
|
| 477 |
-
|
| 478 |
-
---
|
| 479 |
-
|
| 480 |
-
## Important Notes
|
| 481 |
-
|
| 482 |
-
### Image Validation
|
| 483 |
-
|
| 484 |
-
- **Property validation happens BEFORE upload** - Non-property images are rejected, saving Cloudflare storage
|
| 485 |
-
- **Confidence threshold**: Default 0.6 (60%) - Can be adjusted via `PROPERTY_IMAGE_MIN_CONFIDENCE`
|
| 486 |
-
- **High-confidence fields** (>0.7): Auto-filled in listing form
|
| 487 |
-
- **Medium-confidence fields** (0.5-0.7): Shown as suggestions; user confirms
|
| 488 |
-
- **Low-confidence fields** (<0.5): User must provide manually
|
| 489 |
-
|
| 490 |
-
### Video Processing
|
| 491 |
-
|
| 492 |
-
- Videos uploaded to **Cloudinary** (not Cloudflare)
|
| 493 |
-
- Frame extraction available for future frame-by-frame analysis
|
| 494 |
-
- Users encouraged to upload photos alongside video for better accuracy
|
| 495 |
-
|
| 496 |
-
### Listing Type Inference
|
| 497 |
-
|
| 498 |
-
After user provides **price**, system infers listing_type:
|
| 499 |
-
|
| 500 |
-
```python
|
| 501 |
-
Price Input β Listing Type
|
| 502 |
-
- High monthly (e.g., 500,000/month) β "rent"
|
| 503 |
-
- Low nightly (e.g., 5,000/night) β "short-stay"
|
| 504 |
-
- Very high one-time (e.g., 50,000,000) β "sale"
|
| 505 |
-
- "Looking for roommate" context β "roommate"
|
| 506 |
-
```
|
| 507 |
-
|
| 508 |
-
---
|
| 509 |
-
|
| 510 |
-
## Testing
|
| 511 |
-
|
| 512 |
-
### Test 1: TEXT Method (User provided text details + uploading images)
|
| 513 |
-
|
| 514 |
-
```bash
|
| 515 |
-
# User already provided details via chat
|
| 516 |
-
# Now uploading images to validate
|
| 517 |
-
|
| 518 |
-
curl -X POST /listings/analyze-images \
|
| 519 |
-
-H "Authorization: Bearer {token}" \
|
| 520 |
-
-F "images=@bedroom.jpg" \
|
| 521 |
-
-F "images=@kitchen.jpg" \
|
| 522 |
-
-F "listing_method=text" \
|
| 523 |
-
-F "location=Lagos"
|
| 524 |
-
|
| 525 |
-
# Response:
|
| 526 |
-
# - Images validated as property-related β
|
| 527 |
-
# - Details preserved from text conversation
|
| 528 |
-
# - Returns same format with extracted fields + image URLs
|
| 529 |
-
```
|
| 530 |
-
|
| 531 |
-
### Test 2: IMAGE Method (User uploading photos only)
|
| 532 |
-
|
| 533 |
-
```bash
|
| 534 |
-
# User has no text details - AI extracts everything
|
| 535 |
-
|
| 536 |
-
curl -X POST /listings/analyze-images \
|
| 537 |
-
-H "Authorization: Bearer {token}" \
|
| 538 |
-
-F "images=@bedroom.jpg" \
|
| 539 |
-
-F "images=@kitchen.jpg" \
|
| 540 |
-
-F "images=@bathroom.jpg" \
|
| 541 |
-
-F "listing_method=image"
|
| 542 |
-
|
| 543 |
-
# Response:
|
| 544 |
-
# - bedrooms: 3 (extracted from images)
|
| 545 |
-
# - bathrooms: 2 (extracted from images)
|
| 546 |
-
# - title: "Modern 3-Bed Apartment. Great Location!" (AI-generated, SHORT)
|
| 547 |
-
# - description: "Beautiful apartment with..." (AI-generated, full)
|
| 548 |
-
# - amenities: ["WiFi", "AC", "Parking"] (extracted)
|
| 549 |
-
# - confidence: { bedrooms: 0.95, bathrooms: 0.88, ... }
|
| 550 |
-
```
|
| 551 |
-
|
| 552 |
-
### Test 3: VIDEO Method (User uploading video + photos)
|
| 553 |
-
|
| 554 |
-
```bash
|
| 555 |
-
# Step 1: Upload video
|
| 556 |
-
curl -X POST /listings/analyze-video \
|
| 557 |
-
-H "Authorization: Bearer {token}" \
|
| 558 |
-
-F "video=@walkthrough.mp4" \
|
| 559 |
-
-F "location=Lagos"
|
| 560 |
-
|
| 561 |
-
# Response: video_url, suggestions to upload photos
|
| 562 |
-
|
| 563 |
-
# Step 2: Upload photos for analysis
|
| 564 |
-
curl -X POST /listings/analyze-images \
|
| 565 |
-
-H "Authorization: Bearer {token}" \
|
| 566 |
-
-F "images=@photo1.jpg" \
|
| 567 |
-
-F "images=@photo2.jpg" \
|
| 568 |
-
-F "listing_method=video" \
|
| 569 |
-
-F "location=Lagos"
|
| 570 |
-
|
| 571 |
-
# Response: Same as IMAGE method + video_url in final listing
|
| 572 |
-
```
|
| 573 |
-
|
| 574 |
-
### Test 4: File Naming
|
| 575 |
-
|
| 576 |
-
```bash
|
| 577 |
-
# Upload images with location context
|
| 578 |
-
curl -X POST /listings/analyze-images \
|
| 579 |
-
-F "images=@IMG_1234.jpg" \
|
| 580 |
-
-F "images=@IMG_5678.jpg" \
|
| 581 |
-
-F "listing_method=image" \
|
| 582 |
-
-F "location=Lagos"
|
| 583 |
-
|
| 584 |
-
# Backend generates:
|
| 585 |
-
# - Lagos_Modern_Apartment_2025_01_31_0.jpg
|
| 586 |
-
# - Lagos_Modern_Apartment_2025_01_31_1.jpg
|
| 587 |
-
# (AI extracts title from image and uses it in filename)
|
| 588 |
-
|
| 589 |
-
# Cloudflare stores with these intelligent names
|
| 590 |
-
# If duplicate: Lagos_Modern_Apartment_2025_01_31_0_1.jpg (worker appends _1)
|
| 591 |
-
```
|
| 592 |
-
|
| 593 |
-
### Test 5: Short Title Validation
|
| 594 |
-
|
| 595 |
-
```bash
|
| 596 |
-
# Verify title is SHORT (max 2 sentences)
|
| 597 |
-
|
| 598 |
-
Response:
|
| 599 |
-
{
|
| 600 |
-
"extracted_fields": {
|
| 601 |
-
"title": "Modern 3-Bed Apartment. Great location!", β SHORT
|
| 602 |
-
"description": "Beautiful 3-bedroom, 2-bathroom modern apartment..." β FULL
|
| 603 |
-
}
|
| 604 |
-
}
|
| 605 |
-
|
| 606 |
-
# NOT acceptable:
|
| 607 |
-
{
|
| 608 |
-
"title": "This is a beautiful 3-bedroom, 2-bathroom modern apartment..." β TOO LONG
|
| 609 |
-
}
|
| 610 |
-
```
|
| 611 |
-
|
| 612 |
-
---
|
| 613 |
-
|
| 614 |
-
## Error Handling
|
| 615 |
-
|
| 616 |
-
### Common Errors
|
| 617 |
-
|
| 618 |
-
| Error | Cause | Solution |
|
| 619 |
-
|-------|-------|----------|
|
| 620 |
-
| `Not a property photo` | Image rejected by vision AI | Upload actual property photos |
|
| 621 |
-
| `Image size exceeds 10MB` | File too large | Compress image or use smaller file |
|
| 622 |
-
| `Invalid image type` | Wrong file format | Use JPEG, PNG, or WebP |
|
| 623 |
-
| `Cloudinary upload failed` | Credentials not set | Check `.env` variables |
|
| 624 |
-
| `HF API timeout` | Vision model slow | Retry or use Cloudinary-hosted fallback |
|
| 625 |
-
|
| 626 |
-
---
|
| 627 |
-
|
| 628 |
-
## Smart File Naming & Storage
|
| 629 |
-
|
| 630 |
-
### Intelligent Filename Generation
|
| 631 |
-
|
| 632 |
-
**Backend generates meaningful filenames instead of using random names:**
|
| 633 |
-
|
| 634 |
-
```python
|
| 635 |
-
Pattern: {location}_{title}_{timestamp}_{index}.jpg
|
| 636 |
-
|
| 637 |
-
Examples:
|
| 638 |
-
- Lagos_Modern_Apartment_2025_01_31_1.jpg
|
| 639 |
-
- Victoria_Island_3_Bed_Luxury_2025_01_31_0.jpg
|
| 640 |
-
- Cotonou_Cozy_Studio_2025_01_31_0.jpg
|
| 641 |
-
```
|
| 642 |
-
|
| 643 |
-
**Algorithm:**
|
| 644 |
-
1. Extract location (if available)
|
| 645 |
-
2. Extract title (first 20 chars, AI-generated if image/video method)
|
| 646 |
-
3. Add timestamp (YYYY_MM_DD_HHMMSS)
|
| 647 |
-
4. Add index for multiple images (0, 1, 2...)
|
| 648 |
-
|
| 649 |
-
**Benefits:**
|
| 650 |
-
- Easy to identify property in storage
|
| 651 |
-
- Date shows when listed
|
| 652 |
-
- Cloudflare worker can detect duplicates
|
| 653 |
-
- Organized file structure
|
| 654 |
-
|
| 655 |
-
### Cloudflare Worker Deduplication
|
| 656 |
-
|
| 657 |
-
When image reaches Cloudflare:
|
| 658 |
-
```
|
| 659 |
-
1. Check if filename exists
|
| 660 |
-
2. If NEW β Store as-is
|
| 661 |
-
3. If DUPLICATE β Append counter
|
| 662 |
-
- first duplicate: {name}_1.jpg
|
| 663 |
-
- second: {name}_2.jpg
|
| 664 |
-
```
|
| 665 |
-
|
| 666 |
-
**Example:**
|
| 667 |
-
```
|
| 668 |
-
Scenario: Same user uploads "Lagos_Apartment.jpg" twice
|
| 669 |
-
1st upload β Lagos_Apartment.jpg
|
| 670 |
-
2nd upload β Lagos_Apartment_1.jpg (worker auto-appended)
|
| 671 |
-
```
|
| 672 |
-
|
| 673 |
-
---
|
| 674 |
-
|
| 675 |
-
## Title & Description Generation
|
| 676 |
-
|
| 677 |
-
### Title Requirements
|
| 678 |
-
|
| 679 |
-
**MUST BE SHORT:**
|
| 680 |
-
- β
"Modern 3-bed apartment. Great location!"
|
| 681 |
-
- β
"Spacious family home with garden."
|
| 682 |
-
- β "This is a beautiful 3-bedroom, 2-bathroom modern apartment with contemporary furnishings, located in a prime area of the city with excellent amenities and facilities"
|
| 683 |
-
|
| 684 |
-
**Maximum:** 2 sentences (not full descriptions)
|
| 685 |
-
|
| 686 |
-
**Generated by Vision AI for image/video methods:**
|
| 687 |
-
```python
|
| 688 |
-
Example prompts:
|
| 689 |
-
"Generate a SHORT, catchy real estate listing title for this property (3bed, 2bath) in Lagos.
|
| 690 |
-
Maximum 2 sentences. Must be concise and appealing.
|
| 691 |
-
Example: 'Modern 2-bed apartment with balcony. Great location!'"
|
| 692 |
-
```
|
| 693 |
-
|
| 694 |
-
### Description Generation
|
| 695 |
-
|
| 696 |
-
**Full property description (2-3 sentences):**
|
| 697 |
-
- Generated from images/video
|
| 698 |
-
- Professional tone
|
| 699 |
-
- Highlights key features
|
| 700 |
-
- Stored in `extracted_fields.description`
|
| 701 |
-
|
| 702 |
-
**Example:**
|
| 703 |
-
```
|
| 704 |
-
"Beautiful 3-bedroom, 2-bathroom modern apartment featuring contemporary
|
| 705 |
-
furnishings, air conditioning, WiFi, and private balcony overlooking the
|
| 706 |
-
city. Located in a secure, gated community with excellent amenities."
|
| 707 |
-
```
|
| 708 |
-
|
| 709 |
-
---
|
| 710 |
-
|
| 711 |
-
## Performance Optimization
|
| 712 |
-
|
| 713 |
-
### Recommended for Production
|
| 714 |
-
|
| 715 |
-
1. **Implement caching**: Cache similar property images to reduce API calls
|
| 716 |
-
2. **Batch processing**: Process multiple images in parallel
|
| 717 |
-
3. **Frame extraction**: For videos, extract key frames instead of all frames
|
| 718 |
-
4. **Model optimization**: Consider smaller model variant for faster inference
|
| 719 |
-
5. **Async processing**: Long-running tasks (video analysis) should be async jobs
|
| 720 |
-
|
| 721 |
-
### Estimated Response Times
|
| 722 |
-
|
| 723 |
-
- Image validation: **2-3 seconds** (first image), **+1s per additional**
|
| 724 |
-
- Video upload: **5-10 seconds** depending on file size
|
| 725 |
-
- Vision analysis: **2-4 seconds** per image
|
| 726 |
-
|
| 727 |
-
---
|
| 728 |
-
|
| 729 |
-
## Success Metrics
|
| 730 |
-
|
| 731 |
-
Track these to measure feature adoption:
|
| 732 |
-
|
| 733 |
-
1. **Adoption Rate**: % of new listings created via image/video upload
|
| 734 |
-
2. **Time Saved**: Avg creation time (image-based vs text-based)
|
| 735 |
-
3. **Accuracy**: % of auto-detected fields accepted by users
|
| 736 |
-
4. **Field Coverage**: Which fields have highest accuracy
|
| 737 |
-
5. **Error Rate**: % of images rejected as non-property
|
| 738 |
-
|
| 739 |
-
---
|
| 740 |
-
|
| 741 |
-
## Future Enhancements
|
| 742 |
-
|
| 743 |
-
1. **Multi-frame video analysis**: Extract key frames from video, analyze each
|
| 744 |
-
2. **OCR for signs**: Extract property addresses from signs visible in photos
|
| 745 |
-
3. **Furniture detection**: Count furniture items, estimate age
|
| 746 |
-
4. **Damage detection**: Identify needed repairs
|
| 747 |
-
5. **Neighborhood analysis**: Analyze background (street view, buildings)
|
| 748 |
-
6. **Price estimation**: AI suggests price based on similar listings
|
| 749 |
-
7. **Virtual tour generation**: Automatically create walkthrough from photos
|
| 750 |
-
|
| 751 |
-
---
|
| 752 |
-
|
| 753 |
-
## Support & Troubleshooting
|
| 754 |
-
|
| 755 |
-
### Check Vision Service Status
|
| 756 |
-
|
| 757 |
-
```bash
|
| 758 |
-
GET /health
|
| 759 |
-
# Returns: vision_service: "healthy" | "unavailable"
|
| 760 |
-
```
|
| 761 |
-
|
| 762 |
-
### View Logs
|
| 763 |
-
|
| 764 |
-
```bash
|
| 765 |
-
# Backend logs for vision analysis
|
| 766 |
-
grep "Vision Service" logs/app.log
|
| 767 |
-
grep "Hugging Face API" logs/app.log
|
| 768 |
-
```
|
| 769 |
-
|
| 770 |
-
### Reset Cloudinary Cache
|
| 771 |
-
|
| 772 |
-
```bash
|
| 773 |
-
# Clear vision service cache (if implemented)
|
| 774 |
-
DELETE /admin/cache/vision
|
| 775 |
-
```
|
| 776 |
-
|
| 777 |
-
---
|
| 778 |
-
|
| 779 |
-
## Summary
|
| 780 |
-
|
| 781 |
-
β
**Phase 1 Complete:**
|
| 782 |
-
- Vision service created (Hugging Face integration)
|
| 783 |
-
- Media upload endpoints ready
|
| 784 |
-
- Property validation implemented
|
| 785 |
-
- Listing collection integration done
|
| 786 |
-
- Image/video storage configured
|
| 787 |
-
|
| 788 |
-
**Next Steps:**
|
| 789 |
-
1. Update frontend to use `/listings/analyze-images` endpoint
|
| 790 |
-
2. Update frontend to use `/listings/analyze-video` endpoint
|
| 791 |
-
3. Add vision results to chat UI
|
| 792 |
-
4. Test end-to-end flow
|
| 793 |
-
5. Monitor accuracy metrics
|
| 794 |
-
6. Optimize based on user feedback
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
app/__pycache__/config.cpython-313.pyc
CHANGED
|
Binary files a/app/__pycache__/config.cpython-313.pyc and b/app/__pycache__/config.cpython-313.pyc differ
|
|
|
app/ai/agent/__pycache__/graph.cpython-313.pyc
CHANGED
|
Binary files a/app/ai/agent/__pycache__/graph.cpython-313.pyc and b/app/ai/agent/__pycache__/graph.cpython-313.pyc differ
|
|
|
app/ai/agent/brain.py
CHANGED
|
@@ -15,6 +15,8 @@ from langchain_core.messages import SystemMessage, HumanMessage
|
|
| 15 |
from app.ai.agent.state import AgentState, FlowState
|
| 16 |
from app.ai.agent.schema import get_schema_for_llm, get_draft_summary, get_missing_fields
|
| 17 |
from app.config import settings
|
|
|
|
|
|
|
| 18 |
|
| 19 |
logger = get_logger(__name__)
|
| 20 |
|
|
@@ -851,21 +853,25 @@ async def execute_tool(tool_name: str, params: Dict[str, Any], state: AgentState
|
|
| 851 |
else:
|
| 852 |
# No draft_ui yet - AIDA will ask for images
|
| 853 |
state.temp_data["action"] = "respond"
|
| 854 |
-
|
|
|
|
|
|
|
|
|
|
| 855 |
return True, f"Updated: {list(fields.keys())}", state.provided_fields
|
| 856 |
|
| 857 |
elif tool_name == "search_properties":
|
| 858 |
# Import and call search service
|
| 859 |
from app.ai.services.search_extractor import extract_search_params
|
| 860 |
from app.ai.services.search_service import search_listings_hybrid, search_mongodb
|
| 861 |
-
|
|
|
|
| 862 |
# SMART UI: Clear old my_listings when doing new search
|
| 863 |
state.my_listings = []
|
| 864 |
state.temp_data.pop("my_listings", None)
|
| 865 |
-
|
| 866 |
# Step 1: Extract params from the full user message (LLM is smart)
|
| 867 |
search_params = await extract_search_params(state.last_user_message)
|
| 868 |
-
|
| 869 |
# Step 2: Merge with Brain-extracted params (these have priority if present)
|
| 870 |
if params.get("location"):
|
| 871 |
search_params["location"] = params["location"]
|
|
@@ -875,81 +881,142 @@ async def execute_tool(tool_name: str, params: Dict[str, Any], state: AgentState
|
|
| 875 |
search_params["max_price"] = params["max_price"]
|
| 876 |
if params.get("beds"):
|
| 877 |
search_params["bedrooms"] = params["beds"]
|
| 878 |
-
|
| 879 |
is_suggestion = False
|
| 880 |
-
|
| 881 |
-
|
| 882 |
-
|
| 883 |
-
|
| 884 |
-
#
|
| 885 |
-
|
| 886 |
-
|
| 887 |
-
|
| 888 |
-
|
| 889 |
-
|
| 890 |
-
|
| 891 |
-
|
|
|
|
| 892 |
)
|
| 893 |
-
|
| 894 |
-
|
| 895 |
-
|
| 896 |
-
|
| 897 |
-
|
| 898 |
-
|
| 899 |
-
|
| 900 |
-
|
| 901 |
-
|
| 902 |
-
|
| 903 |
-
|
| 904 |
-
|
| 905 |
-
|
| 906 |
-
|
| 907 |
-
|
| 908 |
-
|
| 909 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 910 |
else:
|
| 911 |
-
|
| 912 |
-
|
| 913 |
-
|
| 914 |
-
logger.info(f"No relevant suggestions found for {requested_location}, will prompt for notification")
|
| 915 |
-
else:
|
| 916 |
-
# No location filter specified, use all suggestions
|
| 917 |
-
results = suggestion_results
|
| 918 |
-
is_suggestion = True
|
| 919 |
-
|
| 920 |
# Step 5: Enrich results with owner/review data (same as listings API)
|
| 921 |
if results:
|
| 922 |
from app.database import get_db
|
| 923 |
from app.services.listing_service import enrich_listings_batch
|
| 924 |
-
|
| 925 |
db = await get_db()
|
| 926 |
-
# Convert to dicts if needed and stringify _id
|
| 927 |
formatted_results = []
|
| 928 |
for doc in results:
|
| 929 |
if "_id" in doc and not isinstance(doc["_id"], str):
|
| 930 |
doc["_id"] = str(doc["_id"])
|
| 931 |
formatted_results.append(doc)
|
| 932 |
-
|
| 933 |
results = await enrich_listings_batch(formatted_results, db)
|
| 934 |
logger.info(f"Enriched {len(results)} search results with owner/review data")
|
| 935 |
-
|
| 936 |
# Step 6: Store results and flags
|
| 937 |
state.search_results = results
|
| 938 |
state.temp_data["search_results"] = results
|
| 939 |
state.temp_data["action"] = "search_results"
|
| 940 |
state.temp_data["is_suggestion"] = is_suggestion
|
| 941 |
-
state.temp_data["search_params"] = search_params
|
| 942 |
-
|
|
|
|
| 943 |
# Always save last search params for "Notify me" feature
|
| 944 |
state.temp_data["last_search_params"] = search_params
|
| 945 |
state.temp_data["last_search_query"] = state.last_user_message
|
| 946 |
-
|
| 947 |
# If no results found, flag to propose alert
|
| 948 |
if len(results) == 0:
|
| 949 |
state.temp_data["propose_alert"] = True
|
| 950 |
state.temp_data["response_text"] = f"I couldn't find any properties matching your search right now. Would you like me to notify you when something becomes available? π"
|
| 951 |
-
|
| 952 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 953 |
|
| 954 |
elif tool_name == "get_my_listings":
|
| 955 |
# Get user's listings
|
|
@@ -1147,6 +1214,9 @@ async def execute_tool(tool_name: str, params: Dict[str, Any], state: AgentState
|
|
| 1147 |
state.temp_data["response_text"] = f"Got it! π I'll keep watching for properties in {location} and notify you the moment something becomes available!"
|
| 1148 |
state.temp_data["action"] = "alert_created"
|
| 1149 |
|
|
|
|
|
|
|
|
|
|
| 1150 |
return True, f"Alert created: {alert.id} (found {len(current_results)} current matches)", {
|
| 1151 |
"alert_id": str(alert.id),
|
| 1152 |
"current_match_count": len(current_results)
|
|
@@ -1297,6 +1367,8 @@ async def execute_tool(tool_name: str, params: Dict[str, Any], state: AgentState
|
|
| 1297 |
|
| 1298 |
except Exception as e:
|
| 1299 |
logger.error("Tool execution error", tool=tool_name, exc_info=e)
|
|
|
|
|
|
|
| 1300 |
return False, str(e), None
|
| 1301 |
|
| 1302 |
|
|
@@ -1305,11 +1377,31 @@ async def agent_think(state: AgentState) -> AgentState:
|
|
| 1305 |
Main agent thinking loop.
|
| 1306 |
LLM reasons β decides tool β executes β generates response.
|
| 1307 |
"""
|
| 1308 |
-
|
| 1309 |
logger.info("Agent thinking started", user_id=state.user_id)
|
| 1310 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1311 |
# Step 1: Brain decides what to do
|
| 1312 |
decision = await brain_decide(state)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1313 |
|
| 1314 |
# Store thinking for debugging
|
| 1315 |
state.temp_data["brain_thinking"] = decision.thinking
|
|
@@ -1359,8 +1451,19 @@ async def agent_think(state: AgentState) -> AgentState:
|
|
| 1359 |
else:
|
| 1360 |
state.temp_data["action"] = "respond" # Just text, no data cards
|
| 1361 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1362 |
logger.info("Agent thinking complete", action=decision.tool, show_data=decision.show_data)
|
| 1363 |
-
|
| 1364 |
return state
|
| 1365 |
|
| 1366 |
|
|
|
|
| 15 |
from app.ai.agent.state import AgentState, FlowState
|
| 16 |
from app.ai.agent.schema import get_schema_for_llm, get_draft_summary, get_missing_fields
|
| 17 |
from app.config import settings
|
| 18 |
+
from app.ai.lightning.rewards import log_field_reward, log_reward, log_negative_reward, REWARD_SEARCH_COMPLETED, REWARD_ALERT_CREATED
|
| 19 |
+
from app.ai.lightning.tracer import log_trajectory_step
|
| 20 |
|
| 21 |
logger = get_logger(__name__)
|
| 22 |
|
|
|
|
| 853 |
else:
|
| 854 |
# No draft_ui yet - AIDA will ask for images
|
| 855 |
state.temp_data["action"] = "respond"
|
| 856 |
+
|
| 857 |
+
# Log reward for field extraction (Agent Lightning)
|
| 858 |
+
await log_field_reward(state.session_id, list(fields.keys()))
|
| 859 |
+
|
| 860 |
return True, f"Updated: {list(fields.keys())}", state.provided_fields
|
| 861 |
|
| 862 |
elif tool_name == "search_properties":
|
| 863 |
# Import and call search service
|
| 864 |
from app.ai.services.search_extractor import extract_search_params
|
| 865 |
from app.ai.services.search_service import search_listings_hybrid, search_mongodb
|
| 866 |
+
from app.ai.services.search_strategy_selector import select_search_strategy, SearchStrategy
|
| 867 |
+
|
| 868 |
# SMART UI: Clear old my_listings when doing new search
|
| 869 |
state.my_listings = []
|
| 870 |
state.temp_data.pop("my_listings", None)
|
| 871 |
+
|
| 872 |
# Step 1: Extract params from the full user message (LLM is smart)
|
| 873 |
search_params = await extract_search_params(state.last_user_message)
|
| 874 |
+
|
| 875 |
# Step 2: Merge with Brain-extracted params (these have priority if present)
|
| 876 |
if params.get("location"):
|
| 877 |
search_params["location"] = params["location"]
|
|
|
|
| 881 |
search_params["max_price"] = params["max_price"]
|
| 882 |
if params.get("beds"):
|
| 883 |
search_params["bedrooms"] = params["beds"]
|
| 884 |
+
|
| 885 |
is_suggestion = False
|
| 886 |
+
rlm_used = False
|
| 887 |
+
|
| 888 |
+
# ================================================================
|
| 889 |
+
# Step 2.5: CHECK IF RLM SHOULD BE USED (NEW!)
|
| 890 |
+
# ================================================================
|
| 891 |
+
strategy_result = await select_search_strategy(state.last_user_message, search_params)
|
| 892 |
+
|
| 893 |
+
if strategy_result.get("use_rlm"):
|
| 894 |
+
# Use RLM for complex queries
|
| 895 |
+
logger.info(
|
| 896 |
+
"π§ RLM activated for search",
|
| 897 |
+
strategy=strategy_result["strategy"].value,
|
| 898 |
+
reasoning=strategy_result["reasoning"][:50]
|
| 899 |
)
|
| 900 |
+
|
| 901 |
+
try:
|
| 902 |
+
from app.ai.services.rlm_search_service import rlm_search
|
| 903 |
+
|
| 904 |
+
rlm_result = await rlm_search(
|
| 905 |
+
query=state.last_user_message,
|
| 906 |
+
context={
|
| 907 |
+
"user_location": state.user_location,
|
| 908 |
+
"search_params": search_params
|
| 909 |
+
}
|
| 910 |
+
)
|
| 911 |
+
|
| 912 |
+
results = rlm_result.get("results", [])
|
| 913 |
+
rlm_used = True
|
| 914 |
+
|
| 915 |
+
# Store RLM metadata
|
| 916 |
+
state.temp_data["rlm_strategy"] = rlm_result.get("strategy_used")
|
| 917 |
+
state.temp_data["rlm_reasoning_steps"] = rlm_result.get("reasoning_steps")
|
| 918 |
+
state.temp_data["rlm_call_count"] = rlm_result.get("call_count")
|
| 919 |
+
|
| 920 |
+
# Use RLM-generated message if available
|
| 921 |
+
if rlm_result.get("message"):
|
| 922 |
+
state.temp_data["response_text"] = rlm_result["message"]
|
| 923 |
+
|
| 924 |
+
# Store comparison data if available
|
| 925 |
+
if rlm_result.get("comparison_data"):
|
| 926 |
+
state.temp_data["comparison_data"] = rlm_result["comparison_data"]
|
| 927 |
+
|
| 928 |
+
# Store aggregation result if available
|
| 929 |
+
if rlm_result.get("aggregation_result"):
|
| 930 |
+
state.temp_data["aggregation_result"] = rlm_result["aggregation_result"]
|
| 931 |
+
|
| 932 |
+
logger.info(
|
| 933 |
+
f"π§ RLM search complete",
|
| 934 |
+
result_count=len(results),
|
| 935 |
+
strategy=rlm_result.get("strategy_used"),
|
| 936 |
+
calls=rlm_result.get("call_count")
|
| 937 |
+
)
|
| 938 |
+
|
| 939 |
+
except Exception as rlm_error:
|
| 940 |
+
logger.error(f"RLM search failed, falling back to standard: {rlm_error}")
|
| 941 |
+
rlm_used = False
|
| 942 |
+
results = []
|
| 943 |
+
|
| 944 |
+
# ================================================================
|
| 945 |
+
# Standard search path (if RLM not used or failed)
|
| 946 |
+
# ================================================================
|
| 947 |
+
if not rlm_used:
|
| 948 |
+
# Step 3: STRICT SEARCH FIRST (only use what user specified)
|
| 949 |
+
results = await search_mongodb(search_params, limit=10)
|
| 950 |
+
|
| 951 |
+
# Step 4: If 0 results, try RELAXED SUGGESTION search
|
| 952 |
+
if not results:
|
| 953 |
+
logger.info("Strict search yielded 0 results, trying suggestion search...")
|
| 954 |
+
suggestion_results, currency = await search_listings_hybrid(
|
| 955 |
+
user_query=state.last_user_message,
|
| 956 |
+
search_params=search_params,
|
| 957 |
+
limit=10,
|
| 958 |
+
mode="relaxed"
|
| 959 |
+
)
|
| 960 |
+
|
| 961 |
+
# Step 4.5: Filter suggestions - only keep results from the requested location
|
| 962 |
+
requested_location = (search_params.get("location") or "").lower()
|
| 963 |
+
|
| 964 |
+
if requested_location and suggestion_results:
|
| 965 |
+
relevant_suggestions = []
|
| 966 |
+
for listing in suggestion_results:
|
| 967 |
+
listing_location = (listing.get("location") or "").lower()
|
| 968 |
+
if requested_location in listing_location or listing_location in requested_location:
|
| 969 |
+
relevant_suggestions.append(listing)
|
| 970 |
+
|
| 971 |
+
if relevant_suggestions:
|
| 972 |
+
results = relevant_suggestions
|
| 973 |
+
is_suggestion = True
|
| 974 |
+
logger.info(f"Found {len(relevant_suggestions)} relevant suggestions for {requested_location}")
|
| 975 |
+
else:
|
| 976 |
+
results = []
|
| 977 |
+
is_suggestion = False
|
| 978 |
+
logger.info(f"No relevant suggestions found for {requested_location}")
|
| 979 |
else:
|
| 980 |
+
results = suggestion_results
|
| 981 |
+
is_suggestion = True
|
| 982 |
+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 983 |
# Step 5: Enrich results with owner/review data (same as listings API)
|
| 984 |
if results:
|
| 985 |
from app.database import get_db
|
| 986 |
from app.services.listing_service import enrich_listings_batch
|
| 987 |
+
|
| 988 |
db = await get_db()
|
|
|
|
| 989 |
formatted_results = []
|
| 990 |
for doc in results:
|
| 991 |
if "_id" in doc and not isinstance(doc["_id"], str):
|
| 992 |
doc["_id"] = str(doc["_id"])
|
| 993 |
formatted_results.append(doc)
|
| 994 |
+
|
| 995 |
results = await enrich_listings_batch(formatted_results, db)
|
| 996 |
logger.info(f"Enriched {len(results)} search results with owner/review data")
|
| 997 |
+
|
| 998 |
# Step 6: Store results and flags
|
| 999 |
state.search_results = results
|
| 1000 |
state.temp_data["search_results"] = results
|
| 1001 |
state.temp_data["action"] = "search_results"
|
| 1002 |
state.temp_data["is_suggestion"] = is_suggestion
|
| 1003 |
+
state.temp_data["search_params"] = search_params
|
| 1004 |
+
state.temp_data["search_strategy"] = strategy_result["strategy"].value if hasattr(strategy_result["strategy"], "value") else str(strategy_result["strategy"])
|
| 1005 |
+
|
| 1006 |
# Always save last search params for "Notify me" feature
|
| 1007 |
state.temp_data["last_search_params"] = search_params
|
| 1008 |
state.temp_data["last_search_query"] = state.last_user_message
|
| 1009 |
+
|
| 1010 |
# If no results found, flag to propose alert
|
| 1011 |
if len(results) == 0:
|
| 1012 |
state.temp_data["propose_alert"] = True
|
| 1013 |
state.temp_data["response_text"] = f"I couldn't find any properties matching your search right now. Would you like me to notify you when something becomes available? π"
|
| 1014 |
+
|
| 1015 |
+
# Log reward for search completion (Agent Lightning)
|
| 1016 |
+
if len(results) > 0:
|
| 1017 |
+
await log_reward(state.session_id, REWARD_SEARCH_COMPLETED, "search_completed", {"result_count": len(results)})
|
| 1018 |
+
|
| 1019 |
+
return True, f"Found {len(results)} properties" + (" (via RLM)" if rlm_used else ""), results
|
| 1020 |
|
| 1021 |
elif tool_name == "get_my_listings":
|
| 1022 |
# Get user's listings
|
|
|
|
| 1214 |
state.temp_data["response_text"] = f"Got it! π I'll keep watching for properties in {location} and notify you the moment something becomes available!"
|
| 1215 |
state.temp_data["action"] = "alert_created"
|
| 1216 |
|
| 1217 |
+
# Log reward for alert creation (Agent Lightning)
|
| 1218 |
+
await log_reward(state.session_id, REWARD_ALERT_CREATED, "alert_created", {"alert_id": str(alert.id)})
|
| 1219 |
+
|
| 1220 |
return True, f"Alert created: {alert.id} (found {len(current_results)} current matches)", {
|
| 1221 |
"alert_id": str(alert.id),
|
| 1222 |
"current_match_count": len(current_results)
|
|
|
|
| 1367 |
|
| 1368 |
except Exception as e:
|
| 1369 |
logger.error("Tool execution error", tool=tool_name, exc_info=e)
|
| 1370 |
+
# Log negative reward for tool execution error (Agent Lightning)
|
| 1371 |
+
await log_negative_reward(state.session_id, "error", f"Tool {tool_name} failed: {str(e)}")
|
| 1372 |
return False, str(e), None
|
| 1373 |
|
| 1374 |
|
|
|
|
| 1377 |
Main agent thinking loop.
|
| 1378 |
LLM reasons β decides tool β executes β generates response.
|
| 1379 |
"""
|
| 1380 |
+
|
| 1381 |
logger.info("Agent thinking started", user_id=state.user_id)
|
| 1382 |
+
|
| 1383 |
+
# Log user input trajectory (Agent Lightning)
|
| 1384 |
+
await log_trajectory_step(
|
| 1385 |
+
state.session_id,
|
| 1386 |
+
"user_input",
|
| 1387 |
+
{"message": state.last_user_message[:500] if state.last_user_message else ""},
|
| 1388 |
+
state.user_id
|
| 1389 |
+
)
|
| 1390 |
+
|
| 1391 |
# Step 1: Brain decides what to do
|
| 1392 |
decision = await brain_decide(state)
|
| 1393 |
+
|
| 1394 |
+
# Log brain decision trajectory (Agent Lightning)
|
| 1395 |
+
await log_trajectory_step(
|
| 1396 |
+
state.session_id,
|
| 1397 |
+
"brain_decision",
|
| 1398 |
+
{
|
| 1399 |
+
"thinking": decision.thinking[:200] if decision.thinking else "",
|
| 1400 |
+
"tool": decision.tool,
|
| 1401 |
+
"is_final": decision.is_final
|
| 1402 |
+
},
|
| 1403 |
+
state.user_id
|
| 1404 |
+
)
|
| 1405 |
|
| 1406 |
# Store thinking for debugging
|
| 1407 |
state.temp_data["brain_thinking"] = decision.thinking
|
|
|
|
| 1451 |
else:
|
| 1452 |
state.temp_data["action"] = "respond" # Just text, no data cards
|
| 1453 |
|
| 1454 |
+
# Log response trajectory (Agent Lightning)
|
| 1455 |
+
await log_trajectory_step(
|
| 1456 |
+
state.session_id,
|
| 1457 |
+
"response",
|
| 1458 |
+
{
|
| 1459 |
+
"response": state.temp_data.get("response_text", "")[:500],
|
| 1460 |
+
"action": state.temp_data.get("action", "respond")
|
| 1461 |
+
},
|
| 1462 |
+
state.user_id
|
| 1463 |
+
)
|
| 1464 |
+
|
| 1465 |
logger.info("Agent thinking complete", action=decision.tool, show_data=decision.show_data)
|
| 1466 |
+
|
| 1467 |
return state
|
| 1468 |
|
| 1469 |
|
app/ai/agent/graph.py
CHANGED
|
@@ -15,6 +15,7 @@ from langgraph.checkpoint.memory import MemorySaver
|
|
| 15 |
from structlog import get_logger
|
| 16 |
|
| 17 |
from app.ai.agent.state import AgentState, FlowState
|
|
|
|
| 18 |
from app.ai.agent.nodes.authenticate import authenticate
|
| 19 |
from app.ai.agent.brain import agent_think
|
| 20 |
from app.ai.agent.nodes.validate_output import validate_output_node
|
|
@@ -117,9 +118,12 @@ def build_aida_graph():
|
|
| 117 |
|
| 118 |
checkpointer = MemorySaver()
|
| 119 |
compiled_graph = graph.compile(checkpointer=checkpointer)
|
| 120 |
-
|
|
|
|
|
|
|
|
|
|
| 121 |
logger.info("β
LangGraph V2 compiled (Brain-Based)")
|
| 122 |
-
|
| 123 |
return compiled_graph
|
| 124 |
|
| 125 |
|
|
|
|
| 15 |
from structlog import get_logger
|
| 16 |
|
| 17 |
from app.ai.agent.state import AgentState, FlowState
|
| 18 |
+
from app.ai.lightning.tracer import wrap_graph_if_enabled
|
| 19 |
from app.ai.agent.nodes.authenticate import authenticate
|
| 20 |
from app.ai.agent.brain import agent_think
|
| 21 |
from app.ai.agent.nodes.validate_output import validate_output_node
|
|
|
|
| 118 |
|
| 119 |
checkpointer = MemorySaver()
|
| 120 |
compiled_graph = graph.compile(checkpointer=checkpointer)
|
| 121 |
+
|
| 122 |
+
# Wrap with Agent Lightning tracer (if enabled)
|
| 123 |
+
compiled_graph = wrap_graph_if_enabled(compiled_graph)
|
| 124 |
+
|
| 125 |
logger.info("β
LangGraph V2 compiled (Brain-Based)")
|
| 126 |
+
|
| 127 |
return compiled_graph
|
| 128 |
|
| 129 |
|
app/ai/agent/nodes/__pycache__/listing_publish.cpython-313.pyc
CHANGED
|
Binary files a/app/ai/agent/nodes/__pycache__/listing_publish.cpython-313.pyc and b/app/ai/agent/nodes/__pycache__/listing_publish.cpython-313.pyc differ
|
|
|
app/ai/agent/nodes/listing_collect.py
CHANGED
|
@@ -30,80 +30,19 @@ llm = ChatOpenAI(
|
|
| 30 |
temperature=0.7,
|
| 31 |
)
|
| 32 |
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
|
| 47 |
-
Args:
|
| 48 |
-
state: Current agent state
|
| 49 |
-
vision_data: Dict with extracted fields from vision service
|
| 50 |
-
|
| 51 |
-
Returns:
|
| 52 |
-
Updated state ready for collection
|
| 53 |
-
"""
|
| 54 |
-
try:
|
| 55 |
-
# Extract vision analysis results
|
| 56 |
-
extracted_fields = vision_data.get("extracted_fields", {})
|
| 57 |
-
confidence = vision_data.get("confidence", {})
|
| 58 |
-
image_urls = vision_data.get("image_urls", [])
|
| 59 |
-
|
| 60 |
-
logger.info("π€ Initializing listing from vision analysis",
|
| 61 |
-
bedrooms=extracted_fields.get("bedrooms"),
|
| 62 |
-
bathrooms=extracted_fields.get("bathrooms"),
|
| 63 |
-
amenities_count=len(extracted_fields.get("amenities", [])))
|
| 64 |
-
|
| 65 |
-
# Pre-fill detected fields with high confidence (>0.7)
|
| 66 |
-
high_confidence_threshold = 0.7
|
| 67 |
-
|
| 68 |
-
# Always add images (they were validated)
|
| 69 |
-
if image_urls:
|
| 70 |
-
state.update_listing_progress("images", image_urls)
|
| 71 |
-
logger.info(f"β
Added {len(image_urls)} validated images")
|
| 72 |
-
|
| 73 |
-
# Bedrooms (high confidence)
|
| 74 |
-
if extracted_fields.get("bedrooms") is not None and confidence.get("bedrooms", 0) > high_confidence_threshold:
|
| 75 |
-
state.update_listing_progress("bedrooms", extracted_fields["bedrooms"])
|
| 76 |
-
logger.info(f"β
Auto-filled bedrooms: {extracted_fields['bedrooms']}")
|
| 77 |
-
|
| 78 |
-
# Bathrooms (high confidence)
|
| 79 |
-
if extracted_fields.get("bathrooms") is not None and confidence.get("bathrooms", 0) > high_confidence_threshold:
|
| 80 |
-
state.update_listing_progress("bathrooms", extracted_fields["bathrooms"])
|
| 81 |
-
logger.info(f"β
Auto-filled bathrooms: {extracted_fields['bathrooms']}")
|
| 82 |
-
|
| 83 |
-
# Amenities (even medium confidence is good for amenities)
|
| 84 |
-
if extracted_fields.get("amenities") and confidence.get("amenities", 0) > 0.5:
|
| 85 |
-
state.update_listing_progress("amenities", extracted_fields["amenities"])
|
| 86 |
-
logger.info(f"β
Auto-filled amenities: {extracted_fields['amenities']}")
|
| 87 |
-
|
| 88 |
-
# Description (if high confidence)
|
| 89 |
-
if extracted_fields.get("description") and confidence.get("description", 0) > high_confidence_threshold:
|
| 90 |
-
state.update_listing_progress("description", extracted_fields["description"])
|
| 91 |
-
logger.info(f"β
Auto-filled description")
|
| 92 |
-
|
| 93 |
-
# Store vision confidence scores in temp_data for reference
|
| 94 |
-
state.temp_data["vision_confidence"] = confidence
|
| 95 |
-
state.temp_data["from_vision_analysis"] = True
|
| 96 |
-
|
| 97 |
-
# Set user message to indicate vision analysis was done
|
| 98 |
-
state.last_user_message = "[Vision analysis completed - awaiting user confirmation]"
|
| 99 |
-
|
| 100 |
-
logger.info("β
Vision analysis initialization complete")
|
| 101 |
-
return state
|
| 102 |
-
|
| 103 |
-
except Exception as e:
|
| 104 |
-
logger.error("Error initializing from vision analysis", exc_info=e)
|
| 105 |
-
state.set_error(f"Error initializing from vision: {str(e)}", should_retry=True)
|
| 106 |
-
return state
|
| 107 |
|
| 108 |
|
| 109 |
async def generate_contextual_question(state: AgentState, next_field: str = None) -> str:
|
|
|
|
| 30 |
temperature=0.7,
|
| 31 |
)
|
| 32 |
|
| 33 |
+
# ============================================================
|
| 34 |
+
# VISION ANALYSIS - DISABLED
|
| 35 |
+
# ============================================================
|
| 36 |
+
# NOTE: Vision analysis is NOT in use. Image uploads are handled
|
| 37 |
+
# directly by Cloudflare Worker (frontend upload).
|
| 38 |
+
# This function is kept for future reference only.
|
| 39 |
+
# ============================================================
|
| 40 |
+
# async def initialize_from_vision_analysis(
|
| 41 |
+
# state: AgentState,
|
| 42 |
+
# vision_data: Dict
|
| 43 |
+
# ) -> AgentState:
|
| 44 |
+
# """Initialize listing from AI vision analysis (images/video) - DISABLED"""
|
| 45 |
+
# pass
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 46 |
|
| 47 |
|
| 48 |
async def generate_contextual_question(state: AgentState, next_field: str = None) -> str:
|
app/ai/agent/nodes/listing_publish.py
CHANGED
|
@@ -11,6 +11,7 @@ from app.ai.agent.state import AgentState, FlowState
|
|
| 11 |
from app.ai.agent.schemas import ListingDraft
|
| 12 |
from app.database import get_db
|
| 13 |
from app.ai.services.vector_service import upsert_listing_to_vector_db
|
|
|
|
| 14 |
|
| 15 |
logger = get_logger(__name__)
|
| 16 |
|
|
@@ -282,6 +283,15 @@ async def listing_publish_handler(state: AgentState) -> AgentState:
|
|
| 282 |
logger.info("Proactive alerts processed for new listing", listing_id=listing_id)
|
| 283 |
except Exception as notify_err:
|
| 284 |
logger.warning("Proactive notification check failed", error=str(notify_err))
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 285 |
|
| 286 |
except Exception as e:
|
| 287 |
logger.error("MongoDB save failed", exc_info=e)
|
|
|
|
| 11 |
from app.ai.agent.schemas import ListingDraft
|
| 12 |
from app.database import get_db
|
| 13 |
from app.ai.services.vector_service import upsert_listing_to_vector_db
|
| 14 |
+
from app.ai.lightning.rewards import log_reward
|
| 15 |
|
| 16 |
logger = get_logger(__name__)
|
| 17 |
|
|
|
|
| 283 |
logger.info("Proactive alerts processed for new listing", listing_id=listing_id)
|
| 284 |
except Exception as notify_err:
|
| 285 |
logger.warning("Proactive notification check failed", error=str(notify_err))
|
| 286 |
+
|
| 287 |
+
# β
STEP 3.3: Log reward for successful publish (Agent Lightning)
|
| 288 |
+
is_update = bool(state.temp_data.get("editing_listing_id"))
|
| 289 |
+
await log_reward(
|
| 290 |
+
state.session_id,
|
| 291 |
+
1.0, # Primary success signal
|
| 292 |
+
"listing_published",
|
| 293 |
+
{"listing_id": listing_id, "is_update": is_update}
|
| 294 |
+
)
|
| 295 |
|
| 296 |
except Exception as e:
|
| 297 |
logger.error("MongoDB save failed", exc_info=e)
|
app/ai/lightning/__init__.py
ADDED
|
@@ -0,0 +1,38 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# app/ai/lightning/__init__.py
|
| 2 |
+
"""
|
| 3 |
+
Agent Lightning - RL Trajectory Capture for AIDA
|
| 4 |
+
|
| 5 |
+
This module implements Agent Lightning-inspired reinforcement learning
|
| 6 |
+
trajectory capture for training AIDA to improve listing completion rates.
|
| 7 |
+
|
| 8 |
+
Components:
|
| 9 |
+
- tracer.py: Captures state transitions, tool calls, and outcomes
|
| 10 |
+
- rewards.py: Logs reward signals at key events
|
| 11 |
+
- config.py: Lightning-specific configuration
|
| 12 |
+
|
| 13 |
+
Usage:
|
| 14 |
+
from app.ai.lightning import log_reward, log_trajectory
|
| 15 |
+
|
| 16 |
+
# Log a reward signal
|
| 17 |
+
await log_reward(session_id, 1.0, "listing_published", {"listing_id": "..."})
|
| 18 |
+
|
| 19 |
+
# Trajectories are captured automatically when LIGHTNING_ENABLED=true
|
| 20 |
+
"""
|
| 21 |
+
|
| 22 |
+
from app.ai.lightning.rewards import log_reward, log_field_reward, log_negative_reward
|
| 23 |
+
from app.ai.lightning.tracer import (
|
| 24 |
+
wrap_graph_if_enabled,
|
| 25 |
+
log_trajectory_step,
|
| 26 |
+
get_session_trajectory,
|
| 27 |
+
export_trajectories_for_training
|
| 28 |
+
)
|
| 29 |
+
|
| 30 |
+
__all__ = [
|
| 31 |
+
"log_reward",
|
| 32 |
+
"log_field_reward",
|
| 33 |
+
"log_negative_reward",
|
| 34 |
+
"wrap_graph_if_enabled",
|
| 35 |
+
"log_trajectory_step",
|
| 36 |
+
"get_session_trajectory",
|
| 37 |
+
"export_trajectories_for_training"
|
| 38 |
+
]
|
app/ai/lightning/rewards.py
ADDED
|
@@ -0,0 +1,249 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# app/ai/lightning/rewards.py
|
| 2 |
+
"""
|
| 3 |
+
Agent Lightning Reward Signals
|
| 4 |
+
|
| 5 |
+
Logs reward signals at key events for RL training.
|
| 6 |
+
Rewards are associated with session trajectories.
|
| 7 |
+
|
| 8 |
+
Reward Definitions:
|
| 9 |
+
- listing_published: +1.0 (primary success signal)
|
| 10 |
+
- field_extracted: +0.1 per field (incremental progress)
|
| 11 |
+
- search_completed: +0.3 (user found what they wanted)
|
| 12 |
+
- alert_created: +0.2 (user engaged with notifications)
|
| 13 |
+
- conversation_error: -0.5 (negative signal for failures)
|
| 14 |
+
- conversation_abandoned: -0.3 (user left mid-flow)
|
| 15 |
+
"""
|
| 16 |
+
|
| 17 |
+
import json
|
| 18 |
+
from datetime import datetime
|
| 19 |
+
from typing import Any, Dict, List, Optional
|
| 20 |
+
from structlog import get_logger
|
| 21 |
+
|
| 22 |
+
logger = get_logger(__name__)
|
| 23 |
+
|
| 24 |
+
# Reward value constants
|
| 25 |
+
REWARD_LISTING_PUBLISHED = 1.0
|
| 26 |
+
REWARD_FIELD_EXTRACTED = 0.1
|
| 27 |
+
REWARD_SEARCH_COMPLETED = 0.3
|
| 28 |
+
REWARD_ALERT_CREATED = 0.2
|
| 29 |
+
REWARD_CONVERSATION_ERROR = -0.5
|
| 30 |
+
REWARD_CONVERSATION_ABANDONED = -0.3
|
| 31 |
+
|
| 32 |
+
# Redis connection (lazy initialization)
|
| 33 |
+
_redis_client = None
|
| 34 |
+
|
| 35 |
+
|
| 36 |
+
async def _get_redis():
|
| 37 |
+
"""Get or create Redis client"""
|
| 38 |
+
global _redis_client
|
| 39 |
+
if _redis_client is None:
|
| 40 |
+
try:
|
| 41 |
+
from app.ai.memory.redis_memory import get_redis_client
|
| 42 |
+
_redis_client = await get_redis_client()
|
| 43 |
+
except Exception as e:
|
| 44 |
+
logger.warning("Lightning Rewards: Redis connection failed", error=str(e))
|
| 45 |
+
return None
|
| 46 |
+
return _redis_client
|
| 47 |
+
|
| 48 |
+
|
| 49 |
+
def _is_lightning_enabled() -> bool:
|
| 50 |
+
"""Check if Lightning is enabled"""
|
| 51 |
+
try:
|
| 52 |
+
from app.config import settings
|
| 53 |
+
return getattr(settings, 'LIGHTNING_ENABLED', False)
|
| 54 |
+
except Exception:
|
| 55 |
+
return False
|
| 56 |
+
|
| 57 |
+
|
| 58 |
+
def _get_reward_ttl() -> int:
|
| 59 |
+
"""Get TTL for rewards in seconds (same as trajectories)"""
|
| 60 |
+
try:
|
| 61 |
+
from app.config import settings
|
| 62 |
+
days = getattr(settings, 'LIGHTNING_TRAJECTORY_TTL_DAYS', 30)
|
| 63 |
+
return days * 24 * 60 * 60
|
| 64 |
+
except Exception:
|
| 65 |
+
return 30 * 24 * 60 * 60
|
| 66 |
+
|
| 67 |
+
|
| 68 |
+
async def log_reward(
|
| 69 |
+
session_id: str,
|
| 70 |
+
reward: float,
|
| 71 |
+
event_type: str,
|
| 72 |
+
metadata: Optional[Dict[str, Any]] = None
|
| 73 |
+
) -> bool:
|
| 74 |
+
"""
|
| 75 |
+
Log a reward signal for a session.
|
| 76 |
+
|
| 77 |
+
Args:
|
| 78 |
+
session_id: Session identifier
|
| 79 |
+
reward: Reward value (positive or negative)
|
| 80 |
+
event_type: Type of event that triggered reward
|
| 81 |
+
metadata: Optional additional context
|
| 82 |
+
|
| 83 |
+
Returns:
|
| 84 |
+
True if logged successfully
|
| 85 |
+
"""
|
| 86 |
+
if not _is_lightning_enabled():
|
| 87 |
+
return False
|
| 88 |
+
|
| 89 |
+
try:
|
| 90 |
+
redis = await _get_redis()
|
| 91 |
+
if not redis:
|
| 92 |
+
return False
|
| 93 |
+
|
| 94 |
+
reward_entry = {
|
| 95 |
+
"timestamp": datetime.utcnow().isoformat(),
|
| 96 |
+
"reward": reward,
|
| 97 |
+
"event_type": event_type,
|
| 98 |
+
"metadata": metadata or {}
|
| 99 |
+
}
|
| 100 |
+
|
| 101 |
+
key = f"lightning:rewards:{session_id}"
|
| 102 |
+
await redis.rpush(key, json.dumps(reward_entry))
|
| 103 |
+
await redis.expire(key, _get_reward_ttl())
|
| 104 |
+
|
| 105 |
+
# Also increment global counters for monitoring
|
| 106 |
+
counter_key = f"lightning:stats:{event_type}"
|
| 107 |
+
await redis.incr(counter_key)
|
| 108 |
+
|
| 109 |
+
logger.info("Lightning: Reward logged",
|
| 110 |
+
session_id=session_id[:8],
|
| 111 |
+
reward=reward,
|
| 112 |
+
event_type=event_type)
|
| 113 |
+
return True
|
| 114 |
+
|
| 115 |
+
except Exception as e:
|
| 116 |
+
logger.warning("Lightning: Failed to log reward", error=str(e))
|
| 117 |
+
return False
|
| 118 |
+
|
| 119 |
+
|
| 120 |
+
async def log_field_reward(
|
| 121 |
+
session_id: str,
|
| 122 |
+
fields: List[str]
|
| 123 |
+
) -> bool:
|
| 124 |
+
"""
|
| 125 |
+
Log reward for successfully extracted listing fields.
|
| 126 |
+
|
| 127 |
+
Args:
|
| 128 |
+
session_id: Session identifier
|
| 129 |
+
fields: List of field names that were extracted
|
| 130 |
+
|
| 131 |
+
Returns:
|
| 132 |
+
True if logged successfully
|
| 133 |
+
"""
|
| 134 |
+
if not fields:
|
| 135 |
+
return False
|
| 136 |
+
|
| 137 |
+
# Calculate reward: 0.1 per field
|
| 138 |
+
reward = REWARD_FIELD_EXTRACTED * len(fields)
|
| 139 |
+
|
| 140 |
+
return await log_reward(
|
| 141 |
+
session_id,
|
| 142 |
+
reward,
|
| 143 |
+
"field_extracted",
|
| 144 |
+
{"fields": fields, "field_count": len(fields)}
|
| 145 |
+
)
|
| 146 |
+
|
| 147 |
+
|
| 148 |
+
async def log_negative_reward(
|
| 149 |
+
session_id: str,
|
| 150 |
+
event_type: str,
|
| 151 |
+
reason: str
|
| 152 |
+
) -> bool:
|
| 153 |
+
"""
|
| 154 |
+
Log a negative reward for errors or abandonment.
|
| 155 |
+
|
| 156 |
+
Args:
|
| 157 |
+
session_id: Session identifier
|
| 158 |
+
event_type: "error" or "abandoned"
|
| 159 |
+
reason: Description of what went wrong
|
| 160 |
+
|
| 161 |
+
Returns:
|
| 162 |
+
True if logged successfully
|
| 163 |
+
"""
|
| 164 |
+
if event_type == "error":
|
| 165 |
+
reward = REWARD_CONVERSATION_ERROR
|
| 166 |
+
elif event_type == "abandoned":
|
| 167 |
+
reward = REWARD_CONVERSATION_ABANDONED
|
| 168 |
+
else:
|
| 169 |
+
reward = -0.1 # Generic negative
|
| 170 |
+
|
| 171 |
+
return await log_reward(
|
| 172 |
+
session_id,
|
| 173 |
+
reward,
|
| 174 |
+
f"conversation_{event_type}",
|
| 175 |
+
{"reason": reason}
|
| 176 |
+
)
|
| 177 |
+
|
| 178 |
+
|
| 179 |
+
async def get_session_rewards(session_id: str) -> List[Dict[str, Any]]:
|
| 180 |
+
"""
|
| 181 |
+
Get all rewards for a session.
|
| 182 |
+
|
| 183 |
+
Args:
|
| 184 |
+
session_id: Session identifier
|
| 185 |
+
|
| 186 |
+
Returns:
|
| 187 |
+
List of reward entries
|
| 188 |
+
"""
|
| 189 |
+
if not _is_lightning_enabled():
|
| 190 |
+
return []
|
| 191 |
+
|
| 192 |
+
try:
|
| 193 |
+
redis = await _get_redis()
|
| 194 |
+
if not redis:
|
| 195 |
+
return []
|
| 196 |
+
|
| 197 |
+
key = f"lightning:rewards:{session_id}"
|
| 198 |
+
raw_rewards = await redis.lrange(key, 0, -1)
|
| 199 |
+
|
| 200 |
+
return [json.loads(r) for r in raw_rewards]
|
| 201 |
+
|
| 202 |
+
except Exception as e:
|
| 203 |
+
logger.warning("Lightning: Failed to get rewards", error=str(e))
|
| 204 |
+
return []
|
| 205 |
+
|
| 206 |
+
|
| 207 |
+
async def get_total_session_reward(session_id: str) -> float:
|
| 208 |
+
"""
|
| 209 |
+
Calculate total reward for a session.
|
| 210 |
+
|
| 211 |
+
Args:
|
| 212 |
+
session_id: Session identifier
|
| 213 |
+
|
| 214 |
+
Returns:
|
| 215 |
+
Sum of all rewards
|
| 216 |
+
"""
|
| 217 |
+
rewards = await get_session_rewards(session_id)
|
| 218 |
+
return sum(r.get("reward", 0) for r in rewards)
|
| 219 |
+
|
| 220 |
+
|
| 221 |
+
async def get_lightning_stats() -> Dict[str, int]:
|
| 222 |
+
"""
|
| 223 |
+
Get global Lightning statistics.
|
| 224 |
+
|
| 225 |
+
Returns:
|
| 226 |
+
Dict of event_type -> count
|
| 227 |
+
"""
|
| 228 |
+
if not _is_lightning_enabled():
|
| 229 |
+
return {}
|
| 230 |
+
|
| 231 |
+
try:
|
| 232 |
+
redis = await _get_redis()
|
| 233 |
+
if not redis:
|
| 234 |
+
return {}
|
| 235 |
+
|
| 236 |
+
# Get all stats keys
|
| 237 |
+
stats_keys = await redis.keys("lightning:stats:*")
|
| 238 |
+
|
| 239 |
+
stats = {}
|
| 240 |
+
for key in stats_keys:
|
| 241 |
+
event_type = key.decode().split(":")[-1]
|
| 242 |
+
count = await redis.get(key)
|
| 243 |
+
stats[event_type] = int(count) if count else 0
|
| 244 |
+
|
| 245 |
+
return stats
|
| 246 |
+
|
| 247 |
+
except Exception as e:
|
| 248 |
+
logger.warning("Lightning: Failed to get stats", error=str(e))
|
| 249 |
+
return {}
|
app/ai/lightning/tracer.py
ADDED
|
@@ -0,0 +1,326 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# app/ai/lightning/tracer.py
|
| 2 |
+
"""
|
| 3 |
+
Agent Lightning Trajectory Tracer
|
| 4 |
+
|
| 5 |
+
Captures state transitions and tool calls for RL training.
|
| 6 |
+
Uses Redis for trajectory storage with automatic TTL cleanup.
|
| 7 |
+
|
| 8 |
+
Design Principles:
|
| 9 |
+
1. Zero overhead when disabled (LIGHTNING_ENABLED=false)
|
| 10 |
+
2. Non-blocking async operations
|
| 11 |
+
3. Graceful degradation on errors
|
| 12 |
+
4. Compatible with existing LangGraph architecture
|
| 13 |
+
"""
|
| 14 |
+
|
| 15 |
+
import json
|
| 16 |
+
import asyncio
|
| 17 |
+
from datetime import datetime
|
| 18 |
+
from typing import Any, Dict, List, Optional, Callable
|
| 19 |
+
from functools import wraps
|
| 20 |
+
from structlog import get_logger
|
| 21 |
+
|
| 22 |
+
logger = get_logger(__name__)
|
| 23 |
+
|
| 24 |
+
# Redis connection (lazy initialization)
|
| 25 |
+
_redis_client = None
|
| 26 |
+
|
| 27 |
+
|
| 28 |
+
async def _get_redis():
|
| 29 |
+
"""Get or create Redis client for lightning storage"""
|
| 30 |
+
global _redis_client
|
| 31 |
+
if _redis_client is None:
|
| 32 |
+
try:
|
| 33 |
+
from app.ai.memory.redis_memory import get_redis_client
|
| 34 |
+
_redis_client = await get_redis_client()
|
| 35 |
+
except Exception as e:
|
| 36 |
+
logger.warning("Lightning: Redis connection failed, trajectories will not be stored", error=str(e))
|
| 37 |
+
return None
|
| 38 |
+
return _redis_client
|
| 39 |
+
|
| 40 |
+
|
| 41 |
+
def _is_lightning_enabled() -> bool:
|
| 42 |
+
"""Check if Lightning is enabled via config"""
|
| 43 |
+
try:
|
| 44 |
+
from app.config import settings
|
| 45 |
+
return getattr(settings, 'LIGHTNING_ENABLED', False)
|
| 46 |
+
except Exception:
|
| 47 |
+
return False
|
| 48 |
+
|
| 49 |
+
|
| 50 |
+
def _get_trajectory_ttl() -> int:
|
| 51 |
+
"""Get TTL for trajectories in seconds"""
|
| 52 |
+
try:
|
| 53 |
+
from app.config import settings
|
| 54 |
+
days = getattr(settings, 'LIGHTNING_TRAJECTORY_TTL_DAYS', 30)
|
| 55 |
+
return days * 24 * 60 * 60 # Convert to seconds
|
| 56 |
+
except Exception:
|
| 57 |
+
return 30 * 24 * 60 * 60 # Default 30 days
|
| 58 |
+
|
| 59 |
+
|
| 60 |
+
async def log_trajectory_step(
|
| 61 |
+
session_id: str,
|
| 62 |
+
step_type: str,
|
| 63 |
+
data: Dict[str, Any],
|
| 64 |
+
user_id: Optional[str] = None
|
| 65 |
+
) -> bool:
|
| 66 |
+
"""
|
| 67 |
+
Log a single trajectory step to Redis.
|
| 68 |
+
|
| 69 |
+
Args:
|
| 70 |
+
session_id: Unique session identifier
|
| 71 |
+
step_type: Type of step (user_input, brain_decision, tool_call, tool_result, response)
|
| 72 |
+
data: Step data (varies by type)
|
| 73 |
+
user_id: Optional user ID for filtering
|
| 74 |
+
|
| 75 |
+
Returns:
|
| 76 |
+
True if logged successfully, False otherwise
|
| 77 |
+
"""
|
| 78 |
+
if not _is_lightning_enabled():
|
| 79 |
+
return False
|
| 80 |
+
|
| 81 |
+
try:
|
| 82 |
+
redis = await _get_redis()
|
| 83 |
+
if not redis:
|
| 84 |
+
return False
|
| 85 |
+
|
| 86 |
+
step = {
|
| 87 |
+
"timestamp": datetime.utcnow().isoformat(),
|
| 88 |
+
"step_type": step_type,
|
| 89 |
+
"data": data,
|
| 90 |
+
"user_id": user_id
|
| 91 |
+
}
|
| 92 |
+
|
| 93 |
+
# Store as list under session key
|
| 94 |
+
key = f"lightning:trajectory:{session_id}"
|
| 95 |
+
await redis.rpush(key, json.dumps(step))
|
| 96 |
+
await redis.expire(key, _get_trajectory_ttl())
|
| 97 |
+
|
| 98 |
+
logger.debug("Lightning: Trajectory step logged",
|
| 99 |
+
session_id=session_id[:8],
|
| 100 |
+
step_type=step_type)
|
| 101 |
+
return True
|
| 102 |
+
|
| 103 |
+
except Exception as e:
|
| 104 |
+
logger.warning("Lightning: Failed to log trajectory step", error=str(e))
|
| 105 |
+
return False
|
| 106 |
+
|
| 107 |
+
|
| 108 |
+
async def get_session_trajectory(session_id: str) -> List[Dict[str, Any]]:
|
| 109 |
+
"""
|
| 110 |
+
Retrieve full trajectory for a session.
|
| 111 |
+
|
| 112 |
+
Args:
|
| 113 |
+
session_id: Session identifier
|
| 114 |
+
|
| 115 |
+
Returns:
|
| 116 |
+
List of trajectory steps
|
| 117 |
+
"""
|
| 118 |
+
if not _is_lightning_enabled():
|
| 119 |
+
return []
|
| 120 |
+
|
| 121 |
+
try:
|
| 122 |
+
redis = await _get_redis()
|
| 123 |
+
if not redis:
|
| 124 |
+
return []
|
| 125 |
+
|
| 126 |
+
key = f"lightning:trajectory:{session_id}"
|
| 127 |
+
raw_steps = await redis.lrange(key, 0, -1)
|
| 128 |
+
|
| 129 |
+
return [json.loads(step) for step in raw_steps]
|
| 130 |
+
|
| 131 |
+
except Exception as e:
|
| 132 |
+
logger.warning("Lightning: Failed to get trajectory", error=str(e))
|
| 133 |
+
return []
|
| 134 |
+
|
| 135 |
+
|
| 136 |
+
async def export_trajectories_for_training(
|
| 137 |
+
min_steps: int = 3,
|
| 138 |
+
max_trajectories: int = 1000,
|
| 139 |
+
only_completed: bool = True
|
| 140 |
+
) -> List[Dict[str, Any]]:
|
| 141 |
+
"""
|
| 142 |
+
Export trajectories for RL training.
|
| 143 |
+
|
| 144 |
+
Args:
|
| 145 |
+
min_steps: Minimum steps required per trajectory
|
| 146 |
+
max_trajectories: Maximum number to export
|
| 147 |
+
only_completed: Only include trajectories with rewards
|
| 148 |
+
|
| 149 |
+
Returns:
|
| 150 |
+
List of trajectories with their rewards
|
| 151 |
+
"""
|
| 152 |
+
if not _is_lightning_enabled():
|
| 153 |
+
logger.warning("Lightning: Cannot export - Lightning not enabled")
|
| 154 |
+
return []
|
| 155 |
+
|
| 156 |
+
try:
|
| 157 |
+
redis = await _get_redis()
|
| 158 |
+
if not redis:
|
| 159 |
+
return []
|
| 160 |
+
|
| 161 |
+
# Get all trajectory keys
|
| 162 |
+
trajectory_keys = await redis.keys("lightning:trajectory:*")
|
| 163 |
+
|
| 164 |
+
trajectories = []
|
| 165 |
+
for key in trajectory_keys[:max_trajectories * 2]: # Get extra to filter
|
| 166 |
+
session_id = key.decode().split(":")[-1]
|
| 167 |
+
|
| 168 |
+
# Get trajectory
|
| 169 |
+
raw_steps = await redis.lrange(key, 0, -1)
|
| 170 |
+
steps = [json.loads(step) for step in raw_steps]
|
| 171 |
+
|
| 172 |
+
if len(steps) < min_steps:
|
| 173 |
+
continue
|
| 174 |
+
|
| 175 |
+
# Get rewards for this session
|
| 176 |
+
reward_key = f"lightning:rewards:{session_id}"
|
| 177 |
+
raw_rewards = await redis.lrange(reward_key, 0, -1)
|
| 178 |
+
rewards = [json.loads(r) for r in raw_rewards]
|
| 179 |
+
|
| 180 |
+
if only_completed and not rewards:
|
| 181 |
+
continue
|
| 182 |
+
|
| 183 |
+
# Calculate total reward
|
| 184 |
+
total_reward = sum(r.get("reward", 0) for r in rewards)
|
| 185 |
+
|
| 186 |
+
trajectories.append({
|
| 187 |
+
"session_id": session_id,
|
| 188 |
+
"steps": steps,
|
| 189 |
+
"rewards": rewards,
|
| 190 |
+
"total_reward": total_reward,
|
| 191 |
+
"step_count": len(steps)
|
| 192 |
+
})
|
| 193 |
+
|
| 194 |
+
if len(trajectories) >= max_trajectories:
|
| 195 |
+
break
|
| 196 |
+
|
| 197 |
+
logger.info("Lightning: Exported trajectories for training", count=len(trajectories))
|
| 198 |
+
return trajectories
|
| 199 |
+
|
| 200 |
+
except Exception as e:
|
| 201 |
+
logger.error("Lightning: Failed to export trajectories", error=str(e))
|
| 202 |
+
return []
|
| 203 |
+
|
| 204 |
+
|
| 205 |
+
def wrap_graph_if_enabled(compiled_graph):
|
| 206 |
+
"""
|
| 207 |
+
Wrap a compiled LangGraph with trajectory logging.
|
| 208 |
+
|
| 209 |
+
This is a passthrough wrapper that logs trajectory steps
|
| 210 |
+
without modifying the graph's behavior.
|
| 211 |
+
|
| 212 |
+
Args:
|
| 213 |
+
compiled_graph: The compiled LangGraph
|
| 214 |
+
|
| 215 |
+
Returns:
|
| 216 |
+
Wrapped graph (or original if Lightning disabled)
|
| 217 |
+
"""
|
| 218 |
+
if not _is_lightning_enabled():
|
| 219 |
+
logger.info("Lightning: Disabled - returning unwrapped graph")
|
| 220 |
+
return compiled_graph
|
| 221 |
+
|
| 222 |
+
logger.info("Lightning: Wrapping graph with trajectory capture")
|
| 223 |
+
|
| 224 |
+
# For now, return the original graph
|
| 225 |
+
# The trajectory logging is done at the brain.py level
|
| 226 |
+
# This wrapper is a hook for future enhancements
|
| 227 |
+
return compiled_graph
|
| 228 |
+
|
| 229 |
+
|
| 230 |
+
class TrajectoryContext:
|
| 231 |
+
"""
|
| 232 |
+
Context manager for tracking a complete conversation trajectory.
|
| 233 |
+
|
| 234 |
+
Usage:
|
| 235 |
+
async with TrajectoryContext(session_id, user_id) as ctx:
|
| 236 |
+
ctx.log_user_input(message)
|
| 237 |
+
ctx.log_brain_decision(decision)
|
| 238 |
+
ctx.log_tool_call(tool, params, result)
|
| 239 |
+
ctx.log_response(response)
|
| 240 |
+
"""
|
| 241 |
+
|
| 242 |
+
def __init__(self, session_id: str, user_id: Optional[str] = None):
|
| 243 |
+
self.session_id = session_id
|
| 244 |
+
self.user_id = user_id
|
| 245 |
+
self.start_time = None
|
| 246 |
+
self.enabled = _is_lightning_enabled()
|
| 247 |
+
|
| 248 |
+
async def __aenter__(self):
|
| 249 |
+
self.start_time = datetime.utcnow()
|
| 250 |
+
if self.enabled:
|
| 251 |
+
await log_trajectory_step(
|
| 252 |
+
self.session_id,
|
| 253 |
+
"session_start",
|
| 254 |
+
{"timestamp": self.start_time.isoformat()},
|
| 255 |
+
self.user_id
|
| 256 |
+
)
|
| 257 |
+
return self
|
| 258 |
+
|
| 259 |
+
async def __aexit__(self, exc_type, exc_val, exc_tb):
|
| 260 |
+
if self.enabled:
|
| 261 |
+
end_time = datetime.utcnow()
|
| 262 |
+
duration = (end_time - self.start_time).total_seconds()
|
| 263 |
+
await log_trajectory_step(
|
| 264 |
+
self.session_id,
|
| 265 |
+
"session_end",
|
| 266 |
+
{
|
| 267 |
+
"duration_seconds": duration,
|
| 268 |
+
"error": str(exc_val) if exc_val else None
|
| 269 |
+
},
|
| 270 |
+
self.user_id
|
| 271 |
+
)
|
| 272 |
+
return False # Don't suppress exceptions
|
| 273 |
+
|
| 274 |
+
async def log_user_input(self, message: str, is_voice: bool = False):
|
| 275 |
+
"""Log user input step"""
|
| 276 |
+
if self.enabled:
|
| 277 |
+
await log_trajectory_step(
|
| 278 |
+
self.session_id,
|
| 279 |
+
"user_input",
|
| 280 |
+
{
|
| 281 |
+
"message": message[:500], # Truncate long messages
|
| 282 |
+
"is_voice": is_voice
|
| 283 |
+
},
|
| 284 |
+
self.user_id
|
| 285 |
+
)
|
| 286 |
+
|
| 287 |
+
async def log_brain_decision(self, thinking: str, tool: Optional[str], params: Dict):
|
| 288 |
+
"""Log brain decision step"""
|
| 289 |
+
if self.enabled:
|
| 290 |
+
await log_trajectory_step(
|
| 291 |
+
self.session_id,
|
| 292 |
+
"brain_decision",
|
| 293 |
+
{
|
| 294 |
+
"thinking": thinking[:200], # Truncate
|
| 295 |
+
"tool": tool,
|
| 296 |
+
"params": {k: str(v)[:100] for k, v in params.items()} if params else {}
|
| 297 |
+
},
|
| 298 |
+
self.user_id
|
| 299 |
+
)
|
| 300 |
+
|
| 301 |
+
async def log_tool_call(self, tool: str, success: bool, message: str):
|
| 302 |
+
"""Log tool execution step"""
|
| 303 |
+
if self.enabled:
|
| 304 |
+
await log_trajectory_step(
|
| 305 |
+
self.session_id,
|
| 306 |
+
"tool_result",
|
| 307 |
+
{
|
| 308 |
+
"tool": tool,
|
| 309 |
+
"success": success,
|
| 310 |
+
"message": message[:200]
|
| 311 |
+
},
|
| 312 |
+
self.user_id
|
| 313 |
+
)
|
| 314 |
+
|
| 315 |
+
async def log_response(self, response: str, action: Optional[str] = None):
|
| 316 |
+
"""Log AI response step"""
|
| 317 |
+
if self.enabled:
|
| 318 |
+
await log_trajectory_step(
|
| 319 |
+
self.session_id,
|
| 320 |
+
"response",
|
| 321 |
+
{
|
| 322 |
+
"response": response[:500],
|
| 323 |
+
"action": action
|
| 324 |
+
},
|
| 325 |
+
self.user_id
|
| 326 |
+
)
|
app/ai/services/__init__.py
CHANGED
|
@@ -0,0 +1,52 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# app/ai/services/__init__.py
|
| 2 |
+
"""
|
| 3 |
+
AI Services for AIDA
|
| 4 |
+
|
| 5 |
+
Includes:
|
| 6 |
+
- Search services (hybrid, MongoDB, Qdrant)
|
| 7 |
+
- RLM (Recursive Language Model) for complex queries
|
| 8 |
+
- Strategy selection
|
| 9 |
+
- Intent classification
|
| 10 |
+
- OpenStreetMap POI service for proximity searches
|
| 11 |
+
"""
|
| 12 |
+
|
| 13 |
+
from app.ai.services.rlm_query_analyzer import (
|
| 14 |
+
QueryComplexity,
|
| 15 |
+
QueryAnalysis,
|
| 16 |
+
analyze_query_complexity,
|
| 17 |
+
should_use_rlm
|
| 18 |
+
)
|
| 19 |
+
|
| 20 |
+
from app.ai.services.rlm_search_service import (
|
| 21 |
+
RLMSearchAgent,
|
| 22 |
+
get_rlm_agent,
|
| 23 |
+
rlm_search
|
| 24 |
+
)
|
| 25 |
+
|
| 26 |
+
from app.ai.services.osm_poi_service import (
|
| 27 |
+
find_pois,
|
| 28 |
+
find_pois_overpass,
|
| 29 |
+
geocode_location,
|
| 30 |
+
find_multiple_poi_types,
|
| 31 |
+
calculate_distance_km
|
| 32 |
+
)
|
| 33 |
+
|
| 34 |
+
__all__ = [
|
| 35 |
+
# RLM Query Analyzer
|
| 36 |
+
"QueryComplexity",
|
| 37 |
+
"QueryAnalysis",
|
| 38 |
+
"analyze_query_complexity",
|
| 39 |
+
"should_use_rlm",
|
| 40 |
+
|
| 41 |
+
# RLM Search Service
|
| 42 |
+
"RLMSearchAgent",
|
| 43 |
+
"get_rlm_agent",
|
| 44 |
+
"rlm_search",
|
| 45 |
+
|
| 46 |
+
# OpenStreetMap POI Service
|
| 47 |
+
"find_pois",
|
| 48 |
+
"find_pois_overpass",
|
| 49 |
+
"geocode_location",
|
| 50 |
+
"find_multiple_poi_types",
|
| 51 |
+
"calculate_distance_km",
|
| 52 |
+
]
|
app/ai/services/osm_poi_service.py
ADDED
|
@@ -0,0 +1,499 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# app/ai/services/osm_poi_service.py
|
| 2 |
+
"""
|
| 3 |
+
OpenStreetMap POI (Point of Interest) Service for AIDA RLM.
|
| 4 |
+
|
| 5 |
+
Uses FREE OpenStreetMap APIs:
|
| 6 |
+
- Nominatim: Geocoding (location name β coordinates)
|
| 7 |
+
- Overpass: POI search (find schools, hospitals, parks near a location)
|
| 8 |
+
|
| 9 |
+
No API key required! Just respect rate limits (1 request/second for Nominatim).
|
| 10 |
+
|
| 11 |
+
Supports:
|
| 12 |
+
- Schools, universities, colleges
|
| 13 |
+
- Hospitals, clinics, pharmacies
|
| 14 |
+
- Parks, gardens, beaches
|
| 15 |
+
- Markets, supermarkets, malls
|
| 16 |
+
- Airports, bus stations
|
| 17 |
+
- Mosques, churches
|
| 18 |
+
- And more...
|
| 19 |
+
"""
|
| 20 |
+
|
| 21 |
+
import asyncio
|
| 22 |
+
import httpx
|
| 23 |
+
from typing import List, Dict, Optional, Tuple
|
| 24 |
+
from structlog import get_logger
|
| 25 |
+
|
| 26 |
+
logger = get_logger(__name__)
|
| 27 |
+
|
| 28 |
+
# Rate limiting: Nominatim requires max 1 request/second
|
| 29 |
+
_last_nominatim_request = 0
|
| 30 |
+
|
| 31 |
+
|
| 32 |
+
# =============================================================================
|
| 33 |
+
# OSM Tag Mappings
|
| 34 |
+
# =============================================================================
|
| 35 |
+
|
| 36 |
+
OSM_POI_TAGS = {
|
| 37 |
+
# Education
|
| 38 |
+
"school": "amenity=school",
|
| 39 |
+
"schools": "amenity=school",
|
| 40 |
+
"primary school": "amenity=school",
|
| 41 |
+
"secondary school": "amenity=school",
|
| 42 |
+
"high school": "amenity=school",
|
| 43 |
+
"university": "amenity=university",
|
| 44 |
+
"college": "amenity=college",
|
| 45 |
+
"kindergarten": "amenity=kindergarten",
|
| 46 |
+
|
| 47 |
+
# Healthcare
|
| 48 |
+
"hospital": "amenity=hospital",
|
| 49 |
+
"clinic": "amenity=clinic",
|
| 50 |
+
"pharmacy": "amenity=pharmacy",
|
| 51 |
+
"doctor": "amenity=doctors",
|
| 52 |
+
|
| 53 |
+
# Recreation & Nature
|
| 54 |
+
"beach": "natural=beach",
|
| 55 |
+
"park": "leisure=park",
|
| 56 |
+
"garden": "leisure=garden",
|
| 57 |
+
"playground": "leisure=playground",
|
| 58 |
+
"sports": "leisure=sports_centre",
|
| 59 |
+
"gym": "leisure=fitness_centre",
|
| 60 |
+
"swimming pool": "leisure=swimming_pool",
|
| 61 |
+
"stadium": "leisure=stadium",
|
| 62 |
+
|
| 63 |
+
# Shopping
|
| 64 |
+
"market": "amenity=marketplace",
|
| 65 |
+
"supermarket": "shop=supermarket",
|
| 66 |
+
"mall": "shop=mall",
|
| 67 |
+
"shopping center": "shop=mall",
|
| 68 |
+
"shop": "shop=supermarket",
|
| 69 |
+
|
| 70 |
+
# Transport
|
| 71 |
+
"airport": "aeroway=aerodrome",
|
| 72 |
+
"bus station": "amenity=bus_station",
|
| 73 |
+
"bus stop": "highway=bus_stop",
|
| 74 |
+
"train station": "railway=station",
|
| 75 |
+
"port": "amenity=ferry_terminal",
|
| 76 |
+
|
| 77 |
+
# Religious
|
| 78 |
+
"mosque": 'amenity=place_of_worship"][religion=muslim',
|
| 79 |
+
"church": 'amenity=place_of_worship"][religion=christian',
|
| 80 |
+
"cathedral": "building=cathedral",
|
| 81 |
+
|
| 82 |
+
# Food & Drink
|
| 83 |
+
"restaurant": "amenity=restaurant",
|
| 84 |
+
"cafe": "amenity=cafe",
|
| 85 |
+
"bar": "amenity=bar",
|
| 86 |
+
|
| 87 |
+
# Business & Services
|
| 88 |
+
"bank": "amenity=bank",
|
| 89 |
+
"atm": "amenity=atm",
|
| 90 |
+
"police": "amenity=police",
|
| 91 |
+
"post office": "amenity=post_office",
|
| 92 |
+
"embassy": "amenity=embassy",
|
| 93 |
+
|
| 94 |
+
# Landmarks
|
| 95 |
+
"downtown": "place=city_centre",
|
| 96 |
+
"city center": "place=city_centre",
|
| 97 |
+
"city centre": "place=city_centre",
|
| 98 |
+
}
|
| 99 |
+
|
| 100 |
+
# French translations
|
| 101 |
+
OSM_POI_TAGS_FR = {
|
| 102 |
+
"Γ©cole": "amenity=school",
|
| 103 |
+
"ecole": "amenity=school",
|
| 104 |
+
"lycΓ©e": "amenity=school",
|
| 105 |
+
"lycee": "amenity=school",
|
| 106 |
+
"collège": "amenity=school",
|
| 107 |
+
"college": "amenity=college",
|
| 108 |
+
"universitΓ©": "amenity=university",
|
| 109 |
+
"universite": "amenity=university",
|
| 110 |
+
"hΓ΄pital": "amenity=hospital",
|
| 111 |
+
"hopital": "amenity=hospital",
|
| 112 |
+
"clinique": "amenity=clinic",
|
| 113 |
+
"pharmacie": "amenity=pharmacy",
|
| 114 |
+
"plage": "natural=beach",
|
| 115 |
+
"parc": "leisure=park",
|
| 116 |
+
"jardin": "leisure=garden",
|
| 117 |
+
"marchΓ©": "amenity=marketplace",
|
| 118 |
+
"marche": "amenity=marketplace",
|
| 119 |
+
"supermarchΓ©": "shop=supermarket",
|
| 120 |
+
"aΓ©roport": "aeroway=aerodrome",
|
| 121 |
+
"aeroport": "aeroway=aerodrome",
|
| 122 |
+
"gare": "railway=station",
|
| 123 |
+
"mosquΓ©e": 'amenity=place_of_worship"][religion=muslim',
|
| 124 |
+
"mosquee": 'amenity=place_of_worship"][religion=muslim',
|
| 125 |
+
"Γ©glise": 'amenity=place_of_worship"][religion=christian',
|
| 126 |
+
"eglise": 'amenity=place_of_worship"][religion=christian',
|
| 127 |
+
"centre-ville": "place=city_centre",
|
| 128 |
+
}
|
| 129 |
+
|
| 130 |
+
# Merge all tags
|
| 131 |
+
ALL_POI_TAGS = {**OSM_POI_TAGS, **OSM_POI_TAGS_FR}
|
| 132 |
+
|
| 133 |
+
|
| 134 |
+
# =============================================================================
|
| 135 |
+
# Nominatim Geocoding
|
| 136 |
+
# =============================================================================
|
| 137 |
+
|
| 138 |
+
async def geocode_location(location: str) -> Optional[Tuple[float, float]]:
|
| 139 |
+
"""
|
| 140 |
+
Convert location name to coordinates using Nominatim.
|
| 141 |
+
|
| 142 |
+
Args:
|
| 143 |
+
location: Location name (e.g., "Cotonou, Benin")
|
| 144 |
+
|
| 145 |
+
Returns:
|
| 146 |
+
Tuple of (latitude, longitude) or None if not found
|
| 147 |
+
"""
|
| 148 |
+
global _last_nominatim_request
|
| 149 |
+
|
| 150 |
+
# Rate limiting: wait if needed
|
| 151 |
+
import time
|
| 152 |
+
now = time.time()
|
| 153 |
+
if now - _last_nominatim_request < 1:
|
| 154 |
+
await asyncio.sleep(1 - (now - _last_nominatim_request))
|
| 155 |
+
_last_nominatim_request = time.time()
|
| 156 |
+
|
| 157 |
+
try:
|
| 158 |
+
async with httpx.AsyncClient(timeout=15) as client:
|
| 159 |
+
response = await client.get(
|
| 160 |
+
"https://nominatim.openstreetmap.org/search",
|
| 161 |
+
params={
|
| 162 |
+
"q": location,
|
| 163 |
+
"format": "json",
|
| 164 |
+
"limit": 1,
|
| 165 |
+
"addressdetails": 1
|
| 166 |
+
},
|
| 167 |
+
headers={
|
| 168 |
+
"User-Agent": "AIDA-RealEstate/1.0 (contact@lojiz.com)"
|
| 169 |
+
}
|
| 170 |
+
)
|
| 171 |
+
|
| 172 |
+
if response.status_code != 200:
|
| 173 |
+
logger.error(f"Nominatim error: {response.status_code}")
|
| 174 |
+
return None
|
| 175 |
+
|
| 176 |
+
data = response.json()
|
| 177 |
+
|
| 178 |
+
if not data:
|
| 179 |
+
logger.warning(f"Location not found: {location}")
|
| 180 |
+
return None
|
| 181 |
+
|
| 182 |
+
lat = float(data[0]["lat"])
|
| 183 |
+
lon = float(data[0]["lon"])
|
| 184 |
+
|
| 185 |
+
logger.info(
|
| 186 |
+
"Geocoded location",
|
| 187 |
+
location=location,
|
| 188 |
+
lat=lat,
|
| 189 |
+
lon=lon
|
| 190 |
+
)
|
| 191 |
+
|
| 192 |
+
return (lat, lon)
|
| 193 |
+
|
| 194 |
+
except Exception as e:
|
| 195 |
+
logger.error(f"Geocoding failed: {e}")
|
| 196 |
+
return None
|
| 197 |
+
|
| 198 |
+
|
| 199 |
+
# =============================================================================
|
| 200 |
+
# Overpass POI Search
|
| 201 |
+
# =============================================================================
|
| 202 |
+
|
| 203 |
+
async def find_pois_overpass(
|
| 204 |
+
poi_type: str,
|
| 205 |
+
center_lat: float,
|
| 206 |
+
center_lon: float,
|
| 207 |
+
radius_km: float = 5
|
| 208 |
+
) -> List[Dict]:
|
| 209 |
+
"""
|
| 210 |
+
Find POIs near a location using Overpass API.
|
| 211 |
+
|
| 212 |
+
Args:
|
| 213 |
+
poi_type: Type of POI (school, hospital, beach, etc.)
|
| 214 |
+
center_lat: Center latitude
|
| 215 |
+
center_lon: Center longitude
|
| 216 |
+
radius_km: Search radius in kilometers
|
| 217 |
+
|
| 218 |
+
Returns:
|
| 219 |
+
List of POI dicts with name, lat, lon, type
|
| 220 |
+
"""
|
| 221 |
+
# Get OSM tag for this POI type
|
| 222 |
+
poi_lower = poi_type.lower().strip()
|
| 223 |
+
osm_tag = ALL_POI_TAGS.get(poi_lower)
|
| 224 |
+
|
| 225 |
+
if not osm_tag:
|
| 226 |
+
# Try partial matching
|
| 227 |
+
for key, tag in ALL_POI_TAGS.items():
|
| 228 |
+
if poi_lower in key or key in poi_lower:
|
| 229 |
+
osm_tag = tag
|
| 230 |
+
break
|
| 231 |
+
|
| 232 |
+
if not osm_tag:
|
| 233 |
+
# Default to amenity search
|
| 234 |
+
osm_tag = f"amenity={poi_lower}"
|
| 235 |
+
logger.warning(f"Unknown POI type '{poi_type}', using default: {osm_tag}")
|
| 236 |
+
|
| 237 |
+
# Build Overpass QL query
|
| 238 |
+
radius_meters = radius_km * 1000
|
| 239 |
+
|
| 240 |
+
query = f"""
|
| 241 |
+
[out:json][timeout:25];
|
| 242 |
+
(
|
| 243 |
+
node[{osm_tag}](around:{radius_meters},{center_lat},{center_lon});
|
| 244 |
+
way[{osm_tag}](around:{radius_meters},{center_lat},{center_lon});
|
| 245 |
+
relation[{osm_tag}](around:{radius_meters},{center_lat},{center_lon});
|
| 246 |
+
);
|
| 247 |
+
out center tags;
|
| 248 |
+
"""
|
| 249 |
+
|
| 250 |
+
try:
|
| 251 |
+
async with httpx.AsyncClient(timeout=30) as client:
|
| 252 |
+
response = await client.post(
|
| 253 |
+
"https://overpass-api.de/api/interpreter",
|
| 254 |
+
data={"data": query},
|
| 255 |
+
headers={
|
| 256 |
+
"User-Agent": "AIDA-RealEstate/1.0"
|
| 257 |
+
}
|
| 258 |
+
)
|
| 259 |
+
|
| 260 |
+
if response.status_code != 200:
|
| 261 |
+
logger.error(f"Overpass error: {response.status_code}")
|
| 262 |
+
return []
|
| 263 |
+
|
| 264 |
+
data = response.json()
|
| 265 |
+
|
| 266 |
+
except Exception as e:
|
| 267 |
+
logger.error(f"Overpass query failed: {e}")
|
| 268 |
+
return []
|
| 269 |
+
|
| 270 |
+
# Parse results
|
| 271 |
+
pois = []
|
| 272 |
+
for element in data.get("elements", []):
|
| 273 |
+
# Get coordinates
|
| 274 |
+
if element["type"] == "node":
|
| 275 |
+
lat = element.get("lat")
|
| 276 |
+
lon = element.get("lon")
|
| 277 |
+
else:
|
| 278 |
+
# For ways/relations, use center
|
| 279 |
+
center = element.get("center", {})
|
| 280 |
+
lat = center.get("lat")
|
| 281 |
+
lon = center.get("lon")
|
| 282 |
+
|
| 283 |
+
if not lat or not lon:
|
| 284 |
+
continue
|
| 285 |
+
|
| 286 |
+
tags = element.get("tags", {})
|
| 287 |
+
|
| 288 |
+
# Build POI entry
|
| 289 |
+
poi = {
|
| 290 |
+
"name": tags.get("name", f"{poi_type.title()} (unnamed)"),
|
| 291 |
+
"lat": lat,
|
| 292 |
+
"lon": lon,
|
| 293 |
+
"type": poi_type,
|
| 294 |
+
"osm_id": element.get("id"),
|
| 295 |
+
"osm_type": element.get("type"),
|
| 296 |
+
}
|
| 297 |
+
|
| 298 |
+
# Add extra info if available
|
| 299 |
+
if tags.get("addr:street"):
|
| 300 |
+
poi["address"] = f"{tags.get('addr:housenumber', '')} {tags['addr:street']}".strip()
|
| 301 |
+
if tags.get("website"):
|
| 302 |
+
poi["website"] = tags["website"]
|
| 303 |
+
if tags.get("phone"):
|
| 304 |
+
poi["phone"] = tags["phone"]
|
| 305 |
+
|
| 306 |
+
pois.append(poi)
|
| 307 |
+
|
| 308 |
+
logger.info(
|
| 309 |
+
"Found POIs",
|
| 310 |
+
poi_type=poi_type,
|
| 311 |
+
count=len(pois),
|
| 312 |
+
radius_km=radius_km
|
| 313 |
+
)
|
| 314 |
+
|
| 315 |
+
return pois
|
| 316 |
+
|
| 317 |
+
|
| 318 |
+
# =============================================================================
|
| 319 |
+
# Main Function: Find POIs by Location Name
|
| 320 |
+
# =============================================================================
|
| 321 |
+
|
| 322 |
+
async def find_pois(
|
| 323 |
+
poi_type: str,
|
| 324 |
+
location: str,
|
| 325 |
+
radius_km: float = 5,
|
| 326 |
+
limit: int = 10
|
| 327 |
+
) -> List[Dict]:
|
| 328 |
+
"""
|
| 329 |
+
Find POIs near a location (main entry point).
|
| 330 |
+
|
| 331 |
+
Args:
|
| 332 |
+
poi_type: Type of POI (school, hospital, beach, park, etc.)
|
| 333 |
+
location: Location name (e.g., "Cotonou", "Calavi, Benin")
|
| 334 |
+
radius_km: Search radius in kilometers (default 5km)
|
| 335 |
+
limit: Maximum number of results (default 10)
|
| 336 |
+
|
| 337 |
+
Returns:
|
| 338 |
+
List of POI dicts:
|
| 339 |
+
[
|
| 340 |
+
{
|
| 341 |
+
"name": "Collège Père Aupiais",
|
| 342 |
+
"lat": 6.3654,
|
| 343 |
+
"lon": 2.4183,
|
| 344 |
+
"type": "school",
|
| 345 |
+
"osm_id": 12345678,
|
| 346 |
+
"address": "Rue de l'Γcole"
|
| 347 |
+
},
|
| 348 |
+
...
|
| 349 |
+
]
|
| 350 |
+
|
| 351 |
+
Example:
|
| 352 |
+
pois = await find_pois("school", "Cotonou, Benin", radius_km=3)
|
| 353 |
+
"""
|
| 354 |
+
# Step 1: Geocode the location
|
| 355 |
+
coords = await geocode_location(location)
|
| 356 |
+
|
| 357 |
+
if not coords:
|
| 358 |
+
logger.warning(f"Could not geocode location: {location}")
|
| 359 |
+
return []
|
| 360 |
+
|
| 361 |
+
center_lat, center_lon = coords
|
| 362 |
+
|
| 363 |
+
# Step 2: Find POIs near those coordinates
|
| 364 |
+
pois = await find_pois_overpass(
|
| 365 |
+
poi_type=poi_type,
|
| 366 |
+
center_lat=center_lat,
|
| 367 |
+
center_lon=center_lon,
|
| 368 |
+
radius_km=radius_km
|
| 369 |
+
)
|
| 370 |
+
|
| 371 |
+
# Limit results
|
| 372 |
+
return pois[:limit]
|
| 373 |
+
|
| 374 |
+
|
| 375 |
+
# =============================================================================
|
| 376 |
+
# Batch POI Search
|
| 377 |
+
# =============================================================================
|
| 378 |
+
|
| 379 |
+
async def find_multiple_poi_types(
|
| 380 |
+
poi_types: List[str],
|
| 381 |
+
location: str,
|
| 382 |
+
radius_km: float = 5
|
| 383 |
+
) -> Dict[str, List[Dict]]:
|
| 384 |
+
"""
|
| 385 |
+
Find multiple types of POIs at once.
|
| 386 |
+
|
| 387 |
+
Args:
|
| 388 |
+
poi_types: List of POI types (e.g., ["school", "hospital", "park"])
|
| 389 |
+
location: Location name
|
| 390 |
+
|
| 391 |
+
Returns:
|
| 392 |
+
Dict mapping POI type to list of POIs:
|
| 393 |
+
{
|
| 394 |
+
"school": [...],
|
| 395 |
+
"hospital": [...],
|
| 396 |
+
"park": [...]
|
| 397 |
+
}
|
| 398 |
+
"""
|
| 399 |
+
# Geocode once
|
| 400 |
+
coords = await geocode_location(location)
|
| 401 |
+
|
| 402 |
+
if not coords:
|
| 403 |
+
return {poi_type: [] for poi_type in poi_types}
|
| 404 |
+
|
| 405 |
+
center_lat, center_lon = coords
|
| 406 |
+
|
| 407 |
+
# Search each POI type in parallel
|
| 408 |
+
async def search_poi(poi_type: str):
|
| 409 |
+
return poi_type, await find_pois_overpass(
|
| 410 |
+
poi_type, center_lat, center_lon, radius_km
|
| 411 |
+
)
|
| 412 |
+
|
| 413 |
+
results = await asyncio.gather(*[search_poi(pt) for pt in poi_types])
|
| 414 |
+
|
| 415 |
+
return {poi_type: pois for poi_type, pois in results}
|
| 416 |
+
|
| 417 |
+
|
| 418 |
+
# =============================================================================
|
| 419 |
+
# Utility: Calculate Distance
|
| 420 |
+
# =============================================================================
|
| 421 |
+
|
| 422 |
+
def calculate_distance_km(
|
| 423 |
+
lat1: float,
|
| 424 |
+
lon1: float,
|
| 425 |
+
lat2: float,
|
| 426 |
+
lon2: float
|
| 427 |
+
) -> float:
|
| 428 |
+
"""
|
| 429 |
+
Calculate distance between two points using Haversine formula.
|
| 430 |
+
|
| 431 |
+
Returns distance in kilometers.
|
| 432 |
+
"""
|
| 433 |
+
import math
|
| 434 |
+
|
| 435 |
+
R = 6371 # Earth's radius in km
|
| 436 |
+
|
| 437 |
+
lat1_rad = math.radians(lat1)
|
| 438 |
+
lat2_rad = math.radians(lat2)
|
| 439 |
+
delta_lat = math.radians(lat2 - lat1)
|
| 440 |
+
delta_lon = math.radians(lon2 - lon1)
|
| 441 |
+
|
| 442 |
+
a = (math.sin(delta_lat / 2) ** 2 +
|
| 443 |
+
math.cos(lat1_rad) * math.cos(lat2_rad) * math.sin(delta_lon / 2) ** 2)
|
| 444 |
+
c = 2 * math.atan2(math.sqrt(a), math.sqrt(1 - a))
|
| 445 |
+
|
| 446 |
+
return R * c
|
| 447 |
+
|
| 448 |
+
|
| 449 |
+
# =============================================================================
|
| 450 |
+
# Test Function
|
| 451 |
+
# =============================================================================
|
| 452 |
+
|
| 453 |
+
async def test_osm_service():
|
| 454 |
+
"""Test the OSM POI service."""
|
| 455 |
+
print("\n" + "=" * 60)
|
| 456 |
+
print("Testing OpenStreetMap POI Service")
|
| 457 |
+
print("=" * 60 + "\n")
|
| 458 |
+
|
| 459 |
+
# Test 1: Geocoding
|
| 460 |
+
print("Test 1: Geocoding 'Cotonou, Benin'")
|
| 461 |
+
coords = await geocode_location("Cotonou, Benin")
|
| 462 |
+
if coords:
|
| 463 |
+
print(f" β
Found: {coords}")
|
| 464 |
+
else:
|
| 465 |
+
print(" β Failed")
|
| 466 |
+
|
| 467 |
+
# Test 2: Find schools
|
| 468 |
+
print("\nTest 2: Find schools in Cotonou")
|
| 469 |
+
schools = await find_pois("school", "Cotonou, Benin", radius_km=3)
|
| 470 |
+
print(f" Found {len(schools)} schools:")
|
| 471 |
+
for school in schools[:5]:
|
| 472 |
+
print(f" - {school['name']} ({school['lat']:.4f}, {school['lon']:.4f})")
|
| 473 |
+
|
| 474 |
+
# Test 3: Find hospitals
|
| 475 |
+
print("\nTest 3: Find hospitals in Cotonou")
|
| 476 |
+
hospitals = await find_pois("hospital", "Cotonou, Benin", radius_km=5)
|
| 477 |
+
print(f" Found {len(hospitals)} hospitals:")
|
| 478 |
+
for hospital in hospitals[:3]:
|
| 479 |
+
print(f" - {hospital['name']} ({hospital['lat']:.4f}, {hospital['lon']:.4f})")
|
| 480 |
+
|
| 481 |
+
# Test 4: Find markets
|
| 482 |
+
print("\nTest 4: Find markets in Cotonou")
|
| 483 |
+
markets = await find_pois("market", "Cotonou, Benin", radius_km=3)
|
| 484 |
+
print(f" Found {len(markets)} markets:")
|
| 485 |
+
for market in markets[:3]:
|
| 486 |
+
print(f" - {market['name']} ({market['lat']:.4f}, {market['lon']:.4f})")
|
| 487 |
+
|
| 488 |
+
# Test 5: French POI type
|
| 489 |
+
print("\nTest 5: Find 'Γ©cole' (French) in Cotonou")
|
| 490 |
+
ecoles = await find_pois("Γ©cole", "Cotonou, Benin", radius_km=3)
|
| 491 |
+
print(f" Found {len(ecoles)} Γ©coles")
|
| 492 |
+
|
| 493 |
+
print("\n" + "=" * 60)
|
| 494 |
+
print("OSM Service Tests Complete!")
|
| 495 |
+
print("=" * 60 + "\n")
|
| 496 |
+
|
| 497 |
+
|
| 498 |
+
if __name__ == "__main__":
|
| 499 |
+
asyncio.run(test_osm_service())
|
app/ai/services/rlm_query_analyzer.py
ADDED
|
@@ -0,0 +1,287 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# app/ai/services/rlm_query_analyzer.py
|
| 2 |
+
"""
|
| 3 |
+
RLM Query Analyzer - Detects complex queries that need recursive reasoning.
|
| 4 |
+
|
| 5 |
+
Identifies:
|
| 6 |
+
- Multi-hop queries: "near schools", "close to beach"
|
| 7 |
+
- Boolean OR queries: "under 500k OR has pool"
|
| 8 |
+
- Comparative queries: "compare Cotonou vs Calavi"
|
| 9 |
+
- Aggregation queries: "average price", "how many"
|
| 10 |
+
- Multi-factor queries: "best family apartment near schools and parks"
|
| 11 |
+
"""
|
| 12 |
+
|
| 13 |
+
import re
|
| 14 |
+
from typing import Dict, List, Literal, Optional
|
| 15 |
+
from enum import Enum
|
| 16 |
+
from structlog import get_logger
|
| 17 |
+
from pydantic import BaseModel
|
| 18 |
+
|
| 19 |
+
logger = get_logger(__name__)
|
| 20 |
+
|
| 21 |
+
|
| 22 |
+
class QueryComplexity(str, Enum):
|
| 23 |
+
"""Types of complex queries that RLM can handle"""
|
| 24 |
+
SIMPLE = "simple" # Standard single-hop search
|
| 25 |
+
MULTI_HOP = "multi_hop" # "near X", "close to Y"
|
| 26 |
+
BOOLEAN_OR = "boolean_or" # "A OR B"
|
| 27 |
+
COMPARATIVE = "comparative" # "compare A vs B"
|
| 28 |
+
AGGREGATION = "aggregation" # "average", "total", "count"
|
| 29 |
+
MULTI_FACTOR = "multi_factor" # Multiple ranking criteria
|
| 30 |
+
|
| 31 |
+
|
| 32 |
+
class QueryAnalysis(BaseModel):
|
| 33 |
+
"""Result of query analysis"""
|
| 34 |
+
complexity: QueryComplexity
|
| 35 |
+
confidence: float # 0.0 to 1.0
|
| 36 |
+
reasoning: str
|
| 37 |
+
detected_patterns: List[str]
|
| 38 |
+
sub_query_hints: List[str] # Hints for decomposition
|
| 39 |
+
use_rlm: bool
|
| 40 |
+
|
| 41 |
+
|
| 42 |
+
# Pattern definitions for each complexity type
|
| 43 |
+
MULTI_HOP_PATTERNS = [
|
| 44 |
+
r"\bnear\b",
|
| 45 |
+
r"\bclose to\b",
|
| 46 |
+
r"\bnearby\b",
|
| 47 |
+
r"\bwalking distance\b",
|
| 48 |
+
r"\bwithin \d+ ?(?:km|m|meters|miles|minutes)\b",
|
| 49 |
+
r"\baround\b",
|
| 50 |
+
r"\bproximity\b",
|
| 51 |
+
r"\bnext to\b",
|
| 52 |
+
r"\bbeside\b",
|
| 53 |
+
r"\bopposite\b",
|
| 54 |
+
r"\bacross from\b",
|
| 55 |
+
# French equivalents
|
| 56 |
+
r"\bprès de\b",
|
| 57 |
+
r"\bΓ cΓ΄tΓ© de\b",
|
| 58 |
+
r"\bproche de\b",
|
| 59 |
+
r"\baux alentours\b",
|
| 60 |
+
]
|
| 61 |
+
|
| 62 |
+
BOOLEAN_OR_PATTERNS = [
|
| 63 |
+
r"\bor\b",
|
| 64 |
+
r"\beither\b",
|
| 65 |
+
r"\balternatively\b",
|
| 66 |
+
r"\botherwise\b",
|
| 67 |
+
# French
|
| 68 |
+
r"\bou\b",
|
| 69 |
+
r"\bsoit\b",
|
| 70 |
+
]
|
| 71 |
+
|
| 72 |
+
COMPARATIVE_PATTERNS = [
|
| 73 |
+
r"\bcompare\b",
|
| 74 |
+
r"\bvs\.?\b",
|
| 75 |
+
r"\bversus\b",
|
| 76 |
+
r"\bdifference between\b",
|
| 77 |
+
r"\bcheaper\b",
|
| 78 |
+
r"\bmore expensive\b",
|
| 79 |
+
r"\bbetter\b",
|
| 80 |
+
r"\bwhich is\b",
|
| 81 |
+
# French
|
| 82 |
+
r"\bcomparer\b",
|
| 83 |
+
r"\bentre\b",
|
| 84 |
+
r"\bmoins cher\b",
|
| 85 |
+
r"\bplus cher\b",
|
| 86 |
+
]
|
| 87 |
+
|
| 88 |
+
AGGREGATION_PATTERNS = [
|
| 89 |
+
r"\baverage\b",
|
| 90 |
+
r"\bmean\b",
|
| 91 |
+
r"\btotal\b",
|
| 92 |
+
r"\bcount\b",
|
| 93 |
+
r"\bhow many\b",
|
| 94 |
+
r"\bsum\b",
|
| 95 |
+
r"\bstatistics\b",
|
| 96 |
+
r"\brange\b",
|
| 97 |
+
r"\bmin(?:imum)?\b",
|
| 98 |
+
r"\bmax(?:imum)?\b",
|
| 99 |
+
# French
|
| 100 |
+
r"\bmoyenne\b",
|
| 101 |
+
r"\bcombien\b",
|
| 102 |
+
r"\btotal\b",
|
| 103 |
+
]
|
| 104 |
+
|
| 105 |
+
MULTI_FACTOR_PATTERNS = [
|
| 106 |
+
r"\bbest\b",
|
| 107 |
+
r"\btop\b",
|
| 108 |
+
r"\bideal\b",
|
| 109 |
+
r"\bperfect\b",
|
| 110 |
+
r"\brecommend\b",
|
| 111 |
+
r"\bsuitable\b",
|
| 112 |
+
r"\bfamily.?friendly\b",
|
| 113 |
+
r"\bsafe\b",
|
| 114 |
+
r"\bquiet\b",
|
| 115 |
+
r"\bpeaceful\b",
|
| 116 |
+
# Combined criteria indicators
|
| 117 |
+
r"\band\b.*\band\b", # Multiple ANDs suggest multi-factor
|
| 118 |
+
# French
|
| 119 |
+
r"\bmeilleur\b",
|
| 120 |
+
r"\bidΓ©al\b",
|
| 121 |
+
r"\brecommandΓ©\b",
|
| 122 |
+
r"\bfamilial\b",
|
| 123 |
+
r"\bsΓ©curisΓ©\b",
|
| 124 |
+
]
|
| 125 |
+
|
| 126 |
+
# Points of Interest that trigger multi-hop search
|
| 127 |
+
POI_KEYWORDS = [
|
| 128 |
+
# Education
|
| 129 |
+
"school", "university", "college", "campus", "Γ©cole", "universitΓ©",
|
| 130 |
+
# Health
|
| 131 |
+
"hospital", "clinic", "pharmacy", "hΓ΄pital", "clinique",
|
| 132 |
+
# Recreation
|
| 133 |
+
"beach", "park", "garden", "gym", "plage", "parc", "jardin",
|
| 134 |
+
# Shopping
|
| 135 |
+
"mall", "market", "supermarket", "marchΓ©", "supermarchΓ©",
|
| 136 |
+
# Transport
|
| 137 |
+
"airport", "station", "bus stop", "aΓ©roport", "gare",
|
| 138 |
+
# Business
|
| 139 |
+
"downtown", "city center", "business district", "centre-ville",
|
| 140 |
+
# Landmarks
|
| 141 |
+
"mosque", "church", "cathedral", "mosquΓ©e", "Γ©glise",
|
| 142 |
+
]
|
| 143 |
+
|
| 144 |
+
|
| 145 |
+
def analyze_query_complexity(query: str) -> QueryAnalysis:
|
| 146 |
+
"""
|
| 147 |
+
Analyze a search query to determine if it needs RLM processing.
|
| 148 |
+
|
| 149 |
+
Args:
|
| 150 |
+
query: User's search query
|
| 151 |
+
|
| 152 |
+
Returns:
|
| 153 |
+
QueryAnalysis with complexity type and recommendations
|
| 154 |
+
"""
|
| 155 |
+
query_lower = query.lower()
|
| 156 |
+
detected_patterns = []
|
| 157 |
+
sub_query_hints = []
|
| 158 |
+
scores = {
|
| 159 |
+
QueryComplexity.MULTI_HOP: 0.0,
|
| 160 |
+
QueryComplexity.BOOLEAN_OR: 0.0,
|
| 161 |
+
QueryComplexity.COMPARATIVE: 0.0,
|
| 162 |
+
QueryComplexity.AGGREGATION: 0.0,
|
| 163 |
+
QueryComplexity.MULTI_FACTOR: 0.0,
|
| 164 |
+
}
|
| 165 |
+
|
| 166 |
+
# Check for multi-hop patterns
|
| 167 |
+
for pattern in MULTI_HOP_PATTERNS:
|
| 168 |
+
if re.search(pattern, query_lower, re.IGNORECASE):
|
| 169 |
+
scores[QueryComplexity.MULTI_HOP] += 0.4
|
| 170 |
+
detected_patterns.append(f"proximity: {pattern}")
|
| 171 |
+
|
| 172 |
+
# Check for POI keywords (boost multi-hop if found with proximity)
|
| 173 |
+
poi_found = []
|
| 174 |
+
for poi in POI_KEYWORDS:
|
| 175 |
+
if poi.lower() in query_lower:
|
| 176 |
+
poi_found.append(poi)
|
| 177 |
+
if scores[QueryComplexity.MULTI_HOP] > 0:
|
| 178 |
+
scores[QueryComplexity.MULTI_HOP] += 0.3
|
| 179 |
+
sub_query_hints.append(f"Find {poi} locations first")
|
| 180 |
+
|
| 181 |
+
if poi_found:
|
| 182 |
+
detected_patterns.append(f"POI: {', '.join(poi_found)}")
|
| 183 |
+
|
| 184 |
+
# Check for boolean OR patterns
|
| 185 |
+
for pattern in BOOLEAN_OR_PATTERNS:
|
| 186 |
+
if re.search(pattern, query_lower, re.IGNORECASE):
|
| 187 |
+
scores[QueryComplexity.BOOLEAN_OR] += 0.5
|
| 188 |
+
detected_patterns.append(f"boolean: {pattern}")
|
| 189 |
+
|
| 190 |
+
# Try to extract OR branches
|
| 191 |
+
parts = re.split(r'\bor\b|\bou\b', query_lower, flags=re.IGNORECASE)
|
| 192 |
+
if len(parts) > 1:
|
| 193 |
+
for i, part in enumerate(parts):
|
| 194 |
+
sub_query_hints.append(f"Branch {i+1}: {part.strip()}")
|
| 195 |
+
|
| 196 |
+
# Check for comparative patterns
|
| 197 |
+
for pattern in COMPARATIVE_PATTERNS:
|
| 198 |
+
if re.search(pattern, query_lower, re.IGNORECASE):
|
| 199 |
+
scores[QueryComplexity.COMPARATIVE] += 0.5
|
| 200 |
+
detected_patterns.append(f"comparative: {pattern}")
|
| 201 |
+
|
| 202 |
+
# Try to extract comparison subjects
|
| 203 |
+
vs_match = re.search(r'(\w+)\s+(?:vs\.?|versus|or)\s+(\w+)', query_lower)
|
| 204 |
+
if vs_match:
|
| 205 |
+
sub_query_hints.append(f"Compare: {vs_match.group(1)} vs {vs_match.group(2)}")
|
| 206 |
+
|
| 207 |
+
# Check for aggregation patterns
|
| 208 |
+
for pattern in AGGREGATION_PATTERNS:
|
| 209 |
+
if re.search(pattern, query_lower, re.IGNORECASE):
|
| 210 |
+
scores[QueryComplexity.AGGREGATION] += 0.5
|
| 211 |
+
detected_patterns.append(f"aggregation: {pattern}")
|
| 212 |
+
sub_query_hints.append("Fetch all matching listings, then aggregate")
|
| 213 |
+
|
| 214 |
+
# Check for multi-factor patterns
|
| 215 |
+
multi_factor_count = 0
|
| 216 |
+
for pattern in MULTI_FACTOR_PATTERNS:
|
| 217 |
+
if re.search(pattern, query_lower, re.IGNORECASE):
|
| 218 |
+
multi_factor_count += 1
|
| 219 |
+
detected_patterns.append(f"multi-factor: {pattern}")
|
| 220 |
+
|
| 221 |
+
# If 2+ factors detected, it's multi-factor
|
| 222 |
+
if multi_factor_count >= 2:
|
| 223 |
+
scores[QueryComplexity.MULTI_FACTOR] += 0.3 * multi_factor_count
|
| 224 |
+
sub_query_hints.append("Evaluate each factor separately, then combine scores")
|
| 225 |
+
|
| 226 |
+
# Determine dominant complexity type
|
| 227 |
+
max_score = max(scores.values())
|
| 228 |
+
|
| 229 |
+
if max_score < 0.3:
|
| 230 |
+
# Simple query - no RLM needed
|
| 231 |
+
return QueryAnalysis(
|
| 232 |
+
complexity=QueryComplexity.SIMPLE,
|
| 233 |
+
confidence=1.0 - max_score,
|
| 234 |
+
reasoning="No complex patterns detected, standard search sufficient",
|
| 235 |
+
detected_patterns=detected_patterns,
|
| 236 |
+
sub_query_hints=[],
|
| 237 |
+
use_rlm=False
|
| 238 |
+
)
|
| 239 |
+
|
| 240 |
+
# Find the complexity type with highest score
|
| 241 |
+
dominant_type = max(scores, key=scores.get)
|
| 242 |
+
confidence = min(scores[dominant_type], 1.0)
|
| 243 |
+
|
| 244 |
+
# Build reasoning
|
| 245 |
+
reasoning_map = {
|
| 246 |
+
QueryComplexity.MULTI_HOP: f"Query requires finding POI locations first, then searching nearby. POIs: {poi_found}",
|
| 247 |
+
QueryComplexity.BOOLEAN_OR: "Query has OR logic requiring separate searches and union",
|
| 248 |
+
QueryComplexity.COMPARATIVE: "Query requires searching multiple locations and comparing results",
|
| 249 |
+
QueryComplexity.AGGREGATION: "Query requires aggregating data across listings",
|
| 250 |
+
QueryComplexity.MULTI_FACTOR: f"Query has {multi_factor_count} ranking factors requiring weighted scoring",
|
| 251 |
+
}
|
| 252 |
+
|
| 253 |
+
logger.info(
|
| 254 |
+
"Query analyzed",
|
| 255 |
+
complexity=dominant_type.value,
|
| 256 |
+
confidence=confidence,
|
| 257 |
+
patterns=len(detected_patterns),
|
| 258 |
+
use_rlm=True
|
| 259 |
+
)
|
| 260 |
+
|
| 261 |
+
return QueryAnalysis(
|
| 262 |
+
complexity=dominant_type,
|
| 263 |
+
confidence=confidence,
|
| 264 |
+
reasoning=reasoning_map.get(dominant_type, "Complex query detected"),
|
| 265 |
+
detected_patterns=detected_patterns,
|
| 266 |
+
sub_query_hints=sub_query_hints,
|
| 267 |
+
use_rlm=True
|
| 268 |
+
)
|
| 269 |
+
|
| 270 |
+
|
| 271 |
+
async def should_use_rlm(query: str) -> bool:
|
| 272 |
+
"""
|
| 273 |
+
Quick check if query should use RLM.
|
| 274 |
+
|
| 275 |
+
Returns True if query is complex enough for RLM.
|
| 276 |
+
"""
|
| 277 |
+
analysis = analyze_query_complexity(query)
|
| 278 |
+
return analysis.use_rlm
|
| 279 |
+
|
| 280 |
+
|
| 281 |
+
# Export for use in other modules
|
| 282 |
+
__all__ = [
|
| 283 |
+
"QueryComplexity",
|
| 284 |
+
"QueryAnalysis",
|
| 285 |
+
"analyze_query_complexity",
|
| 286 |
+
"should_use_rlm"
|
| 287 |
+
]
|
app/ai/services/rlm_search_service.py
ADDED
|
@@ -0,0 +1,1202 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# app/ai/services/rlm_search_service.py
|
| 2 |
+
"""
|
| 3 |
+
RLM (Recursive Language Model) Search Service for AIDA.
|
| 4 |
+
|
| 5 |
+
Implements multi-hop reasoning for complex search queries using
|
| 6 |
+
recursive decomposition and aggregation.
|
| 7 |
+
|
| 8 |
+
Key Features:
|
| 9 |
+
- Multi-hop proximity search ("near schools", "close to beach")
|
| 10 |
+
- Boolean OR query handling ("under 500k OR has pool")
|
| 11 |
+
- Comparative analysis ("compare Cotonou vs Calavi")
|
| 12 |
+
- Aggregation queries ("average price in Cotonou")
|
| 13 |
+
- Multi-factor ranking ("best family apartment near schools and parks")
|
| 14 |
+
|
| 15 |
+
Uses existing DeepSeek LLM (brain_llm) - no additional infrastructure needed.
|
| 16 |
+
"""
|
| 17 |
+
|
| 18 |
+
import json
|
| 19 |
+
import asyncio
|
| 20 |
+
from typing import Dict, List, Any, Optional, Tuple
|
| 21 |
+
from structlog import get_logger
|
| 22 |
+
from langchain_openai import ChatOpenAI
|
| 23 |
+
from langchain_core.messages import SystemMessage, HumanMessage
|
| 24 |
+
|
| 25 |
+
from app.config import settings
|
| 26 |
+
from app.ai.services.rlm_query_analyzer import (
|
| 27 |
+
QueryComplexity,
|
| 28 |
+
QueryAnalysis,
|
| 29 |
+
analyze_query_complexity
|
| 30 |
+
)
|
| 31 |
+
|
| 32 |
+
logger = get_logger(__name__)
|
| 33 |
+
|
| 34 |
+
|
| 35 |
+
# Use existing DeepSeek LLM configuration
|
| 36 |
+
rlm_llm = ChatOpenAI(
|
| 37 |
+
api_key=settings.DEEPSEEK_API_KEY,
|
| 38 |
+
base_url=settings.DEEPSEEK_BASE_URL,
|
| 39 |
+
model="deepseek-chat",
|
| 40 |
+
temperature=0.3, # Lower temp for more deterministic decomposition
|
| 41 |
+
)
|
| 42 |
+
|
| 43 |
+
|
| 44 |
+
# =============================================================================
|
| 45 |
+
# RLM CORE: Recursive Search Agent
|
| 46 |
+
# =============================================================================
|
| 47 |
+
|
| 48 |
+
class RLMSearchAgent:
|
| 49 |
+
"""
|
| 50 |
+
Recursive Language Model Search Agent.
|
| 51 |
+
|
| 52 |
+
Decomposes complex queries into sub-queries, executes them recursively,
|
| 53 |
+
and aggregates results using LLM reasoning.
|
| 54 |
+
|
| 55 |
+
Example:
|
| 56 |
+
Query: "3-bed apartment near international schools in Cotonou under 500k"
|
| 57 |
+
|
| 58 |
+
RLM Flow:
|
| 59 |
+
1. Decompose: ["Find schools in Cotonou", "Find 3-bed under 500k near schools"]
|
| 60 |
+
2. Execute: Search schools β Get coordinates β Search apartments nearby
|
| 61 |
+
3. Aggregate: Rank by proximity to schools
|
| 62 |
+
"""
|
| 63 |
+
|
| 64 |
+
def __init__(self):
|
| 65 |
+
self.llm = rlm_llm
|
| 66 |
+
self.max_depth = 3
|
| 67 |
+
self.call_count = 0
|
| 68 |
+
self.search_cache = {} # Cache sub-query results
|
| 69 |
+
|
| 70 |
+
async def search(
|
| 71 |
+
self,
|
| 72 |
+
query: str,
|
| 73 |
+
context: Optional[Dict] = None,
|
| 74 |
+
analysis: Optional[QueryAnalysis] = None
|
| 75 |
+
) -> Dict[str, Any]:
|
| 76 |
+
"""
|
| 77 |
+
Main entry point for RLM search.
|
| 78 |
+
|
| 79 |
+
Args:
|
| 80 |
+
query: User's search query
|
| 81 |
+
context: Optional context (user location, previous results, etc.)
|
| 82 |
+
analysis: Optional pre-computed query analysis
|
| 83 |
+
|
| 84 |
+
Returns:
|
| 85 |
+
Dict with:
|
| 86 |
+
- results: List of matching listings
|
| 87 |
+
- strategy_used: RLM strategy name
|
| 88 |
+
- reasoning_steps: List of reasoning steps taken
|
| 89 |
+
- call_count: Number of LLM calls made
|
| 90 |
+
"""
|
| 91 |
+
self.call_count = 0
|
| 92 |
+
|
| 93 |
+
# Analyze query if not provided
|
| 94 |
+
if analysis is None:
|
| 95 |
+
analysis = analyze_query_complexity(query)
|
| 96 |
+
|
| 97 |
+
logger.info(
|
| 98 |
+
"RLM search started",
|
| 99 |
+
query=query[:50],
|
| 100 |
+
complexity=analysis.complexity.value,
|
| 101 |
+
confidence=analysis.confidence
|
| 102 |
+
)
|
| 103 |
+
|
| 104 |
+
# Route to appropriate handler based on complexity
|
| 105 |
+
handler_map = {
|
| 106 |
+
QueryComplexity.MULTI_HOP: self._handle_multi_hop,
|
| 107 |
+
QueryComplexity.BOOLEAN_OR: self._handle_boolean_or,
|
| 108 |
+
QueryComplexity.COMPARATIVE: self._handle_comparative,
|
| 109 |
+
QueryComplexity.AGGREGATION: self._handle_aggregation,
|
| 110 |
+
QueryComplexity.MULTI_FACTOR: self._handle_multi_factor,
|
| 111 |
+
QueryComplexity.SIMPLE: self._handle_simple,
|
| 112 |
+
}
|
| 113 |
+
|
| 114 |
+
handler = handler_map.get(analysis.complexity, self._handle_simple)
|
| 115 |
+
|
| 116 |
+
try:
|
| 117 |
+
results = await handler(query, context or {}, analysis)
|
| 118 |
+
|
| 119 |
+
logger.info(
|
| 120 |
+
"RLM search complete",
|
| 121 |
+
query=query[:50],
|
| 122 |
+
result_count=len(results.get("results", [])),
|
| 123 |
+
call_count=self.call_count
|
| 124 |
+
)
|
| 125 |
+
|
| 126 |
+
return {
|
| 127 |
+
**results,
|
| 128 |
+
"strategy_used": f"RLM_{analysis.complexity.value.upper()}",
|
| 129 |
+
"call_count": self.call_count,
|
| 130 |
+
"analysis": analysis.model_dump()
|
| 131 |
+
}
|
| 132 |
+
|
| 133 |
+
except Exception as e:
|
| 134 |
+
logger.error("RLM search failed", error=str(e), query=query[:50])
|
| 135 |
+
# Fallback to simple search
|
| 136 |
+
return await self._handle_simple(query, context or {}, analysis)
|
| 137 |
+
|
| 138 |
+
# =========================================================================
|
| 139 |
+
# Handler: Multi-hop Queries ("near X", "close to Y")
|
| 140 |
+
# =========================================================================
|
| 141 |
+
|
| 142 |
+
async def _handle_multi_hop(
|
| 143 |
+
self,
|
| 144 |
+
query: str,
|
| 145 |
+
context: Dict,
|
| 146 |
+
analysis: QueryAnalysis
|
| 147 |
+
) -> Dict[str, Any]:
|
| 148 |
+
"""
|
| 149 |
+
Handle multi-hop proximity queries.
|
| 150 |
+
|
| 151 |
+
Example: "3-bed apartment near international schools in Cotonou"
|
| 152 |
+
|
| 153 |
+
Steps:
|
| 154 |
+
1. Extract POI type (schools) and location (Cotonou)
|
| 155 |
+
2. Find POI coordinates (schools in Cotonou)
|
| 156 |
+
3. Search listings near POI coordinates
|
| 157 |
+
4. Rank by proximity
|
| 158 |
+
"""
|
| 159 |
+
reasoning_steps = []
|
| 160 |
+
|
| 161 |
+
# Step 1: Decompose query to extract POI and criteria
|
| 162 |
+
decomposition_prompt = f"""
|
| 163 |
+
Analyze this real estate search query and extract the proximity components:
|
| 164 |
+
|
| 165 |
+
Query: "{query}"
|
| 166 |
+
|
| 167 |
+
Extract:
|
| 168 |
+
1. POI (Point of Interest) type: What the user wants to be near (school, beach, park, etc.)
|
| 169 |
+
2. Location: The city/area being searched
|
| 170 |
+
3. Listing criteria: bedrooms, price, amenities, etc.
|
| 171 |
+
|
| 172 |
+
Return JSON:
|
| 173 |
+
{{
|
| 174 |
+
"poi_type": "school" or "beach" or "park" or "hospital" or "market" or "airport" or null,
|
| 175 |
+
"poi_name": "specific name if mentioned" or null,
|
| 176 |
+
"location": "city or area name",
|
| 177 |
+
"listing_criteria": {{
|
| 178 |
+
"bedrooms": number or null,
|
| 179 |
+
"max_price": number or null,
|
| 180 |
+
"min_price": number or null,
|
| 181 |
+
"amenities": ["list"] or [],
|
| 182 |
+
"listing_type": "rent" or "sale" or null
|
| 183 |
+
}},
|
| 184 |
+
"proximity_km": 2 // default proximity radius in km
|
| 185 |
+
}}
|
| 186 |
+
"""
|
| 187 |
+
self.call_count += 1
|
| 188 |
+
decomp_response = await self.llm.ainvoke([
|
| 189 |
+
HumanMessage(content=decomposition_prompt)
|
| 190 |
+
])
|
| 191 |
+
|
| 192 |
+
try:
|
| 193 |
+
decomposition = self._extract_json(decomp_response.content)
|
| 194 |
+
except Exception:
|
| 195 |
+
logger.error("Failed to parse decomposition, falling back to simple search")
|
| 196 |
+
return await self._handle_simple(query, context, analysis)
|
| 197 |
+
|
| 198 |
+
reasoning_steps.append({
|
| 199 |
+
"step": "decomposition",
|
| 200 |
+
"result": decomposition
|
| 201 |
+
})
|
| 202 |
+
|
| 203 |
+
poi_type = decomposition.get("poi_type")
|
| 204 |
+
location = decomposition.get("location")
|
| 205 |
+
criteria = decomposition.get("listing_criteria", {})
|
| 206 |
+
proximity_km = decomposition.get("proximity_km", 2)
|
| 207 |
+
|
| 208 |
+
# Step 2: Find POI coordinates
|
| 209 |
+
poi_locations = []
|
| 210 |
+
if poi_type and location:
|
| 211 |
+
poi_locations = await self._find_poi_locations(
|
| 212 |
+
poi_type,
|
| 213 |
+
location,
|
| 214 |
+
decomposition.get("poi_name")
|
| 215 |
+
)
|
| 216 |
+
reasoning_steps.append({
|
| 217 |
+
"step": "find_poi",
|
| 218 |
+
"poi_type": poi_type,
|
| 219 |
+
"location": location,
|
| 220 |
+
"found": len(poi_locations)
|
| 221 |
+
})
|
| 222 |
+
|
| 223 |
+
# Step 3: Search listings near POI locations
|
| 224 |
+
if poi_locations:
|
| 225 |
+
# Search near each POI and aggregate
|
| 226 |
+
all_listings = []
|
| 227 |
+
for poi in poi_locations[:3]: # Limit to top 3 POIs
|
| 228 |
+
nearby_listings = await self._search_near_coordinates(
|
| 229 |
+
lat=poi["lat"],
|
| 230 |
+
lon=poi["lon"],
|
| 231 |
+
radius_km=proximity_km,
|
| 232 |
+
criteria=criteria,
|
| 233 |
+
location=location
|
| 234 |
+
)
|
| 235 |
+
# Add distance info to each listing
|
| 236 |
+
for listing in nearby_listings:
|
| 237 |
+
listing["_poi_name"] = poi.get("name", poi_type)
|
| 238 |
+
listing["_distance_km"] = self._calculate_distance(
|
| 239 |
+
poi["lat"], poi["lon"],
|
| 240 |
+
listing.get("latitude"), listing.get("longitude")
|
| 241 |
+
)
|
| 242 |
+
all_listings.extend(nearby_listings)
|
| 243 |
+
|
| 244 |
+
# Deduplicate by listing ID
|
| 245 |
+
seen_ids = set()
|
| 246 |
+
unique_listings = []
|
| 247 |
+
for listing in all_listings:
|
| 248 |
+
lid = str(listing.get("_id") or listing.get("mongo_id"))
|
| 249 |
+
if lid not in seen_ids:
|
| 250 |
+
seen_ids.add(lid)
|
| 251 |
+
unique_listings.append(listing)
|
| 252 |
+
|
| 253 |
+
# Sort by distance
|
| 254 |
+
unique_listings.sort(key=lambda x: x.get("_distance_km", 999))
|
| 255 |
+
|
| 256 |
+
reasoning_steps.append({
|
| 257 |
+
"step": "proximity_search",
|
| 258 |
+
"poi_count": len(poi_locations),
|
| 259 |
+
"listings_found": len(unique_listings)
|
| 260 |
+
})
|
| 261 |
+
|
| 262 |
+
return {
|
| 263 |
+
"results": unique_listings[:10],
|
| 264 |
+
"reasoning_steps": reasoning_steps,
|
| 265 |
+
"message": f"Found {len(unique_listings)} listings near {poi_type}s in {location}"
|
| 266 |
+
}
|
| 267 |
+
|
| 268 |
+
else:
|
| 269 |
+
# No POI found, fall back to semantic search with location
|
| 270 |
+
logger.warning("No POI locations found, using semantic search")
|
| 271 |
+
return await self._semantic_search_with_criteria(query, location, criteria)
|
| 272 |
+
|
| 273 |
+
# =========================================================================
|
| 274 |
+
# Handler: Boolean OR Queries
|
| 275 |
+
# =========================================================================
|
| 276 |
+
|
| 277 |
+
async def _handle_boolean_or(
|
| 278 |
+
self,
|
| 279 |
+
query: str,
|
| 280 |
+
context: Dict,
|
| 281 |
+
analysis: QueryAnalysis
|
| 282 |
+
) -> Dict[str, Any]:
|
| 283 |
+
"""
|
| 284 |
+
Handle queries with OR logic.
|
| 285 |
+
|
| 286 |
+
Example: "Under 500k XOF OR (2-bedroom AND has pool)"
|
| 287 |
+
|
| 288 |
+
Steps:
|
| 289 |
+
1. Parse OR branches
|
| 290 |
+
2. Execute each branch in parallel
|
| 291 |
+
3. Union results
|
| 292 |
+
"""
|
| 293 |
+
reasoning_steps = []
|
| 294 |
+
|
| 295 |
+
# Step 1: Parse OR branches
|
| 296 |
+
parse_prompt = f"""
|
| 297 |
+
Parse this real estate query into separate OR branches:
|
| 298 |
+
|
| 299 |
+
Query: "{query}"
|
| 300 |
+
|
| 301 |
+
Return JSON:
|
| 302 |
+
{{
|
| 303 |
+
"branches": [
|
| 304 |
+
{{
|
| 305 |
+
"description": "human-readable description",
|
| 306 |
+
"criteria": {{
|
| 307 |
+
"location": "city" or null,
|
| 308 |
+
"max_price": number or null,
|
| 309 |
+
"min_price": number or null,
|
| 310 |
+
"bedrooms": number or null,
|
| 311 |
+
"amenities": ["list"] or [],
|
| 312 |
+
"listing_type": "rent" or "sale" or null
|
| 313 |
+
}}
|
| 314 |
+
}}
|
| 315 |
+
],
|
| 316 |
+
"shared_criteria": {{
|
| 317 |
+
// Criteria that apply to ALL branches (e.g., location)
|
| 318 |
+
"location": "city" or null
|
| 319 |
+
}}
|
| 320 |
+
}}
|
| 321 |
+
|
| 322 |
+
Example for "Under 500k OR (2-bed AND pool) in Cotonou":
|
| 323 |
+
{{
|
| 324 |
+
"branches": [
|
| 325 |
+
{{"description": "Under 500k", "criteria": {{"max_price": 500000}}}},
|
| 326 |
+
{{"description": "2-bed with pool", "criteria": {{"bedrooms": 2, "amenities": ["pool"]}}}}
|
| 327 |
+
],
|
| 328 |
+
"shared_criteria": {{"location": "Cotonou"}}
|
| 329 |
+
}}
|
| 330 |
+
"""
|
| 331 |
+
self.call_count += 1
|
| 332 |
+
parse_response = await self.llm.ainvoke([
|
| 333 |
+
HumanMessage(content=parse_prompt)
|
| 334 |
+
])
|
| 335 |
+
|
| 336 |
+
try:
|
| 337 |
+
parsed = self._extract_json(parse_response.content)
|
| 338 |
+
except Exception:
|
| 339 |
+
logger.error("Failed to parse OR branches")
|
| 340 |
+
return await self._handle_simple(query, context, analysis)
|
| 341 |
+
|
| 342 |
+
branches = parsed.get("branches", [])
|
| 343 |
+
shared = parsed.get("shared_criteria", {})
|
| 344 |
+
|
| 345 |
+
reasoning_steps.append({
|
| 346 |
+
"step": "parse_or_branches",
|
| 347 |
+
"branch_count": len(branches),
|
| 348 |
+
"shared_criteria": shared
|
| 349 |
+
})
|
| 350 |
+
|
| 351 |
+
# Step 2: Execute each branch in parallel
|
| 352 |
+
async def execute_branch(branch: Dict) -> List[Dict]:
|
| 353 |
+
criteria = {**shared, **branch.get("criteria", {})}
|
| 354 |
+
return await self._execute_criteria_search(criteria)
|
| 355 |
+
|
| 356 |
+
branch_results = await asyncio.gather(
|
| 357 |
+
*[execute_branch(b) for b in branches]
|
| 358 |
+
)
|
| 359 |
+
|
| 360 |
+
# Step 3: Union results (deduplicate)
|
| 361 |
+
seen_ids = set()
|
| 362 |
+
union_results = []
|
| 363 |
+
for i, results in enumerate(branch_results):
|
| 364 |
+
reasoning_steps.append({
|
| 365 |
+
"step": f"branch_{i+1}",
|
| 366 |
+
"description": branches[i].get("description"),
|
| 367 |
+
"results_count": len(results)
|
| 368 |
+
})
|
| 369 |
+
for listing in results:
|
| 370 |
+
lid = str(listing.get("_id") or listing.get("mongo_id"))
|
| 371 |
+
if lid not in seen_ids:
|
| 372 |
+
seen_ids.add(lid)
|
| 373 |
+
listing["_matched_branch"] = branches[i].get("description")
|
| 374 |
+
union_results.append(listing)
|
| 375 |
+
|
| 376 |
+
reasoning_steps.append({
|
| 377 |
+
"step": "union",
|
| 378 |
+
"total_unique": len(union_results)
|
| 379 |
+
})
|
| 380 |
+
|
| 381 |
+
return {
|
| 382 |
+
"results": union_results[:10],
|
| 383 |
+
"reasoning_steps": reasoning_steps,
|
| 384 |
+
"message": f"Found {len(union_results)} listings matching any of {len(branches)} criteria"
|
| 385 |
+
}
|
| 386 |
+
|
| 387 |
+
# =========================================================================
|
| 388 |
+
# Handler: Comparative Queries
|
| 389 |
+
# =========================================================================
|
| 390 |
+
|
| 391 |
+
async def _handle_comparative(
|
| 392 |
+
self,
|
| 393 |
+
query: str,
|
| 394 |
+
context: Dict,
|
| 395 |
+
analysis: QueryAnalysis
|
| 396 |
+
) -> Dict[str, Any]:
|
| 397 |
+
"""
|
| 398 |
+
Handle comparative queries.
|
| 399 |
+
|
| 400 |
+
Example: "Compare average prices in Cotonou vs Calavi"
|
| 401 |
+
|
| 402 |
+
Steps:
|
| 403 |
+
1. Extract comparison subjects and metrics
|
| 404 |
+
2. Search each subject
|
| 405 |
+
3. Calculate and compare metrics
|
| 406 |
+
"""
|
| 407 |
+
reasoning_steps = []
|
| 408 |
+
|
| 409 |
+
# Step 1: Parse comparison
|
| 410 |
+
compare_prompt = f"""
|
| 411 |
+
Parse this comparative real estate query:
|
| 412 |
+
|
| 413 |
+
Query: "{query}"
|
| 414 |
+
|
| 415 |
+
Return JSON:
|
| 416 |
+
{{
|
| 417 |
+
"subjects": [
|
| 418 |
+
{{"name": "Cotonou", "type": "location"}},
|
| 419 |
+
{{"name": "Calavi", "type": "location"}}
|
| 420 |
+
],
|
| 421 |
+
"metric": "average_price" or "count" or "price_range",
|
| 422 |
+
"listing_criteria": {{
|
| 423 |
+
"bedrooms": number or null,
|
| 424 |
+
"listing_type": "rent" or "sale" or null
|
| 425 |
+
}}
|
| 426 |
+
}}
|
| 427 |
+
"""
|
| 428 |
+
self.call_count += 1
|
| 429 |
+
compare_response = await self.llm.ainvoke([
|
| 430 |
+
HumanMessage(content=compare_prompt)
|
| 431 |
+
])
|
| 432 |
+
|
| 433 |
+
try:
|
| 434 |
+
comparison = self._extract_json(compare_response.content)
|
| 435 |
+
except Exception:
|
| 436 |
+
return await self._handle_simple(query, context, analysis)
|
| 437 |
+
|
| 438 |
+
subjects = comparison.get("subjects", [])
|
| 439 |
+
metric = comparison.get("metric", "average_price")
|
| 440 |
+
criteria = comparison.get("listing_criteria", {})
|
| 441 |
+
|
| 442 |
+
reasoning_steps.append({
|
| 443 |
+
"step": "parse_comparison",
|
| 444 |
+
"subjects": [s["name"] for s in subjects],
|
| 445 |
+
"metric": metric
|
| 446 |
+
})
|
| 447 |
+
|
| 448 |
+
# Step 2: Search each subject
|
| 449 |
+
subject_results = []
|
| 450 |
+
for subject in subjects:
|
| 451 |
+
search_criteria = {**criteria, "location": subject["name"]}
|
| 452 |
+
listings = await self._execute_criteria_search(search_criteria, limit=50)
|
| 453 |
+
subject_results.append({
|
| 454 |
+
"name": subject["name"],
|
| 455 |
+
"listings": listings,
|
| 456 |
+
"count": len(listings)
|
| 457 |
+
})
|
| 458 |
+
|
| 459 |
+
# Step 3: Calculate metrics
|
| 460 |
+
for result in subject_results:
|
| 461 |
+
listings = result["listings"]
|
| 462 |
+
if listings:
|
| 463 |
+
prices = [l.get("price", 0) for l in listings if l.get("price")]
|
| 464 |
+
result["avg_price"] = sum(prices) / len(prices) if prices else 0
|
| 465 |
+
result["min_price"] = min(prices) if prices else 0
|
| 466 |
+
result["max_price"] = max(prices) if prices else 0
|
| 467 |
+
else:
|
| 468 |
+
result["avg_price"] = 0
|
| 469 |
+
result["min_price"] = 0
|
| 470 |
+
result["max_price"] = 0
|
| 471 |
+
|
| 472 |
+
reasoning_steps.append({
|
| 473 |
+
"step": f"metrics_{result['name']}",
|
| 474 |
+
"count": result["count"],
|
| 475 |
+
"avg_price": result["avg_price"]
|
| 476 |
+
})
|
| 477 |
+
|
| 478 |
+
# Step 4: Generate comparison summary
|
| 479 |
+
summary = await self._generate_comparison_summary(subject_results, metric)
|
| 480 |
+
|
| 481 |
+
# Return top listings from each subject
|
| 482 |
+
combined_results = []
|
| 483 |
+
for result in subject_results:
|
| 484 |
+
for listing in result["listings"][:5]:
|
| 485 |
+
listing["_comparison_group"] = result["name"]
|
| 486 |
+
combined_results.append(listing)
|
| 487 |
+
|
| 488 |
+
return {
|
| 489 |
+
"results": combined_results[:10],
|
| 490 |
+
"reasoning_steps": reasoning_steps,
|
| 491 |
+
"comparison_data": subject_results,
|
| 492 |
+
"message": summary
|
| 493 |
+
}
|
| 494 |
+
|
| 495 |
+
# =========================================================================
|
| 496 |
+
# Handler: Aggregation Queries
|
| 497 |
+
# =========================================================================
|
| 498 |
+
|
| 499 |
+
async def _handle_aggregation(
|
| 500 |
+
self,
|
| 501 |
+
query: str,
|
| 502 |
+
context: Dict,
|
| 503 |
+
analysis: QueryAnalysis
|
| 504 |
+
) -> Dict[str, Any]:
|
| 505 |
+
"""
|
| 506 |
+
Handle aggregation queries (average, count, etc.)
|
| 507 |
+
"""
|
| 508 |
+
reasoning_steps = []
|
| 509 |
+
|
| 510 |
+
# Parse aggregation request
|
| 511 |
+
agg_prompt = f"""
|
| 512 |
+
Parse this aggregation query:
|
| 513 |
+
|
| 514 |
+
Query: "{query}"
|
| 515 |
+
|
| 516 |
+
Return JSON:
|
| 517 |
+
{{
|
| 518 |
+
"aggregation_type": "average" or "count" or "sum" or "min" or "max",
|
| 519 |
+
"field": "price" or "bedrooms",
|
| 520 |
+
"filters": {{
|
| 521 |
+
"location": "city" or null,
|
| 522 |
+
"listing_type": "rent" or "sale" or null
|
| 523 |
+
}}
|
| 524 |
+
}}
|
| 525 |
+
"""
|
| 526 |
+
self.call_count += 1
|
| 527 |
+
agg_response = await self.llm.ainvoke([
|
| 528 |
+
HumanMessage(content=agg_prompt)
|
| 529 |
+
])
|
| 530 |
+
|
| 531 |
+
try:
|
| 532 |
+
aggregation = self._extract_json(agg_response.content)
|
| 533 |
+
except Exception:
|
| 534 |
+
return await self._handle_simple(query, context, analysis)
|
| 535 |
+
|
| 536 |
+
agg_type = aggregation.get("aggregation_type", "count")
|
| 537 |
+
field = aggregation.get("field", "price")
|
| 538 |
+
filters = aggregation.get("filters", {})
|
| 539 |
+
|
| 540 |
+
# Fetch listings
|
| 541 |
+
listings = await self._execute_criteria_search(filters, limit=100)
|
| 542 |
+
|
| 543 |
+
# Calculate aggregation
|
| 544 |
+
values = [l.get(field, 0) for l in listings if l.get(field) is not None]
|
| 545 |
+
|
| 546 |
+
result = 0
|
| 547 |
+
if agg_type == "count":
|
| 548 |
+
result = len(listings)
|
| 549 |
+
elif agg_type == "average" and values:
|
| 550 |
+
result = sum(values) / len(values)
|
| 551 |
+
elif agg_type == "sum":
|
| 552 |
+
result = sum(values)
|
| 553 |
+
elif agg_type == "min" and values:
|
| 554 |
+
result = min(values)
|
| 555 |
+
elif agg_type == "max" and values:
|
| 556 |
+
result = max(values)
|
| 557 |
+
|
| 558 |
+
reasoning_steps.append({
|
| 559 |
+
"step": "aggregation",
|
| 560 |
+
"type": agg_type,
|
| 561 |
+
"field": field,
|
| 562 |
+
"sample_size": len(listings),
|
| 563 |
+
"result": result
|
| 564 |
+
})
|
| 565 |
+
|
| 566 |
+
location = filters.get("location", "all areas")
|
| 567 |
+
message = f"The {agg_type} {field} in {location} is {result:,.0f}"
|
| 568 |
+
|
| 569 |
+
return {
|
| 570 |
+
"results": listings[:10],
|
| 571 |
+
"reasoning_steps": reasoning_steps,
|
| 572 |
+
"aggregation_result": {
|
| 573 |
+
"type": agg_type,
|
| 574 |
+
"field": field,
|
| 575 |
+
"value": result,
|
| 576 |
+
"sample_size": len(listings)
|
| 577 |
+
},
|
| 578 |
+
"message": message
|
| 579 |
+
}
|
| 580 |
+
|
| 581 |
+
# =========================================================================
|
| 582 |
+
# Handler: Multi-Factor Queries
|
| 583 |
+
# =========================================================================
|
| 584 |
+
|
| 585 |
+
async def _handle_multi_factor(
|
| 586 |
+
self,
|
| 587 |
+
query: str,
|
| 588 |
+
context: Dict,
|
| 589 |
+
analysis: QueryAnalysis
|
| 590 |
+
) -> Dict[str, Any]:
|
| 591 |
+
"""
|
| 592 |
+
Handle multi-factor ranking queries.
|
| 593 |
+
|
| 594 |
+
Example: "Best family apartment near schools and parks, safe area"
|
| 595 |
+
|
| 596 |
+
Steps:
|
| 597 |
+
1. Extract ranking factors
|
| 598 |
+
2. Score each factor
|
| 599 |
+
3. Combine scores with weights
|
| 600 |
+
"""
|
| 601 |
+
reasoning_steps = []
|
| 602 |
+
|
| 603 |
+
# Parse factors
|
| 604 |
+
factor_prompt = f"""
|
| 605 |
+
Extract ranking factors from this query:
|
| 606 |
+
|
| 607 |
+
Query: "{query}"
|
| 608 |
+
|
| 609 |
+
Return JSON:
|
| 610 |
+
{{
|
| 611 |
+
"location": "city" or null,
|
| 612 |
+
"base_criteria": {{
|
| 613 |
+
"bedrooms": number or null,
|
| 614 |
+
"max_price": number or null
|
| 615 |
+
}},
|
| 616 |
+
"ranking_factors": [
|
| 617 |
+
{{"factor": "school_proximity", "weight": 0.3}},
|
| 618 |
+
{{"factor": "park_proximity", "weight": 0.2}},
|
| 619 |
+
{{"factor": "safety", "weight": 0.3}},
|
| 620 |
+
{{"factor": "family_friendly", "weight": 0.2}}
|
| 621 |
+
]
|
| 622 |
+
}}
|
| 623 |
+
|
| 624 |
+
Available factors:
|
| 625 |
+
- school_proximity: Near schools
|
| 626 |
+
- park_proximity: Near parks
|
| 627 |
+
- beach_proximity: Near beach
|
| 628 |
+
- safety: Safe neighborhood
|
| 629 |
+
- family_friendly: Family-friendly amenities
|
| 630 |
+
- luxury: Luxury amenities
|
| 631 |
+
- modern: Modern/renovated
|
| 632 |
+
- quiet: Quiet/peaceful area
|
| 633 |
+
"""
|
| 634 |
+
self.call_count += 1
|
| 635 |
+
factor_response = await self.llm.ainvoke([
|
| 636 |
+
HumanMessage(content=factor_prompt)
|
| 637 |
+
])
|
| 638 |
+
|
| 639 |
+
try:
|
| 640 |
+
factors = self._extract_json(factor_response.content)
|
| 641 |
+
except Exception:
|
| 642 |
+
return await self._handle_simple(query, context, analysis)
|
| 643 |
+
|
| 644 |
+
location = factors.get("location")
|
| 645 |
+
base_criteria = factors.get("base_criteria", {})
|
| 646 |
+
ranking_factors = factors.get("ranking_factors", [])
|
| 647 |
+
|
| 648 |
+
reasoning_steps.append({
|
| 649 |
+
"step": "extract_factors",
|
| 650 |
+
"location": location,
|
| 651 |
+
"factor_count": len(ranking_factors)
|
| 652 |
+
})
|
| 653 |
+
|
| 654 |
+
# Get base listings
|
| 655 |
+
search_criteria = {**base_criteria}
|
| 656 |
+
if location:
|
| 657 |
+
search_criteria["location"] = location
|
| 658 |
+
|
| 659 |
+
listings = await self._execute_criteria_search(search_criteria, limit=30)
|
| 660 |
+
|
| 661 |
+
if not listings:
|
| 662 |
+
return {
|
| 663 |
+
"results": [],
|
| 664 |
+
"reasoning_steps": reasoning_steps,
|
| 665 |
+
"message": f"No listings found in {location}"
|
| 666 |
+
}
|
| 667 |
+
|
| 668 |
+
# Score each listing on each factor
|
| 669 |
+
for listing in listings:
|
| 670 |
+
total_score = 0
|
| 671 |
+
factor_scores = {}
|
| 672 |
+
|
| 673 |
+
for factor_info in ranking_factors:
|
| 674 |
+
factor = factor_info["factor"]
|
| 675 |
+
weight = factor_info.get("weight", 0.25)
|
| 676 |
+
|
| 677 |
+
score = await self._score_factor(listing, factor, location)
|
| 678 |
+
factor_scores[factor] = score
|
| 679 |
+
total_score += score * weight
|
| 680 |
+
|
| 681 |
+
listing["_factor_scores"] = factor_scores
|
| 682 |
+
listing["_total_score"] = total_score
|
| 683 |
+
|
| 684 |
+
# Sort by total score
|
| 685 |
+
listings.sort(key=lambda x: x.get("_total_score", 0), reverse=True)
|
| 686 |
+
|
| 687 |
+
reasoning_steps.append({
|
| 688 |
+
"step": "scoring",
|
| 689 |
+
"listings_scored": len(listings),
|
| 690 |
+
"top_score": listings[0].get("_total_score") if listings else 0
|
| 691 |
+
})
|
| 692 |
+
|
| 693 |
+
return {
|
| 694 |
+
"results": listings[:10],
|
| 695 |
+
"reasoning_steps": reasoning_steps,
|
| 696 |
+
"message": f"Found {len(listings)} listings ranked by {len(ranking_factors)} factors"
|
| 697 |
+
}
|
| 698 |
+
|
| 699 |
+
# =========================================================================
|
| 700 |
+
# Handler: Simple Queries (Fallback)
|
| 701 |
+
# =========================================================================
|
| 702 |
+
|
| 703 |
+
async def _handle_simple(
|
| 704 |
+
self,
|
| 705 |
+
query: str,
|
| 706 |
+
context: Dict,
|
| 707 |
+
analysis: QueryAnalysis
|
| 708 |
+
) -> Dict[str, Any]:
|
| 709 |
+
"""
|
| 710 |
+
Fallback handler for simple queries - uses existing hybrid search.
|
| 711 |
+
"""
|
| 712 |
+
from app.ai.services.search_service import hybrid_search
|
| 713 |
+
from app.ai.services.search_extractor import extract_search_params
|
| 714 |
+
|
| 715 |
+
params = await extract_search_params(query)
|
| 716 |
+
results = await hybrid_search(
|
| 717 |
+
query_text=query,
|
| 718 |
+
search_params=params,
|
| 719 |
+
limit=10
|
| 720 |
+
)
|
| 721 |
+
|
| 722 |
+
return {
|
| 723 |
+
"results": results,
|
| 724 |
+
"reasoning_steps": [{"step": "simple_search", "params": params}],
|
| 725 |
+
"message": f"Found {len(results)} listings"
|
| 726 |
+
}
|
| 727 |
+
|
| 728 |
+
# =========================================================================
|
| 729 |
+
# Helper Methods
|
| 730 |
+
# =========================================================================
|
| 731 |
+
|
| 732 |
+
async def _find_poi_locations(
|
| 733 |
+
self,
|
| 734 |
+
poi_type: str,
|
| 735 |
+
location: str,
|
| 736 |
+
specific_name: Optional[str] = None
|
| 737 |
+
) -> List[Dict]:
|
| 738 |
+
"""
|
| 739 |
+
Find POI (Point of Interest) locations using OpenStreetMap.
|
| 740 |
+
|
| 741 |
+
Uses FREE OpenStreetMap APIs:
|
| 742 |
+
- Nominatim: Geocoding (location name β coordinates)
|
| 743 |
+
- Overpass: POI search (find schools, hospitals, parks near a location)
|
| 744 |
+
|
| 745 |
+
Args:
|
| 746 |
+
poi_type: Type of POI (school, hospital, beach, park, etc.)
|
| 747 |
+
location: City/area name (e.g., "Cotonou, Benin")
|
| 748 |
+
specific_name: Optional specific POI name to search for
|
| 749 |
+
|
| 750 |
+
Returns:
|
| 751 |
+
List of POI dicts with name, lat, lon, type
|
| 752 |
+
"""
|
| 753 |
+
try:
|
| 754 |
+
from app.ai.services.osm_poi_service import find_pois
|
| 755 |
+
|
| 756 |
+
# Use OSM to get real POI locations
|
| 757 |
+
search_type = specific_name if specific_name else poi_type
|
| 758 |
+
|
| 759 |
+
logger.info(
|
| 760 |
+
"Finding POIs via OpenStreetMap",
|
| 761 |
+
poi_type=search_type,
|
| 762 |
+
location=location
|
| 763 |
+
)
|
| 764 |
+
|
| 765 |
+
pois = await find_pois(
|
| 766 |
+
poi_type=search_type,
|
| 767 |
+
location=location,
|
| 768 |
+
radius_km=5, # Search within 5km of location center
|
| 769 |
+
limit=5 # Get top 5 POIs
|
| 770 |
+
)
|
| 771 |
+
|
| 772 |
+
if pois:
|
| 773 |
+
logger.info(
|
| 774 |
+
"OSM POIs found",
|
| 775 |
+
count=len(pois),
|
| 776 |
+
poi_type=poi_type,
|
| 777 |
+
location=location
|
| 778 |
+
)
|
| 779 |
+
return pois
|
| 780 |
+
|
| 781 |
+
logger.warning(
|
| 782 |
+
"No OSM POIs found, trying with broader search",
|
| 783 |
+
poi_type=poi_type,
|
| 784 |
+
location=location
|
| 785 |
+
)
|
| 786 |
+
|
| 787 |
+
# Try with just the POI type if specific name returned nothing
|
| 788 |
+
if specific_name:
|
| 789 |
+
pois = await find_pois(
|
| 790 |
+
poi_type=poi_type,
|
| 791 |
+
location=location,
|
| 792 |
+
radius_km=10, # Expand radius
|
| 793 |
+
limit=5
|
| 794 |
+
)
|
| 795 |
+
if pois:
|
| 796 |
+
return pois
|
| 797 |
+
|
| 798 |
+
except ImportError:
|
| 799 |
+
logger.error("OSM POI service not available")
|
| 800 |
+
except Exception as e:
|
| 801 |
+
logger.error(f"OSM POI search failed: {e}")
|
| 802 |
+
|
| 803 |
+
# Fallback: Use LLM to estimate coordinates (less accurate)
|
| 804 |
+
logger.warning("Falling back to LLM POI estimation")
|
| 805 |
+
return await self._fallback_llm_poi_locations(poi_type, location, specific_name)
|
| 806 |
+
|
| 807 |
+
async def _fallback_llm_poi_locations(
|
| 808 |
+
self,
|
| 809 |
+
poi_type: str,
|
| 810 |
+
location: str,
|
| 811 |
+
specific_name: Optional[str] = None
|
| 812 |
+
) -> List[Dict]:
|
| 813 |
+
"""
|
| 814 |
+
Fallback: Use LLM to estimate POI coordinates when OSM fails.
|
| 815 |
+
|
| 816 |
+
Note: This is less accurate than OSM data and should only be used as fallback.
|
| 817 |
+
"""
|
| 818 |
+
poi_prompt = f"""
|
| 819 |
+
You are a geolocation assistant. Provide approximate coordinates for {poi_type}s in {location}.
|
| 820 |
+
|
| 821 |
+
{f"Specifically looking for: {specific_name}" if specific_name else ""}
|
| 822 |
+
|
| 823 |
+
Return JSON array of up to 3 POIs:
|
| 824 |
+
[
|
| 825 |
+
{{"name": "POI name", "lat": 6.3654, "lon": 2.4183, "type": "{poi_type}"}}
|
| 826 |
+
]
|
| 827 |
+
|
| 828 |
+
Use realistic coordinates for {location}. If you don't know exact coordinates,
|
| 829 |
+
provide approximate city center coordinates for {location}.
|
| 830 |
+
"""
|
| 831 |
+
self.call_count += 1
|
| 832 |
+
poi_response = await self.llm.ainvoke([
|
| 833 |
+
HumanMessage(content=poi_prompt)
|
| 834 |
+
])
|
| 835 |
+
|
| 836 |
+
try:
|
| 837 |
+
pois = self._extract_json(poi_response.content)
|
| 838 |
+
return pois if isinstance(pois, list) else []
|
| 839 |
+
except Exception:
|
| 840 |
+
logger.error("Failed to get POI locations from LLM fallback")
|
| 841 |
+
return []
|
| 842 |
+
|
| 843 |
+
async def _search_near_coordinates(
|
| 844 |
+
self,
|
| 845 |
+
lat: float,
|
| 846 |
+
lon: float,
|
| 847 |
+
radius_km: float,
|
| 848 |
+
criteria: Dict,
|
| 849 |
+
location: str
|
| 850 |
+
) -> List[Dict]:
|
| 851 |
+
"""
|
| 852 |
+
Search listings near specific coordinates.
|
| 853 |
+
|
| 854 |
+
Uses MongoDB geospatial query if listings have lat/lon,
|
| 855 |
+
otherwise falls back to location-based search.
|
| 856 |
+
"""
|
| 857 |
+
from app.database import get_db
|
| 858 |
+
|
| 859 |
+
try:
|
| 860 |
+
db = await get_db()
|
| 861 |
+
|
| 862 |
+
# Build geo query
|
| 863 |
+
# Note: This requires a 2dsphere index on listings collection
|
| 864 |
+
# db.listings.create_index([("location_geo", "2dsphere")])
|
| 865 |
+
|
| 866 |
+
geo_query = {
|
| 867 |
+
"status": "active"
|
| 868 |
+
}
|
| 869 |
+
|
| 870 |
+
# Add criteria filters
|
| 871 |
+
if criteria.get("bedrooms"):
|
| 872 |
+
geo_query["bedrooms"] = {"$gte": criteria["bedrooms"]}
|
| 873 |
+
if criteria.get("max_price"):
|
| 874 |
+
geo_query["price"] = {"$lte": criteria["max_price"]}
|
| 875 |
+
if criteria.get("min_price"):
|
| 876 |
+
if "price" in geo_query:
|
| 877 |
+
geo_query["price"]["$gte"] = criteria["min_price"]
|
| 878 |
+
else:
|
| 879 |
+
geo_query["price"] = {"$gte": criteria["min_price"]}
|
| 880 |
+
if criteria.get("listing_type"):
|
| 881 |
+
geo_query["listing_type"] = {"$regex": criteria["listing_type"], "$options": "i"}
|
| 882 |
+
|
| 883 |
+
# Try geospatial query first
|
| 884 |
+
if lat and lon:
|
| 885 |
+
# Convert km to meters for MongoDB
|
| 886 |
+
radius_meters = radius_km * 1000
|
| 887 |
+
|
| 888 |
+
geo_query["$or"] = [
|
| 889 |
+
# Check if listing has coordinates
|
| 890 |
+
{
|
| 891 |
+
"latitude": {"$exists": True, "$ne": None},
|
| 892 |
+
"longitude": {"$exists": True, "$ne": None}
|
| 893 |
+
}
|
| 894 |
+
]
|
| 895 |
+
|
| 896 |
+
# Fetch listings and filter by distance in Python
|
| 897 |
+
# (More flexible than requiring 2dsphere index)
|
| 898 |
+
cursor = db.listings.find(geo_query).limit(50)
|
| 899 |
+
listings = await cursor.to_list(length=50)
|
| 900 |
+
|
| 901 |
+
# Filter by distance
|
| 902 |
+
nearby = []
|
| 903 |
+
for listing in listings:
|
| 904 |
+
if listing.get("latitude") and listing.get("longitude"):
|
| 905 |
+
dist = self._calculate_distance(
|
| 906 |
+
lat, lon,
|
| 907 |
+
listing["latitude"], listing["longitude"]
|
| 908 |
+
)
|
| 909 |
+
if dist <= radius_km:
|
| 910 |
+
listing["_id"] = str(listing["_id"])
|
| 911 |
+
nearby.append(listing)
|
| 912 |
+
# Also include listings in the same location (fallback)
|
| 913 |
+
elif location and location.lower() in str(listing.get("location", "")).lower():
|
| 914 |
+
listing["_id"] = str(listing["_id"])
|
| 915 |
+
nearby.append(listing)
|
| 916 |
+
|
| 917 |
+
return nearby
|
| 918 |
+
|
| 919 |
+
else:
|
| 920 |
+
# No coordinates, search by location name
|
| 921 |
+
if location:
|
| 922 |
+
geo_query["location"] = {"$regex": location, "$options": "i"}
|
| 923 |
+
|
| 924 |
+
cursor = db.listings.find(geo_query).limit(20)
|
| 925 |
+
listings = await cursor.to_list(length=20)
|
| 926 |
+
|
| 927 |
+
for listing in listings:
|
| 928 |
+
listing["_id"] = str(listing["_id"])
|
| 929 |
+
|
| 930 |
+
return listings
|
| 931 |
+
|
| 932 |
+
except Exception as e:
|
| 933 |
+
logger.error("Geo search failed", error=str(e))
|
| 934 |
+
return []
|
| 935 |
+
|
| 936 |
+
async def _execute_criteria_search(
|
| 937 |
+
self,
|
| 938 |
+
criteria: Dict,
|
| 939 |
+
limit: int = 20
|
| 940 |
+
) -> List[Dict]:
|
| 941 |
+
"""
|
| 942 |
+
Execute a search with given criteria using existing search infrastructure.
|
| 943 |
+
"""
|
| 944 |
+
from app.ai.services.search_service import search_mongodb
|
| 945 |
+
|
| 946 |
+
results = await search_mongodb(criteria, limit=limit)
|
| 947 |
+
return results
|
| 948 |
+
|
| 949 |
+
async def _semantic_search_with_criteria(
|
| 950 |
+
self,
|
| 951 |
+
query: str,
|
| 952 |
+
location: str,
|
| 953 |
+
criteria: Dict
|
| 954 |
+
) -> Dict[str, Any]:
|
| 955 |
+
"""
|
| 956 |
+
Semantic search with additional criteria.
|
| 957 |
+
"""
|
| 958 |
+
from app.ai.services.search_service import hybrid_search
|
| 959 |
+
|
| 960 |
+
search_params = {**criteria}
|
| 961 |
+
if location:
|
| 962 |
+
search_params["location"] = location
|
| 963 |
+
|
| 964 |
+
results = await hybrid_search(
|
| 965 |
+
query_text=query,
|
| 966 |
+
search_params=search_params,
|
| 967 |
+
limit=10
|
| 968 |
+
)
|
| 969 |
+
|
| 970 |
+
return {
|
| 971 |
+
"results": results,
|
| 972 |
+
"reasoning_steps": [{"step": "semantic_fallback"}],
|
| 973 |
+
"message": f"Found {len(results)} listings in {location}"
|
| 974 |
+
}
|
| 975 |
+
|
| 976 |
+
async def _generate_comparison_summary(
|
| 977 |
+
self,
|
| 978 |
+
subject_results: List[Dict],
|
| 979 |
+
metric: str
|
| 980 |
+
) -> str:
|
| 981 |
+
"""
|
| 982 |
+
Generate a natural language comparison summary.
|
| 983 |
+
"""
|
| 984 |
+
if len(subject_results) < 2:
|
| 985 |
+
return "Not enough data for comparison"
|
| 986 |
+
|
| 987 |
+
s1, s2 = subject_results[0], subject_results[1]
|
| 988 |
+
|
| 989 |
+
if metric == "average_price":
|
| 990 |
+
diff = abs(s1["avg_price"] - s2["avg_price"])
|
| 991 |
+
cheaper = s1["name"] if s1["avg_price"] < s2["avg_price"] else s2["name"]
|
| 992 |
+
pct = (diff / max(s1["avg_price"], s2["avg_price"]) * 100) if max(s1["avg_price"], s2["avg_price"]) > 0 else 0
|
| 993 |
+
|
| 994 |
+
return (
|
| 995 |
+
f"Average prices: {s1['name']}: {s1['avg_price']:,.0f} XOF | "
|
| 996 |
+
f"{s2['name']}: {s2['avg_price']:,.0f} XOF. "
|
| 997 |
+
f"{cheaper} is {pct:.0f}% cheaper."
|
| 998 |
+
)
|
| 999 |
+
else:
|
| 1000 |
+
return f"Comparison: {s1['name']} ({s1['count']} listings) vs {s2['name']} ({s2['count']} listings)"
|
| 1001 |
+
|
| 1002 |
+
async def _score_factor(
|
| 1003 |
+
self,
|
| 1004 |
+
listing: Dict,
|
| 1005 |
+
factor: str,
|
| 1006 |
+
location: str
|
| 1007 |
+
) -> float:
|
| 1008 |
+
"""
|
| 1009 |
+
Score a listing on a specific factor (0-1).
|
| 1010 |
+
|
| 1011 |
+
Uses:
|
| 1012 |
+
- OpenStreetMap for proximity calculations (school_proximity, park_proximity, etc.)
|
| 1013 |
+
- Text analysis for non-proximity factors (safety, luxury, modern, etc.)
|
| 1014 |
+
"""
|
| 1015 |
+
# Proximity factors - use OSM for actual distance calculation
|
| 1016 |
+
proximity_factors = {
|
| 1017 |
+
"school_proximity": "school",
|
| 1018 |
+
"park_proximity": "park",
|
| 1019 |
+
"beach_proximity": "beach",
|
| 1020 |
+
"hospital_proximity": "hospital",
|
| 1021 |
+
"market_proximity": "market",
|
| 1022 |
+
}
|
| 1023 |
+
|
| 1024 |
+
# Check if this is a proximity factor and listing has coordinates
|
| 1025 |
+
if factor in proximity_factors and listing.get("latitude") and listing.get("longitude"):
|
| 1026 |
+
poi_type = proximity_factors[factor]
|
| 1027 |
+
return await self._score_proximity_factor(
|
| 1028 |
+
listing=listing,
|
| 1029 |
+
poi_type=poi_type,
|
| 1030 |
+
location=location
|
| 1031 |
+
)
|
| 1032 |
+
|
| 1033 |
+
# Non-proximity factors - use text analysis
|
| 1034 |
+
score = 0.5 # Default neutral score
|
| 1035 |
+
|
| 1036 |
+
title = str(listing.get("title", "")).lower()
|
| 1037 |
+
description = str(listing.get("description", "")).lower()
|
| 1038 |
+
amenities = [a.lower() for a in listing.get("amenities", [])]
|
| 1039 |
+
text = f"{title} {description} {' '.join(amenities)}"
|
| 1040 |
+
|
| 1041 |
+
factor_keywords = {
|
| 1042 |
+
"school_proximity": ["school", "Γ©cole", "university", "campus", "education"],
|
| 1043 |
+
"park_proximity": ["park", "garden", "parc", "jardin", "green"],
|
| 1044 |
+
"beach_proximity": ["beach", "plage", "ocean", "sea", "waterfront"],
|
| 1045 |
+
"safety": ["safe", "secure", "security", "sΓ©curitΓ©", "gated", "guard"],
|
| 1046 |
+
"family_friendly": ["family", "children", "kids", "playground", "familial"],
|
| 1047 |
+
"luxury": ["luxury", "luxe", "premium", "high-end", "prestige", "elegant"],
|
| 1048 |
+
"modern": ["modern", "new", "renovated", "contemporary", "neuf"],
|
| 1049 |
+
"quiet": ["quiet", "peaceful", "calm", "tranquil", "calme"],
|
| 1050 |
+
}
|
| 1051 |
+
|
| 1052 |
+
keywords = factor_keywords.get(factor, [])
|
| 1053 |
+
matches = sum(1 for kw in keywords if kw in text)
|
| 1054 |
+
|
| 1055 |
+
if matches > 0:
|
| 1056 |
+
score = min(0.5 + (matches * 0.2), 1.0)
|
| 1057 |
+
|
| 1058 |
+
return score
|
| 1059 |
+
|
| 1060 |
+
async def _score_proximity_factor(
|
| 1061 |
+
self,
|
| 1062 |
+
listing: Dict,
|
| 1063 |
+
poi_type: str,
|
| 1064 |
+
location: str
|
| 1065 |
+
) -> float:
|
| 1066 |
+
"""
|
| 1067 |
+
Score a listing based on actual proximity to POIs using OpenStreetMap.
|
| 1068 |
+
|
| 1069 |
+
Scoring:
|
| 1070 |
+
- < 0.5 km: 1.0 (excellent)
|
| 1071 |
+
- 0.5 - 1 km: 0.9 (very good)
|
| 1072 |
+
- 1 - 2 km: 0.75 (good)
|
| 1073 |
+
- 2 - 3 km: 0.5 (average)
|
| 1074 |
+
- 3 - 5 km: 0.3 (below average)
|
| 1075 |
+
- > 5 km: 0.1 (poor)
|
| 1076 |
+
"""
|
| 1077 |
+
try:
|
| 1078 |
+
from app.ai.services.osm_poi_service import find_pois_overpass
|
| 1079 |
+
|
| 1080 |
+
listing_lat = listing.get("latitude")
|
| 1081 |
+
listing_lon = listing.get("longitude")
|
| 1082 |
+
|
| 1083 |
+
if not listing_lat or not listing_lon:
|
| 1084 |
+
return 0.5 # No coordinates, return neutral score
|
| 1085 |
+
|
| 1086 |
+
# Find nearby POIs
|
| 1087 |
+
pois = await find_pois_overpass(
|
| 1088 |
+
poi_type=poi_type,
|
| 1089 |
+
center_lat=listing_lat,
|
| 1090 |
+
center_lon=listing_lon,
|
| 1091 |
+
radius_km=5
|
| 1092 |
+
)
|
| 1093 |
+
|
| 1094 |
+
if not pois:
|
| 1095 |
+
return 0.3 # No POIs found nearby
|
| 1096 |
+
|
| 1097 |
+
# Find closest POI
|
| 1098 |
+
min_distance = float('inf')
|
| 1099 |
+
for poi in pois:
|
| 1100 |
+
dist = self._calculate_distance(
|
| 1101 |
+
listing_lat, listing_lon,
|
| 1102 |
+
poi.get("lat"), poi.get("lon")
|
| 1103 |
+
)
|
| 1104 |
+
min_distance = min(min_distance, dist)
|
| 1105 |
+
|
| 1106 |
+
# Score based on distance
|
| 1107 |
+
if min_distance < 0.5:
|
| 1108 |
+
return 1.0
|
| 1109 |
+
elif min_distance < 1:
|
| 1110 |
+
return 0.9
|
| 1111 |
+
elif min_distance < 2:
|
| 1112 |
+
return 0.75
|
| 1113 |
+
elif min_distance < 3:
|
| 1114 |
+
return 0.5
|
| 1115 |
+
elif min_distance < 5:
|
| 1116 |
+
return 0.3
|
| 1117 |
+
else:
|
| 1118 |
+
return 0.1
|
| 1119 |
+
|
| 1120 |
+
except Exception as e:
|
| 1121 |
+
logger.error(f"Proximity scoring failed: {e}")
|
| 1122 |
+
return 0.5 # Return neutral on error
|
| 1123 |
+
|
| 1124 |
+
def _calculate_distance(
|
| 1125 |
+
self,
|
| 1126 |
+
lat1: float,
|
| 1127 |
+
lon1: float,
|
| 1128 |
+
lat2: Optional[float],
|
| 1129 |
+
lon2: Optional[float]
|
| 1130 |
+
) -> float:
|
| 1131 |
+
"""
|
| 1132 |
+
Calculate distance between two points using Haversine formula.
|
| 1133 |
+
Returns distance in kilometers.
|
| 1134 |
+
"""
|
| 1135 |
+
import math
|
| 1136 |
+
|
| 1137 |
+
if lat2 is None or lon2 is None:
|
| 1138 |
+
return 999.0 # Return large distance for missing coordinates
|
| 1139 |
+
|
| 1140 |
+
R = 6371 # Earth's radius in km
|
| 1141 |
+
|
| 1142 |
+
lat1_rad = math.radians(lat1)
|
| 1143 |
+
lat2_rad = math.radians(lat2)
|
| 1144 |
+
delta_lat = math.radians(lat2 - lat1)
|
| 1145 |
+
delta_lon = math.radians(lon2 - lon1)
|
| 1146 |
+
|
| 1147 |
+
a = (math.sin(delta_lat/2)**2 +
|
| 1148 |
+
math.cos(lat1_rad) * math.cos(lat2_rad) * math.sin(delta_lon/2)**2)
|
| 1149 |
+
c = 2 * math.atan2(math.sqrt(a), math.sqrt(1-a))
|
| 1150 |
+
|
| 1151 |
+
return R * c
|
| 1152 |
+
|
| 1153 |
+
def _extract_json(self, text: str) -> Any:
|
| 1154 |
+
"""
|
| 1155 |
+
Extract JSON from LLM response text.
|
| 1156 |
+
"""
|
| 1157 |
+
import re
|
| 1158 |
+
|
| 1159 |
+
# Try to find JSON in the response
|
| 1160 |
+
json_match = re.search(r'[\[{][\s\S]*[\]}]', text)
|
| 1161 |
+
if json_match:
|
| 1162 |
+
return json.loads(json_match.group())
|
| 1163 |
+
raise ValueError("No JSON found in response")
|
| 1164 |
+
|
| 1165 |
+
|
| 1166 |
+
# =============================================================================
|
| 1167 |
+
# Singleton Instance
|
| 1168 |
+
# =============================================================================
|
| 1169 |
+
|
| 1170 |
+
_rlm_agent: Optional[RLMSearchAgent] = None
|
| 1171 |
+
|
| 1172 |
+
|
| 1173 |
+
def get_rlm_agent() -> RLMSearchAgent:
|
| 1174 |
+
"""Get or create the singleton RLM agent."""
|
| 1175 |
+
global _rlm_agent
|
| 1176 |
+
if _rlm_agent is None:
|
| 1177 |
+
_rlm_agent = RLMSearchAgent()
|
| 1178 |
+
return _rlm_agent
|
| 1179 |
+
|
| 1180 |
+
|
| 1181 |
+
# =============================================================================
|
| 1182 |
+
# Convenience Function
|
| 1183 |
+
# =============================================================================
|
| 1184 |
+
|
| 1185 |
+
async def rlm_search(query: str, context: Optional[Dict] = None) -> Dict[str, Any]:
|
| 1186 |
+
"""
|
| 1187 |
+
Convenience function for RLM search.
|
| 1188 |
+
|
| 1189 |
+
Usage:
|
| 1190 |
+
from app.ai.services.rlm_search_service import rlm_search
|
| 1191 |
+
|
| 1192 |
+
results = await rlm_search("3-bed near schools in Cotonou")
|
| 1193 |
+
"""
|
| 1194 |
+
agent = get_rlm_agent()
|
| 1195 |
+
return await agent.search(query, context)
|
| 1196 |
+
|
| 1197 |
+
|
| 1198 |
+
__all__ = [
|
| 1199 |
+
"RLMSearchAgent",
|
| 1200 |
+
"get_rlm_agent",
|
| 1201 |
+
"rlm_search"
|
| 1202 |
+
]
|
app/ai/services/search_strategy_selector.py
CHANGED
|
@@ -7,6 +7,13 @@ Strategies:
|
|
| 7 |
- QDRANT_ONLY: Pure semantic search (vague/descriptive queries)
|
| 8 |
- MONGO_THEN_QDRANT: Filter by location/price in MongoDB, then semantic search within results
|
| 9 |
- QDRANT_THEN_MONGO: Semantic search first, then apply MongoDB filters
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
"""
|
| 11 |
|
| 12 |
import logging
|
|
@@ -22,11 +29,19 @@ logger = logging.getLogger(__name__)
|
|
| 22 |
|
| 23 |
class SearchStrategy(str, Enum):
|
| 24 |
"""Available search strategies"""
|
|
|
|
| 25 |
MONGO_ONLY = "MONGO_ONLY"
|
| 26 |
QDRANT_ONLY = "QDRANT_ONLY"
|
| 27 |
MONGO_THEN_QDRANT = "MONGO_THEN_QDRANT"
|
| 28 |
QDRANT_THEN_MONGO = "QDRANT_THEN_MONGO"
|
| 29 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 30 |
|
| 31 |
# LLM for strategy selection
|
| 32 |
llm = ChatOpenAI(
|
|
@@ -81,19 +96,67 @@ Return ONLY valid JSON:
|
|
| 81 |
async def select_search_strategy(user_query: str, search_params: Dict) -> Dict:
|
| 82 |
"""
|
| 83 |
Select optimal search strategy based on query and extracted parameters.
|
| 84 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 85 |
Args:
|
| 86 |
user_query: Original user query
|
| 87 |
search_params: Extracted search parameters
|
| 88 |
-
|
| 89 |
Returns:
|
| 90 |
Dict with:
|
| 91 |
- strategy: SearchStrategy enum value
|
| 92 |
- reasoning: str
|
| 93 |
- has_semantic_features: bool
|
| 94 |
- has_structured_filters: bool
|
|
|
|
| 95 |
"""
|
| 96 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 97 |
# Quick heuristics for obvious cases
|
| 98 |
has_location = bool(search_params.get("location"))
|
| 99 |
has_price = bool(search_params.get("min_price") or search_params.get("max_price"))
|
|
@@ -101,9 +164,9 @@ async def select_search_strategy(user_query: str, search_params: Dict) -> Dict:
|
|
| 101 |
has_bathrooms = bool(search_params.get("bathrooms"))
|
| 102 |
has_listing_type = bool(search_params.get("listing_type"))
|
| 103 |
has_amenities = bool(search_params.get("amenities") and len(search_params.get("amenities", [])) > 0)
|
| 104 |
-
|
| 105 |
structured_count = sum([has_location, has_price, has_bedrooms, has_bathrooms, has_listing_type])
|
| 106 |
-
|
| 107 |
# Detect semantic keywords in query
|
| 108 |
semantic_keywords = [
|
| 109 |
"close to", "near", "nearby", "walking distance",
|
|
@@ -117,10 +180,9 @@ async def select_search_strategy(user_query: str, search_params: Dict) -> Dict:
|
|
| 117 |
"good vibes", "nice area", "good neighborhood",
|
| 118 |
"beach", "school", "market", "downtown", "city center",
|
| 119 |
]
|
| 120 |
-
|
| 121 |
-
query_lower = user_query.lower()
|
| 122 |
has_semantic = any(keyword in query_lower for keyword in semantic_keywords)
|
| 123 |
-
|
| 124 |
# Simple rule-based decision for clear cases
|
| 125 |
if structured_count >= 2 and not has_semantic and not has_amenities:
|
| 126 |
# Pure structured query
|
|
@@ -128,25 +190,28 @@ async def select_search_strategy(user_query: str, search_params: Dict) -> Dict:
|
|
| 128 |
"strategy": SearchStrategy.MONGO_ONLY,
|
| 129 |
"reasoning": "Query has multiple structured filters and no semantic features",
|
| 130 |
"has_semantic_features": False,
|
| 131 |
-
"has_structured_filters": True
|
|
|
|
| 132 |
}
|
| 133 |
-
|
| 134 |
if structured_count == 0 and (has_semantic or has_amenities):
|
| 135 |
# Pure semantic query
|
| 136 |
return {
|
| 137 |
"strategy": SearchStrategy.QDRANT_ONLY,
|
| 138 |
"reasoning": "Query is purely semantic/descriptive with no structured filters",
|
| 139 |
"has_semantic_features": True,
|
| 140 |
-
"has_structured_filters": False
|
|
|
|
| 141 |
}
|
| 142 |
-
|
| 143 |
if has_location and has_semantic:
|
| 144 |
# Location + semantic features
|
| 145 |
return {
|
| 146 |
"strategy": SearchStrategy.MONGO_THEN_QDRANT,
|
| 147 |
"reasoning": "Query has location filter and semantic features - filter by location first, then semantic search",
|
| 148 |
"has_semantic_features": True,
|
| 149 |
-
"has_structured_filters": True
|
|
|
|
| 150 |
}
|
| 151 |
|
| 152 |
# Use LLM for complex cases
|
|
@@ -171,13 +236,15 @@ async def select_search_strategy(user_query: str, search_params: Dict) -> Dict:
|
|
| 171 |
"strategy": SearchStrategy.MONGO_ONLY,
|
| 172 |
"reasoning": "Strategy selection failed, using MongoDB filters",
|
| 173 |
"has_semantic_features": False,
|
| 174 |
-
"has_structured_filters": True
|
|
|
|
| 175 |
}
|
| 176 |
-
|
| 177 |
result = validation.data
|
|
|
|
| 178 |
logger.info(f"Strategy selected: {result.get('strategy')} - {result.get('reasoning')}")
|
| 179 |
return result
|
| 180 |
-
|
| 181 |
except Exception as e:
|
| 182 |
logger.error(f"Strategy selection error: {e}")
|
| 183 |
# Default to MONGO_ONLY on error
|
|
@@ -185,5 +252,6 @@ async def select_search_strategy(user_query: str, search_params: Dict) -> Dict:
|
|
| 185 |
"strategy": SearchStrategy.MONGO_ONLY,
|
| 186 |
"reasoning": "Strategy selection error, defaulting to MongoDB",
|
| 187 |
"has_semantic_features": False,
|
| 188 |
-
"has_structured_filters": True
|
|
|
|
| 189 |
}
|
|
|
|
| 7 |
- QDRANT_ONLY: Pure semantic search (vague/descriptive queries)
|
| 8 |
- MONGO_THEN_QDRANT: Filter by location/price in MongoDB, then semantic search within results
|
| 9 |
- QDRANT_THEN_MONGO: Semantic search first, then apply MongoDB filters
|
| 10 |
+
|
| 11 |
+
RLM Strategies (Recursive Language Model):
|
| 12 |
+
- RLM_MULTI_HOP: "near schools", "close to beach" - requires finding POI first
|
| 13 |
+
- RLM_BOOLEAN_OR: "under 500k OR has pool" - complex OR logic
|
| 14 |
+
- RLM_COMPARATIVE: "compare Cotonou vs Calavi" - multi-location comparison
|
| 15 |
+
- RLM_AGGREGATION: "average price", "how many" - data aggregation
|
| 16 |
+
- RLM_MULTI_FACTOR: "best family apartment" - multi-criteria ranking
|
| 17 |
"""
|
| 18 |
|
| 19 |
import logging
|
|
|
|
| 29 |
|
| 30 |
class SearchStrategy(str, Enum):
|
| 31 |
"""Available search strategies"""
|
| 32 |
+
# Traditional strategies
|
| 33 |
MONGO_ONLY = "MONGO_ONLY"
|
| 34 |
QDRANT_ONLY = "QDRANT_ONLY"
|
| 35 |
MONGO_THEN_QDRANT = "MONGO_THEN_QDRANT"
|
| 36 |
QDRANT_THEN_MONGO = "QDRANT_THEN_MONGO"
|
| 37 |
|
| 38 |
+
# RLM (Recursive Language Model) strategies
|
| 39 |
+
RLM_MULTI_HOP = "RLM_MULTI_HOP" # "near X", "close to Y"
|
| 40 |
+
RLM_BOOLEAN_OR = "RLM_BOOLEAN_OR" # "X OR Y"
|
| 41 |
+
RLM_COMPARATIVE = "RLM_COMPARATIVE" # "compare A vs B"
|
| 42 |
+
RLM_AGGREGATION = "RLM_AGGREGATION" # "average", "count"
|
| 43 |
+
RLM_MULTI_FACTOR = "RLM_MULTI_FACTOR" # multi-criteria ranking
|
| 44 |
+
|
| 45 |
|
| 46 |
# LLM for strategy selection
|
| 47 |
llm = ChatOpenAI(
|
|
|
|
| 96 |
async def select_search_strategy(user_query: str, search_params: Dict) -> Dict:
|
| 97 |
"""
|
| 98 |
Select optimal search strategy based on query and extracted parameters.
|
| 99 |
+
|
| 100 |
+
PRIORITY ORDER:
|
| 101 |
+
1. Check for RLM-appropriate queries (complex multi-hop, OR, comparative)
|
| 102 |
+
2. Fall back to traditional strategies for simple queries
|
| 103 |
+
|
| 104 |
Args:
|
| 105 |
user_query: Original user query
|
| 106 |
search_params: Extracted search parameters
|
| 107 |
+
|
| 108 |
Returns:
|
| 109 |
Dict with:
|
| 110 |
- strategy: SearchStrategy enum value
|
| 111 |
- reasoning: str
|
| 112 |
- has_semantic_features: bool
|
| 113 |
- has_structured_filters: bool
|
| 114 |
+
- use_rlm: bool (NEW)
|
| 115 |
"""
|
| 116 |
+
query_lower = user_query.lower()
|
| 117 |
+
|
| 118 |
+
# =========================================================================
|
| 119 |
+
# STEP 1: Check for RLM-appropriate queries FIRST
|
| 120 |
+
# =========================================================================
|
| 121 |
+
try:
|
| 122 |
+
from app.ai.services.rlm_query_analyzer import analyze_query_complexity, QueryComplexity
|
| 123 |
+
|
| 124 |
+
rlm_analysis = analyze_query_complexity(user_query)
|
| 125 |
+
|
| 126 |
+
if rlm_analysis.use_rlm:
|
| 127 |
+
# Map QueryComplexity to SearchStrategy
|
| 128 |
+
rlm_strategy_map = {
|
| 129 |
+
QueryComplexity.MULTI_HOP: SearchStrategy.RLM_MULTI_HOP,
|
| 130 |
+
QueryComplexity.BOOLEAN_OR: SearchStrategy.RLM_BOOLEAN_OR,
|
| 131 |
+
QueryComplexity.COMPARATIVE: SearchStrategy.RLM_COMPARATIVE,
|
| 132 |
+
QueryComplexity.AGGREGATION: SearchStrategy.RLM_AGGREGATION,
|
| 133 |
+
QueryComplexity.MULTI_FACTOR: SearchStrategy.RLM_MULTI_FACTOR,
|
| 134 |
+
}
|
| 135 |
+
|
| 136 |
+
strategy = rlm_strategy_map.get(rlm_analysis.complexity)
|
| 137 |
+
if strategy:
|
| 138 |
+
logger.info(
|
| 139 |
+
f"RLM strategy selected: {strategy.value}",
|
| 140 |
+
query=user_query[:50],
|
| 141 |
+
confidence=rlm_analysis.confidence
|
| 142 |
+
)
|
| 143 |
+
return {
|
| 144 |
+
"strategy": strategy,
|
| 145 |
+
"reasoning": rlm_analysis.reasoning,
|
| 146 |
+
"has_semantic_features": True,
|
| 147 |
+
"has_structured_filters": True,
|
| 148 |
+
"use_rlm": True,
|
| 149 |
+
"rlm_analysis": rlm_analysis.model_dump()
|
| 150 |
+
}
|
| 151 |
+
except ImportError:
|
| 152 |
+
logger.warning("RLM module not available, using traditional strategies")
|
| 153 |
+
except Exception as e:
|
| 154 |
+
logger.error(f"RLM analysis failed: {e}, falling back to traditional")
|
| 155 |
+
|
| 156 |
+
# =========================================================================
|
| 157 |
+
# STEP 2: Traditional strategy selection (for simple queries)
|
| 158 |
+
# =========================================================================
|
| 159 |
+
|
| 160 |
# Quick heuristics for obvious cases
|
| 161 |
has_location = bool(search_params.get("location"))
|
| 162 |
has_price = bool(search_params.get("min_price") or search_params.get("max_price"))
|
|
|
|
| 164 |
has_bathrooms = bool(search_params.get("bathrooms"))
|
| 165 |
has_listing_type = bool(search_params.get("listing_type"))
|
| 166 |
has_amenities = bool(search_params.get("amenities") and len(search_params.get("amenities", [])) > 0)
|
| 167 |
+
|
| 168 |
structured_count = sum([has_location, has_price, has_bedrooms, has_bathrooms, has_listing_type])
|
| 169 |
+
|
| 170 |
# Detect semantic keywords in query
|
| 171 |
semantic_keywords = [
|
| 172 |
"close to", "near", "nearby", "walking distance",
|
|
|
|
| 180 |
"good vibes", "nice area", "good neighborhood",
|
| 181 |
"beach", "school", "market", "downtown", "city center",
|
| 182 |
]
|
| 183 |
+
|
|
|
|
| 184 |
has_semantic = any(keyword in query_lower for keyword in semantic_keywords)
|
| 185 |
+
|
| 186 |
# Simple rule-based decision for clear cases
|
| 187 |
if structured_count >= 2 and not has_semantic and not has_amenities:
|
| 188 |
# Pure structured query
|
|
|
|
| 190 |
"strategy": SearchStrategy.MONGO_ONLY,
|
| 191 |
"reasoning": "Query has multiple structured filters and no semantic features",
|
| 192 |
"has_semantic_features": False,
|
| 193 |
+
"has_structured_filters": True,
|
| 194 |
+
"use_rlm": False
|
| 195 |
}
|
| 196 |
+
|
| 197 |
if structured_count == 0 and (has_semantic or has_amenities):
|
| 198 |
# Pure semantic query
|
| 199 |
return {
|
| 200 |
"strategy": SearchStrategy.QDRANT_ONLY,
|
| 201 |
"reasoning": "Query is purely semantic/descriptive with no structured filters",
|
| 202 |
"has_semantic_features": True,
|
| 203 |
+
"has_structured_filters": False,
|
| 204 |
+
"use_rlm": False
|
| 205 |
}
|
| 206 |
+
|
| 207 |
if has_location and has_semantic:
|
| 208 |
# Location + semantic features
|
| 209 |
return {
|
| 210 |
"strategy": SearchStrategy.MONGO_THEN_QDRANT,
|
| 211 |
"reasoning": "Query has location filter and semantic features - filter by location first, then semantic search",
|
| 212 |
"has_semantic_features": True,
|
| 213 |
+
"has_structured_filters": True,
|
| 214 |
+
"use_rlm": False
|
| 215 |
}
|
| 216 |
|
| 217 |
# Use LLM for complex cases
|
|
|
|
| 236 |
"strategy": SearchStrategy.MONGO_ONLY,
|
| 237 |
"reasoning": "Strategy selection failed, using MongoDB filters",
|
| 238 |
"has_semantic_features": False,
|
| 239 |
+
"has_structured_filters": True,
|
| 240 |
+
"use_rlm": False
|
| 241 |
}
|
| 242 |
+
|
| 243 |
result = validation.data
|
| 244 |
+
result["use_rlm"] = False # LLM-selected strategies are not RLM
|
| 245 |
logger.info(f"Strategy selected: {result.get('strategy')} - {result.get('reasoning')}")
|
| 246 |
return result
|
| 247 |
+
|
| 248 |
except Exception as e:
|
| 249 |
logger.error(f"Strategy selection error: {e}")
|
| 250 |
# Default to MONGO_ONLY on error
|
|
|
|
| 252 |
"strategy": SearchStrategy.MONGO_ONLY,
|
| 253 |
"reasoning": "Strategy selection error, defaulting to MongoDB",
|
| 254 |
"has_semantic_features": False,
|
| 255 |
+
"has_structured_filters": True,
|
| 256 |
+
"use_rlm": False
|
| 257 |
}
|
app/ai/services/vision_service.py
DELETED
|
@@ -1,697 +0,0 @@
|
|
| 1 |
-
# ============================================================
|
| 2 |
-
# app/ai/services/vision_service.py
|
| 3 |
-
# Vision AI Service for Property Image Analysis
|
| 4 |
-
# Uses Hugging Face Inference API (Moondream2 model)
|
| 5 |
-
# ============================================================
|
| 6 |
-
|
| 7 |
-
import io
|
| 8 |
-
import os
|
| 9 |
-
import base64
|
| 10 |
-
import logging
|
| 11 |
-
from typing import Dict, List, Optional, Tuple
|
| 12 |
-
from PIL import Image
|
| 13 |
-
import requests
|
| 14 |
-
import cv2
|
| 15 |
-
import numpy as np
|
| 16 |
-
import tempfile
|
| 17 |
-
from app.config import settings
|
| 18 |
-
|
| 19 |
-
logger = logging.getLogger(__name__)
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
class VisionService:
|
| 23 |
-
"""Service for analyzing property images using HuggingFace Inference API (BLIP - FREE)"""
|
| 24 |
-
|
| 25 |
-
def __init__(self):
|
| 26 |
-
# BLIP image captioning works with HuggingFace FREE Inference API
|
| 27 |
-
# No special providers needed - uses standard inference endpoint
|
| 28 |
-
self.hf_token = settings.HF_TOKEN or settings.HUGGINGFACE_API_KEY
|
| 29 |
-
self.model_id = settings.HF_VISION_MODEL # Salesforce/blip-image-captioning-large
|
| 30 |
-
# Standard HuggingFace Inference API endpoint (works with BLIP!)
|
| 31 |
-
self.api_url = f"https://api-inference.huggingface.co/models/{self.model_id}"
|
| 32 |
-
self.headers = {
|
| 33 |
-
"Authorization": f"Bearer {self.hf_token}",
|
| 34 |
-
"Content-Type": "application/json"
|
| 35 |
-
}
|
| 36 |
-
self.property_confidence_threshold = settings.PROPERTY_IMAGE_MIN_CONFIDENCE
|
| 37 |
-
logger.info(f"π§ Vision Service initialized with HF Inference: {self.model_id}")
|
| 38 |
-
|
| 39 |
-
# ============================================================
|
| 40 |
-
# Core Image Validation & Analysis
|
| 41 |
-
# ============================================================
|
| 42 |
-
|
| 43 |
-
def validate_property_image(self, image_bytes: bytes) -> Tuple[bool, float, str]:
|
| 44 |
-
"""
|
| 45 |
-
Validate if image is property-related before uploading
|
| 46 |
-
|
| 47 |
-
Args:
|
| 48 |
-
image_bytes: Raw image bytes
|
| 49 |
-
|
| 50 |
-
Returns:
|
| 51 |
-
Tuple of (is_valid, confidence, message)
|
| 52 |
-
"""
|
| 53 |
-
try:
|
| 54 |
-
# Check if image is readable
|
| 55 |
-
image = Image.open(io.BytesIO(image_bytes))
|
| 56 |
-
image_rgb = image.convert("RGB")
|
| 57 |
-
|
| 58 |
-
# Query vision model to check if it's a property
|
| 59 |
-
payload = {
|
| 60 |
-
"inputs": image_rgb,
|
| 61 |
-
"question": (
|
| 62 |
-
"Is this image a photo of a real property (house, apartment, room, "
|
| 63 |
-
"office, land, or commercial building)? Answer only yes or no."
|
| 64 |
-
),
|
| 65 |
-
}
|
| 66 |
-
|
| 67 |
-
response = self._query_hf_api(payload)
|
| 68 |
-
|
| 69 |
-
if not response:
|
| 70 |
-
return False, 0.0, "Failed to process image"
|
| 71 |
-
|
| 72 |
-
answer = response.strip().lower()
|
| 73 |
-
is_property = "yes" in answer or "this is a property" in answer.lower()
|
| 74 |
-
|
| 75 |
-
# Assign confidence based on response clarity
|
| 76 |
-
confidence = 0.95 if is_property else 0.5
|
| 77 |
-
|
| 78 |
-
if is_property:
|
| 79 |
-
return (
|
| 80 |
-
True,
|
| 81 |
-
confidence,
|
| 82 |
-
"Property image validated successfully"
|
| 83 |
-
)
|
| 84 |
-
else:
|
| 85 |
-
return (
|
| 86 |
-
False,
|
| 87 |
-
confidence,
|
| 88 |
-
"This doesn't look like a property photo. Please upload images of "
|
| 89 |
-
"actual properties (houses, apartments, rooms, offices, or land)."
|
| 90 |
-
)
|
| 91 |
-
|
| 92 |
-
except Exception as e:
|
| 93 |
-
logger.error(f"Error validating property image: {str(e)}")
|
| 94 |
-
return False, 0.0, f"Error processing image: {str(e)}"
|
| 95 |
-
|
| 96 |
-
# ============================================================
|
| 97 |
-
# Property Field Extraction
|
| 98 |
-
# ============================================================
|
| 99 |
-
|
| 100 |
-
def extract_property_fields(
|
| 101 |
-
self,
|
| 102 |
-
image_bytes: bytes,
|
| 103 |
-
location: Optional[str] = None,
|
| 104 |
-
fast_validate: bool = False
|
| 105 |
-
) -> Dict:
|
| 106 |
-
"""
|
| 107 |
-
Extract property listing fields from image
|
| 108 |
-
|
| 109 |
-
Args:
|
| 110 |
-
image_bytes: Raw image bytes
|
| 111 |
-
location: Optional location context (helps with accuracy)
|
| 112 |
-
fast_validate: If True, only generate title (skip detailed extraction)
|
| 113 |
-
Use when image is complementary to text listing
|
| 114 |
-
|
| 115 |
-
Returns:
|
| 116 |
-
Dict with extracted fields and confidence scores
|
| 117 |
-
"""
|
| 118 |
-
try:
|
| 119 |
-
image = Image.open(io.BytesIO(image_bytes))
|
| 120 |
-
image_rgb = image.convert("RGB")
|
| 121 |
-
|
| 122 |
-
extracted = {
|
| 123 |
-
"bedrooms": None,
|
| 124 |
-
"bathrooms": None,
|
| 125 |
-
"amenities": [],
|
| 126 |
-
"description": "",
|
| 127 |
-
"title": "",
|
| 128 |
-
"confidence": {}
|
| 129 |
-
}
|
| 130 |
-
|
| 131 |
-
# ============================================================
|
| 132 |
-
# FAST VALIDATE MODE: Only generate title, skip extraction
|
| 133 |
-
# Used when user has already provided details via text
|
| 134 |
-
# ============================================================
|
| 135 |
-
|
| 136 |
-
if fast_validate:
|
| 137 |
-
logger.info("π Fast validation mode: Generating title only")
|
| 138 |
-
title_data = self._generate_title(
|
| 139 |
-
image_rgb,
|
| 140 |
-
bedrooms=None,
|
| 141 |
-
bathrooms=None,
|
| 142 |
-
location=location
|
| 143 |
-
)
|
| 144 |
-
extracted["title"] = title_data.get("title", "Property Image")
|
| 145 |
-
extracted["confidence"]["title"] = title_data.get("confidence", 0.8)
|
| 146 |
-
extracted["fast_validated"] = True
|
| 147 |
-
return extracted
|
| 148 |
-
|
| 149 |
-
# ============================================================
|
| 150 |
-
# FULL EXTRACTION MODE: Extract all details for new listing
|
| 151 |
-
# ============================================================
|
| 152 |
-
|
| 153 |
-
# Query 1: Count rooms (bedrooms + bathrooms)
|
| 154 |
-
rooms_data = self._extract_room_count(image_rgb)
|
| 155 |
-
extracted["bedrooms"] = rooms_data.get("bedrooms")
|
| 156 |
-
extracted["bathrooms"] = rooms_data.get("bathrooms")
|
| 157 |
-
extracted["confidence"].update({
|
| 158 |
-
"bedrooms": rooms_data.get("bedroom_confidence", 0.0),
|
| 159 |
-
"bathrooms": rooms_data.get("bathroom_confidence", 0.0)
|
| 160 |
-
})
|
| 161 |
-
|
| 162 |
-
# Query 2: Detect amenities
|
| 163 |
-
amenities_data = self._detect_amenities(image_rgb)
|
| 164 |
-
extracted["amenities"] = amenities_data.get("amenities", [])
|
| 165 |
-
extracted["confidence"]["amenities"] = amenities_data.get("confidence", 0.0)
|
| 166 |
-
|
| 167 |
-
# Query 3: Generate description
|
| 168 |
-
description_data = self._generate_description(image_rgb)
|
| 169 |
-
extracted["description"] = description_data.get("description", "")
|
| 170 |
-
extracted["confidence"]["description"] = description_data.get("confidence", 0.0)
|
| 171 |
-
|
| 172 |
-
# Query 4: Generate SHORT title (max 2 sentences)
|
| 173 |
-
title_data = self._generate_title(
|
| 174 |
-
image_rgb,
|
| 175 |
-
bedrooms=extracted.get("bedrooms"),
|
| 176 |
-
bathrooms=extracted.get("bathrooms"),
|
| 177 |
-
location=location
|
| 178 |
-
)
|
| 179 |
-
extracted["title"] = title_data.get("title", "")
|
| 180 |
-
extracted["confidence"]["title"] = title_data.get("confidence", 0.0)
|
| 181 |
-
|
| 182 |
-
return extracted
|
| 183 |
-
|
| 184 |
-
except Exception as e:
|
| 185 |
-
logger.error(f"Error extracting property fields: {str(e)}")
|
| 186 |
-
return {
|
| 187 |
-
"bedrooms": None,
|
| 188 |
-
"bathrooms": None,
|
| 189 |
-
"amenities": [],
|
| 190 |
-
"description": "",
|
| 191 |
-
"title": "",
|
| 192 |
-
"confidence": {},
|
| 193 |
-
"error": str(e)
|
| 194 |
-
}
|
| 195 |
-
|
| 196 |
-
# ============================================================
|
| 197 |
-
# Specific Field Extraction Methods
|
| 198 |
-
# ============================================================
|
| 199 |
-
|
| 200 |
-
def _extract_room_count(self, image: Image.Image) -> Dict:
|
| 201 |
-
"""
|
| 202 |
-
Extract bedroom and bathroom count (matches listing schema)
|
| 203 |
-
|
| 204 |
-
Returns bedrooms and bathrooms as integers (not property_type)
|
| 205 |
-
"""
|
| 206 |
-
try:
|
| 207 |
-
payload = {
|
| 208 |
-
"inputs": image,
|
| 209 |
-
"question": (
|
| 210 |
-
"Count the number of bedrooms and bathrooms you can see in this property photo. "
|
| 211 |
-
"Only count what you can clearly identify. "
|
| 212 |
-
"Format: bedrooms: [number], bathrooms: [number]"
|
| 213 |
-
),
|
| 214 |
-
}
|
| 215 |
-
|
| 216 |
-
response = self._query_hf_api(payload)
|
| 217 |
-
|
| 218 |
-
bedrooms = None
|
| 219 |
-
bathrooms = None
|
| 220 |
-
bedroom_conf = 0.0
|
| 221 |
-
bathroom_conf = 0.0
|
| 222 |
-
|
| 223 |
-
if response:
|
| 224 |
-
response_lower = response.lower()
|
| 225 |
-
|
| 226 |
-
# Extract bedrooms
|
| 227 |
-
if "bedrooms:" in response_lower or "bedroom:" in response_lower:
|
| 228 |
-
try:
|
| 229 |
-
# Handle both "bedrooms:" and "bedroom:"
|
| 230 |
-
if "bedrooms:" in response_lower:
|
| 231 |
-
bed_str = response_lower.split("bedrooms:")[1].split(",")[0].strip()
|
| 232 |
-
else:
|
| 233 |
-
bed_str = response_lower.split("bedroom:")[1].split(",")[0].strip()
|
| 234 |
-
|
| 235 |
-
# Extract first number found
|
| 236 |
-
numbers = ''.join(filter(str.isdigit, bed_str))
|
| 237 |
-
if numbers:
|
| 238 |
-
bedrooms = int(numbers)
|
| 239 |
-
bedroom_conf = 0.80 # Good confidence if extracted
|
| 240 |
-
except Exception as e:
|
| 241 |
-
logger.debug(f"Failed to parse bedrooms: {e}")
|
| 242 |
-
bedroom_conf = 0.2
|
| 243 |
-
|
| 244 |
-
# Extract bathrooms
|
| 245 |
-
if "bathrooms:" in response_lower or "bathroom:" in response_lower:
|
| 246 |
-
try:
|
| 247 |
-
if "bathrooms:" in response_lower:
|
| 248 |
-
bath_str = response_lower.split("bathrooms:")[1].strip()
|
| 249 |
-
else:
|
| 250 |
-
bath_str = response_lower.split("bathroom:")[1].strip()
|
| 251 |
-
|
| 252 |
-
numbers = ''.join(filter(str.isdigit, bath_str))
|
| 253 |
-
if numbers:
|
| 254 |
-
bathrooms = int(numbers)
|
| 255 |
-
bathroom_conf = 0.80
|
| 256 |
-
except Exception as e:
|
| 257 |
-
logger.debug(f"Failed to parse bathrooms: {e}")
|
| 258 |
-
bathroom_conf = 0.2
|
| 259 |
-
|
| 260 |
-
return {
|
| 261 |
-
"bedrooms": bedrooms,
|
| 262 |
-
"bathrooms": bathrooms,
|
| 263 |
-
"bedroom_confidence": bedroom_conf,
|
| 264 |
-
"bathroom_confidence": bathroom_conf
|
| 265 |
-
}
|
| 266 |
-
|
| 267 |
-
except Exception as e:
|
| 268 |
-
logger.error(f"Error extracting room count: {str(e)}")
|
| 269 |
-
return {
|
| 270 |
-
"bedrooms": None,
|
| 271 |
-
"bathrooms": None,
|
| 272 |
-
"bedroom_confidence": 0.0,
|
| 273 |
-
"bathroom_confidence": 0.0
|
| 274 |
-
}
|
| 275 |
-
|
| 276 |
-
def _detect_amenities(self, image: Image.Image) -> Dict:
|
| 277 |
-
"""
|
| 278 |
-
Detect amenities visible in property (matches listing schema)
|
| 279 |
-
|
| 280 |
-
Amenities is a simple list of strings in the listing model.
|
| 281 |
-
Common amenities: balcony, pool, parking, garden, gym, wifi, AC, security
|
| 282 |
-
"""
|
| 283 |
-
try:
|
| 284 |
-
payload = {
|
| 285 |
-
"inputs": image,
|
| 286 |
-
"question": (
|
| 287 |
-
"What amenities can you see in this property? "
|
| 288 |
-
"List only what is clearly visible. Examples: "
|
| 289 |
-
"balcony, pool, parking, garden, gym, wifi router, AC unit, security gate, "
|
| 290 |
-
"furnished, modern kitchen, etc. "
|
| 291 |
-
"If nothing special, say 'none'."
|
| 292 |
-
),
|
| 293 |
-
}
|
| 294 |
-
|
| 295 |
-
response = self._query_hf_api(payload)
|
| 296 |
-
amenities = []
|
| 297 |
-
confidence = 0.5
|
| 298 |
-
|
| 299 |
-
if response and response.lower().strip() not in ["none", "none."]:
|
| 300 |
-
# Split by common separators (comma, and, newline)
|
| 301 |
-
import re
|
| 302 |
-
# Replace "and" with comma for easier splitting
|
| 303 |
-
cleaned = response.replace(" and ", ", ")
|
| 304 |
-
# Split by comma or newline
|
| 305 |
-
parts = re.split(r'[,\n]', cleaned)
|
| 306 |
-
|
| 307 |
-
# Clean and filter amenities
|
| 308 |
-
for amenity in parts:
|
| 309 |
-
amenity = amenity.strip().lower()
|
| 310 |
-
# Remove numbers, bullets, dashes at start
|
| 311 |
-
amenity = re.sub(r'^[\d\-β’\*\.\)]+\s*', '', amenity)
|
| 312 |
-
# Skip empty or too short
|
| 313 |
-
if amenity and len(amenity) > 2:
|
| 314 |
-
amenities.append(amenity)
|
| 315 |
-
|
| 316 |
-
# Remove duplicates while preserving order
|
| 317 |
-
amenities = list(dict.fromkeys(amenities))
|
| 318 |
-
confidence = 0.70 if amenities else 0.3
|
| 319 |
-
|
| 320 |
-
return {
|
| 321 |
-
"amenities": amenities,
|
| 322 |
-
"confidence": confidence
|
| 323 |
-
}
|
| 324 |
-
|
| 325 |
-
except Exception as e:
|
| 326 |
-
logger.error(f"Error detecting amenities: {str(e)}")
|
| 327 |
-
return {"amenities": [], "confidence": 0.0}
|
| 328 |
-
|
| 329 |
-
def _generate_description(self, image: Image.Image) -> Dict:
|
| 330 |
-
"""
|
| 331 |
-
Generate brief property description (matches listing schema)
|
| 332 |
-
|
| 333 |
-
Note: The listing flow will later use LLM to generate full title/description
|
| 334 |
-
based on all provided fields. This is just initial extraction from image.
|
| 335 |
-
"""
|
| 336 |
-
try:
|
| 337 |
-
payload = {
|
| 338 |
-
"inputs": image,
|
| 339 |
-
"question": (
|
| 340 |
-
"Describe what you see in this property photo in 1-2 sentences. "
|
| 341 |
-
"Focus on visible features: room type, condition, style, notable features. "
|
| 342 |
-
"Be factual and concise."
|
| 343 |
-
),
|
| 344 |
-
}
|
| 345 |
-
|
| 346 |
-
response = self._query_hf_api(payload)
|
| 347 |
-
|
| 348 |
-
# Limit length
|
| 349 |
-
if response and len(response) > 150:
|
| 350 |
-
response = response[:147] + "..."
|
| 351 |
-
|
| 352 |
-
return {
|
| 353 |
-
"description": response or "",
|
| 354 |
-
"confidence": 0.70 if response else 0.0
|
| 355 |
-
}
|
| 356 |
-
|
| 357 |
-
except Exception as e:
|
| 358 |
-
logger.error(f"Error generating description: {str(e)}")
|
| 359 |
-
return {"description": "", "confidence": 0.0}
|
| 360 |
-
|
| 361 |
-
def _generate_title(self, image: Image.Image, bedrooms: int = None, bathrooms: int = None, location: str = None) -> Dict:
|
| 362 |
-
"""
|
| 363 |
-
Generate simple property title (matches listing schema)
|
| 364 |
-
|
| 365 |
-
Note: The listing flow will later generate SEO-optimized title using LLM
|
| 366 |
-
based on all fields. This is just placeholder from image.
|
| 367 |
-
"""
|
| 368 |
-
try:
|
| 369 |
-
# Build basic context
|
| 370 |
-
room_info = ""
|
| 371 |
-
if bedrooms is not None:
|
| 372 |
-
room_info = f"{bedrooms}-bedroom property"
|
| 373 |
-
elif bathrooms is not None:
|
| 374 |
-
room_info = "Property"
|
| 375 |
-
else:
|
| 376 |
-
room_info = "Property listing"
|
| 377 |
-
|
| 378 |
-
payload = {
|
| 379 |
-
"inputs": image,
|
| 380 |
-
"question": (
|
| 381 |
-
f"Generate a short title for this property photo. "
|
| 382 |
-
f"It appears to be a {room_info}. "
|
| 383 |
-
"Keep it under 50 characters. Example: 'Modern apartment with balcony'"
|
| 384 |
-
),
|
| 385 |
-
}
|
| 386 |
-
|
| 387 |
-
response = self._query_hf_api(payload)
|
| 388 |
-
|
| 389 |
-
# Ensure it's short
|
| 390 |
-
if response and len(response) > 80:
|
| 391 |
-
response = response[:77] + "..."
|
| 392 |
-
|
| 393 |
-
# Fallback if no response
|
| 394 |
-
if not response:
|
| 395 |
-
if bedrooms:
|
| 396 |
-
response = f"{bedrooms}-Bedroom Property"
|
| 397 |
-
else:
|
| 398 |
-
response = "Property Listing"
|
| 399 |
-
|
| 400 |
-
return {
|
| 401 |
-
"title": response or "Property Listing",
|
| 402 |
-
"confidence": 0.60 if response else 0.3
|
| 403 |
-
}
|
| 404 |
-
|
| 405 |
-
except Exception as e:
|
| 406 |
-
logger.error(f"Error generating title: {str(e)}")
|
| 407 |
-
return {"title": "Property Listing", "confidence": 0.3}
|
| 408 |
-
|
| 409 |
-
# ============================================================
|
| 410 |
-
# Hugging Face API Communication
|
| 411 |
-
# ============================================================
|
| 412 |
-
|
| 413 |
-
def _query_hf_api(self, payload: Dict) -> Optional[str]:
|
| 414 |
-
"""
|
| 415 |
-
Query HuggingFace Inference API for image captioning (BLIP model).
|
| 416 |
-
Works with FREE HuggingFace Inference API - no special providers needed!
|
| 417 |
-
|
| 418 |
-
Args:
|
| 419 |
-
payload: Dict with "inputs" (PIL Image) and "question" (str - optional prompt)
|
| 420 |
-
|
| 421 |
-
Returns:
|
| 422 |
-
Response text (caption) or None
|
| 423 |
-
"""
|
| 424 |
-
try:
|
| 425 |
-
if not self.hf_token:
|
| 426 |
-
logger.error("HF_TOKEN not set!")
|
| 427 |
-
return None
|
| 428 |
-
|
| 429 |
-
# BLIP accepts raw image bytes directly
|
| 430 |
-
if isinstance(payload.get("inputs"), Image.Image):
|
| 431 |
-
# Convert PIL Image to bytes
|
| 432 |
-
image_buffer = io.BytesIO()
|
| 433 |
-
payload["inputs"].save(image_buffer, format="JPEG")
|
| 434 |
-
image_bytes = image_buffer.getvalue()
|
| 435 |
-
|
| 436 |
-
# Optional: Add question/prompt for conditional captioning
|
| 437 |
-
# BLIP supports this via inputs parameter
|
| 438 |
-
question = payload.get("question", "")
|
| 439 |
-
|
| 440 |
-
# For BLIP, we send image bytes directly
|
| 441 |
-
# The question can be sent as a query parameter or ignored
|
| 442 |
-
response = requests.post(
|
| 443 |
-
self.api_url,
|
| 444 |
-
headers={"Authorization": f"Bearer {self.hf_token}"},
|
| 445 |
-
data=image_bytes,
|
| 446 |
-
timeout=60
|
| 447 |
-
)
|
| 448 |
-
else:
|
| 449 |
-
# Text-only query not supported for image captioning
|
| 450 |
-
logger.warning("BLIP requires an image input")
|
| 451 |
-
return None
|
| 452 |
-
|
| 453 |
-
if response.status_code == 200:
|
| 454 |
-
result = response.json()
|
| 455 |
-
|
| 456 |
-
# BLIP response format: [{"generated_text": "..."}]
|
| 457 |
-
if isinstance(result, list) and len(result) > 0:
|
| 458 |
-
return result[0].get("generated_text", "")
|
| 459 |
-
elif isinstance(result, dict):
|
| 460 |
-
return result.get("generated_text", "") or result.get("caption", "")
|
| 461 |
-
elif isinstance(result, str):
|
| 462 |
-
return result
|
| 463 |
-
else:
|
| 464 |
-
logger.warning(f"Unexpected BLIP response format: {result}")
|
| 465 |
-
return str(result)
|
| 466 |
-
|
| 467 |
-
elif response.status_code == 503:
|
| 468 |
-
# Model loading - wait and retry
|
| 469 |
-
logger.warning("Model loading (503). Retrying in 15s...")
|
| 470 |
-
import time
|
| 471 |
-
time.sleep(15)
|
| 472 |
-
response = requests.post(
|
| 473 |
-
self.api_url,
|
| 474 |
-
headers={"Authorization": f"Bearer {self.hf_token}"},
|
| 475 |
-
data=image_bytes,
|
| 476 |
-
timeout=60
|
| 477 |
-
)
|
| 478 |
-
if response.status_code == 200:
|
| 479 |
-
result = response.json()
|
| 480 |
-
if isinstance(result, list) and len(result) > 0:
|
| 481 |
-
return result[0].get("generated_text", "")
|
| 482 |
-
return str(result)
|
| 483 |
-
else:
|
| 484 |
-
logger.error(f"HF API error after retry: {response.status_code}")
|
| 485 |
-
return None
|
| 486 |
-
else:
|
| 487 |
-
logger.error(f"HF API error: {response.status_code} - {response.text[:200]}")
|
| 488 |
-
return None
|
| 489 |
-
|
| 490 |
-
except Exception as e:
|
| 491 |
-
logger.error(f"Error querying HF API: {str(e)}")
|
| 492 |
-
return None
|
| 493 |
-
|
| 494 |
-
# ============================================================
|
| 495 |
-
# Video Frame Extraction
|
| 496 |
-
# ============================================================
|
| 497 |
-
|
| 498 |
-
def extract_frames_from_video(self, video_bytes: bytes, max_frames: int = 8) -> List[Image.Image]:
|
| 499 |
-
"""
|
| 500 |
-
Extract key frames from video for analysis
|
| 501 |
-
|
| 502 |
-
Args:
|
| 503 |
-
video_bytes: Raw video file bytes
|
| 504 |
-
max_frames: Maximum number of frames to extract (default 8)
|
| 505 |
-
|
| 506 |
-
Returns:
|
| 507 |
-
List of PIL Images extracted from video
|
| 508 |
-
"""
|
| 509 |
-
frames = []
|
| 510 |
-
temp_video_path = None
|
| 511 |
-
|
| 512 |
-
try:
|
| 513 |
-
# Save video bytes to temp file (OpenCV needs a file path)
|
| 514 |
-
with tempfile.NamedTemporaryFile(delete=False, suffix='.mp4') as temp_video:
|
| 515 |
-
temp_video.write(video_bytes)
|
| 516 |
-
temp_video_path = temp_video.name
|
| 517 |
-
|
| 518 |
-
# Open video with OpenCV
|
| 519 |
-
cap = cv2.VideoCapture(temp_video_path)
|
| 520 |
-
|
| 521 |
-
if not cap.isOpened():
|
| 522 |
-
logger.error("Failed to open video file")
|
| 523 |
-
return frames
|
| 524 |
-
|
| 525 |
-
# Get video properties
|
| 526 |
-
total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
|
| 527 |
-
fps = cap.get(cv2.CAP_PROP_FPS)
|
| 528 |
-
duration = total_frames / fps if fps > 0 else 0
|
| 529 |
-
|
| 530 |
-
logger.info(f"Video: {total_frames} frames, {fps:.2f} FPS, {duration:.2f}s duration")
|
| 531 |
-
|
| 532 |
-
# Calculate frame interval to extract max_frames evenly distributed
|
| 533 |
-
if total_frames <= max_frames:
|
| 534 |
-
# Extract all frames if video has fewer frames than max_frames
|
| 535 |
-
frame_indices = list(range(total_frames))
|
| 536 |
-
else:
|
| 537 |
-
# Extract frames at regular intervals
|
| 538 |
-
interval = total_frames // max_frames
|
| 539 |
-
frame_indices = [i * interval for i in range(max_frames)]
|
| 540 |
-
|
| 541 |
-
# Extract frames
|
| 542 |
-
for frame_idx in frame_indices:
|
| 543 |
-
cap.set(cv2.CAP_PROP_POS_FRAMES, frame_idx)
|
| 544 |
-
ret, frame = cap.read()
|
| 545 |
-
|
| 546 |
-
if ret:
|
| 547 |
-
# Convert BGR (OpenCV) to RGB (PIL)
|
| 548 |
-
frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
|
| 549 |
-
|
| 550 |
-
# Convert numpy array to PIL Image
|
| 551 |
-
pil_image = Image.fromarray(frame_rgb)
|
| 552 |
-
|
| 553 |
-
frames.append(pil_image)
|
| 554 |
-
logger.info(f"Extracted frame {len(frames)}/{max_frames} at index {frame_idx}")
|
| 555 |
-
|
| 556 |
-
cap.release()
|
| 557 |
-
logger.info(f"β
Successfully extracted {len(frames)} frames from video")
|
| 558 |
-
|
| 559 |
-
except Exception as e:
|
| 560 |
-
logger.error(f"Error extracting video frames: {str(e)}")
|
| 561 |
-
|
| 562 |
-
finally:
|
| 563 |
-
# Cleanup temp file
|
| 564 |
-
if temp_video_path:
|
| 565 |
-
try:
|
| 566 |
-
import os
|
| 567 |
-
os.unlink(temp_video_path)
|
| 568 |
-
except:
|
| 569 |
-
pass
|
| 570 |
-
|
| 571 |
-
return frames
|
| 572 |
-
|
| 573 |
-
def analyze_video(self, video_bytes: bytes, location: str = None, max_frames: int = 8) -> Dict:
|
| 574 |
-
"""
|
| 575 |
-
Analyze property video by extracting frames and analyzing them
|
| 576 |
-
|
| 577 |
-
Args:
|
| 578 |
-
video_bytes: Raw video file bytes
|
| 579 |
-
location: Optional location for context
|
| 580 |
-
max_frames: Maximum frames to extract (default 8)
|
| 581 |
-
|
| 582 |
-
Returns:
|
| 583 |
-
Dict with extracted property fields and confidence scores
|
| 584 |
-
"""
|
| 585 |
-
try:
|
| 586 |
-
# Step 1: Extract frames from video
|
| 587 |
-
logger.info(f"π¬ Extracting up to {max_frames} frames from video...")
|
| 588 |
-
frames = self.extract_frames_from_video(video_bytes, max_frames=max_frames)
|
| 589 |
-
|
| 590 |
-
if not frames:
|
| 591 |
-
logger.error("No frames extracted from video")
|
| 592 |
-
return {
|
| 593 |
-
"bedrooms": None,
|
| 594 |
-
"bathrooms": None,
|
| 595 |
-
"amenities": [],
|
| 596 |
-
"description": "Unable to analyze video - no frames extracted",
|
| 597 |
-
"title": "Property Video",
|
| 598 |
-
"confidence": {},
|
| 599 |
-
"error": "Failed to extract frames from video"
|
| 600 |
-
}
|
| 601 |
-
|
| 602 |
-
logger.info(f"β
Extracted {len(frames)} frames, analyzing each frame...")
|
| 603 |
-
|
| 604 |
-
# Step 2: Analyze each frame as an image
|
| 605 |
-
frame_results = []
|
| 606 |
-
for idx, frame in enumerate(frames):
|
| 607 |
-
logger.info(f"Analyzing frame {idx + 1}/{len(frames)}...")
|
| 608 |
-
|
| 609 |
-
# Convert PIL Image to bytes for analysis
|
| 610 |
-
frame_bytes = io.BytesIO()
|
| 611 |
-
frame.save(frame_bytes, format='JPEG')
|
| 612 |
-
frame_bytes.seek(0)
|
| 613 |
-
|
| 614 |
-
# Analyze this frame
|
| 615 |
-
frame_data = self.extract_property_fields(frame_bytes.getvalue(), location=location)
|
| 616 |
-
frame_results.append(frame_data)
|
| 617 |
-
|
| 618 |
-
# Step 3: Merge results from all frames
|
| 619 |
-
logger.info(f"Merging results from {len(frame_results)} analyzed frames...")
|
| 620 |
-
consolidated = self.merge_multiple_image_results(frame_results)
|
| 621 |
-
|
| 622 |
-
logger.info(f"β
Video analysis complete: {consolidated.get('bedrooms')} beds, {consolidated.get('bathrooms')} baths, {len(consolidated.get('amenities', []))} amenities")
|
| 623 |
-
|
| 624 |
-
return consolidated
|
| 625 |
-
|
| 626 |
-
except Exception as e:
|
| 627 |
-
logger.error(f"Error analyzing video: {str(e)}")
|
| 628 |
-
return {
|
| 629 |
-
"bedrooms": None,
|
| 630 |
-
"bathrooms": None,
|
| 631 |
-
"amenities": [],
|
| 632 |
-
"description": "",
|
| 633 |
-
"title": "",
|
| 634 |
-
"confidence": {},
|
| 635 |
-
"error": str(e)
|
| 636 |
-
}
|
| 637 |
-
|
| 638 |
-
# ============================================================
|
| 639 |
-
# Utility Methods
|
| 640 |
-
# ============================================================
|
| 641 |
-
|
| 642 |
-
def merge_multiple_image_results(self, results_list: List[Dict]) -> Dict:
|
| 643 |
-
"""
|
| 644 |
-
Merge results from multiple images into single listing data
|
| 645 |
-
|
| 646 |
-
Args:
|
| 647 |
-
results_list: List of extracted field dicts from different images
|
| 648 |
-
|
| 649 |
-
Returns:
|
| 650 |
-
Consolidated dict with most likely values
|
| 651 |
-
"""
|
| 652 |
-
if not results_list:
|
| 653 |
-
return {}
|
| 654 |
-
|
| 655 |
-
consolidated = {
|
| 656 |
-
"bedrooms": None,
|
| 657 |
-
"bathrooms": None,
|
| 658 |
-
"amenities": [],
|
| 659 |
-
"description": "",
|
| 660 |
-
"confidence": {}
|
| 661 |
-
}
|
| 662 |
-
|
| 663 |
-
# Bedrooms: take highest count mentioned
|
| 664 |
-
bedrooms_list = [r.get("bedrooms") for r in results_list if r.get("bedrooms")]
|
| 665 |
-
if bedrooms_list:
|
| 666 |
-
consolidated["bedrooms"] = max(bedrooms_list)
|
| 667 |
-
consolidated["confidence"]["bedrooms"] = sum(
|
| 668 |
-
[r.get("confidence", {}).get("bedrooms", 0)
|
| 669 |
-
for r in results_list]
|
| 670 |
-
) / len(results_list)
|
| 671 |
-
|
| 672 |
-
# Bathrooms: take highest count mentioned
|
| 673 |
-
bathrooms_list = [r.get("bathrooms") for r in results_list if r.get("bathrooms")]
|
| 674 |
-
if bathrooms_list:
|
| 675 |
-
consolidated["bathrooms"] = max(bathrooms_list)
|
| 676 |
-
consolidated["confidence"]["bathrooms"] = sum(
|
| 677 |
-
[r.get("confidence", {}).get("bathrooms", 0)
|
| 678 |
-
for r in results_list]
|
| 679 |
-
) / len(results_list)
|
| 680 |
-
|
| 681 |
-
# Amenities: deduplicate and combine
|
| 682 |
-
all_amenities = set()
|
| 683 |
-
for result in results_list:
|
| 684 |
-
all_amenities.update(result.get("amenities", []))
|
| 685 |
-
consolidated["amenities"] = list(all_amenities)
|
| 686 |
-
consolidated["confidence"]["amenities"] = sum(
|
| 687 |
-
[r.get("confidence", {}).get("amenities", 0)
|
| 688 |
-
for r in results_list]
|
| 689 |
-
) / len(results_list)
|
| 690 |
-
|
| 691 |
-
# Description: use longest one
|
| 692 |
-
descriptions = [r.get("description", "") for r in results_list if r.get("description")]
|
| 693 |
-
if descriptions:
|
| 694 |
-
consolidated["description"] = max(descriptions, key=len)
|
| 695 |
-
consolidated["confidence"]["description"] = 0.8
|
| 696 |
-
|
| 697 |
-
return consolidated
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
app/config.py
CHANGED
|
@@ -114,13 +114,15 @@ class Settings(BaseSettings):
|
|
| 114 |
HF_WHISPER_MODEL: str = os.getenv("HF_WHISPER_MODEL", "openai/whisper-large-v3")
|
| 115 |
|
| 116 |
# ------------------------------------------------------------------
|
| 117 |
-
# Vision AI (Property Analysis)
|
| 118 |
# ------------------------------------------------------------------
|
| 119 |
-
#
|
| 120 |
-
#
|
| 121 |
-
|
| 122 |
-
|
| 123 |
-
|
|
|
|
|
|
|
| 124 |
|
| 125 |
# ------------------------------------------------------------------
|
| 126 |
# LLM / Tooling keys
|
|
@@ -141,6 +143,12 @@ class Settings(BaseSettings):
|
|
| 141 |
LANGCHAIN_TRACING_V2: bool = os.getenv("LANGCHAIN_TRACING_V2", "false").lower() == "true"
|
| 142 |
LANGCHAIN_API_KEY: str = os.getenv("LANGCHAIN_API_KEY", "")
|
| 143 |
LANGCHAIN_PROJECT: str = os.getenv("LANGCHAIN_PROJECT", "aida_agent")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 144 |
|
| 145 |
# ============ REDIS (SESSION & MEMORY) ============
|
| 146 |
REDIS_URL: str = os.getenv("REDIS_URL", "redis://localhost:6379")
|
|
|
|
| 114 |
HF_WHISPER_MODEL: str = os.getenv("HF_WHISPER_MODEL", "openai/whisper-large-v3")
|
| 115 |
|
| 116 |
# ------------------------------------------------------------------
|
| 117 |
+
# Vision AI (Property Analysis) - DISABLED
|
| 118 |
# ------------------------------------------------------------------
|
| 119 |
+
# NOTE: Vision analysis is NOT in use. Image uploads are handled
|
| 120 |
+
# directly by Cloudflare Worker (frontend upload).
|
| 121 |
+
# These settings are kept for future reference only.
|
| 122 |
+
# ------------------------------------------------------------------
|
| 123 |
+
# HF_VISION_MODEL: str = os.getenv("HF_VISION_MODEL", "Salesforce/blip-image-captioning-large")
|
| 124 |
+
# HF_VISION_API_ENABLED: bool = os.getenv("HF_VISION_API_ENABLED", "true").lower() == "true"
|
| 125 |
+
# PROPERTY_IMAGE_MIN_CONFIDENCE: float = float(os.getenv("PROPERTY_IMAGE_MIN_CONFIDENCE", "0.6"))
|
| 126 |
|
| 127 |
# ------------------------------------------------------------------
|
| 128 |
# LLM / Tooling keys
|
|
|
|
| 143 |
LANGCHAIN_TRACING_V2: bool = os.getenv("LANGCHAIN_TRACING_V2", "false").lower() == "true"
|
| 144 |
LANGCHAIN_API_KEY: str = os.getenv("LANGCHAIN_API_KEY", "")
|
| 145 |
LANGCHAIN_PROJECT: str = os.getenv("LANGCHAIN_PROJECT", "aida_agent")
|
| 146 |
+
|
| 147 |
+
# ============ AGENT LIGHTNING (RL TRAINING) ============
|
| 148 |
+
# Enable trajectory capture for reinforcement learning
|
| 149 |
+
# Set LIGHTNING_ENABLED=true in .env to start collecting training data
|
| 150 |
+
LIGHTNING_ENABLED: bool = os.getenv("LIGHTNING_ENABLED", "false").lower() == "true"
|
| 151 |
+
LIGHTNING_TRAJECTORY_TTL_DAYS: int = int(os.getenv("LIGHTNING_TRAJECTORY_TTL_DAYS", "30"))
|
| 152 |
|
| 153 |
# ============ REDIS (SESSION & MEMORY) ============
|
| 154 |
REDIS_URL: str = os.getenv("REDIS_URL", "redis://localhost:6379")
|
app/routes/auth.py
CHANGED
|
@@ -11,6 +11,7 @@ from app.schemas.auth import (
|
|
| 11 |
ResetPasswordDto,
|
| 12 |
ResendOtpDto,
|
| 13 |
)
|
|
|
|
| 14 |
from app.services.auth_service import auth_service
|
| 15 |
from app.services.user_service import user_service
|
| 16 |
from app.services.otp_service import otp_service
|
|
@@ -162,6 +163,61 @@ async def get_current_user_profile(current_user: dict = Depends(get_current_user
|
|
| 162 |
logger.info(f"Get current user profile: {current_user.get('user_id')}")
|
| 163 |
return await user_service.get_current_user_profile(current_user.get("user_id"))
|
| 164 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 165 |
# ============================================================
|
| 166 |
# LOGOUT ENDPOINT
|
| 167 |
# ============================================================
|
|
|
|
| 11 |
ResetPasswordDto,
|
| 12 |
ResendOtpDto,
|
| 13 |
)
|
| 14 |
+
from app.schemas.user import ProfileUpdateRequest, ProfileUpdateResponse
|
| 15 |
from app.services.auth_service import auth_service
|
| 16 |
from app.services.user_service import user_service
|
| 17 |
from app.services.otp_service import otp_service
|
|
|
|
| 163 |
logger.info(f"Get current user profile: {current_user.get('user_id')}")
|
| 164 |
return await user_service.get_current_user_profile(current_user.get("user_id"))
|
| 165 |
|
| 166 |
+
|
| 167 |
+
@router.patch("/profile", status_code=status.HTTP_200_OK, response_model=ProfileUpdateResponse)
|
| 168 |
+
async def update_current_user_profile(
|
| 169 |
+
profile_data: ProfileUpdateRequest,
|
| 170 |
+
current_user: dict = Depends(get_current_user)
|
| 171 |
+
):
|
| 172 |
+
"""
|
| 173 |
+
Update Current User Profile
|
| 174 |
+
|
| 175 |
+
Update the logged-in user's profile information.
|
| 176 |
+
All fields are optional - only include fields you want to update.
|
| 177 |
+
|
| 178 |
+
**Allowed fields:**
|
| 179 |
+
- `firstName`: User's first name (1-50 chars)
|
| 180 |
+
- `lastName`: User's last name (1-50 chars)
|
| 181 |
+
- `bio`: Short bio (max 150 chars)
|
| 182 |
+
- `location`: Location in "City, Country" format (max 100 chars)
|
| 183 |
+
- `languages`: Array of languages spoken (max 3)
|
| 184 |
+
- `profilePicture`: URL to profile picture
|
| 185 |
+
|
| 186 |
+
**Requires:** Bearer token in Authorization header
|
| 187 |
+
|
| 188 |
+
**Example request body:**
|
| 189 |
+
```json
|
| 190 |
+
{
|
| 191 |
+
"firstName": "John",
|
| 192 |
+
"lastName": "Doe",
|
| 193 |
+
"bio": "Real estate enthusiast",
|
| 194 |
+
"location": "Cotonou, Benin",
|
| 195 |
+
"languages": ["English", "French"]
|
| 196 |
+
}
|
| 197 |
+
```
|
| 198 |
+
"""
|
| 199 |
+
user_id = current_user.get("user_id")
|
| 200 |
+
logger.info(f"Update user profile: {user_id}")
|
| 201 |
+
|
| 202 |
+
# Convert to dict and remove None values
|
| 203 |
+
update_data = {k: v for k, v in profile_data.model_dump().items() if v is not None}
|
| 204 |
+
|
| 205 |
+
if not update_data:
|
| 206 |
+
raise HTTPException(
|
| 207 |
+
status_code=status.HTTP_400_BAD_REQUEST,
|
| 208 |
+
detail="No fields to update. Please provide at least one field."
|
| 209 |
+
)
|
| 210 |
+
|
| 211 |
+
# Validate languages count
|
| 212 |
+
if "languages" in update_data and len(update_data["languages"]) > 3:
|
| 213 |
+
raise HTTPException(
|
| 214 |
+
status_code=status.HTTP_400_BAD_REQUEST,
|
| 215 |
+
detail="Maximum 3 languages allowed"
|
| 216 |
+
)
|
| 217 |
+
|
| 218 |
+
return await user_service.update_user_profile(user_id, update_data)
|
| 219 |
+
|
| 220 |
+
|
| 221 |
# ============================================================
|
| 222 |
# LOGOUT ENDPOINT
|
| 223 |
# ============================================================
|
app/routes/media_upload.py
CHANGED
|
@@ -1,282 +1,33 @@
|
|
| 1 |
# ============================================================
|
| 2 |
# app/routes/media_upload.py
|
| 3 |
-
# Media Upload
|
| 4 |
-
#
|
|
|
|
|
|
|
| 5 |
# ============================================================
|
| 6 |
|
| 7 |
-
import io
|
| 8 |
import logging
|
| 9 |
-
from
|
| 10 |
-
from fastapi import APIRouter,
|
| 11 |
-
from fastapi.responses import JSONResponse
|
| 12 |
-
import cloudinary
|
| 13 |
-
import cloudinary.uploader
|
| 14 |
from app.config import settings
|
| 15 |
-
# PAUSED: Vision service temporarily disabled
|
| 16 |
-
# from app.ai.services.vision_service import VisionService
|
| 17 |
-
from app.guards.jwt_guard import get_current_user
|
| 18 |
-
from app.core.llm_router import LLMRouter, TaskComplexity
|
| 19 |
|
| 20 |
logger = logging.getLogger(__name__)
|
| 21 |
|
| 22 |
router = APIRouter(prefix="/listings", tags=["media"])
|
| 23 |
|
| 24 |
-
# PAUSED: Vision Service temporarily disabled
|
| 25 |
-
# vision_service = VisionService()
|
| 26 |
-
|
| 27 |
-
# Initialize LLM Router for generating personalized messages
|
| 28 |
-
llm_router = LLMRouter()
|
| 29 |
-
|
| 30 |
-
# Configure Cloudinary
|
| 31 |
-
if settings.CLOUDINARY_CLOUD_NAME:
|
| 32 |
-
cloudinary.config(
|
| 33 |
-
cloud_name=settings.CLOUDINARY_CLOUD_NAME,
|
| 34 |
-
api_key=settings.CLOUDINARY_API_KEY,
|
| 35 |
-
api_secret=settings.CLOUDINARY_API_SECRET,
|
| 36 |
-
secure=True
|
| 37 |
-
)
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
# ============================================================
|
| 41 |
-
# File Validation & Limits
|
| 42 |
-
# ============================================================
|
| 43 |
-
|
| 44 |
-
ALLOWED_IMAGE_TYPES = {"image/jpeg", "image/png", "image/webp"}
|
| 45 |
-
ALLOWED_VIDEO_TYPES = {"video/mp4", "video/quicktime", "video/x-msvideo"}
|
| 46 |
-
MAX_IMAGE_SIZE = 10 * 1024 * 1024 # 10MB
|
| 47 |
-
MAX_VIDEO_SIZE = 100 * 1024 * 1024 # 100MB
|
| 48 |
-
MAX_IMAGES_PER_UPLOAD = 10
|
| 49 |
-
MAX_VIDEO_DURATION = 300 # 5 minutes
|
| 50 |
-
|
| 51 |
-
|
| 52 |
-
# ============================================================
|
| 53 |
-
# Helper Functions
|
| 54 |
-
# ============================================================
|
| 55 |
-
|
| 56 |
-
async def validate_image_file(file: UploadFile) -> bytes:
|
| 57 |
-
"""Validate and read image file"""
|
| 58 |
-
if file.content_type not in ALLOWED_IMAGE_TYPES:
|
| 59 |
-
raise HTTPException(
|
| 60 |
-
status_code=status.HTTP_400_BAD_REQUEST,
|
| 61 |
-
detail=f"Invalid image type. Allowed: {', '.join(ALLOWED_IMAGE_TYPES)}"
|
| 62 |
-
)
|
| 63 |
-
|
| 64 |
-
contents = await file.read()
|
| 65 |
-
if len(contents) > MAX_IMAGE_SIZE:
|
| 66 |
-
raise HTTPException(
|
| 67 |
-
status_code=status.HTTP_413_REQUEST_ENTITY_TOO_LARGE,
|
| 68 |
-
detail=f"Image size exceeds {MAX_IMAGE_SIZE / 1024 / 1024}MB limit"
|
| 69 |
-
)
|
| 70 |
-
|
| 71 |
-
return contents
|
| 72 |
-
|
| 73 |
-
|
| 74 |
-
async def validate_video_file(file: UploadFile) -> bytes:
|
| 75 |
-
"""Validate and read video file"""
|
| 76 |
-
if file.content_type not in ALLOWED_VIDEO_TYPES:
|
| 77 |
-
raise HTTPException(
|
| 78 |
-
status_code=status.HTTP_400_BAD_REQUEST,
|
| 79 |
-
detail=f"Invalid video type. Allowed: {', '.join(ALLOWED_VIDEO_TYPES)}"
|
| 80 |
-
)
|
| 81 |
-
|
| 82 |
-
contents = await file.read()
|
| 83 |
-
if len(contents) > MAX_VIDEO_SIZE:
|
| 84 |
-
raise HTTPException(
|
| 85 |
-
status_code=status.HTTP_413_REQUEST_ENTITY_TOO_LARGE,
|
| 86 |
-
detail=f"Video size exceeds {MAX_VIDEO_SIZE / 1024 / 1024}MB limit"
|
| 87 |
-
)
|
| 88 |
-
|
| 89 |
-
return contents
|
| 90 |
-
|
| 91 |
-
|
| 92 |
-
def generate_intelligent_filename(
|
| 93 |
-
original_filename: str,
|
| 94 |
-
location: Optional[str] = None,
|
| 95 |
-
title: Optional[str] = None,
|
| 96 |
-
index: int = 0
|
| 97 |
-
) -> str:
|
| 98 |
-
"""
|
| 99 |
-
Generate intelligent filename for uploaded image
|
| 100 |
-
|
| 101 |
-
Pattern: {location}_{title}_{date}_{index}.jpg
|
| 102 |
-
Example: Lagos_Modern_Apartment_2025_01_31_1.jpg
|
| 103 |
-
|
| 104 |
-
The Cloudflare worker will handle duplicates by appending numbers
|
| 105 |
-
"""
|
| 106 |
-
from datetime import datetime
|
| 107 |
-
|
| 108 |
-
# Get original extension
|
| 109 |
-
_, ext = original_filename.rsplit('.', 1) if '.' in original_filename else (original_filename, 'jpg')
|
| 110 |
-
ext = ext.lower()
|
| 111 |
-
if ext not in ['jpg', 'jpeg', 'png', 'webp']:
|
| 112 |
-
ext = 'jpg'
|
| 113 |
-
|
| 114 |
-
# Build filename components
|
| 115 |
-
parts = []
|
| 116 |
-
|
| 117 |
-
# Add location if available
|
| 118 |
-
if location:
|
| 119 |
-
clean_location = location.replace(' ', '_').replace(',', '').lower()[:20]
|
| 120 |
-
parts.append(clean_location)
|
| 121 |
-
|
| 122 |
-
# Add title if available (first 20 chars)
|
| 123 |
-
if title:
|
| 124 |
-
clean_title = title.replace(' ', '_').replace(',', '').lower()[:20]
|
| 125 |
-
parts.append(clean_title)
|
| 126 |
-
|
| 127 |
-
# Add timestamp
|
| 128 |
-
timestamp = datetime.utcnow().strftime("%Y_%m_%d_%H%M%S")
|
| 129 |
-
parts.append(timestamp)
|
| 130 |
-
|
| 131 |
-
# Add index if multiple images
|
| 132 |
-
if index > 0:
|
| 133 |
-
parts.append(str(index))
|
| 134 |
-
|
| 135 |
-
filename = "_".join(parts)
|
| 136 |
-
return f"{filename}.{ext}"
|
| 137 |
-
|
| 138 |
-
|
| 139 |
-
async def upload_to_cloudflare(file_bytes: bytes, filename: str, meaningful_name: str = None) -> str:
|
| 140 |
-
"""
|
| 141 |
-
Upload image/video to Cloudflare R2
|
| 142 |
-
|
| 143 |
-
Args:
|
| 144 |
-
file_bytes: File bytes
|
| 145 |
-
filename: Original filename
|
| 146 |
-
meaningful_name: AI-generated meaningful filename (optional)
|
| 147 |
-
|
| 148 |
-
Returns:
|
| 149 |
-
Public URL of uploaded file
|
| 150 |
-
"""
|
| 151 |
-
import boto3
|
| 152 |
-
from botocore.config import Config
|
| 153 |
-
import os
|
| 154 |
-
from datetime import datetime
|
| 155 |
-
|
| 156 |
-
try:
|
| 157 |
-
# Use meaningful name if provided, otherwise original filename
|
| 158 |
-
final_filename = meaningful_name or filename
|
| 159 |
-
|
| 160 |
-
# Initialize R2 client
|
| 161 |
-
r2_client = boto3.client(
|
| 162 |
-
's3',
|
| 163 |
-
endpoint_url=settings.CF_R2_ENDPOINT,
|
| 164 |
-
aws_access_key_id=settings.CF_R2_ACCESS_KEY_ID,
|
| 165 |
-
aws_secret_access_key=settings.CF_R2_SECRET_ACCESS_KEY,
|
| 166 |
-
config=Config(
|
| 167 |
-
signature_version='s3v4',
|
| 168 |
-
s3={'addressing_style': 'path'}
|
| 169 |
-
),
|
| 170 |
-
region_name='auto'
|
| 171 |
-
)
|
| 172 |
-
|
| 173 |
-
# Determine content type based on file extension
|
| 174 |
-
ext = os.path.splitext(final_filename)[1].lower()
|
| 175 |
-
content_type_map = {
|
| 176 |
-
'.jpg': 'image/jpeg',
|
| 177 |
-
'.jpeg': 'image/jpeg',
|
| 178 |
-
'.png': 'image/png',
|
| 179 |
-
'.webp': 'image/webp',
|
| 180 |
-
'.mp4': 'video/mp4',
|
| 181 |
-
'.mov': 'video/quicktime',
|
| 182 |
-
'.avi': 'video/x-msvideo'
|
| 183 |
-
}
|
| 184 |
-
content_type = content_type_map.get(ext, 'application/octet-stream')
|
| 185 |
-
|
| 186 |
-
# Create folder structure: media/YYYY/MM/filename
|
| 187 |
-
now = datetime.utcnow()
|
| 188 |
-
folder_path = f"media/{now.year}/{now.month:02d}"
|
| 189 |
-
object_key = f"{folder_path}/{final_filename}"
|
| 190 |
-
|
| 191 |
-
# Upload to R2 (use lojiz-audio bucket for now, or create lojiz-media bucket)
|
| 192 |
-
bucket_name = settings.CF_R2_BUCKET_NAME
|
| 193 |
-
|
| 194 |
-
r2_client.put_object(
|
| 195 |
-
Bucket=bucket_name,
|
| 196 |
-
Key=object_key,
|
| 197 |
-
Body=file_bytes,
|
| 198 |
-
ContentType=content_type,
|
| 199 |
-
CacheControl='public, max-age=31536000', # Cache for 1 year
|
| 200 |
-
)
|
| 201 |
-
|
| 202 |
-
# Construct public URL
|
| 203 |
-
public_url = f"{settings.CF_R2_PUBLIC_URL}/{object_key}"
|
| 204 |
-
|
| 205 |
-
logger.info(f"β
Uploaded to Cloudflare R2: {public_url}")
|
| 206 |
-
return public_url
|
| 207 |
-
|
| 208 |
-
except Exception as e:
|
| 209 |
-
logger.error(f"β Error uploading to Cloudflare R2: {str(e)}")
|
| 210 |
-
raise HTTPException(
|
| 211 |
-
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
|
| 212 |
-
detail=f"Failed to upload to cloud storage: {str(e)}"
|
| 213 |
-
)
|
| 214 |
-
|
| 215 |
-
|
| 216 |
-
async def upload_to_cloudinary(file_bytes: bytes, filename: str, resource_type: str = "video") -> str:
|
| 217 |
-
"""Upload video to Cloudinary"""
|
| 218 |
-
try:
|
| 219 |
-
file_obj = io.BytesIO(file_bytes)
|
| 220 |
-
|
| 221 |
-
result = cloudinary.uploader.upload(
|
| 222 |
-
file_obj,
|
| 223 |
-
resource_type=resource_type,
|
| 224 |
-
folder="lojiz/property-videos",
|
| 225 |
-
public_id=filename.split(".")[0],
|
| 226 |
-
overwrite=True,
|
| 227 |
-
quality="auto",
|
| 228 |
-
fetch_format="auto"
|
| 229 |
-
)
|
| 230 |
-
|
| 231 |
-
return result.get("secure_url", "")
|
| 232 |
-
|
| 233 |
-
except Exception as e:
|
| 234 |
-
logger.error(f"Error uploading to Cloudinary: {str(e)}")
|
| 235 |
-
raise HTTPException(
|
| 236 |
-
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
|
| 237 |
-
detail="Failed to upload video to Cloudinary"
|
| 238 |
-
)
|
| 239 |
-
|
| 240 |
|
| 241 |
# ============================================================
|
| 242 |
# API Endpoints
|
| 243 |
# ============================================================
|
| 244 |
|
| 245 |
-
@router.post("/analyze-images")
|
| 246 |
-
async def analyze_property_images(
|
| 247 |
-
images: List[UploadFile] = File(...),
|
| 248 |
-
listing_method: str = "image",
|
| 249 |
-
location: Optional[str] = None,
|
| 250 |
-
user_input: Optional[str] = Form(None),
|
| 251 |
-
session_id: Optional[str] = Form(None),
|
| 252 |
-
current_user = Depends(get_current_user)
|
| 253 |
-
):
|
| 254 |
-
"""
|
| 255 |
-
π§ VISION FEATURE PAUSED π§
|
| 256 |
-
|
| 257 |
-
This route is temporarily disabled while we set up a reliable vision provider.
|
| 258 |
-
For now, use /upload-images for direct image upload without AI analysis.
|
| 259 |
-
"""
|
| 260 |
-
raise HTTPException(
|
| 261 |
-
status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
|
| 262 |
-
detail={
|
| 263 |
-
"message": "Vision analysis temporarily unavailable",
|
| 264 |
-
"suggestion": "Use /listings/upload-images for direct upload without AI analysis",
|
| 265 |
-
"feature_status": "paused"
|
| 266 |
-
}
|
| 267 |
-
)
|
| 268 |
-
|
| 269 |
-
|
| 270 |
-
|
| 271 |
-
|
| 272 |
@router.get("/upload-config")
|
| 273 |
async def get_upload_configuration():
|
| 274 |
"""
|
| 275 |
Get image upload configuration for frontend.
|
| 276 |
-
|
| 277 |
Frontend should upload images DIRECTLY to the Cloudflare Worker URL.
|
| 278 |
No backend processing involved.
|
| 279 |
-
|
| 280 |
Returns:
|
| 281 |
{
|
| 282 |
"worker_url": "https://image-upload-worker.destinyebuka7.workers.dev",
|
|
@@ -290,55 +41,60 @@ async def get_upload_configuration():
|
|
| 290 |
"max_file_size_mb": 5,
|
| 291 |
"allowed_types": ["image/jpeg", "image/png", "image/webp"],
|
| 292 |
"instructions": {
|
| 293 |
-
"
|
| 294 |
-
|
| 295 |
-
|
| 296 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 297 |
}
|
| 298 |
}
|
| 299 |
|
| 300 |
|
| 301 |
@router.post("/get-image-name")
|
| 302 |
-
|
| 303 |
-
|
| 304 |
async def get_image_name_for_upload(
|
| 305 |
user_id: str = Form(...),
|
| 306 |
session_id: str = Form(...)
|
| 307 |
):
|
| 308 |
"""
|
| 309 |
Generate intelligent filename for Cloudflare Worker uploads.
|
| 310 |
-
|
| 311 |
-
Called by Cloudflare Worker before uploading
|
| 312 |
Returns a descriptive filename based on current listing context.
|
| 313 |
-
|
| 314 |
Args:
|
| 315 |
user_id: User ID
|
| 316 |
session_id: Current session ID
|
| 317 |
-
|
| 318 |
Returns:
|
| 319 |
{"name": "lagos_modern_apartment_2bed"}
|
| 320 |
"""
|
| 321 |
try:
|
| 322 |
from app.ai.services.conversation_service import ConversationService
|
| 323 |
-
|
| 324 |
-
|
| 325 |
# Get current conversation state to extract listing context
|
| 326 |
conv_service = ConversationService()
|
| 327 |
state = await conv_service.get_or_create_conversation(user_id, session_id)
|
| 328 |
-
|
| 329 |
# Extract relevant fields for filename
|
| 330 |
location = state.provided_fields.get("location", "")
|
| 331 |
title = state.provided_fields.get("title", "")
|
| 332 |
bedrooms = state.provided_fields.get("bedrooms")
|
| 333 |
listing_type = state.provided_fields.get("listing_type", "property")
|
| 334 |
-
|
| 335 |
# Build intelligent filename
|
| 336 |
parts = []
|
| 337 |
-
|
| 338 |
if location:
|
| 339 |
clean_location = location.replace(' ', '_').replace(',', '').lower()[:15]
|
| 340 |
parts.append(clean_location)
|
| 341 |
-
|
| 342 |
if title:
|
| 343 |
# Extract first 2-3 meaningful words from title
|
| 344 |
title_words = [w for w in title.lower().split() if len(w) > 3][:2]
|
|
@@ -346,23 +102,23 @@ async def get_image_name_for_upload(
|
|
| 346 |
parts.extend(title_words)
|
| 347 |
elif bedrooms:
|
| 348 |
parts.append(f"{bedrooms}bed")
|
| 349 |
-
|
| 350 |
if listing_type and listing_type != "property":
|
| 351 |
parts.append(listing_type[:4])
|
| 352 |
-
|
| 353 |
# If we have no context, use generic name
|
| 354 |
if not parts:
|
| 355 |
parts = ["property", datetime.now().strftime("%Y%m%d")]
|
| 356 |
-
|
| 357 |
filename = "_".join(parts)
|
| 358 |
-
|
| 359 |
-
logger.info(f"Generated image name: {filename}
|
| 360 |
-
|
| 361 |
return {
|
| 362 |
"name": filename,
|
| 363 |
"success": True
|
| 364 |
}
|
| 365 |
-
|
| 366 |
except Exception as e:
|
| 367 |
logger.error(f"Failed to generate image name: {str(e)}")
|
| 368 |
# Fallback to timestamp-based name
|
|
@@ -370,53 +126,3 @@ async def get_image_name_for_upload(
|
|
| 370 |
"name": f"property_{datetime.now().strftime('%Y%m%d_%H%M%S')}",
|
| 371 |
"success": True
|
| 372 |
}
|
| 373 |
-
|
| 374 |
-
|
| 375 |
-
# ============================================================
|
| 376 |
-
# DEPRECATED ENDPOINTS (Vision Feature Paused)
|
| 377 |
-
# ============================================================
|
| 378 |
-
|
| 379 |
-
@router.post("/analyze-images")
|
| 380 |
-
async def analyze_property_images_deprecated(current_user = Depends(get_current_user)):
|
| 381 |
-
"""
|
| 382 |
-
π§ VISION FEATURE PAUSED π§
|
| 383 |
-
|
| 384 |
-
Image analysis is now handled by Cloudflare Worker with AI vision.
|
| 385 |
-
Frontend should upload directly to the worker endpoint.
|
| 386 |
-
"""
|
| 387 |
-
raise HTTPException(
|
| 388 |
-
status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
|
| 389 |
-
detail={
|
| 390 |
-
"message": "Vision analysis moved to Cloudflare Worker",
|
| 391 |
-
"suggestion": "Upload images to Cloudflare Worker endpoint for direct processing",
|
| 392 |
-
"feature_status": "deprecated"
|
| 393 |
-
}
|
| 394 |
-
)
|
| 395 |
-
|
| 396 |
-
|
| 397 |
-
@router.post("/analyze-video")
|
| 398 |
-
async def analyze_property_video_deprecated(current_user = Depends(get_current_user)):
|
| 399 |
-
"""
|
| 400 |
-
π§ VISION FEATURE PAUSED π§
|
| 401 |
-
"""
|
| 402 |
-
raise HTTPException(
|
| 403 |
-
status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
|
| 404 |
-
detail={
|
| 405 |
-
"message": "Video analysis temporarily unavailable",
|
| 406 |
-
"feature_status": "paused"
|
| 407 |
-
}
|
| 408 |
-
)
|
| 409 |
-
|
| 410 |
-
|
| 411 |
-
@router.post("/validate-media")
|
| 412 |
-
async def validate_media_deprecated(current_user = Depends(get_current_user)):
|
| 413 |
-
"""
|
| 414 |
-
π§ VISION FEATURE PAUSED π§
|
| 415 |
-
"""
|
| 416 |
-
raise HTTPException(
|
| 417 |
-
status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
|
| 418 |
-
detail={
|
| 419 |
-
"message": "Media validation moved to Cloudflare Worker",
|
| 420 |
-
"feature_status": "deprecated"
|
| 421 |
-
}
|
| 422 |
-
)
|
|
|
|
| 1 |
# ============================================================
|
| 2 |
# app/routes/media_upload.py
|
| 3 |
+
# Media Upload Configuration Routes
|
| 4 |
+
# ============================================================
|
| 5 |
+
# NOTE: Image/video uploads are handled DIRECTLY by Cloudflare Worker
|
| 6 |
+
# This file only provides configuration endpoints for the frontend
|
| 7 |
# ============================================================
|
| 8 |
|
|
|
|
| 9 |
import logging
|
| 10 |
+
from datetime import datetime
|
| 11 |
+
from fastapi import APIRouter, Form, HTTPException, status
|
|
|
|
|
|
|
|
|
|
| 12 |
from app.config import settings
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
|
| 14 |
logger = logging.getLogger(__name__)
|
| 15 |
|
| 16 |
router = APIRouter(prefix="/listings", tags=["media"])
|
| 17 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 18 |
|
| 19 |
# ============================================================
|
| 20 |
# API Endpoints
|
| 21 |
# ============================================================
|
| 22 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 23 |
@router.get("/upload-config")
|
| 24 |
async def get_upload_configuration():
|
| 25 |
"""
|
| 26 |
Get image upload configuration for frontend.
|
| 27 |
+
|
| 28 |
Frontend should upload images DIRECTLY to the Cloudflare Worker URL.
|
| 29 |
No backend processing involved.
|
| 30 |
+
|
| 31 |
Returns:
|
| 32 |
{
|
| 33 |
"worker_url": "https://image-upload-worker.destinyebuka7.workers.dev",
|
|
|
|
| 41 |
"max_file_size_mb": 5,
|
| 42 |
"allowed_types": ["image/jpeg", "image/png", "image/webp"],
|
| 43 |
"instructions": {
|
| 44 |
+
"profile_upload": {
|
| 45 |
+
"step_1": "Create FormData with: file, type='profile', user_name, user_id",
|
| 46 |
+
"step_2": "POST to worker_url",
|
| 47 |
+
"step_3": "Worker uploads to Cloudflare, returns URL",
|
| 48 |
+
"step_4": "Include URL in PATCH /auth/profile payload"
|
| 49 |
+
},
|
| 50 |
+
"property_upload": {
|
| 51 |
+
"step_1": "Create FormData with: file, type='property', user_id, session_id",
|
| 52 |
+
"step_2": "POST to worker_url",
|
| 53 |
+
"step_3": "Worker uploads to Cloudflare, returns URL",
|
| 54 |
+
"step_4": "Send URL to AIDA in chat message"
|
| 55 |
+
}
|
| 56 |
}
|
| 57 |
}
|
| 58 |
|
| 59 |
|
| 60 |
@router.post("/get-image-name")
|
|
|
|
|
|
|
| 61 |
async def get_image_name_for_upload(
|
| 62 |
user_id: str = Form(...),
|
| 63 |
session_id: str = Form(...)
|
| 64 |
):
|
| 65 |
"""
|
| 66 |
Generate intelligent filename for Cloudflare Worker uploads.
|
| 67 |
+
|
| 68 |
+
Called by Cloudflare Worker before uploading property images.
|
| 69 |
Returns a descriptive filename based on current listing context.
|
| 70 |
+
|
| 71 |
Args:
|
| 72 |
user_id: User ID
|
| 73 |
session_id: Current session ID
|
| 74 |
+
|
| 75 |
Returns:
|
| 76 |
{"name": "lagos_modern_apartment_2bed"}
|
| 77 |
"""
|
| 78 |
try:
|
| 79 |
from app.ai.services.conversation_service import ConversationService
|
| 80 |
+
|
|
|
|
| 81 |
# Get current conversation state to extract listing context
|
| 82 |
conv_service = ConversationService()
|
| 83 |
state = await conv_service.get_or_create_conversation(user_id, session_id)
|
| 84 |
+
|
| 85 |
# Extract relevant fields for filename
|
| 86 |
location = state.provided_fields.get("location", "")
|
| 87 |
title = state.provided_fields.get("title", "")
|
| 88 |
bedrooms = state.provided_fields.get("bedrooms")
|
| 89 |
listing_type = state.provided_fields.get("listing_type", "property")
|
| 90 |
+
|
| 91 |
# Build intelligent filename
|
| 92 |
parts = []
|
| 93 |
+
|
| 94 |
if location:
|
| 95 |
clean_location = location.replace(' ', '_').replace(',', '').lower()[:15]
|
| 96 |
parts.append(clean_location)
|
| 97 |
+
|
| 98 |
if title:
|
| 99 |
# Extract first 2-3 meaningful words from title
|
| 100 |
title_words = [w for w in title.lower().split() if len(w) > 3][:2]
|
|
|
|
| 102 |
parts.extend(title_words)
|
| 103 |
elif bedrooms:
|
| 104 |
parts.append(f"{bedrooms}bed")
|
| 105 |
+
|
| 106 |
if listing_type and listing_type != "property":
|
| 107 |
parts.append(listing_type[:4])
|
| 108 |
+
|
| 109 |
# If we have no context, use generic name
|
| 110 |
if not parts:
|
| 111 |
parts = ["property", datetime.now().strftime("%Y%m%d")]
|
| 112 |
+
|
| 113 |
filename = "_".join(parts)
|
| 114 |
+
|
| 115 |
+
logger.info(f"Generated image name: {filename} for user: {user_id}")
|
| 116 |
+
|
| 117 |
return {
|
| 118 |
"name": filename,
|
| 119 |
"success": True
|
| 120 |
}
|
| 121 |
+
|
| 122 |
except Exception as e:
|
| 123 |
logger.error(f"Failed to generate image name: {str(e)}")
|
| 124 |
# Fallback to timestamp-based name
|
|
|
|
| 126 |
"name": f"property_{datetime.now().strftime('%Y%m%d_%H%M%S')}",
|
| 127 |
"success": True
|
| 128 |
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
app/schemas/user.py
CHANGED
|
@@ -100,6 +100,69 @@ class UserUpdateDto(BaseModel):
|
|
| 100 |
languages: Optional[list[str]] = None
|
| 101 |
|
| 102 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 103 |
# ============================================================
|
| 104 |
# Generic Response DTOs
|
| 105 |
# ============================================================
|
|
|
|
| 100 |
languages: Optional[list[str]] = None
|
| 101 |
|
| 102 |
|
| 103 |
+
class ProfileUpdateRequest(BaseModel):
|
| 104 |
+
"""
|
| 105 |
+
Update user profile request body.
|
| 106 |
+
All fields are optional - only include fields you want to update.
|
| 107 |
+
"""
|
| 108 |
+
firstName: Optional[str] = Field(
|
| 109 |
+
None,
|
| 110 |
+
min_length=1,
|
| 111 |
+
max_length=50,
|
| 112 |
+
description="User's first name",
|
| 113 |
+
examples=["John"]
|
| 114 |
+
)
|
| 115 |
+
lastName: Optional[str] = Field(
|
| 116 |
+
None,
|
| 117 |
+
min_length=1,
|
| 118 |
+
max_length=50,
|
| 119 |
+
description="User's last name",
|
| 120 |
+
examples=["Doe"]
|
| 121 |
+
)
|
| 122 |
+
bio: Optional[str] = Field(
|
| 123 |
+
None,
|
| 124 |
+
max_length=150,
|
| 125 |
+
description="Short bio about the user (max 150 characters)",
|
| 126 |
+
examples=["Real estate enthusiast looking for the perfect apartment in Cotonou."]
|
| 127 |
+
)
|
| 128 |
+
location: Optional[str] = Field(
|
| 129 |
+
None,
|
| 130 |
+
max_length=100,
|
| 131 |
+
description="User's location in 'City, Country' format",
|
| 132 |
+
examples=["Cotonou, Benin"]
|
| 133 |
+
)
|
| 134 |
+
languages: Optional[list[str]] = Field(
|
| 135 |
+
None,
|
| 136 |
+
max_length=3,
|
| 137 |
+
description="Languages spoken by the user (max 3)",
|
| 138 |
+
examples=[["English", "French", "Portuguese"]]
|
| 139 |
+
)
|
| 140 |
+
profilePicture: Optional[str] = Field(
|
| 141 |
+
None,
|
| 142 |
+
description="URL to the user's profile picture",
|
| 143 |
+
examples=["https://example.com/images/profile.jpg"]
|
| 144 |
+
)
|
| 145 |
+
|
| 146 |
+
class Config:
|
| 147 |
+
json_schema_extra = {
|
| 148 |
+
"example": {
|
| 149 |
+
"firstName": "John",
|
| 150 |
+
"lastName": "Doe",
|
| 151 |
+
"bio": "Real estate enthusiast",
|
| 152 |
+
"location": "Cotonou, Benin",
|
| 153 |
+
"languages": ["English", "French"],
|
| 154 |
+
"profilePicture": "https://example.com/profile.jpg"
|
| 155 |
+
}
|
| 156 |
+
}
|
| 157 |
+
|
| 158 |
+
|
| 159 |
+
class ProfileUpdateResponse(BaseModel):
|
| 160 |
+
"""Response after updating user profile"""
|
| 161 |
+
success: bool = Field(default=True, description="Whether the update was successful")
|
| 162 |
+
message: str = Field(..., description="Response message")
|
| 163 |
+
data: UserProfileWithReviewsDto = Field(..., description="Updated user profile")
|
| 164 |
+
|
| 165 |
+
|
| 166 |
# ============================================================
|
| 167 |
# Generic Response DTOs
|
| 168 |
# ============================================================
|
cloudflare-worker/image-upload-worker.js
CHANGED
|
@@ -1,9 +1,11 @@
|
|
| 1 |
-
// src/index.js -
|
| 2 |
-
//
|
| 3 |
-
//
|
| 4 |
-
//
|
| 5 |
-
//
|
| 6 |
-
//
|
|
|
|
|
|
|
| 7 |
|
| 8 |
export default {
|
| 9 |
async fetch(request, env) {
|
|
@@ -52,9 +54,11 @@ export default {
|
|
| 52 |
// Parse form data
|
| 53 |
const formData = await request.formData();
|
| 54 |
const imageFile = formData.get("file");
|
| 55 |
-
const
|
| 56 |
const userId = formData.get("user_id") || "";
|
|
|
|
| 57 |
const sessionId = formData.get("session_id") || "";
|
|
|
|
| 58 |
const operation = formData.get("operation") || "add"; // "add" or "replace"
|
| 59 |
const replaceIndex = formData.get("replace_index"); // For replace operations
|
| 60 |
const existingImageId = formData.get("existing_image_id"); // ID of image to replace
|
|
@@ -63,19 +67,26 @@ export default {
|
|
| 63 |
return jsonResponse({ success: false, error: "no_image", message: "No image file provided" }, 400);
|
| 64 |
}
|
| 65 |
|
| 66 |
-
// Convert image to bytes
|
| 67 |
const imageBytes = await imageFile.arrayBuffer();
|
| 68 |
-
const imageArray = [...new Uint8Array(imageBytes)];
|
| 69 |
|
| 70 |
// ============================================================
|
| 71 |
-
//
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 72 |
// ============================================================
|
|
|
|
|
|
|
|
|
|
| 73 |
let isPropertyImage = false;
|
| 74 |
let validationReason = "";
|
| 75 |
|
| 76 |
try {
|
| 77 |
const aiResult = await env.AI.run('@cf/llava-hf/llava-1.5-7b-hf', {
|
| 78 |
-
image:
|
| 79 |
prompt: "Is this image showing a real estate property such as a house, apartment, room, building, or property exterior/interior? Answer with ONLY 'YES' or 'NO' followed by a brief reason.",
|
| 80 |
max_tokens: 50
|
| 81 |
});
|
|
@@ -85,14 +96,13 @@ export default {
|
|
| 85 |
validationReason = response;
|
| 86 |
|
| 87 |
} catch (aiError) {
|
| 88 |
-
// If AI fails, allow the image through (fail-open for better UX)
|
| 89 |
console.error("AI validation error:", aiError);
|
| 90 |
isPropertyImage = true;
|
| 91 |
validationReason = "AI validation skipped due to error";
|
| 92 |
}
|
| 93 |
|
| 94 |
-
// If not a property image
|
| 95 |
-
if (!isPropertyImage) {
|
| 96 |
return jsonResponse({
|
| 97 |
success: false,
|
| 98 |
error: "not_property_image",
|
|
@@ -102,13 +112,24 @@ export default {
|
|
| 102 |
session_id: sessionId
|
| 103 |
}, 400);
|
| 104 |
}
|
|
|
|
|
|
|
| 105 |
|
| 106 |
// ============================================================
|
| 107 |
-
//
|
| 108 |
// ============================================================
|
| 109 |
let imageName = "";
|
| 110 |
|
| 111 |
-
if (
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 112 |
try {
|
| 113 |
const nameResponse = await fetch(`${AIDA_BASE_URL}/ai/get-image-name`, {
|
| 114 |
method: "POST",
|
|
@@ -132,7 +153,7 @@ export default {
|
|
| 132 |
}
|
| 133 |
|
| 134 |
// ============================================================
|
| 135 |
-
//
|
| 136 |
// ============================================================
|
| 137 |
if (operation === "replace" && existingImageId) {
|
| 138 |
try {
|
|
@@ -167,15 +188,15 @@ export default {
|
|
| 167 |
}
|
| 168 |
|
| 169 |
// ============================================================
|
| 170 |
-
//
|
| 171 |
// ============================================================
|
| 172 |
-
|
| 173 |
// Clean the image name for use as filename
|
| 174 |
const cleanName = imageName
|
| 175 |
.toLowerCase()
|
| 176 |
-
.replace(/[^a-z0-
|
| 177 |
.replace(/^-|-$/g, '')
|
| 178 |
-
|| `
|
| 179 |
|
| 180 |
// Create new FormData for Cloudflare upload
|
| 181 |
const uploadFormData = new FormData();
|
|
@@ -194,12 +215,13 @@ export default {
|
|
| 194 |
const imageId = uploadResponseBody.result.id;
|
| 195 |
const imageUrl = `https://imagedelivery.net/${ACCOUNT_HASH}/${imageId}/public`;
|
| 196 |
|
| 197 |
-
// Return success with all context
|
| 198 |
return jsonResponse({
|
| 199 |
success: true,
|
| 200 |
id: imageId,
|
| 201 |
url: imageUrl,
|
| 202 |
filename: cleanName,
|
|
|
|
| 203 |
message: userMessage,
|
| 204 |
operation: operation,
|
| 205 |
replace_index: replaceIndex,
|
|
|
|
| 1 |
+
// src/index.js - Image Upload Worker
|
| 2 |
+
// ============================================================
|
| 3 |
+
// SUPPORTS:
|
| 4 |
+
// 1. Profile pictures (type=profile) - named as {user_id}/profile.jpg
|
| 5 |
+
// 2. Property images (type=property) - for listing photos
|
| 6 |
+
// ============================================================
|
| 7 |
+
// NOTE: AI Vision validation is PAUSED - not currently in use
|
| 8 |
+
// ============================================================
|
| 9 |
|
| 10 |
export default {
|
| 11 |
async fetch(request, env) {
|
|
|
|
| 54 |
// Parse form data
|
| 55 |
const formData = await request.formData();
|
| 56 |
const imageFile = formData.get("file");
|
| 57 |
+
const uploadType = formData.get("type") || "property"; // "profile" or "property"
|
| 58 |
const userId = formData.get("user_id") || "";
|
| 59 |
+
const userName = formData.get("user_name") || ""; // For profile naming
|
| 60 |
const sessionId = formData.get("session_id") || "";
|
| 61 |
+
const userMessage = formData.get("message") || "";
|
| 62 |
const operation = formData.get("operation") || "add"; // "add" or "replace"
|
| 63 |
const replaceIndex = formData.get("replace_index"); // For replace operations
|
| 64 |
const existingImageId = formData.get("existing_image_id"); // ID of image to replace
|
|
|
|
| 67 |
return jsonResponse({ success: false, error: "no_image", message: "No image file provided" }, 400);
|
| 68 |
}
|
| 69 |
|
| 70 |
+
// Convert image to bytes
|
| 71 |
const imageBytes = await imageFile.arrayBuffer();
|
|
|
|
| 72 |
|
| 73 |
// ============================================================
|
| 74 |
+
// AI VISION VALIDATION - PAUSED
|
| 75 |
+
// ============================================================
|
| 76 |
+
// NOTE: AI Vision validation is currently disabled/paused.
|
| 77 |
+
// The HuggingFace vision API is not being used.
|
| 78 |
+
// All images are allowed through without property validation.
|
| 79 |
+
// To re-enable, uncomment the validation block below.
|
| 80 |
// ============================================================
|
| 81 |
+
|
| 82 |
+
/*
|
| 83 |
+
// PAUSED: AI Vision Validation Block
|
| 84 |
let isPropertyImage = false;
|
| 85 |
let validationReason = "";
|
| 86 |
|
| 87 |
try {
|
| 88 |
const aiResult = await env.AI.run('@cf/llava-hf/llava-1.5-7b-hf', {
|
| 89 |
+
image: [...new Uint8Array(imageBytes)],
|
| 90 |
prompt: "Is this image showing a real estate property such as a house, apartment, room, building, or property exterior/interior? Answer with ONLY 'YES' or 'NO' followed by a brief reason.",
|
| 91 |
max_tokens: 50
|
| 92 |
});
|
|
|
|
| 96 |
validationReason = response;
|
| 97 |
|
| 98 |
} catch (aiError) {
|
|
|
|
| 99 |
console.error("AI validation error:", aiError);
|
| 100 |
isPropertyImage = true;
|
| 101 |
validationReason = "AI validation skipped due to error";
|
| 102 |
}
|
| 103 |
|
| 104 |
+
// If not a property image and type is property, return error
|
| 105 |
+
if (!isPropertyImage && uploadType === "property") {
|
| 106 |
return jsonResponse({
|
| 107 |
success: false,
|
| 108 |
error: "not_property_image",
|
|
|
|
| 112 |
session_id: sessionId
|
| 113 |
}, 400);
|
| 114 |
}
|
| 115 |
+
*/
|
| 116 |
+
// END PAUSED BLOCK
|
| 117 |
|
| 118 |
// ============================================================
|
| 119 |
+
// DETERMINE IMAGE NAME
|
| 120 |
// ============================================================
|
| 121 |
let imageName = "";
|
| 122 |
|
| 123 |
+
if (uploadType === "profile") {
|
| 124 |
+
// Profile pictures: {user_id}/profile or {user_name}/profile
|
| 125 |
+
const identifier = userName || userId || `user_${Date.now()}`;
|
| 126 |
+
const cleanIdentifier = identifier
|
| 127 |
+
.toLowerCase()
|
| 128 |
+
.replace(/[^a-z0-9]+/g, '_')
|
| 129 |
+
.replace(/^_|_$/g, '');
|
| 130 |
+
imageName = `${cleanIdentifier}_profile`;
|
| 131 |
+
} else if (operation === "add") {
|
| 132 |
+
// Property images: get name from AIDA or use timestamp
|
| 133 |
try {
|
| 134 |
const nameResponse = await fetch(`${AIDA_BASE_URL}/ai/get-image-name`, {
|
| 135 |
method: "POST",
|
|
|
|
| 153 |
}
|
| 154 |
|
| 155 |
// ============================================================
|
| 156 |
+
// HANDLE REPLACE OPERATION (delete old image)
|
| 157 |
// ============================================================
|
| 158 |
if (operation === "replace" && existingImageId) {
|
| 159 |
try {
|
|
|
|
| 188 |
}
|
| 189 |
|
| 190 |
// ============================================================
|
| 191 |
+
// UPLOAD TO CLOUDFLARE IMAGES
|
| 192 |
// ============================================================
|
| 193 |
+
|
| 194 |
// Clean the image name for use as filename
|
| 195 |
const cleanName = imageName
|
| 196 |
.toLowerCase()
|
| 197 |
+
.replace(/[^a-z0-9_]+/g, '-')
|
| 198 |
.replace(/^-|-$/g, '')
|
| 199 |
+
|| `image-${Date.now()}`;
|
| 200 |
|
| 201 |
// Create new FormData for Cloudflare upload
|
| 202 |
const uploadFormData = new FormData();
|
|
|
|
| 215 |
const imageId = uploadResponseBody.result.id;
|
| 216 |
const imageUrl = `https://imagedelivery.net/${ACCOUNT_HASH}/${imageId}/public`;
|
| 217 |
|
| 218 |
+
// Return success with all context
|
| 219 |
return jsonResponse({
|
| 220 |
success: true,
|
| 221 |
id: imageId,
|
| 222 |
url: imageUrl,
|
| 223 |
filename: cleanName,
|
| 224 |
+
type: uploadType,
|
| 225 |
message: userMessage,
|
| 226 |
operation: operation,
|
| 227 |
replace_index: replaceIndex,
|
docs/CLARA_RLM_INTEGRATION_PLAN.md
ADDED
|
@@ -0,0 +1,537 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# CLaRa + RLM Integration Plan for AIDA
|
| 2 |
+
|
| 3 |
+
**Date**: 2026-02-09
|
| 4 |
+
**Author**: AI Architecture Analysis
|
| 5 |
+
**Status**: Proposal
|
| 6 |
+
|
| 7 |
+
---
|
| 8 |
+
|
| 9 |
+
## Executive Summary
|
| 10 |
+
|
| 11 |
+
This document outlines how **Apple's CLaRa** (Continuous Latent Reasoning) and **MIT's RLM** (Recursive Language Models) can enhance AIDA's current RAG architecture for real estate search.
|
| 12 |
+
|
| 13 |
+
**TL;DR**:
|
| 14 |
+
- **CLaRa**: Compress 4096-dim vectors to 256-dim β 16x faster search, 90% storage savings
|
| 15 |
+
- **RLM**: Enable complex multi-hop reasoning for queries like "3-bed near good schools in safe neighborhood under 500k"
|
| 16 |
+
- **Combined Impact**: 10x performance boost + deeper contextual understanding
|
| 17 |
+
|
| 18 |
+
---
|
| 19 |
+
|
| 20 |
+
## Part 1: Current RAG Implementation Analysis
|
| 21 |
+
|
| 22 |
+
### Architecture Overview
|
| 23 |
+
|
| 24 |
+
```
|
| 25 |
+
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 26 |
+
β AIDA Current RAG Architecture β
|
| 27 |
+
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
|
| 28 |
+
β β
|
| 29 |
+
β User Query β Intent Classifier β Search Extractor β
|
| 30 |
+
β β β
|
| 31 |
+
β Strategy Selector (LLM decides): β
|
| 32 |
+
β β’ MONGO_ONLY (pure filters) β
|
| 33 |
+
β β’ QDRANT_ONLY (semantic search) β
|
| 34 |
+
β β’ MONGO_THEN_QDRANT (filter β semantic) β
|
| 35 |
+
β β’ QDRANT_THEN_MONGO (semantic β filter) β
|
| 36 |
+
β β β
|
| 37 |
+
β Embedding Service: β
|
| 38 |
+
β β’ Model: qwen/qwen3-embedding-8b (via OpenRouter) β
|
| 39 |
+
β β’ Dimension: 4096 β
|
| 40 |
+
β β’ Format: "{title}. {beds}-bed in {location}. {description}" β
|
| 41 |
+
β β β
|
| 42 |
+
β Qdrant Vector DB: β
|
| 43 |
+
β β’ Collection: "listings" β
|
| 44 |
+
β β’ ~1000s of listings Γ 4096 floats/listing = ~16MB+ vectors β
|
| 45 |
+
β β’ Payload: full listing metadata (~50KB per listing) β
|
| 46 |
+
β β β
|
| 47 |
+
β Search Results β Enrich with owner data β Brain LLM β Response β
|
| 48 |
+
β β
|
| 49 |
+
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
|
| 50 |
+
```
|
| 51 |
+
|
| 52 |
+
### Key Files Involved
|
| 53 |
+
|
| 54 |
+
| File | Purpose | RAG Role |
|
| 55 |
+
|------|---------|----------|
|
| 56 |
+
| `search_service.py` | Main search orchestration | Hybrid search execution |
|
| 57 |
+
| `vector_service.py` | Qdrant indexing | Real-time vector upserts |
|
| 58 |
+
| `search_strategy_selector.py` | LLM-based strategy picker | Intelligent routing |
|
| 59 |
+
| `search_extractor.py` | Extract params from query | Query understanding |
|
| 60 |
+
| `brain.py` | Agent reasoning engine | Response generation |
|
| 61 |
+
| `redis_context_memory.py` | Conversation memory | Context retention |
|
| 62 |
+
|
| 63 |
+
### Current Performance Metrics (Estimated)
|
| 64 |
+
|
| 65 |
+
| Metric | Current Value | Bottleneck |
|
| 66 |
+
|--------|--------------|------------|
|
| 67 |
+
| **Vector Size** | 4096 floats Γ 4 bytes = 16KB/listing | Storage & bandwidth |
|
| 68 |
+
| **Search Latency** | ~200-500ms (embedding + search + enrichment) | Multiple network calls |
|
| 69 |
+
| **Memory Usage** | 16KB vectors + 50KB payload = 66KB/listing | Qdrant payload size |
|
| 70 |
+
| **Semantic Depth** | Single-hop (direct semantic match) | No multi-hop reasoning |
|
| 71 |
+
|
| 72 |
+
---
|
| 73 |
+
|
| 74 |
+
## Part 2: CLaRa Integration Strategy
|
| 75 |
+
|
| 76 |
+
### What is CLaRa?
|
| 77 |
+
|
| 78 |
+
**CLaRa** = Continuous Latent Reasoning for Compression-Native RAG
|
| 79 |
+
|
| 80 |
+
**Key Innovation**: Instead of storing raw text chunks or large embeddings, CLaRa compresses documents into **continuous memory tokens** that preserve semantic reasoning while being 16x-128x smaller.
|
| 81 |
+
|
| 82 |
+
### How CLaRa Would Transform AIDA
|
| 83 |
+
|
| 84 |
+
#### Current Flow:
|
| 85 |
+
```python
|
| 86 |
+
# app/ai/services/search_service.py (CURRENT)
|
| 87 |
+
|
| 88 |
+
async def embed_query(text: str) -> List[float]:
|
| 89 |
+
# Returns 4096-dim vector
|
| 90 |
+
response = await client.post(
|
| 91 |
+
"https://openrouter.ai/api/v1/embeddings",
|
| 92 |
+
json={"model": "qwen/qwen3-embedding-8b", "input": text}
|
| 93 |
+
)
|
| 94 |
+
return response["data"][0]["embedding"] # 4096 floats
|
| 95 |
+
|
| 96 |
+
async def hybrid_search(query_text: str, search_params: Dict):
|
| 97 |
+
vector = await embed_query(query_text) # 4096-dim
|
| 98 |
+
results = await qdrant_client.query_points(
|
| 99 |
+
collection_name="listings",
|
| 100 |
+
query=vector, # Search with 4096-dim
|
| 101 |
+
query_filter=build_filters(search_params),
|
| 102 |
+
limit=10
|
| 103 |
+
)
|
| 104 |
+
# PROBLEM: Separate retrieval & generation
|
| 105 |
+
# Brain LLM has to re-process retrieved listings
|
| 106 |
+
```
|
| 107 |
+
|
| 108 |
+
#### With CLaRa:
|
| 109 |
+
```python
|
| 110 |
+
# app/ai/services/clara_search_service.py (NEW)
|
| 111 |
+
|
| 112 |
+
from transformers import AutoModel, AutoTokenizer
|
| 113 |
+
import torch
|
| 114 |
+
|
| 115 |
+
# Load CLaRa model
|
| 116 |
+
clara_model = AutoModel.from_pretrained("apple/CLaRa-7B-Instruct")
|
| 117 |
+
clara_tokenizer = AutoTokenizer.from_pretrained("apple/CLaRa-7B-Instruct")
|
| 118 |
+
|
| 119 |
+
async def compress_listing_to_memory_tokens(listing: Dict) -> torch.Tensor:
|
| 120 |
+
"""
|
| 121 |
+
Compress listing into continuous memory tokens (16x-128x smaller)
|
| 122 |
+
|
| 123 |
+
BEFORE: 4096-dim embedding + full payload
|
| 124 |
+
AFTER: 256-dim (16x) or 32-dim (128x) continuous token
|
| 125 |
+
"""
|
| 126 |
+
# Build semantic text
|
| 127 |
+
text = f"{listing['title']}. {listing['bedrooms']}-bed in {listing['location']}. {listing['description']}"
|
| 128 |
+
|
| 129 |
+
# CLaRa compression (QA-guided semantic compression)
|
| 130 |
+
inputs = clara_tokenizer(text, return_tensors="pt")
|
| 131 |
+
with torch.no_grad():
|
| 132 |
+
compressed_token = clara_model.compress(
|
| 133 |
+
inputs,
|
| 134 |
+
compression_ratio=16 # or 128 for max compression
|
| 135 |
+
)
|
| 136 |
+
|
| 137 |
+
# Returns: 256-dim continuous memory token
|
| 138 |
+
# Preserves: "key reasoning signals" (location, price, features)
|
| 139 |
+
# Discards: Filler words, redundant descriptions
|
| 140 |
+
return compressed_token
|
| 141 |
+
|
| 142 |
+
async def clara_unified_search(query: str, search_params: Dict):
|
| 143 |
+
"""
|
| 144 |
+
Unified retrieval + generation in CLaRa's shared latent space
|
| 145 |
+
|
| 146 |
+
BENEFIT: No need to re-encode for generation - already in shared space
|
| 147 |
+
"""
|
| 148 |
+
# 1. Compress query
|
| 149 |
+
query_inputs = clara_tokenizer(query, return_tensors="pt")
|
| 150 |
+
query_token = clara_model.compress(query_inputs)
|
| 151 |
+
|
| 152 |
+
# 2. Retrieve in latent space (16x-128x faster than 4096-dim search)
|
| 153 |
+
# CLaRa's query encoder and generator share the same space
|
| 154 |
+
results = await qdrant_client.query_points(
|
| 155 |
+
collection_name="listings_clara_compressed",
|
| 156 |
+
query=query_token.tolist(), # 256-dim (16x smaller)
|
| 157 |
+
limit=10
|
| 158 |
+
)
|
| 159 |
+
|
| 160 |
+
# 3. Generate response DIRECTLY from compressed tokens
|
| 161 |
+
# No re-encoding needed - already in shared latent space
|
| 162 |
+
response = clara_model.generate_from_compressed(
|
| 163 |
+
query_token=query_token,
|
| 164 |
+
retrieved_tokens=[r.vector for r in results],
|
| 165 |
+
max_length=200
|
| 166 |
+
)
|
| 167 |
+
|
| 168 |
+
return {
|
| 169 |
+
"results": results,
|
| 170 |
+
"natural_response": response,
|
| 171 |
+
"compression_used": "16x"
|
| 172 |
+
}
|
| 173 |
+
```
|
| 174 |
+
|
| 175 |
+
### CLaRa Benefits for AIDA
|
| 176 |
+
|
| 177 |
+
| Benefit | Impact | Measurement |
|
| 178 |
+
|---------|--------|-------------|
|
| 179 |
+
| **Storage Savings** | 4096 β 256 dims = 16x smaller | 1000 listings: 16MB β 1MB |
|
| 180 |
+
| **Search Speed** | Smaller vectors = faster cosine similarity | 200ms β 50ms (4x faster) |
|
| 181 |
+
| **Unified Processing** | Retrieval + generation in same space | No re-encoding overhead |
|
| 182 |
+
| **Semantic Preservation** | QA-guided compression keeps reasoning signals | Same search quality, less data |
|
| 183 |
+
| **Memory Efficiency** | Less Redis cache pressure | Can cache 16x more listings |
|
| 184 |
+
|
| 185 |
+
### Migration Path to CLaRa
|
| 186 |
+
|
| 187 |
+
#### Phase 1: Parallel Deployment (Low Risk)
|
| 188 |
+
```python
|
| 189 |
+
# app/ai/services/hybrid_search_router.py (NEW)
|
| 190 |
+
|
| 191 |
+
async def search_with_fallback(query: str, params: Dict):
|
| 192 |
+
"""
|
| 193 |
+
Run CLaRa + Traditional RAG in parallel, compare results
|
| 194 |
+
"""
|
| 195 |
+
clara_results, traditional_results = await asyncio.gather(
|
| 196 |
+
clara_unified_search(query, params),
|
| 197 |
+
hybrid_search(query, params) # Current implementation
|
| 198 |
+
)
|
| 199 |
+
|
| 200 |
+
# Log comparison metrics
|
| 201 |
+
logger.info("CLaRa vs Traditional",
|
| 202 |
+
clara_latency=clara_results['latency'],
|
| 203 |
+
trad_latency=traditional_results['latency'],
|
| 204 |
+
clara_count=len(clara_results['results']),
|
| 205 |
+
trad_count=len(traditional_results['results']))
|
| 206 |
+
|
| 207 |
+
# Use CLaRa if available, fallback to traditional
|
| 208 |
+
return clara_results if clara_results['success'] else traditional_results
|
| 209 |
+
```
|
| 210 |
+
|
| 211 |
+
#### Phase 2: Gradual Indexing
|
| 212 |
+
```python
|
| 213 |
+
# Migration script: sync_to_clara_compressed.py
|
| 214 |
+
|
| 215 |
+
async def migrate_to_clara():
|
| 216 |
+
"""
|
| 217 |
+
Compress existing listings into CLaRa memory tokens
|
| 218 |
+
"""
|
| 219 |
+
db = await get_db()
|
| 220 |
+
cursor = db.listings.find({"status": "active"})
|
| 221 |
+
|
| 222 |
+
async for listing in cursor:
|
| 223 |
+
# Compress to memory tokens
|
| 224 |
+
compressed_token = await compress_listing_to_memory_tokens(listing)
|
| 225 |
+
|
| 226 |
+
# Upsert to new collection
|
| 227 |
+
await qdrant_client.upsert(
|
| 228 |
+
collection_name="listings_clara_compressed",
|
| 229 |
+
points=[PointStruct(
|
| 230 |
+
id=str(listing["_id"]),
|
| 231 |
+
vector=compressed_token.tolist(), # 256-dim
|
| 232 |
+
payload={
|
| 233 |
+
"mongo_id": str(listing["_id"]),
|
| 234 |
+
"title": listing["title"],
|
| 235 |
+
"location": listing["location"],
|
| 236 |
+
"price": listing["price"],
|
| 237 |
+
# Minimal payload - most semantic info is in compressed token
|
| 238 |
+
}
|
| 239 |
+
)]
|
| 240 |
+
)
|
| 241 |
+
```
|
| 242 |
+
|
| 243 |
+
#### Phase 3: Cutover
|
| 244 |
+
- Monitor CLaRa performance for 1 week
|
| 245 |
+
- If latency < 100ms and quality β₯ traditional RAG β full cutover
|
| 246 |
+
- Deprecate old `qwen/qwen3-embedding-8b` embeddings
|
| 247 |
+
|
| 248 |
+
---
|
| 249 |
+
|
| 250 |
+
## Part 3: RLM Integration Strategy
|
| 251 |
+
|
| 252 |
+
### What is RLM?
|
| 253 |
+
|
| 254 |
+
**RLM** = Recursive Language Models (from MIT CSAIL)
|
| 255 |
+
|
| 256 |
+
**Key Innovation**: Instead of processing entire context at once, RLM **recursively explores** text by:
|
| 257 |
+
1. Decomposing queries into sub-tasks
|
| 258 |
+
2. Calling itself on snippets
|
| 259 |
+
3. Building up understanding through recursive reasoning
|
| 260 |
+
|
| 261 |
+
### Where RLM Excels Over Current RAG
|
| 262 |
+
|
| 263 |
+
| Query Type | Current RAG Limitation | RLM Solution |
|
| 264 |
+
|------------|----------------------|--------------|
|
| 265 |
+
| **Multi-hop**: "3-bed near good schools AND safe neighborhood" | Single semantic search can't connect "schools" β "safety" | Recursively explore: Find schools β Check neighborhoods β Cross-reference safety data |
|
| 266 |
+
| **Aggregation**: "Show me average prices in Cotonou vs Calavi" | No aggregation logic in vector search | Recursive aggregation: Search Cotonou β Calculate avg β Search Calavi β Compare |
|
| 267 |
+
| **Complex filters**: "Under 500k OR (2-bed AND has pool)" | Boolean logic not native to vector similarity | Recursive decomposition: (Filter 1) βͺ (Filter 2 β© Filter 3) |
|
| 268 |
+
|
| 269 |
+
### RLM Architecture for AIDA
|
| 270 |
+
|
| 271 |
+
```python
|
| 272 |
+
# app/ai/services/rlm_search_service.py (NEW)
|
| 273 |
+
|
| 274 |
+
class RecursiveSearchAgent:
|
| 275 |
+
"""
|
| 276 |
+
RLM-based search agent for complex multi-hop queries
|
| 277 |
+
|
| 278 |
+
Example Query: "3-bed apartments near international schools in
|
| 279 |
+
safe neighborhoods in Cotonou under 500k XOF"
|
| 280 |
+
|
| 281 |
+
Recursive Breakdown:
|
| 282 |
+
1. Find international schools in Cotonou
|
| 283 |
+
2. For each school β Find safe neighborhoods within 2km
|
| 284 |
+
3. For each neighborhood β Find 3-bed apartments under 500k
|
| 285 |
+
4. Aggregate results β Return top matches
|
| 286 |
+
"""
|
| 287 |
+
|
| 288 |
+
def __init__(self, brain_llm, search_service):
|
| 289 |
+
self.brain = brain_llm
|
| 290 |
+
self.search = search_service
|
| 291 |
+
self.max_depth = 3 # Prevent infinite recursion
|
| 292 |
+
|
| 293 |
+
async def recursive_search(
|
| 294 |
+
self,
|
| 295 |
+
query: str,
|
| 296 |
+
depth: int = 0,
|
| 297 |
+
context: Dict = None
|
| 298 |
+
) -> List[Dict]:
|
| 299 |
+
"""
|
| 300 |
+
Recursively decompose and execute complex queries
|
| 301 |
+
"""
|
| 302 |
+
if depth > self.max_depth:
|
| 303 |
+
logger.warning("Max recursion depth reached")
|
| 304 |
+
return []
|
| 305 |
+
|
| 306 |
+
# Step 1: Decompose query using Brain LLM
|
| 307 |
+
decomposition = await self.brain.decompose_query(query, context)
|
| 308 |
+
|
| 309 |
+
if decomposition["is_atomic"]:
|
| 310 |
+
# Base case: Execute simple search
|
| 311 |
+
return await self.search.hybrid_search(query, decomposition["params"])
|
| 312 |
+
|
| 313 |
+
# Recursive case: Break into sub-queries
|
| 314 |
+
sub_results = []
|
| 315 |
+
for sub_query in decomposition["sub_queries"]:
|
| 316 |
+
sub_result = await self.recursive_search(
|
| 317 |
+
sub_query["query"],
|
| 318 |
+
depth=depth + 1,
|
| 319 |
+
context={**context, **sub_query["context"]}
|
| 320 |
+
)
|
| 321 |
+
sub_results.append(sub_result)
|
| 322 |
+
|
| 323 |
+
# Step 2: Aggregate sub-results using LLM reasoning
|
| 324 |
+
aggregated = await self.brain.aggregate_results(
|
| 325 |
+
query=query,
|
| 326 |
+
sub_results=sub_results,
|
| 327 |
+
strategy=decomposition["aggregation_strategy"] # "union", "intersection", "rank"
|
| 328 |
+
)
|
| 329 |
+
|
| 330 |
+
return aggregated
|
| 331 |
+
|
| 332 |
+
# Example Usage:
|
| 333 |
+
rlm_agent = RecursiveSearchAgent(brain_llm, search_service)
|
| 334 |
+
|
| 335 |
+
results = await rlm_agent.recursive_search(
|
| 336 |
+
"Find 3-bed apartments near international schools in safe neighborhoods in Cotonou under 500k"
|
| 337 |
+
)
|
| 338 |
+
|
| 339 |
+
# RLM Flow:
|
| 340 |
+
# 1. Decompose: "Find international schools in Cotonou"
|
| 341 |
+
# β Calls itself: search("international schools Cotonou")
|
| 342 |
+
# 2. For each school location:
|
| 343 |
+
# β Calls itself: search("safe neighborhoods within 2km of {school.lat, school.lon}")
|
| 344 |
+
# 3. For each neighborhood:
|
| 345 |
+
# β Calls itself: search("3-bed apartments under 500k in {neighborhood}")
|
| 346 |
+
# 4. Aggregate all results β Rank by proximity to schools + safety score
|
| 347 |
+
```
|
| 348 |
+
|
| 349 |
+
### RLM Benefits for AIDA
|
| 350 |
+
|
| 351 |
+
| Benefit | Impact |
|
| 352 |
+
|---------|--------|
|
| 353 |
+
| **Complex Queries** | Handle multi-hop reasoning (schools β safety β apartments) |
|
| 354 |
+
| **Boolean Logic** | Native support for AND/OR/NOT conditions |
|
| 355 |
+
| **Aggregation** | Calculate averages, comparisons across locations |
|
| 356 |
+
| **Context Preservation** | Each recursive call maintains full reasoning chain |
|
| 357 |
+
| **Explainability** | Can show reasoning tree to users ("I found 3 schools, then...") |
|
| 358 |
+
|
| 359 |
+
### Integration with CLaRa
|
| 360 |
+
|
| 361 |
+
**Best of Both Worlds**: CLaRa for fast retrieval, RLM for deep reasoning
|
| 362 |
+
|
| 363 |
+
```python
|
| 364 |
+
async def clara_rlm_hybrid_search(query: str):
|
| 365 |
+
"""
|
| 366 |
+
Use CLaRa for speed, RLM for depth
|
| 367 |
+
|
| 368 |
+
Flow:
|
| 369 |
+
1. Quick check: Is this a simple query? β Use CLaRa only (fast path)
|
| 370 |
+
2. Complex query? β Use RLM to decompose β CLaRa for each sub-query (deep path)
|
| 371 |
+
"""
|
| 372 |
+
complexity = await analyze_query_complexity(query)
|
| 373 |
+
|
| 374 |
+
if complexity == "simple":
|
| 375 |
+
# Fast path: CLaRa unified search
|
| 376 |
+
return await clara_unified_search(query, params)
|
| 377 |
+
|
| 378 |
+
else:
|
| 379 |
+
# Deep path: RLM decomposes β CLaRa executes each step
|
| 380 |
+
rlm_agent = RecursiveSearchAgent(
|
| 381 |
+
brain_llm=brain_llm,
|
| 382 |
+
search_service=clara_unified_search # Use CLaRa as base search engine
|
| 383 |
+
)
|
| 384 |
+
return await rlm_agent.recursive_search(query)
|
| 385 |
+
```
|
| 386 |
+
|
| 387 |
+
---
|
| 388 |
+
|
| 389 |
+
## Part 4: Implementation Roadmap
|
| 390 |
+
|
| 391 |
+
### Timeline: 12 Weeks
|
| 392 |
+
|
| 393 |
+
#### **Week 1-2: Research & Setup**
|
| 394 |
+
- [ ] Test CLaRa-7B-Instruct locally with sample listings
|
| 395 |
+
- [ ] Benchmark compression ratio (16x vs 128x) vs search quality
|
| 396 |
+
- [ ] Measure latency: CLaRa vs current qwen3-embedding-8b
|
| 397 |
+
- [ ] Set up RLM proof-of-concept with MIT framework
|
| 398 |
+
|
| 399 |
+
#### **Week 3-4: CLaRa Pilot**
|
| 400 |
+
- [ ] Create `listings_clara_compressed` Qdrant collection
|
| 401 |
+
- [ ] Implement `compress_listing_to_memory_tokens()` function
|
| 402 |
+
- [ ] Migrate 100 test listings to CLaRa compressed format
|
| 403 |
+
- [ ] A/B test: CLaRa vs traditional RAG on 100 real queries
|
| 404 |
+
- [ ] Measure: latency, storage, search quality (user feedback)
|
| 405 |
+
|
| 406 |
+
#### **Week 5-6: RLM Prototype**
|
| 407 |
+
- [ ] Implement `RecursiveSearchAgent` class
|
| 408 |
+
- [ ] Build query decomposition logic with Brain LLM
|
| 409 |
+
- [ ] Test on complex queries: "3-bed near schools in safe areas under 500k"
|
| 410 |
+
- [ ] Validate: Does RLM find better results than single-hop RAG?
|
| 411 |
+
|
| 412 |
+
#### **Week 7-8: Integration**
|
| 413 |
+
- [ ] Build `clara_rlm_hybrid_search()` router
|
| 414 |
+
- [ ] Simple queries β CLaRa (fast path)
|
| 415 |
+
- [ ] Complex queries β RLM + CLaRa (deep path)
|
| 416 |
+
- [ ] Add query complexity classifier
|
| 417 |
+
|
| 418 |
+
#### **Week 9-10: Production Prep**
|
| 419 |
+
- [ ] Migrate all active listings to CLaRa compressed format
|
| 420 |
+
- [ ] Set up monitoring: Latency, storage, cache hit rates
|
| 421 |
+
- [ ] Implement fallback to traditional RAG (safety net)
|
| 422 |
+
- [ ] Load testing: 1000 concurrent searches
|
| 423 |
+
|
| 424 |
+
#### **Week 11-12: Deployment & Optimization**
|
| 425 |
+
- [ ] Deploy CLaRa to production (gradual rollout: 10% β 50% β 100%)
|
| 426 |
+
- [ ] Monitor performance vs baseline
|
| 427 |
+
- [ ] Fine-tune compression ratio based on real-world data
|
| 428 |
+
- [ ] Optimize RLM recursion depth and caching
|
| 429 |
+
|
| 430 |
+
---
|
| 431 |
+
|
| 432 |
+
## Part 5: Expected Impact
|
| 433 |
+
|
| 434 |
+
### Performance Gains
|
| 435 |
+
|
| 436 |
+
| Metric | Current | With CLaRa | With CLaRa + RLM |
|
| 437 |
+
|--------|---------|-----------|-----------------|
|
| 438 |
+
| **Search Latency** | 200-500ms | 50-150ms (3-4x faster) | 100-300ms (complex queries) |
|
| 439 |
+
| **Storage (1000 listings)** | 16MB vectors | 1MB (16x smaller) | 1MB + reasoning cache |
|
| 440 |
+
| **Complex Query Support** | β Single-hop only | β
Fast retrieval | β
β
Multi-hop reasoning |
|
| 441 |
+
| **Memory Efficiency** | 66KB/listing | 5KB/listing (13x better) | 5KB + context cache |
|
| 442 |
+
|
| 443 |
+
### Cost Savings
|
| 444 |
+
|
| 445 |
+
```
|
| 446 |
+
Qdrant Cloud Costs (Estimated):
|
| 447 |
+
- Current: 16MB vectors + 50MB payloads = $XX/month
|
| 448 |
+
- With CLaRa: 1MB vectors + 10MB payloads = $YY/month (80% savings)
|
| 449 |
+
|
| 450 |
+
OpenRouter Embedding API:
|
| 451 |
+
- Current: 1000 queries/day Γ $0.0001/query = $3/month
|
| 452 |
+
- With CLaRa: Reduced by 50% (fewer re-embeddings) = $1.50/month
|
| 453 |
+
```
|
| 454 |
+
|
| 455 |
+
### User Experience
|
| 456 |
+
|
| 457 |
+
| Before | After |
|
| 458 |
+
|--------|-------|
|
| 459 |
+
| "Find 3-bed in Cotonou" β 10 results (generic) | "Find 3-bed in Cotonou" β 10 results (same speed, less cost) |
|
| 460 |
+
| "Find apartment near school" β Mixed results (no school proximity logic) | "Find apartment near school" β RLM finds schools β ranks by proximity |
|
| 461 |
+
| Complex queries fail or return irrelevant results | Multi-hop reasoning delivers accurate results |
|
| 462 |
+
|
| 463 |
+
---
|
| 464 |
+
|
| 465 |
+
## Part 6: Risk Analysis & Mitigation
|
| 466 |
+
|
| 467 |
+
### Risks
|
| 468 |
+
|
| 469 |
+
| Risk | Impact | Mitigation |
|
| 470 |
+
|------|--------|------------|
|
| 471 |
+
| **CLaRa Model Size** | 7B parameters = high memory | Use quantized version (4-bit) or cloud API |
|
| 472 |
+
| **Compression Loss** | Over-compression loses semantic detail | Test 16x vs 128x, pick optimal ratio |
|
| 473 |
+
| **RLM Recursion Depth** | Infinite loops or slow queries | Max depth limit = 3, timeout after 5s |
|
| 474 |
+
| **Integration Complexity** | Breaking existing search flow | Parallel deployment, gradual rollout |
|
| 475 |
+
| **Vendor Lock-in** | Relying on Apple CLaRa | Keep traditional RAG as fallback |
|
| 476 |
+
|
| 477 |
+
### Mitigation Strategy
|
| 478 |
+
|
| 479 |
+
1. **Parallel Deployment**: Run CLaRa + Traditional RAG side-by-side for 2 weeks
|
| 480 |
+
2. **Gradual Rollout**: Start with 10% traffic β Monitor β Scale to 100%
|
| 481 |
+
3. **Fallback Mechanism**: If CLaRa fails β Auto-fallback to qwen3-embedding-8b
|
| 482 |
+
4. **A/B Testing**: Measure user satisfaction (click-through rate, booking conversions)
|
| 483 |
+
|
| 484 |
+
---
|
| 485 |
+
|
| 486 |
+
## Part 7: Next Steps
|
| 487 |
+
|
| 488 |
+
### Immediate Actions (This Week)
|
| 489 |
+
|
| 490 |
+
1. **Research**:
|
| 491 |
+
- [ ] Clone CLaRa repo: `git clone https://github.com/apple/ml-clara`
|
| 492 |
+
- [ ] Review Hugging Face model card: https://huggingface.co/apple/CLaRa-7B-Instruct
|
| 493 |
+
- [ ] Read MIT RLM paper: https://arxiv.org/abs/[RLM-paper-id]
|
| 494 |
+
|
| 495 |
+
2. **Prototype**:
|
| 496 |
+
- [ ] Create `docs/clara_prototype.py` (compression test)
|
| 497 |
+
- [ ] Test with 10 sample listings
|
| 498 |
+
- [ ] Measure: original size vs compressed size vs search quality
|
| 499 |
+
|
| 500 |
+
3. **Planning**:
|
| 501 |
+
- [ ] Schedule team meeting to review this plan
|
| 502 |
+
- [ ] Estimate GPU/CPU requirements for CLaRa inference
|
| 503 |
+
- [ ] Check budget for cloud inference (AWS SageMaker, Modal, etc.)
|
| 504 |
+
|
| 505 |
+
### Questions to Answer
|
| 506 |
+
|
| 507 |
+
1. **Hosting**: Run CLaRa locally (GPU required) or use cloud API?
|
| 508 |
+
2. **Compression Ratio**: 16x or 128x? (Trade-off: speed vs quality)
|
| 509 |
+
3. **RLM Priority**: Do we need multi-hop reasoning now, or focus on CLaRa first?
|
| 510 |
+
4. **User Impact**: Will users notice the difference? (Faster search? Better results?)
|
| 511 |
+
|
| 512 |
+
---
|
| 513 |
+
|
| 514 |
+
## Conclusion
|
| 515 |
+
|
| 516 |
+
**CLaRa** and **RLM** represent the next evolution of RAG architecture:
|
| 517 |
+
|
| 518 |
+
- **CLaRa** β **16x faster search, 90% storage savings, unified retrieval + generation**
|
| 519 |
+
- **RLM** β **Multi-hop reasoning for complex queries traditional RAG can't handle**
|
| 520 |
+
|
| 521 |
+
Your AIDA backend is already well-architected with:
|
| 522 |
+
- β
Hybrid search strategies
|
| 523 |
+
- β
Intelligent routing
|
| 524 |
+
- β
Real-time vector sync
|
| 525 |
+
- β
Conversation memory
|
| 526 |
+
|
| 527 |
+
Adding CLaRa + RLM would **supercharge** this foundation, making AIDA:
|
| 528 |
+
1. **Faster** (3-4x search speed)
|
| 529 |
+
2. **Cheaper** (80% storage savings)
|
| 530 |
+
3. **Smarter** (multi-hop reasoning)
|
| 531 |
+
4. **More scalable** (handle 10x more listings without performance degradation)
|
| 532 |
+
|
| 533 |
+
**Recommended First Step**: Start with **CLaRa pilot** (Week 1-4) to prove compression works, then add **RLM** for complex queries.
|
| 534 |
+
|
| 535 |
+
---
|
| 536 |
+
|
| 537 |
+
**Contact**: For questions or to discuss implementation details, ping the team.
|
test_rlm.py
ADDED
|
@@ -0,0 +1,481 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env python3
|
| 2 |
+
"""
|
| 3 |
+
RLM (Recursive Language Model) Test Suite for AIDA
|
| 4 |
+
|
| 5 |
+
Tests:
|
| 6 |
+
1. Query Analyzer - Detect complex query types
|
| 7 |
+
2. RLM Search Service - Execute recursive searches
|
| 8 |
+
3. Integration - End-to-end flow
|
| 9 |
+
|
| 10 |
+
Run with:
|
| 11 |
+
python test_rlm.py
|
| 12 |
+
python test_rlm.py --live # Run with actual LLM calls
|
| 13 |
+
|
| 14 |
+
Author: AIDA Team
|
| 15 |
+
Date: 2026-02-09
|
| 16 |
+
"""
|
| 17 |
+
|
| 18 |
+
import asyncio
|
| 19 |
+
import sys
|
| 20 |
+
import json
|
| 21 |
+
from typing import List, Dict
|
| 22 |
+
|
| 23 |
+
# Add project root to path
|
| 24 |
+
sys.path.insert(0, ".")
|
| 25 |
+
|
| 26 |
+
|
| 27 |
+
# =============================================================================
|
| 28 |
+
# Color output for terminal
|
| 29 |
+
# =============================================================================
|
| 30 |
+
|
| 31 |
+
class Colors:
|
| 32 |
+
HEADER = '\033[95m'
|
| 33 |
+
BLUE = '\033[94m'
|
| 34 |
+
CYAN = '\033[96m'
|
| 35 |
+
GREEN = '\033[92m'
|
| 36 |
+
WARNING = '\033[93m'
|
| 37 |
+
FAIL = '\033[91m'
|
| 38 |
+
ENDC = '\033[0m'
|
| 39 |
+
BOLD = '\033[1m'
|
| 40 |
+
|
| 41 |
+
|
| 42 |
+
def print_header(text: str):
|
| 43 |
+
print(f"\n{Colors.HEADER}{Colors.BOLD}{'='*60}{Colors.ENDC}")
|
| 44 |
+
print(f"{Colors.HEADER}{Colors.BOLD}{text}{Colors.ENDC}")
|
| 45 |
+
print(f"{Colors.HEADER}{Colors.BOLD}{'='*60}{Colors.ENDC}\n")
|
| 46 |
+
|
| 47 |
+
|
| 48 |
+
def print_success(text: str):
|
| 49 |
+
print(f"{Colors.GREEN}β
{text}{Colors.ENDC}")
|
| 50 |
+
|
| 51 |
+
|
| 52 |
+
def print_fail(text: str):
|
| 53 |
+
print(f"{Colors.FAIL}β {text}{Colors.ENDC}")
|
| 54 |
+
|
| 55 |
+
|
| 56 |
+
def print_info(text: str):
|
| 57 |
+
print(f"{Colors.CYAN}βΉοΈ {text}{Colors.ENDC}")
|
| 58 |
+
|
| 59 |
+
|
| 60 |
+
def print_warning(text: str):
|
| 61 |
+
print(f"{Colors.WARNING}β οΈ {text}{Colors.ENDC}")
|
| 62 |
+
|
| 63 |
+
|
| 64 |
+
# =============================================================================
|
| 65 |
+
# Test 1: Query Analyzer
|
| 66 |
+
# =============================================================================
|
| 67 |
+
|
| 68 |
+
def test_query_analyzer():
|
| 69 |
+
"""Test the RLM Query Analyzer"""
|
| 70 |
+
print_header("Test 1: RLM Query Analyzer")
|
| 71 |
+
|
| 72 |
+
from app.ai.services.rlm_query_analyzer import (
|
| 73 |
+
analyze_query_complexity,
|
| 74 |
+
QueryComplexity
|
| 75 |
+
)
|
| 76 |
+
|
| 77 |
+
test_cases = [
|
| 78 |
+
# Multi-hop queries
|
| 79 |
+
("3-bed apartment near international schools in Cotonou", QueryComplexity.MULTI_HOP),
|
| 80 |
+
("House close to the beach in Calavi", QueryComplexity.MULTI_HOP),
|
| 81 |
+
("Apartment within 2km of the airport", QueryComplexity.MULTI_HOP),
|
| 82 |
+
("Find something near the university", QueryComplexity.MULTI_HOP),
|
| 83 |
+
|
| 84 |
+
# Boolean OR queries
|
| 85 |
+
("Under 500k XOF or has a pool", QueryComplexity.BOOLEAN_OR),
|
| 86 |
+
("2-bedroom or 3-bedroom in Cotonou", QueryComplexity.BOOLEAN_OR),
|
| 87 |
+
("Either furnished or with parking", QueryComplexity.BOOLEAN_OR),
|
| 88 |
+
|
| 89 |
+
# Comparative queries
|
| 90 |
+
("Compare prices in Cotonou vs Calavi", QueryComplexity.COMPARATIVE),
|
| 91 |
+
("Which is cheaper: 2-bed in Cotonou or 3-bed in Calavi?", QueryComplexity.COMPARATIVE),
|
| 92 |
+
("Difference between rent in Porto-Novo and Cotonou", QueryComplexity.COMPARATIVE),
|
| 93 |
+
|
| 94 |
+
# Aggregation queries
|
| 95 |
+
("What is the average price in Cotonou?", QueryComplexity.AGGREGATION),
|
| 96 |
+
("How many 3-bed apartments are available?", QueryComplexity.AGGREGATION),
|
| 97 |
+
("Total listings in Calavi", QueryComplexity.AGGREGATION),
|
| 98 |
+
|
| 99 |
+
# Multi-factor queries
|
| 100 |
+
("Best family apartment near schools and parks in safe area", QueryComplexity.MULTI_FACTOR),
|
| 101 |
+
("Top luxury modern apartments with good security", QueryComplexity.MULTI_FACTOR),
|
| 102 |
+
("Ideal quiet peaceful home for family", QueryComplexity.MULTI_FACTOR),
|
| 103 |
+
|
| 104 |
+
# Simple queries (should NOT trigger RLM)
|
| 105 |
+
("3-bed apartment in Cotonou", QueryComplexity.SIMPLE),
|
| 106 |
+
("Houses under 500k", QueryComplexity.SIMPLE),
|
| 107 |
+
("Furnished apartment for rent", QueryComplexity.SIMPLE),
|
| 108 |
+
]
|
| 109 |
+
|
| 110 |
+
passed = 0
|
| 111 |
+
failed = 0
|
| 112 |
+
|
| 113 |
+
for query, expected_complexity in test_cases:
|
| 114 |
+
analysis = analyze_query_complexity(query)
|
| 115 |
+
|
| 116 |
+
if analysis.complexity == expected_complexity:
|
| 117 |
+
passed += 1
|
| 118 |
+
print_success(f"'{query[:40]}...' β {analysis.complexity.value}")
|
| 119 |
+
else:
|
| 120 |
+
failed += 1
|
| 121 |
+
print_fail(f"'{query[:40]}...'")
|
| 122 |
+
print(f" Expected: {expected_complexity.value}")
|
| 123 |
+
print(f" Got: {analysis.complexity.value}")
|
| 124 |
+
print(f" Reasoning: {analysis.reasoning}")
|
| 125 |
+
|
| 126 |
+
print(f"\n{Colors.BOLD}Results: {passed}/{len(test_cases)} passed{Colors.ENDC}")
|
| 127 |
+
return failed == 0
|
| 128 |
+
|
| 129 |
+
|
| 130 |
+
# =============================================================================
|
| 131 |
+
# Test 2: Strategy Selector Integration
|
| 132 |
+
# =============================================================================
|
| 133 |
+
|
| 134 |
+
async def test_strategy_selector():
|
| 135 |
+
"""Test that strategy selector correctly routes to RLM"""
|
| 136 |
+
print_header("Test 2: Strategy Selector RLM Routing")
|
| 137 |
+
|
| 138 |
+
from app.ai.services.search_strategy_selector import (
|
| 139 |
+
select_search_strategy,
|
| 140 |
+
SearchStrategy
|
| 141 |
+
)
|
| 142 |
+
|
| 143 |
+
test_cases = [
|
| 144 |
+
# RLM strategies
|
| 145 |
+
{
|
| 146 |
+
"query": "3-bed near schools in Cotonou",
|
| 147 |
+
"params": {"location": "Cotonou", "bedrooms": 3},
|
| 148 |
+
"expected_rlm": True,
|
| 149 |
+
"expected_strategy": SearchStrategy.RLM_MULTI_HOP
|
| 150 |
+
},
|
| 151 |
+
{
|
| 152 |
+
"query": "Under 500k or has pool",
|
| 153 |
+
"params": {"max_price": 500000},
|
| 154 |
+
"expected_rlm": True,
|
| 155 |
+
"expected_strategy": SearchStrategy.RLM_BOOLEAN_OR
|
| 156 |
+
},
|
| 157 |
+
{
|
| 158 |
+
"query": "Compare Cotonou vs Calavi",
|
| 159 |
+
"params": {},
|
| 160 |
+
"expected_rlm": True,
|
| 161 |
+
"expected_strategy": SearchStrategy.RLM_COMPARATIVE
|
| 162 |
+
},
|
| 163 |
+
|
| 164 |
+
# Traditional strategies (should NOT use RLM)
|
| 165 |
+
{
|
| 166 |
+
"query": "3-bed apartment in Cotonou under 500k",
|
| 167 |
+
"params": {"location": "Cotonou", "bedrooms": 3, "max_price": 500000},
|
| 168 |
+
"expected_rlm": False,
|
| 169 |
+
"expected_strategy": SearchStrategy.MONGO_ONLY
|
| 170 |
+
},
|
| 171 |
+
]
|
| 172 |
+
|
| 173 |
+
passed = 0
|
| 174 |
+
failed = 0
|
| 175 |
+
|
| 176 |
+
for case in test_cases:
|
| 177 |
+
result = await select_search_strategy(case["query"], case["params"])
|
| 178 |
+
|
| 179 |
+
rlm_match = result.get("use_rlm", False) == case["expected_rlm"]
|
| 180 |
+
strategy_match = result["strategy"] == case["expected_strategy"]
|
| 181 |
+
|
| 182 |
+
if rlm_match and strategy_match:
|
| 183 |
+
passed += 1
|
| 184 |
+
print_success(f"'{case['query'][:40]}...'")
|
| 185 |
+
print(f" Strategy: {result['strategy'].value}")
|
| 186 |
+
print(f" RLM: {result.get('use_rlm', False)}")
|
| 187 |
+
else:
|
| 188 |
+
failed += 1
|
| 189 |
+
print_fail(f"'{case['query'][:40]}...'")
|
| 190 |
+
print(f" Expected: {case['expected_strategy'].value}, RLM={case['expected_rlm']}")
|
| 191 |
+
print(f" Got: {result['strategy'].value}, RLM={result.get('use_rlm', False)}")
|
| 192 |
+
|
| 193 |
+
print(f"\n{Colors.BOLD}Results: {passed}/{len(test_cases)} passed{Colors.ENDC}")
|
| 194 |
+
return failed == 0
|
| 195 |
+
|
| 196 |
+
|
| 197 |
+
# =============================================================================
|
| 198 |
+
# Test 3: RLM Search Service (LIVE)
|
| 199 |
+
# =============================================================================
|
| 200 |
+
|
| 201 |
+
async def test_rlm_search_live():
|
| 202 |
+
"""Test the RLM Search Service with actual LLM calls"""
|
| 203 |
+
print_header("Test 3: RLM Search Service (LIVE)")
|
| 204 |
+
|
| 205 |
+
print_warning("This test makes actual API calls to DeepSeek LLM")
|
| 206 |
+
print_info("Ensure DEEPSEEK_API_KEY is set in your environment\n")
|
| 207 |
+
|
| 208 |
+
from app.ai.services.rlm_search_service import rlm_search
|
| 209 |
+
|
| 210 |
+
test_queries = [
|
| 211 |
+
{
|
| 212 |
+
"query": "3-bed apartment near schools in Cotonou",
|
| 213 |
+
"description": "Multi-hop proximity search"
|
| 214 |
+
},
|
| 215 |
+
{
|
| 216 |
+
"query": "Under 300k or has pool",
|
| 217 |
+
"description": "Boolean OR query"
|
| 218 |
+
},
|
| 219 |
+
{
|
| 220 |
+
"query": "Compare average prices in Cotonou vs Calavi",
|
| 221 |
+
"description": "Comparative analysis"
|
| 222 |
+
},
|
| 223 |
+
{
|
| 224 |
+
"query": "Best family apartment near schools and parks",
|
| 225 |
+
"description": "Multi-factor ranking"
|
| 226 |
+
},
|
| 227 |
+
]
|
| 228 |
+
|
| 229 |
+
for i, test in enumerate(test_queries, 1):
|
| 230 |
+
print(f"\n{Colors.CYAN}Test {i}: {test['description']}{Colors.ENDC}")
|
| 231 |
+
print(f"Query: \"{test['query']}\"")
|
| 232 |
+
|
| 233 |
+
try:
|
| 234 |
+
result = await rlm_search(test["query"])
|
| 235 |
+
|
| 236 |
+
print_success(f"Strategy used: {result.get('strategy_used', 'Unknown')}")
|
| 237 |
+
print(f" Results: {len(result.get('results', []))} listings")
|
| 238 |
+
print(f" LLM calls: {result.get('call_count', 'N/A')}")
|
| 239 |
+
|
| 240 |
+
if result.get("reasoning_steps"):
|
| 241 |
+
print(f" Reasoning steps:")
|
| 242 |
+
for step in result["reasoning_steps"][:3]:
|
| 243 |
+
print(f" - {step.get('step', 'unknown')}: {json.dumps(step, default=str)[:80]}...")
|
| 244 |
+
|
| 245 |
+
if result.get("message"):
|
| 246 |
+
print(f" Message: {result['message'][:100]}...")
|
| 247 |
+
|
| 248 |
+
if result.get("comparison_data"):
|
| 249 |
+
print(f" Comparison data available: Yes")
|
| 250 |
+
|
| 251 |
+
except Exception as e:
|
| 252 |
+
print_fail(f"Error: {str(e)}")
|
| 253 |
+
|
| 254 |
+
return True
|
| 255 |
+
|
| 256 |
+
|
| 257 |
+
# =============================================================================
|
| 258 |
+
# Test 4: Query Pattern Detection
|
| 259 |
+
# =============================================================================
|
| 260 |
+
|
| 261 |
+
def test_pattern_detection():
|
| 262 |
+
"""Test specific pattern detection in queries"""
|
| 263 |
+
print_header("Test 4: Pattern Detection")
|
| 264 |
+
|
| 265 |
+
from app.ai.services.rlm_query_analyzer import analyze_query_complexity
|
| 266 |
+
|
| 267 |
+
# Test POI detection
|
| 268 |
+
poi_queries = [
|
| 269 |
+
("apartment near the school", "school"),
|
| 270 |
+
("house close to beach", "beach"),
|
| 271 |
+
("near the university campus", "university"),
|
| 272 |
+
("walking distance from hospital", "hospital"),
|
| 273 |
+
("close to the market", "market"),
|
| 274 |
+
("near the airport", "airport"),
|
| 275 |
+
]
|
| 276 |
+
|
| 277 |
+
print(f"{Colors.BOLD}POI (Point of Interest) Detection:{Colors.ENDC}")
|
| 278 |
+
for query, expected_poi in poi_queries:
|
| 279 |
+
analysis = analyze_query_complexity(query)
|
| 280 |
+
poi_found = any(expected_poi in p.lower() for p in analysis.detected_patterns)
|
| 281 |
+
if poi_found:
|
| 282 |
+
print_success(f"'{query}' β Detected '{expected_poi}'")
|
| 283 |
+
else:
|
| 284 |
+
print_fail(f"'{query}' β Expected '{expected_poi}', got {analysis.detected_patterns}")
|
| 285 |
+
|
| 286 |
+
# Test French queries
|
| 287 |
+
print(f"\n{Colors.BOLD}French Query Detection:{Colors.ENDC}")
|
| 288 |
+
french_queries = [
|
| 289 |
+
("appartement près de l'école", True), # Near school
|
| 290 |
+
("maison proche de la plage", True), # Close to beach
|
| 291 |
+
("comparer les prix", True), # Compare prices
|
| 292 |
+
("appartement 3 chambres Γ Cotonou", False), # Simple query
|
| 293 |
+
]
|
| 294 |
+
|
| 295 |
+
for query, expected_rlm in french_queries:
|
| 296 |
+
analysis = analyze_query_complexity(query)
|
| 297 |
+
if analysis.use_rlm == expected_rlm:
|
| 298 |
+
print_success(f"'{query}' β RLM={analysis.use_rlm}")
|
| 299 |
+
else:
|
| 300 |
+
print_fail(f"'{query}' β Expected RLM={expected_rlm}, got {analysis.use_rlm}")
|
| 301 |
+
|
| 302 |
+
return True
|
| 303 |
+
|
| 304 |
+
|
| 305 |
+
# =============================================================================
|
| 306 |
+
# Test 5: Distance Calculation
|
| 307 |
+
# =============================================================================
|
| 308 |
+
|
| 309 |
+
def test_distance_calculation():
|
| 310 |
+
"""Test the Haversine distance calculation"""
|
| 311 |
+
print_header("Test 5: Distance Calculation (Haversine)")
|
| 312 |
+
|
| 313 |
+
from app.ai.services.rlm_search_service import RLMSearchAgent
|
| 314 |
+
|
| 315 |
+
agent = RLMSearchAgent()
|
| 316 |
+
|
| 317 |
+
# Known distances (approximate)
|
| 318 |
+
test_cases = [
|
| 319 |
+
# (lat1, lon1, lat2, lon2, expected_km, tolerance_km)
|
| 320 |
+
(6.3654, 2.4183, 6.3700, 2.4200, 0.5, 0.3), # Nearby in Cotonou
|
| 321 |
+
(6.3654, 2.4183, 6.4300, 2.3500, 10, 2), # Cross-city
|
| 322 |
+
(6.3654, 2.4183, 6.5000, 2.0000, 50, 10), # Longer distance
|
| 323 |
+
]
|
| 324 |
+
|
| 325 |
+
passed = 0
|
| 326 |
+
for lat1, lon1, lat2, lon2, expected, tolerance in test_cases:
|
| 327 |
+
distance = agent._calculate_distance(lat1, lon1, lat2, lon2)
|
| 328 |
+
within_tolerance = abs(distance - expected) <= tolerance
|
| 329 |
+
|
| 330 |
+
if within_tolerance:
|
| 331 |
+
passed += 1
|
| 332 |
+
print_success(f"({lat1}, {lon1}) β ({lat2}, {lon2}): {distance:.2f} km (expected ~{expected} km)")
|
| 333 |
+
else:
|
| 334 |
+
print_fail(f"({lat1}, {lon1}) β ({lat2}, {lon2}): {distance:.2f} km (expected ~{expected} km)")
|
| 335 |
+
|
| 336 |
+
print(f"\n{Colors.BOLD}Results: {passed}/{len(test_cases)} passed{Colors.ENDC}")
|
| 337 |
+
return passed == len(test_cases)
|
| 338 |
+
|
| 339 |
+
|
| 340 |
+
# =============================================================================
|
| 341 |
+
# Test 6: OpenStreetMap POI Service
|
| 342 |
+
# =============================================================================
|
| 343 |
+
|
| 344 |
+
async def test_osm_poi_service():
|
| 345 |
+
"""Test the OpenStreetMap POI service integration"""
|
| 346 |
+
print_header("Test 6: OpenStreetMap POI Service")
|
| 347 |
+
|
| 348 |
+
print_info("This test makes real API calls to OpenStreetMap (FREE)")
|
| 349 |
+
print_info("Testing: Nominatim geocoding + Overpass POI search\n")
|
| 350 |
+
|
| 351 |
+
from app.ai.services.osm_poi_service import (
|
| 352 |
+
geocode_location,
|
| 353 |
+
find_pois,
|
| 354 |
+
find_pois_overpass,
|
| 355 |
+
calculate_distance_km
|
| 356 |
+
)
|
| 357 |
+
|
| 358 |
+
# Test 1: Geocoding
|
| 359 |
+
print(f"{Colors.BOLD}1. Geocoding Test:{Colors.ENDC}")
|
| 360 |
+
coords = await geocode_location("Cotonou, Benin")
|
| 361 |
+
if coords:
|
| 362 |
+
print_success(f"Geocoded 'Cotonou, Benin' β ({coords[0]:.4f}, {coords[1]:.4f})")
|
| 363 |
+
else:
|
| 364 |
+
print_fail("Failed to geocode 'Cotonou, Benin'")
|
| 365 |
+
|
| 366 |
+
# Test 2: Find Schools
|
| 367 |
+
print(f"\n{Colors.BOLD}2. Find Schools in Cotonou:{Colors.ENDC}")
|
| 368 |
+
schools = await find_pois("school", "Cotonou, Benin", radius_km=3, limit=5)
|
| 369 |
+
print(f" Found {len(schools)} schools:")
|
| 370 |
+
for school in schools[:3]:
|
| 371 |
+
print(f" - {school['name']} ({school['lat']:.4f}, {school['lon']:.4f})")
|
| 372 |
+
|
| 373 |
+
# Test 3: Find Hospitals
|
| 374 |
+
print(f"\n{Colors.BOLD}3. Find Hospitals in Cotonou:{Colors.ENDC}")
|
| 375 |
+
hospitals = await find_pois("hospital", "Cotonou, Benin", radius_km=5, limit=5)
|
| 376 |
+
print(f" Found {len(hospitals)} hospitals:")
|
| 377 |
+
for hospital in hospitals[:3]:
|
| 378 |
+
print(f" - {hospital['name']} ({hospital['lat']:.4f}, {hospital['lon']:.4f})")
|
| 379 |
+
|
| 380 |
+
# Test 4: French POI type
|
| 381 |
+
print(f"\n{Colors.BOLD}4. French POI Type 'plage' (beach):{Colors.ENDC}")
|
| 382 |
+
beaches = await find_pois("plage", "Cotonou, Benin", radius_km=10, limit=5)
|
| 383 |
+
print(f" Found {len(beaches)} beaches")
|
| 384 |
+
|
| 385 |
+
# Test 5: Distance calculation
|
| 386 |
+
print(f"\n{Colors.BOLD}5. Distance Calculation:{Colors.ENDC}")
|
| 387 |
+
if coords and schools:
|
| 388 |
+
dist = calculate_distance_km(
|
| 389 |
+
coords[0], coords[1],
|
| 390 |
+
schools[0]["lat"], schools[0]["lon"]
|
| 391 |
+
)
|
| 392 |
+
print_success(f"Distance from Cotonou center to {schools[0]['name']}: {dist:.2f} km")
|
| 393 |
+
|
| 394 |
+
# Test 6: Integration with RLM
|
| 395 |
+
print(f"\n{Colors.BOLD}6. RLM Integration Test:{Colors.ENDC}")
|
| 396 |
+
from app.ai.services.rlm_search_service import RLMSearchAgent
|
| 397 |
+
agent = RLMSearchAgent()
|
| 398 |
+
|
| 399 |
+
pois = await agent._find_poi_locations("school", "Cotonou, Benin")
|
| 400 |
+
if pois:
|
| 401 |
+
print_success(f"RLM agent found {len(pois)} schools via OSM")
|
| 402 |
+
print(f" First result: {pois[0].get('name', 'Unknown')}")
|
| 403 |
+
else:
|
| 404 |
+
print_warning("RLM agent found no schools (may be network issue)")
|
| 405 |
+
|
| 406 |
+
print(f"\n{Colors.BOLD}OSM Integration Complete!{Colors.ENDC}")
|
| 407 |
+
return True
|
| 408 |
+
|
| 409 |
+
|
| 410 |
+
# =============================================================================
|
| 411 |
+
# Main
|
| 412 |
+
# =============================================================================
|
| 413 |
+
|
| 414 |
+
async def main():
|
| 415 |
+
"""Run all tests"""
|
| 416 |
+
print(f"\n{Colors.BOLD}{Colors.HEADER}")
|
| 417 |
+
print("βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ")
|
| 418 |
+
print("β RLM (Recursive Language Model) Test Suite for AIDA β")
|
| 419 |
+
print("βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ")
|
| 420 |
+
print(f"{Colors.ENDC}\n")
|
| 421 |
+
|
| 422 |
+
live_mode = "--live" in sys.argv
|
| 423 |
+
|
| 424 |
+
all_passed = True
|
| 425 |
+
|
| 426 |
+
# Test 1: Query Analyzer (no LLM calls)
|
| 427 |
+
if not test_query_analyzer():
|
| 428 |
+
all_passed = False
|
| 429 |
+
|
| 430 |
+
# Test 2: Strategy Selector
|
| 431 |
+
if not await test_strategy_selector():
|
| 432 |
+
all_passed = False
|
| 433 |
+
|
| 434 |
+
# Test 3: Pattern Detection
|
| 435 |
+
if not test_pattern_detection():
|
| 436 |
+
all_passed = False
|
| 437 |
+
|
| 438 |
+
# Test 4: Distance Calculation
|
| 439 |
+
if not test_distance_calculation():
|
| 440 |
+
all_passed = False
|
| 441 |
+
|
| 442 |
+
# Test 5: OpenStreetMap POI Service
|
| 443 |
+
await test_osm_poi_service()
|
| 444 |
+
|
| 445 |
+
# Test 6: Live RLM Search (only if --live flag)
|
| 446 |
+
if live_mode:
|
| 447 |
+
print_warning("\nRunning LIVE tests with actual LLM calls...")
|
| 448 |
+
await test_rlm_search_live()
|
| 449 |
+
else:
|
| 450 |
+
print_info("\nSkipping live LLM tests. Run with --live flag to include them.")
|
| 451 |
+
print_info("Example: python test_rlm.py --live")
|
| 452 |
+
|
| 453 |
+
# Summary
|
| 454 |
+
print_header("Test Summary")
|
| 455 |
+
if all_passed:
|
| 456 |
+
print_success("All offline tests passed!")
|
| 457 |
+
print_info("RLM is ready to use in AIDA.")
|
| 458 |
+
else:
|
| 459 |
+
print_fail("Some tests failed. Check the output above.")
|
| 460 |
+
|
| 461 |
+
# Usage examples
|
| 462 |
+
print(f"\n{Colors.BOLD}Usage Examples:{Colors.ENDC}")
|
| 463 |
+
print("""
|
| 464 |
+
# In your code:
|
| 465 |
+
from app.ai.services.rlm_search_service import rlm_search
|
| 466 |
+
|
| 467 |
+
# Multi-hop search (near POI)
|
| 468 |
+
results = await rlm_search("3-bed near schools in Cotonou")
|
| 469 |
+
|
| 470 |
+
# Boolean OR
|
| 471 |
+
results = await rlm_search("under 500k or has pool")
|
| 472 |
+
|
| 473 |
+
# Comparative
|
| 474 |
+
results = await rlm_search("compare Cotonou vs Calavi")
|
| 475 |
+
|
| 476 |
+
# The brain.py automatically uses RLM when appropriate!
|
| 477 |
+
""")
|
| 478 |
+
|
| 479 |
+
|
| 480 |
+
if __name__ == "__main__":
|
| 481 |
+
asyncio.run(main())
|