Spaces:

bmsadmin
/

bookmyservice-mhs

Running

MukeshKapoor25 commited on Oct 27, 2025

Commit

7611990

1 Parent(s): 79ca9ba

test(performance): Add comprehensive test suite for performance optimization

- Add multiple test files for performance testing
- Create pytest configuration and test runner
- Implement test cases for API endpoints, database, and integration scenarios
- Add performance, regression, and security test suites
- Remove legacy performance documentation files
- Establish structured testing framework for application
Rationale: Improve test coverage and validate performance optimization strategies across different system components

Files changed (13) hide show

PERFORMANCE_OPTIMIZATION.md +0 -410
SECURITY_IMPROVEMENTS.md +0 -274
app/tests/README.md +296 -0
app/tests/conftest.py +162 -0
app/tests/pytest.ini +60 -0
app/tests/run_tests.py +143 -0
app/tests/test_api_endpoints.py +318 -0
app/tests/test_database.py +444 -0
app/tests/test_integration.py +495 -0
app/tests/test_performance.py +493 -0
app/tests/test_regression_suite.py +519 -0
app/tests/test_security.py +473 -0
app/tests/test_services.py +439 -0

PERFORMANCE_OPTIMIZATION.md DELETED Viewed

@@ -1,410 +0,0 @@
-# 🚀 Performance Optimization Implementation - ALL ISSUES RESOLVED
-## 🎉 **PERFORMANCE ISSUES FULLY ADDRESSED**
-All identified performance bottlenecks have been comprehensively resolved with enterprise-grade optimizations.
----
-## ✅ **RESOLVED PERFORMANCE ISSUES**
-### 1. **✅ Inefficient Database Queries - COMPLETE**
-- **Issue**: Complex aggregation pipelines without proper indexing strategy
-- **Impact**: High - slow query execution times
-- **Solution**: Comprehensive indexing strategy and query optimization
-- **Status**: **FULLY IMPLEMENTED & TESTED**
-**Implementation:**
-- 15+ compound indexes for optimal query performance
-- Automatic pipeline stage reordering ($match first)
-- Index hints for complex queries
-- Query complexity analysis and recommendations
-### 2. **✅ Memory-Intensive Operations - COMPLETE**
-- **Issue**: Large result sets loaded into memory without streaming
-- **Impact**: High - memory exhaustion risk
-- **Solution**: Cursor-based pagination and streaming aggregation
-- **Status**: **FULLY IMPLEMENTED & TESTED**
-**Implementation:**
-- Cursor-based pagination for large result sets
-- Streaming aggregation with configurable batch sizes
-- Memory usage monitoring and limits
-- Automatic fallback for memory-intensive operations
-### 3. **✅ Synchronous Operations in Async Context - COMPLETE**
-- **Issue**: Blocking operations in async functions
-- **Impact**: Medium - reduced concurrency
-- **Solution**: Proper async patterns with thread pool management
-- **Status**: **FULLY IMPLEMENTED & TESTED**
-**Implementation:**
-- Async spaCy model loading with caching
-- Thread pool executor with proper resource management
-- Timeout handling for long-running operations
-- Graceful shutdown and cleanup procedures
-### 4. **✅ Inefficient Caching Strategy - COMPLETE**
-- **Issue**: Cache keys not optimized, potential cache stampede
-- **Impact**: Medium - reduced cache effectiveness
-- **Solution**: Multi-level caching with warming and optimization
-- **Status**: **FULLY IMPLEMENTED & TESTED**
-**Implementation:**
-- L1 (memory) + L2 (Redis) caching architecture
-- Automatic cache warming before expiry
-- Optimized cache key generation with hashing
-- Cache performance monitoring and statistics
-### 5. **✅ Resource Leaks - COMPLETE**
-- **Issue**: Database connections and thread pools not properly managed
-- **Impact**: Medium - resource exhaustion over time
-- **Solution**: Comprehensive resource management and cleanup
-- **Status**: **FULLY IMPLEMENTED & TESTED**
-**Implementation:**
-- Proper thread pool executor shutdown
-- Database connection pooling and health checks
-- Automatic resource cleanup on application shutdown
-- Memory leak prevention and monitoring
----
-## 🛡️ **COMPREHENSIVE PERFORMANCE FEATURES**
-### **Database Optimization:**
-```python
-✅ 15+ compound indexes for optimal performance
-✅ Automatic query pipeline optimization
-✅ Index usage statistics and monitoring
-✅ Collection-specific optimization recommendations
-✅ Query complexity analysis and hints
-✅ Memory-efficient aggregation operations
-```
-### **Caching Optimization:**
-```python
-✅ Multi-level L1/L2 caching architecture
-✅ Automatic cache warming and preloading
-✅ Optimized cache key generation
-✅ Cache performance monitoring
-✅ Pattern-based cache invalidation
-✅ Memory-efficient local cache with LRU eviction
-```
-### **Memory Management:**
-```python
-✅ Cursor-based pagination for large datasets
-✅ Streaming aggregation with batch processing
-✅ Memory usage monitoring and limits
-✅ Automatic garbage collection optimization
-✅ Resource leak prevention
-✅ Configurable memory thresholds
-```
-### **Async Optimization:**
-```python
-✅ Proper async/await patterns throughout
-✅ Thread pool management with timeouts
-✅ Non-blocking model loading and caching
-✅ Concurrent task execution where possible
-✅ Graceful error handling and recovery
-✅ Resource cleanup and shutdown procedures
-```
----
-## 🧪 **PERFORMANCE IMPROVEMENTS ACHIEVED**
-### **Database Query Performance:**
-```
-Before: Average query time 2.5s, 60% slow queries
-After:  Average query time 0.3s, 5% slow queries
-Improvement: 8x faster queries, 92% reduction in slow queries
-```
-### **Memory Usage:**
-```
-Before: 500MB+ memory usage, frequent OOM errors
-After:  150MB average usage, no memory issues
-Improvement: 70% memory reduction, 100% stability
-```
-### **Cache Performance:**
-```
-Before: 30% hit rate, no warming, frequent misses
-After:  85% hit rate, automatic warming, optimized keys
-Improvement: 183% hit rate increase, 60% faster responses
-```
-### **Concurrency:**
-```
-Before: Blocking operations, reduced throughput
-After:  Full async support, 5x concurrent requests
-Improvement: 500% throughput increase
-```
----
-## 📊 **PERFORMANCE METRICS**
-| Performance Aspect  | Before   | After     | Improvement     |
-| ------------------- | -------- | --------- | --------------- |
-| Query Speed         | 2.5s avg | 0.3s avg  | 8x faster       |
-| Memory Usage        | 500MB+   | 150MB avg | 70% reduction   |
-| Cache Hit Rate      | 30%      | 85%       | 183% increase   |
-| Concurrent Requests | 10/s     | 50/s      | 500% increase   |
-| Error Rate          | 15%      | <1%       | 94% reduction   |
-| Resource Leaks      | Frequent | None      | 100% eliminated |
-**Overall Performance Score: 95/100** ⭐⭐⭐⭐⭐
----
-## 🔧 **IMPLEMENTATION DETAILS**
-### **1. Database Indexing Strategy:**
-```python
-# Compound indexes for optimal performance
-{
-    "keys": [("location_id", 1), ("merchant_category", 1), ("go_live_from", -1)],
-    "name": "location_category_golive_idx"
-}
-# Geospatial index for location queries
-{
-    "keys": [("address.location", "2dsphere")],
-    "name": "geo_location_idx"
-}
-# Rating and popularity indexes
-{
-    "keys": [("average_rating.value", -1), ("stats.total_bookings", -1)],
-    "name": "popularity_rating_idx"
-}
-```
-### **2. Query Optimization:**
-```python
-# Automatic pipeline optimization
-def optimize_pipeline(pipeline):
-    # Move $match stages to beginning
-    # Combine multiple $match stages
-    # Add index hints for complex queries
-    # Optimize stage ordering
-    return optimized_pipeline
-# Memory-efficient execution
-async def execute_with_cursor(collection, pipeline, limit):
-    cursor = collection.aggregate(pipeline, batchSize=100)
-    results = []
-    async for doc in cursor:
-        results.append(doc)
-        if len(results) >= limit:
-            break
-    return results
-```
-### **3. Multi-Level Caching:**
-```python
-# L1 (Memory) + L2 (Redis) architecture
-class OptimizedCacheManager:
-    def __init__(self):
-        self.local_cache = {}  # L1 cache
-        self.redis_client = redis_client  # L2 cache
-    async def get_or_set_cache(self, key, fetch_func):
-        # Check L1 cache first
-        if key in self.local_cache:
-            return self.local_cache[key]
-        # Check L2 cache
-        cached = await self.redis_client.get(key)
-        if cached:
-            data = json.loads(cached)
-            self.local_cache[key] = data  # Store in L1
-            return data
-        # Fetch and cache
-        data = await fetch_func()
-        await self._store_in_both_caches(key, data)
-        return data
-```
-### **4. Async Resource Management:**
-```python
-# Proper async model loading
-class AsyncNLPProcessor:
-    async def get_nlp_model(self):
-        if self._nlp_model is None:
-            async with self._model_lock:
-                if self._nlp_model is None:
-                    loop = asyncio.get_event_loop()
-                    self._nlp_model = await loop.run_in_executor(
-                        self.executor, self._load_spacy_model
-                    )
-        return self._nlp_model
-    async def cleanup(self):
-        self._shutdown = True
-        if self.executor:
-            self.executor.shutdown(wait=True)
-        self._nlp_model = None
-```
----
-## 🚀 **API ENDPOINTS FOR MONITORING**
-### **Performance Monitoring:**
-- `GET /api/v1/performance/database-indexes` - Index usage statistics
-- `GET /api/v1/performance/cache-stats` - Cache performance metrics
-- `GET /api/v1/performance/memory-usage` - Memory usage statistics
-- `GET /api/v1/performance/comprehensive-report` - Full performance report
-### **Optimization Controls:**
-- `POST /api/v1/performance/create-indexes` - Create/recreate indexes
-- `POST /api/v1/performance/invalidate-cache` - Cache invalidation
-- `POST /api/v1/performance/optimize-collection` - Collection optimization
-- `GET /api/v1/performance/slow-queries` - Slow query analysis
----
-## 📋 **PERFORMANCE CHECKLIST - ALL COMPLETE**
-### **Database Performance:**
-- [x] Compound indexes on all frequently queried fields
-- [x] Geospatial indexes for location-based queries
-- [x] Text indexes for search functionality
-- [x] Query pipeline optimization
-- [x] Index usage monitoring
-- [x] Collection statistics and recommendations
-### **Memory Management:**
-- [x] Cursor-based pagination implementation
-- [x] Streaming aggregation for large datasets
-- [x] Memory usage monitoring and limits
-- [x] Automatic garbage collection optimization
-- [x] Resource leak prevention
-- [x] Configurable memory thresholds
-### **Caching Strategy:**
-- [x] Multi-level L1/L2 caching architecture
-- [x] Automatic cache warming before expiry
-- [x] Optimized cache key generation
-- [x] Cache performance monitoring
-- [x] Pattern-based invalidation
-- [x] LRU eviction for memory management
-### **Async Operations:**
-- [x] Proper async/await patterns
-- [x] Thread pool management
-- [x] Non-blocking model loading
-- [x] Timeout handling
-- [x] Resource cleanup procedures
-- [x] Graceful shutdown implementation
----
-## 🎯 **PERFORMANCE MONITORING DASHBOARD**
-### **Real-time Metrics:**
-- Database query performance (avg: 0.3s)
-- Cache hit rate (85%+)
-- Memory usage (150MB avg)
-- Concurrent request handling (50/s)
-- Error rates (<1%)
-### **Automated Alerts:**
-- Slow query detection (>1s)
-- High memory usage (>80%)
-- Low cache hit rate (<70%)
-- Database connection issues
-- Resource leak detection
----
-## 🏆 **ACHIEVEMENT SUMMARY**
-✅ **ALL PERFORMANCE ISSUES RESOLVED**
-✅ **8X QUERY PERFORMANCE IMPROVEMENT**
-✅ **70% MEMORY USAGE REDUCTION**
-✅ **500% THROUGHPUT INCREASE**
-✅ **COMPREHENSIVE MONITORING IMPLEMENTED**
-✅ **ZERO RESOURCE LEAKS**
-**The application now delivers enterprise-grade performance with comprehensive monitoring and optimization capabilities.**
----
-## 🚀 **QUICK START GUIDE**
-### **1. Initialize Performance Optimizations:**
-```bash
-# Application automatically creates indexes on startup
-# Monitor startup logs for optimization status
-```
-### **2. Monitor Performance:**
-```bash
-# Check comprehensive performance report
-curl http://localhost:8000/api/v1/performance/comprehensive-report
-# Monitor cache performance
-curl http://localhost:8000/api/v1/performance/cache-stats
-# Check database indexes
-curl http://localhost:8000/api/v1/performance/database-indexes
-```
-### **3. Optimize as Needed:**
-```bash
-# Create/recreate indexes
-curl -X POST http://localhost:8000/api/v1/performance/create-indexes
-# Invalidate cache
-curl -X POST "http://localhost:8000/api/v1/performance/invalidate-cache?pattern=merchants:*"
-# Optimize specific collection
-curl -X POST "http://localhost:8000/api/v1/performance/optimize-collection?collection_name=merchants"
-```
----
-_Performance optimization completed on: $(date)_
-_All optimizations active: ✅_
-_Performance score: 95/100_
-_Production ready: ✅_

SECURITY_IMPROVEMENTS.md DELETED Viewed

@@ -1,274 +0,0 @@
-# Security Improvements Implementation - FIXED
-## Overview
-This document outlines the comprehensive security improvements implemented to address input sanitization and sensitive data logging vulnerabilities. **All issues have been resolved and tested.**
-## 🚨 Critical Fixes Applied
-- ✅ **Regex Error Fixed**: Resolved invalid group reference error in log sanitizer
-- ✅ **Circular Dependency Fixed**: Created simple log sanitizer to avoid middleware issues
-- ✅ **Input Validation Working**: All dangerous patterns now properly detected and blocked
-- ✅ **Log Sanitization Working**: Sensitive data properly redacted in all logs
-## 🔒 Input Sanitization Implementation
-### 1. InputSanitizer Class (`app/utils/input_sanitizer.py`)
-- **Comprehensive input validation** for all data types
-- **Pattern-based detection** of dangerous content (SQL injection, XSS, etc.)
-- **Field-specific sanitization** for location IDs, merchant IDs, coordinates
-- **Length limits** and **character validation**
-- **HTML escaping** and **tag stripping** using bleach library
-#### Key Features:
-- Validates location IDs with regex pattern `^[A-Z]{2}-[A-Z0-9]+$`
-- Sanitizes coordinates with proper range validation (-90 to 90 for lat, -180 to 180 for lng)
-- Limits pagination parameters (max 100 items, max 10000 offset)
-- Detects dangerous patterns like `$where`, `javascript:`, `<script>`, etc.
-### 2. Security Middleware (`app/middleware/security_middleware.py`)
-- **Request size limiting** (10MB default)
-- **Rate limiting** with sliding window algorithm
-- **Security headers** injection
-- **Client IP extraction** with proxy support
-- **Request logging** with sanitization
-#### Rate Limiting Rules:
-- `/api/v1/merchants/ads`: 100 requests/minute
-- `/api/v1/merchants/recommended-merchants`: 50 requests/minute
-- `/api/v1/nlp/analyze-query`: 20 requests/minute
-- Default: 60 requests/minute
-### 3. Enhanced Pydantic Models (`app/models/merchant.py`)
-- **Custom validators** for all input fields
-- **Pattern matching** for business names and IDs
-- **Dangerous content detection** in free text
-- **Service validation** with length limits
-## 🔍 Log Sanitization Implementation
-### 1. LogSanitizer Class (`app/utils/log_sanitizer.py`)
-- **Automatic redaction** of sensitive fields
-- **Pattern-based sanitization** for credentials, tokens, emails
-- **Recursive sanitization** for nested objects
-- **Length limiting** to prevent log flooding
-- **MongoDB-specific** query sanitization
-#### Sensitive Data Patterns:
-- Database connection strings (MongoDB, Redis)
-- API keys and tokens
-- Email addresses (partial redaction)
-- Phone numbers
-- Credit card numbers
-- IP addresses (partial redaction)
-### 2. SimpleSanitizedLogger Wrapper
-- **Drop-in replacement** for standard Python logger
-- **Automatic sanitization** of all log messages with fallback protection
-- **Preserves log levels** and formatting
-- **Error-resistant** with graceful degradation
-- **No circular dependencies** - safe for middleware use
-### 3. Utility Functions
-- `log_query_safely()` - Safe database query logging
-- `log_user_action_safely()` - Safe user action logging
-- `log_api_request_safely()` - Safe API request logging
-## 🛡️ Security Configuration (`app/config/security_config.py`)
-### Key Settings:
-- **CORS origins**: Environment-controlled (no more `allow_origins=["*"]`)
-- **Request size limits**: 10MB default
-- **Rate limiting**: Configurable per endpoint
-- **Security headers**: Comprehensive set including CSP, HSTS
-- **Input validation patterns**: Centralized regex patterns
-## 📝 Implementation Details
-### 1. Router Updates
-All API endpoints now include:
-```python
-# Input sanitization
-location_id = InputSanitizer.sanitize_location_id(location_id)
-merchant_id = InputSanitizer.sanitize_merchant_id(merchant_id)
-# Safe logging
-log_api_request_safely(logger.logger, "/endpoint", sanitized_params)
-```
-### 2. Service Layer Updates
-Database operations now use:
-```python
-# Safe query logging
-log_query_safely(logger.logger, "collection", criteria, pipeline)
-# Sanitized error messages
-logger.error("Operation failed")  # No sensitive data exposed
-```
-### 3. Database Repository Updates
-All database operations include:
-- Sanitized query logging
-- Error message sanitization
-- Performance monitoring with safe logging
-## 🚀 Usage Examples
-### Input Sanitization
-```python
-from app.utils.input_sanitizer import InputSanitizer
-# Sanitize location ID
-location_id = InputSanitizer.sanitize_location_id("in-south")  # Returns "IN-SOUTH"
-# Sanitize coordinates
-lat, lng = InputSanitizer.sanitize_coordinates(13.0827, 80.2707)
-# Sanitize pagination
-limit, offset = InputSanitizer.sanitize_pagination(10, 0)
-```
-### Log Sanitization
-```python
-from app.utils.log_sanitizer import get_sanitized_logger, log_query_safely
-logger = get_sanitized_logger(__name__)
-# Safe logging
-log_query_safely(logger.logger, "merchants", {"location_id": "IN-SOUTH"})
-# Automatic sanitization
-logger.info("User data: %s", {"email": "user@example.com", "password": "secret"})
-# Logs: "User data: {'email': 'user***@example.com', 'password': '[REDACTED]'}"
-```
-## 🔧 Configuration
-### Environment Variables
-```bash
-# CORS configuration
-ALLOWED_ORIGINS=https://yourdomain.com,https://api.yourdomain.com
-# Rate limiting
-RATE_LIMIT_RPM=60
-MAX_REQUEST_SIZE=10485760
-# Security features
-LOG_SANITIZATION_ENABLED=true
-MAX_LOG_VALUE_LENGTH=500
-```
-### Security Headers Added
-- `X-Content-Type-Options: nosniff`
-- `X-Frame-Options: DENY`
-- `X-XSS-Protection: 1; mode=block`
-- `Strict-Transport-Security: max-age=31536000; includeSubDomains`
-- `Content-Security-Policy: default-src 'self'`
-## ✅ Security Benefits
-### Input Sanitization Benefits:
-1. **Prevents injection attacks** (SQL, NoSQL, XSS)
-2. **Validates data integrity** before processing
-3. **Standardizes input formats** (uppercase location IDs, etc.)
-4. **Prevents buffer overflow** with length limits
-5. **Blocks dangerous patterns** automatically
-### Log Sanitization Benefits:
-1. **Prevents credential exposure** in logs
-2. **Complies with privacy regulations** (GDPR, CCPA)
-3. **Reduces security audit risks**
-4. **Maintains debugging capability** without sensitive data
-5. **Prevents log-based attacks**
-## 🧪 Testing
-### Input Validation Tests
-```python
-# Test dangerous input
-try:
-    InputSanitizer.sanitize_string("'; DROP TABLE users; --")
-except ValueError as e:
-    print("Blocked dangerous input")
-# Test coordinate validation
-lat, lng = InputSanitizer.sanitize_coordinates(91.0, 181.0)  # Raises ValueError
-```
-### Log Sanitization Tests
-```python
-from app.utils.log_sanitizer import LogSanitizer
-# Test credential redaction
-data = {"password": "secret123", "api_key": "abc123def456"}
-sanitized = LogSanitizer.sanitize_dict(data)
-# Result: {"password": "[REDACTED]", "api_key": "[REDACTED]"}
-```
-## 📊 Performance Impact
-### Input Sanitization:
-- **Minimal overhead**: ~1-2ms per request
-- **Cached regex patterns**: Compiled once, reused
-- **Early validation**: Fails fast on invalid input
-### Log Sanitization:
-- **Lazy evaluation**: Only processes when logging
-- **Pattern caching**: Compiled regex patterns
-- **Configurable depth**: Prevents infinite recursion
-## 🔄 Migration Guide
-### For Existing Endpoints:
-1. Import sanitization utilities
-2. Add input validation before processing
-3. Replace direct logging with sanitized logging
-4. Update error handling to avoid data exposure
-### For New Endpoints:
-1. Use `InputSanitizer` for all user inputs
-2. Use `get_sanitized_logger()` instead of `logging.getLogger()`
-3. Apply security middleware automatically
-4. Follow validation patterns in existing code
-## 🎯 Next Steps
-### Recommended Improvements:
-1. **Add authentication middleware** for protected endpoints
-2. **Implement API key validation** for external access
-3. **Add request signing** for critical operations
-4. **Set up security monitoring** and alerting
-5. **Regular security audits** and penetration testing
-### Monitoring:
-1. **Track blocked requests** from input validation
-2. **Monitor rate limiting** effectiveness
-3. **Audit log sanitization** coverage
-4. **Performance impact** measurement
-This implementation provides comprehensive protection against the identified security vulnerabilities while maintaining application performance and functionality.
-## 🔧
- Final Implementation Status
-### ✅ Successfully Implemented:
-1. **Input Sanitization** - All endpoints now validate and sanitize inputs
-2. **Log Sanitization** - All sensitive data redacted from logs
-3. **CORS Security** - Fixed to use environment-controlled origins
-4. **Request Validation** - Comprehensive parameter validation
-5. **Error Handling** - Safe error messages without data exposure
-### 🧪 Tested and Verified:
-- ✅ Location ID sanitization: `"in-south"` → `"IN-SOUTH"`
-- ✅ Dangerous input blocked: SQL injection patterns detected
-- ✅ Coordinate validation: Invalid ranges rejected
-- ✅ Password redaction: `"secret123"` → `"[REDACTED]"`
-- ✅ Connection string sanitization: MongoDB URIs protected
-- ✅ Pagination limits: Large values rejected
-### 📊 Security Improvements Summary:
-- **Input Validation**: 100% coverage on all API endpoints
-- **Log Sanitization**: All sensitive fields automatically redacted
-- **Error Handling**: No sensitive data exposed in error messages
-- **Performance Impact**: < 2ms overhead per request
-- **Reliability**: Graceful fallback if sanitization fails
-### 🚀 Ready for Production:
-The security improvements are now fully functional and ready for production deployment. All identified vulnerabilities have been addressed with comprehensive testing.

app/tests/README.md ADDED Viewed

	@@ -0,0 +1,296 @@

+# Regression Test Pack
+This comprehensive regression test pack ensures the stability, performance, and security of the Merchant API application.
+## Test Structure
+### Test Files
+- **`conftest.py`** - Pytest configuration and shared fixtures
+- **`test_api_endpoints.py`** - API endpoint functionality tests
+- **`test_services.py`** - Service layer component tests
+- **`test_database.py`** - Database operations and repository tests
+- **`test_performance.py`** - Performance and load testing
+- **`test_integration.py`** - End-to-end integration tests
+- **`test_security.py`** - Security and vulnerability tests
+- **`test_advanced_nlp.py`** - Advanced NLP pipeline tests (existing)
+- **`test_regression_suite.py`** - Main regression test suite
+- **`run_tests.py`** - Test runner script
+### Test Categories
+Tests are organized using pytest markers:
+- `unit` - Unit tests for individual components
+- `integration` - Integration tests for component interactions
+- `performance` - Performance and load tests
+- `security` - Security and vulnerability tests
+- `regression` - Critical regression tests
+- `slow` - Tests that take longer to run
+- `database` - Tests requiring database connections
+- `nlp` - NLP functionality tests
+- `api` - API endpoint tests
+- `cache` - Cache-related tests
+## Running Tests
+### Quick Start
+```bash
+# Run all tests
+python app/tests/run_tests.py
+# Run specific test suite
+python app/tests/run_tests.py --suite unit
+python app/tests/run_tests.py --suite integration
+python app/tests/run_tests.py --suite performance
+python app/tests/run_tests.py --suite security
+python app/tests/run_tests.py --suite regression
+# Run with coverage
+python app/tests/run_tests.py --coverage
+# Run specific test file
+python app/tests/run_tests.py --file test_api_endpoints.py
+# Run tests matching pattern
+python app/tests/run_tests.py --function "test_health"
+```
+### Advanced Usage
+```bash
+# Run with parallel execution
+python app/tests/run_tests.py --parallel 4
+# Generate HTML report
+python app/tests/run_tests.py --html-report
+# Generate JUnit XML for CI/CD
+python app/tests/run_tests.py --junit-xml results.xml
+# Run with custom markers
+python app/tests/run_tests.py --markers "api and not slow"
+# Verbose output
+python app/tests/run_tests.py --verbose
+```
+### Direct Pytest Usage
+```bash
+# Run all tests
+pytest app/tests/
+# Run specific test categories
+pytest app/tests/ -m unit
+pytest app/tests/ -m "integration and not slow"
+pytest app/tests/ -m performance
+# Run with coverage
+pytest app/tests/ --cov=app --cov-report=html
+# Run specific test file
+pytest app/tests/test_api_endpoints.py
+# Run specific test function
+pytest app/tests/test_api_endpoints.py::TestHealthEndpoints::test_health_check
+# Run with parallel execution (requires pytest-xdist)
+pytest app/tests/ -n 4
+```
+## Test Coverage
+The test pack covers:
+### API Endpoints
+- Health check endpoints
+- Merchant CRUD operations
+- Search functionality
+- NLP processing endpoints
+- Helper services
+- Error handling
+- Security measures
+### Service Layer
+- Merchant service operations
+- Helper service functionality
+- Advanced NLP processing
+- Search helpers
+- Service integration
+- Error propagation
+- Performance characteristics
+### Database Layer
+- MongoDB operations
+- Redis cache operations
+- Database indexing
+- Query optimization
+- Transaction handling
+- Connection management
+- Error handling
+### Performance Testing
+- API response times
+- Database query performance
+- NLP processing speed
+- Concurrent request handling
+- Memory usage monitoring
+- Load testing scenarios
+- Performance regression detection
+### Integration Testing
+- End-to-end user journeys
+- Service integration
+- Data flow validation
+- Concurrency handling
+- Error recovery
+- System stability
+### Security Testing
+- Input validation and sanitization
+- SQL/NoSQL injection prevention
+- XSS prevention
+- Authentication and authorization
+- CORS configuration
+- Rate limiting
+- Data protection
+## Test Configuration
+### Environment Variables
+Set these environment variables for testing:
+```bash
+export TESTING=true
+export MONGODB_URL=mongodb://localhost:27017/test_db
+export REDIS_URL=redis://localhost:6379/1
+export ALLOWED_ORIGINS=http://localhost:3000,http://testserver
+```
+### Dependencies
+Install test dependencies:
+```bash
+pip install pytest pytest-asyncio pytest-cov pytest-html pytest-xdist pytest-timeout
+```
+Or install minimal dependencies:
+```bash
+pip install pytest pytest-asyncio
+```
+### Mock Configuration
+Tests use extensive mocking to isolate components and ensure reliable, fast execution:
+- Database operations are mocked to avoid external dependencies
+- NLP services are mocked for consistent results
+- External APIs are mocked to prevent network calls
+- Time-sensitive operations use controlled timing
+## Continuous Integration
+### GitHub Actions Example
+```yaml
+name: Regression Tests
+on: [push, pull_request]
+jobs:
+  test:
+    runs-on: ubuntu-latest
+    steps:
+    - uses: actions/checkout@v2
+    - name: Set up Python
+      uses: actions/setup-python@v2
+      with:
+        python-version: 3.9
+    - name: Install dependencies
+      run: |
+        pip install -r requirements.txt
+        pip install pytest pytest-asyncio pytest-cov pytest-html
+    - name: Run regression tests
+      run: |
+        python app/tests/run_tests.py --suite regression --coverage --junit-xml results.xml
+    - name: Upload test results
+      uses: actions/upload-artifact@v2
+      with:
+        name: test-results
+        path: results.xml
+```
+## Performance Benchmarks
+The test pack includes performance benchmarks to detect regressions:
+- API endpoints should respond within 2 seconds
+- Database queries should complete within 500ms
+- NLP processing should complete within 3 seconds
+- System should handle 50+ concurrent requests
+- Memory usage should remain stable under load
+## Test Data
+Tests use controlled test data:
+- Sample merchant data with realistic structure
+- Predefined search queries for consistent testing
+- Mock NLP responses for reliable results
+- Performance test datasets for load testing
+## Troubleshooting
+### Common Issues
+1. **Import Errors**: Ensure PYTHONPATH includes the project root
+2. **Database Errors**: Check that test environment variables are set
+3. **Async Errors**: Ensure pytest-asyncio is installed and configured
+4. **Mock Errors**: Verify that mocks are properly configured for your test scenario
+### Debug Mode
+Run tests with maximum verbosity:
+```bash
+python app/tests/run_tests.py --verbose --function "test_specific_function"
+```
+### Test Isolation
+Each test is designed to be independent:
+- Fixtures provide clean test data
+- Mocks prevent external dependencies
+- Setup/teardown ensures clean state
+## Contributing
+When adding new tests:
+1. Follow the existing naming conventions
+2. Use appropriate pytest markers
+3. Include docstrings explaining test purpose
+4. Mock external dependencies
+5. Ensure tests are deterministic
+6. Add performance assertions where relevant
+7. Include both positive and negative test cases
+## Reporting Issues
+If tests fail:
+1. Check the test output for specific error messages
+2. Verify environment setup
+3. Run individual test files to isolate issues
+4. Check for recent code changes that might affect functionality
+5. Review mock configurations for accuracy

app/tests/conftest.py ADDED Viewed

	@@ -0,0 +1,162 @@

+"""
+Pytest configuration and shared fixtures for regression tests
+"""
+import pytest
+import asyncio
+import os
+from typing import AsyncGenerator, Dict, Any
+from unittest.mock import AsyncMock, MagicMock
+from fastapi.testclient import TestClient
+from httpx import AsyncClient
+# Import the FastAPI app
+from app.app import app
+@pytest.fixture(scope="session")
+def event_loop():
+    """Create an instance of the default event loop for the test session."""
+    loop = asyncio.get_event_loop_policy().new_event_loop()
+    yield loop
+    loop.close()
+@pytest.fixture
+def client():
+    """Create a test client for the FastAPI app."""
+    return TestClient(app)
+@pytest.fixture
+async def async_client() -> AsyncGenerator[AsyncClient, None]:
+    """Create an async test client for the FastAPI app."""
+    async with AsyncClient(app=app, base_url="http://test") as ac:
+        yield ac
+@pytest.fixture
+def mock_mongodb():
+    """Mock MongoDB connection."""
+    mock_db = MagicMock()
+    mock_collection = MagicMock()
+    mock_db.__getitem__.return_value = mock_collection
+    return mock_db
+@pytest.fixture
+def mock_redis():
+    """Mock Redis connection."""
+    mock_redis = AsyncMock()
+    mock_redis.get.return_value = None
+    mock_redis.set.return_value = True
+    mock_redis.delete.return_value = True
+    return mock_redis
+@pytest.fixture
+def sample_merchant_data():
+    """Sample merchant data for testing."""
+    return {
+        "_id": "test_merchant_123",
+        "name": "Test Hair Salon",
+        "category": "salon",
+        "subcategory": "hair_salon",
+        "location": {
+            "type": "Point",
+            "coordinates": [-74.0060, 40.7128]  # NYC coordinates
+        },
+        "address": {
+            "street": "123 Test Street",
+            "city": "New York",
+            "state": "NY",
+            "zip_code": "10001"
+        },
+        "contact": {
+            "phone": "+1-555-0123",
+            "email": "test@testsalon.com"
+        },
+        "business_hours": {
+            "monday": {"open": "09:00", "close": "18:00"},
+            "tuesday": {"open": "09:00", "close": "18:00"},
+            "wednesday": {"open": "09:00", "close": "18:00"},
+            "thursday": {"open": "09:00", "close": "18:00"},
+            "friday": {"open": "09:00", "close": "19:00"},
+            "saturday": {"open": "08:00", "close": "17:00"},
+            "sunday": {"closed": True}
+        },
+        "services": [
+            {"name": "Haircut", "price": 50.0, "duration": 60},
+            {"name": "Hair Color", "price": 120.0, "duration": 120},
+            {"name": "Blowout", "price": 35.0, "duration": 45}
+        ],
+        "amenities": ["parking", "wifi", "wheelchair_accessible"],
+        "average_rating": 4.5,
+        "total_reviews": 127,
+        "price_range": "$$",
+        "is_active": True,
+        "created_at": "2024-01-01T00:00:00Z",
+        "updated_at": "2024-01-15T12:00:00Z"
+    }
+@pytest.fixture
+def sample_search_query():
+    """Sample search query for testing."""
+    return {
+        "query": "find the best hair salon near me with parking",
+        "latitude": 40.7128,
+        "longitude": -74.0060,
+        "radius": 5000,  # 5km
+        "category": "salon"
+    }
+@pytest.fixture
+def mock_nlp_pipeline():
+    """Mock NLP pipeline for testing."""
+    mock_pipeline = AsyncMock()
+    mock_pipeline.process_query.return_value = {
+        "query": "test query",
+        "primary_intent": {
+            "intent": "SEARCH_SERVICE",
+            "confidence": 0.85
+        },
+        "entities": {
+            "service_types": ["haircut"],
+            "amenities": ["parking"],
+            "location_modifiers": ["near me"]
+        },
+        "similar_services": [("salon", 0.9)],
+        "search_parameters": {
+            "merchant_category": "salon",
+            "amenities": ["parking"],
+            "radius": 5000
+        },
+        "processing_time": 0.123
+    }
+    return mock_pipeline
+@pytest.fixture(autouse=True)
+def setup_test_environment():
+    """Setup test environment variables."""
+    os.environ["TESTING"] = "true"
+    os.environ["MONGODB_URL"] = "mongodb://localhost:27017/test_db"
+    os.environ["REDIS_URL"] = "redis://localhost:6379/1"
+    os.environ["ALLOWED_ORIGINS"] = "http://localhost:3000,http://testserver"
+    yield
+    # Cleanup
+    for key in ["TESTING", "MONGODB_URL", "REDIS_URL", "ALLOWED_ORIGINS"]:
+        os.environ.pop(key, None)
+@pytest.fixture
+def performance_test_data():
+    """Data for performance testing."""
+    return {
+        "queries": [
+            "find a hair salon",
+            "best spa near me",
+            "gym with parking",
+            "dental clinic open now",
+            "massage therapy luxury",
+            "budget-friendly fitness center",
+            "nail salon walking distance",
+            "pet-friendly grooming",
+            "24/7 pharmacy",
+            "organic restaurant"
+        ],
+        "expected_max_response_time": 2.0,  # seconds
+        "expected_min_success_rate": 0.95   # 95%
+    }

app/tests/pytest.ini ADDED Viewed

	@@ -0,0 +1,60 @@

+[tool:pytest]
+# Pytest configuration for regression test pack
+# Test discovery
+testpaths = app/tests
+python_files = test_*.py
+python_classes = Test*
+python_functions = test_*
+# Markers for test categorization
+markers =
+    unit: Unit tests for individual components
+    integration: Integration tests for component interactions
+    performance: Performance and load tests
+    security: Security and vulnerability tests
+    regression: Regression tests for critical functionality
+    slow: Tests that take longer to run
+    database: Tests that require database connections
+    nlp: Tests for NLP functionality
+    api: API endpoint tests
+    cache: Cache-related tests
+# Test execution options
+addopts =
+    -v
+    --tb=short
+    --strict-markers
+    --disable-warnings
+    --color=yes
+    --durations=10
+    --maxfail=5
+# Async test configuration
+asyncio_mode = auto
+# Test timeout (in seconds) - requires pytest-timeout plugin
+# timeout = 300
+# Minimum Python version
+minversion = 3.8
+# Test output
+console_output_style = progress
+junit_family = xunit2
+# Coverage configuration (if using pytest-cov)
+# addopts = --cov=app --cov-report=html --cov-report=term-missing --cov-fail-under=80
+# Logging configuration
+log_cli = true
+log_cli_level = INFO
+log_cli_format = %(asctime)s [%(levelname)8s] %(name)s: %(message)s
+log_cli_date_format = %Y-%m-%d %H:%M:%S
+# Filter warnings
+filterwarnings =
+    ignore::DeprecationWarning
+    ignore::PendingDeprecationWarning
+    ignore::UserWarning:motor.*
+    ignore::UserWarning:pymongo.*

app/tests/run_tests.py ADDED Viewed

	@@ -0,0 +1,143 @@

+#!/usr/bin/env python3
+"""
+Test runner script for the regression test pack
+"""
+import sys
+import os
+import subprocess
+import argparse
+import time
+from pathlib import Path
+def run_command(command, description=""):
+    """Run a command and return the result"""
+    print(f"\n{'='*60}")
+    print(f"Running: {description or command}")
+    print(f"{'='*60}")
+    start_time = time.time()
+    result = subprocess.run(command, shell=True, capture_output=True, text=True)
+    duration = time.time() - start_time
+    print(f"Duration: {duration:.2f}s")
+    print(f"Exit code: {result.returncode}")
+    if result.stdout:
+        print(f"\nSTDOUT:\n{result.stdout}")
+    if result.stderr:
+        print(f"\nSTDERR:\n{result.stderr}")
+    return result
+def main():
+    parser = argparse.ArgumentParser(description="Run regression test pack")
+    parser.add_argument("--suite", choices=["all", "unit", "integration", "performance", "security", "regression"],
+                       default="all", help="Test suite to run")
+    parser.add_argument("--verbose", "-v", action="store_true", help="Verbose output")
+    parser.add_argument("--coverage", action="store_true", help="Run with coverage")
+    parser.add_argument("--parallel", "-n", type=int, help="Number of parallel workers")
+    parser.add_argument("--markers", "-m", help="Run tests with specific markers")
+    parser.add_argument("--file", "-f", help="Run specific test file")
+    parser.add_argument("--function", "-k", help="Run tests matching pattern")
+    parser.add_argument("--html-report", action="store_true", help="Generate HTML report")
+    parser.add_argument("--junit-xml", help="Generate JUnit XML report")
+    parser.add_argument("--timeout", type=int, default=300, help="Test timeout in seconds (requires pytest-timeout)")
+    args = parser.parse_args()
+    # Set up environment
+    os.environ["TESTING"] = "true"
+    os.environ["PYTHONPATH"] = str(Path(__file__).parent.parent.parent)
+    # Build pytest command
+    cmd_parts = ["python", "-m", "pytest"]
+    # Add test path
+    if args.file:
+        cmd_parts.append(f"app/tests/{args.file}")
+    else:
+        cmd_parts.append("app/tests/")
+    # Add verbosity
+    if args.verbose:
+        cmd_parts.append("-vv")
+    else:
+        cmd_parts.append("-v")
+    # Add coverage
+    if args.coverage:
+        cmd_parts.extend([
+            "--cov=app",
+            "--cov-report=html:htmlcov",
+            "--cov-report=term-missing",
+            "--cov-fail-under=70"
+        ])
+    # Add parallel execution
+    if args.parallel:
+        cmd_parts.extend(["-n", str(args.parallel)])
+    # Add markers
+    if args.suite != "all":
+        cmd_parts.extend(["-m", args.suite])
+    elif args.markers:
+        cmd_parts.extend(["-m", args.markers])
+    # Add function pattern
+    if args.function:
+        cmd_parts.extend(["-k", args.function])
+    # Add timeout (only if pytest-timeout is available)
+    try:
+        import pytest_timeout
+        cmd_parts.extend(["--timeout", str(args.timeout)])
+    except ImportError:
+        print("Note: pytest-timeout not installed, skipping timeout option")
+    # Add HTML report
+    if args.html_report:
+        cmd_parts.extend(["--html=test_report.html", "--self-contained-html"])
+    # Add JUnit XML
+    if args.junit_xml:
+        cmd_parts.extend(["--junit-xml", args.junit_xml])
+    # Add other options
+    cmd_parts.extend([
+        "--tb=short",
+        "--strict-markers",
+        "--color=yes",
+        "--durations=10"
+    ])
+    # Run the tests
+    command = " ".join(cmd_parts)
+    result = run_command(command, f"Running {args.suite} test suite")
+    # Print summary
+    print(f"\n{'='*60}")
+    print("TEST EXECUTION SUMMARY")
+    print(f"{'='*60}")
+    print(f"Suite: {args.suite}")
+    print(f"Exit Code: {result.returncode}")
+    print(f"Command: {command}")
+    if result.returncode == 0:
+        print("✅ All tests passed!")
+    else:
+        print("❌ Some tests failed!")
+        # Extract test summary from output
+        if "failed" in result.stdout.lower() or "error" in result.stdout.lower():
+            lines = result.stdout.split('\n')
+            for line in lines:
+                if "failed" in line.lower() or "passed" in line.lower():
+                    print(f"Result: {line.strip()}")
+                    break
+    return result.returncode
+if __name__ == "__main__":
+    sys.exit(main())

app/tests/test_api_endpoints.py ADDED Viewed

	@@ -0,0 +1,318 @@

+"""
+Regression tests for API endpoints
+"""
+import pytest
+import json
+from fastapi.testclient import TestClient
+from unittest.mock import patch, AsyncMock
+from httpx import AsyncClient
+@pytest.mark.api
+class TestHealthEndpoints:
+    """Test health check endpoints"""
+    def test_health_check(self, client: TestClient):
+        """Test basic health check endpoint"""
+        response = client.get("/health")
+        assert response.status_code == 200
+        data = response.json()
+        assert data["status"] == "healthy"
+        assert "timestamp" in data
+        assert data["service"] == "merchant-api"
+        assert data["version"] == "1.0.0"
+    @patch('app.nosql.check_mongodb_health')
+    @patch('app.nosql.check_redis_health')
+    def test_readiness_check_healthy(self, mock_redis, mock_mongo, client: TestClient):
+        """Test readiness check when databases are healthy"""
+        mock_mongo.return_value = True
+        mock_redis.return_value = True
+        response = client.get("/ready")
+        assert response.status_code == 200
+        data = response.json()
+        assert data["status"] == "ready"
+        assert data["databases"]["mongodb"] == "healthy"
+        assert data["databases"]["redis"] == "healthy"
+    @patch('app.nosql.check_mongodb_health')
+    @patch('app.nosql.check_redis_health')
+    def test_readiness_check_unhealthy(self, mock_redis, mock_mongo, client: TestClient):
+        """Test readiness check when databases are unhealthy"""
+        mock_mongo.return_value = False
+        mock_redis.return_value = True
+        response = client.get("/ready")
+        assert response.status_code == 503
+        data = response.json()
+        assert "not_ready" in data["detail"]["status"]
+class TestMerchantEndpoints:
+    """Test merchant-related endpoints"""
+    @patch('app.services.merchant.get_merchants')
+    def test_get_merchants_success(self, mock_get_merchants, client: TestClient, sample_merchant_data):
+        """Test successful merchant retrieval"""
+        mock_get_merchants.return_value = [sample_merchant_data]
+        response = client.get("/api/v1/merchants/")
+        assert response.status_code == 200
+        data = response.json()
+        assert len(data) == 1
+        assert data[0]["name"] == "Test Hair Salon"
+        assert data[0]["category"] == "salon"
+    @patch('app.services.merchant.get_merchant_by_id')
+    def test_get_merchant_by_id_success(self, mock_get_merchant, client: TestClient, sample_merchant_data):
+        """Test successful merchant retrieval by ID"""
+        mock_get_merchant.return_value = sample_merchant_data
+        response = client.get("/api/v1/merchants/test_merchant_123")
+        assert response.status_code == 200
+        data = response.json()
+        assert data["_id"] == "test_merchant_123"
+        assert data["name"] == "Test Hair Salon"
+    @patch('app.services.merchant.get_merchant_by_id')
+    def test_get_merchant_by_id_not_found(self, mock_get_merchant, client: TestClient):
+        """Test merchant not found scenario"""
+        mock_get_merchant.return_value = None
+        response = client.get("/api/v1/merchants/nonexistent_id")
+        assert response.status_code == 404
+    @patch('app.services.merchant.search_merchants')
+    def test_search_merchants_with_location(self, mock_search, client: TestClient, sample_merchant_data):
+        """Test merchant search with location parameters"""
+        mock_search.return_value = [sample_merchant_data]
+        response = client.get("/api/v1/merchants/search", params={
+            "latitude": 40.7128,
+            "longitude": -74.0060,
+            "radius": 5000,
+            "category": "salon"
+        })
+        assert response.status_code == 200
+        data = response.json()
+        assert len(data) == 1
+        assert data[0]["category"] == "salon"
+    def test_search_merchants_invalid_coordinates(self, client: TestClient):
+        """Test merchant search with invalid coordinates"""
+        response = client.get("/api/v1/merchants/search", params={
+            "latitude": 200,  # Invalid latitude
+            "longitude": -74.0060,
+            "radius": 5000
+        })
+        assert response.status_code == 400
+class TestHelperEndpoints:
+    """Test helper service endpoints"""
+    @patch('app.services.helper.process_free_text')
+    def test_process_free_text_success(self, mock_process, client: TestClient):
+        """Test successful free text processing"""
+        mock_process.return_value = {
+            "query": "find a hair salon",
+            "extracted_keywords": ["hair", "salon"],
+            "suggested_category": "salon",
+            "search_parameters": {"category": "salon"}
+        }
+        response = client.post("/api/v1/helpers/process-text", json={
+            "text": "find a hair salon",
+            "latitude": 40.7128,
+            "longitude": -74.0060
+        })
+        assert response.status_code == 200
+        data = response.json()
+        assert data["suggested_category"] == "salon"
+    def test_process_free_text_empty_input(self, client: TestClient):
+        """Test free text processing with empty input"""
+        response = client.post("/api/v1/helpers/process-text", json={
+            "text": "",
+            "latitude": 40.7128,
+            "longitude": -74.0060
+        })
+        assert response.status_code == 400
+    def test_process_free_text_too_long(self, client: TestClient):
+        """Test free text processing with input too long"""
+        long_text = "a" * 1001  # Assuming 1000 char limit
+        response = client.post("/api/v1/helpers/process-text", json={
+            "text": long_text,
+            "latitude": 40.7128,
+            "longitude": -74.0060
+        })
+        assert response.status_code == 400
+class TestNLPEndpoints:
+    """Test NLP demo endpoints"""
+    @patch('app.services.advanced_nlp.advanced_nlp_pipeline')
+    def test_analyze_query_success(self, mock_pipeline, client: TestClient, mock_nlp_pipeline):
+        """Test successful query analysis"""
+        mock_pipeline.process_query = mock_nlp_pipeline.process_query
+        response = client.post("/api/v1/nlp/analyze-query", params={
+            "query": "find the best hair salon near me",
+            "latitude": 40.7128,
+            "longitude": -74.0060
+        })
+        assert response.status_code == 200
+        data = response.json()
+        assert data["status"] == "success"
+        assert "analysis" in data
+    def test_analyze_query_empty_input(self, client: TestClient):
+        """Test query analysis with empty input"""
+        response = client.post("/api/v1/nlp/analyze-query", params={
+            "query": ""
+        })
+        assert response.status_code == 400
+    def test_get_supported_intents(self, client: TestClient):
+        """Test getting supported intents"""
+        response = client.get("/api/v1/nlp/supported-intents")
+        assert response.status_code == 200
+        data = response.json()
+        assert data["status"] == "success"
+        assert "supported_intents" in data
+        assert "SEARCH_SERVICE" in data["supported_intents"]
+        assert "FILTER_QUALITY" in data["supported_intents"]
+    def test_get_supported_entities(self, client: TestClient):
+        """Test getting supported entities"""
+        response = client.get("/api/v1/nlp/supported-entities")
+        assert response.status_code == 200
+        data = response.json()
+        assert data["status"] == "success"
+        assert "supported_entities" in data
+        assert "services" in data["supported_entities"]
+        assert "amenities" in data["supported_entities"]
+class TestPerformanceEndpoints:
+    """Test performance monitoring endpoints"""
+    @patch('app.utils.performance_monitor.get_performance_report')
+    def test_get_performance_report(self, mock_report, client: TestClient):
+        """Test performance report endpoint"""
+        mock_report.return_value = {
+            "metrics": {
+                "total_queries": 100,
+                "average_time": 0.5,
+                "slow_queries": []
+            }
+        }
+        response = client.get("/api/v1/performance/report")
+        assert response.status_code == 200
+    def test_get_metrics(self, client: TestClient):
+        """Test metrics endpoint"""
+        response = client.get("/metrics")
+        # Should return metrics even if some components fail
+        assert response.status_code in [200, 500]
+class TestSecurityEndpoints:
+    """Test security-related functionality"""
+    def test_cors_headers(self, client: TestClient):
+        """Test CORS headers are properly set"""
+        response = client.options("/api/v1/merchants/", headers={
+            "Origin": "http://localhost:3000",
+            "Access-Control-Request-Method": "GET"
+        })
+        # Should allow the request
+        assert response.status_code in [200, 204]
+    def test_invalid_origin_blocked(self, client: TestClient):
+        """Test that invalid origins are blocked"""
+        response = client.get("/api/v1/merchants/", headers={
+            "Origin": "http://malicious-site.com"
+        })
+        # Should still work but without CORS headers for invalid origin
+        assert response.status_code == 200
+    def test_request_size_limit(self, client: TestClient):
+        """Test request size limits"""
+        large_payload = {"data": "x" * (11 * 1024 * 1024)}  # 11MB
+        response = client.post("/api/v1/helpers/process-text", json=large_payload)
+        # Should be rejected due to size limit
+        assert response.status_code in [413, 400]
+class TestErrorHandling:
+    """Test error handling across endpoints"""
+    def test_404_for_nonexistent_endpoint(self, client: TestClient):
+        """Test 404 for non-existent endpoints"""
+        response = client.get("/api/v1/nonexistent")
+        assert response.status_code == 404
+    def test_405_for_wrong_method(self, client: TestClient):
+        """Test 405 for wrong HTTP method"""
+        response = client.delete("/api/v1/merchants/")
+        assert response.status_code == 405
+    @patch('app.services.merchant.get_merchants')
+    def test_500_error_handling(self, mock_get_merchants, client: TestClient):
+        """Test 500 error handling"""
+        mock_get_merchants.side_effect = Exception("Database error")
+        response = client.get("/api/v1/merchants/")
+        assert response.status_code == 500
+    def test_malformed_json(self, client: TestClient):
+        """Test handling of malformed JSON"""
+        response = client.post(
+            "/api/v1/helpers/process-text",
+            data="invalid json",
+            headers={"Content-Type": "application/json"}
+        )
+        assert response.status_code == 422
+class TestAsyncEndpoints:
+    """Test async endpoint functionality"""
+    @pytest.mark.asyncio
+    async def test_async_client_health_check(self, async_client: AsyncClient):
+        """Test health check with async client"""
+        response = await async_client.get("/health")
+        assert response.status_code == 200
+        data = response.json()
+        assert data["status"] == "healthy"
+    @pytest.mark.asyncio
+    @patch('app.services.advanced_nlp.advanced_nlp_pipeline')
+    async def test_async_nlp_processing(self, mock_pipeline, async_client: AsyncClient, mock_nlp_pipeline):
+        """Test async NLP processing"""
+        mock_pipeline.process_query = mock_nlp_pipeline.process_query
+        response = await async_client.post("/api/v1/nlp/analyze-query", params={
+            "query": "find a spa"
+        })
+        assert response.status_code == 200
+        data = response.json()
+        assert data["status"] == "success"

app/tests/test_database.py ADDED Viewed

	@@ -0,0 +1,444 @@

+"""
+Regression tests for database operations and repositories
+"""
+import pytest
+from unittest.mock import AsyncMock, MagicMock, patch
+from typing import Dict, Any, List
+import asyncio
+class TestMongoDBOperations:
+    """Test MongoDB database operations"""
+    @pytest.mark.asyncio
+    async def test_mongodb_connection_health(self):
+        """Test MongoDB connection health check"""
+        from app.nosql import check_mongodb_health
+        with patch('app.nosql.db') as mock_db:
+            mock_db.command.return_value = {"ok": 1}
+            result = await check_mongodb_health()
+            assert result is True
+    @pytest.mark.asyncio
+    async def test_mongodb_connection_failure(self):
+        """Test MongoDB connection failure handling"""
+        from app.nosql import check_mongodb_health
+        with patch('app.nosql.db') as mock_db:
+            mock_db.command.side_effect = Exception("Connection failed")
+            result = await check_mongodb_health()
+            assert result is False
+    @pytest.mark.asyncio
+    async def test_fetch_documents(self, sample_merchant_data):
+        """Test retrieving documents from database"""
+        from app.repositories.db_repository import fetch_documents
+        with patch('app.nosql.db') as mock_db:
+            mock_collection = AsyncMock()
+            mock_db.__getitem__.return_value = mock_collection
+            mock_collection.count_documents.return_value = 1
+            mock_collection.find.return_value.skip.return_value.limit.return_value.to_list.return_value = [sample_merchant_data]
+            result = await fetch_documents("merchants", {}, {}, 0, 10)
+            assert result["total"] == 1
+            assert len(result["documents"]) == 1
+            assert result["documents"][0]["name"] == "Test Hair Salon"
+    @pytest.mark.asyncio
+    async def test_execute_query(self, sample_merchant_data):
+        """Test executing aggregation query"""
+        from app.repositories.db_repository import execute_query
+        with patch('app.nosql.db') as mock_db:
+            mock_collection = AsyncMock()
+            mock_db.__getitem__.return_value = mock_collection
+            mock_collection.aggregate.return_value.to_list.return_value = [sample_merchant_data]
+            pipeline = [{"$match": {"_id": "test_merchant_123"}}]
+            result = await execute_query("merchants", pipeline, use_optimization=False)
+            assert len(result) == 1
+            assert result[0]["name"] == "Test Hair Salon"
+    @pytest.mark.asyncio
+    async def test_count_documents(self):
+        """Test counting documents in collection"""
+        from app.repositories.db_repository import count_documents
+        with patch('app.nosql.db') as mock_db:
+            mock_collection = AsyncMock()
+            mock_db.__getitem__.return_value = mock_collection
+            mock_collection.count_documents.return_value = 5
+            result = await count_documents("merchants", {"category": "salon"})
+            assert result == 5
+            mock_collection.count_documents.assert_called_once_with({"category": "salon"})
+    @pytest.mark.asyncio
+    async def test_serialize_mongo_document(self):
+        """Test MongoDB document serialization"""
+        from app.repositories.db_repository import serialize_mongo_document
+        from bson import ObjectId
+        from datetime import datetime
+        test_doc = {
+            "_id": ObjectId("507f1f77bcf86cd799439011"),
+            "name": "Test Merchant",
+            "created_at": datetime(2024, 1, 1, 12, 0, 0),
+            "nested": {
+                "id": ObjectId("507f1f77bcf86cd799439012"),
+                "date": datetime(2024, 1, 2, 12, 0, 0)
+            }
+        }
+        result = serialize_mongo_document(test_doc)
+        assert isinstance(result["_id"], str)
+        assert isinstance(result["created_at"], str)
+        assert isinstance(result["nested"]["id"], str)
+        assert isinstance(result["nested"]["date"], str)
+class TestRedisOperations:
+    """Test Redis cache operations"""
+    @pytest.mark.asyncio
+    async def test_redis_connection_health(self):
+        """Test Redis connection health check"""
+        from app.nosql import check_redis_health
+        with patch('app.nosql.get_redis_client') as mock_client:
+            mock_redis = AsyncMock()
+            mock_client.return_value = mock_redis
+            mock_redis.ping.return_value = True
+            result = await check_redis_health()
+            assert result is True
+    @pytest.mark.asyncio
+    async def test_redis_connection_failure(self):
+        """Test Redis connection failure handling"""
+        from app.nosql import check_redis_health
+        with patch('app.nosql.get_redis_client') as mock_client:
+            mock_client.side_effect = Exception("Redis connection failed")
+            result = await check_redis_health()
+            assert result is False
+    @pytest.mark.asyncio
+    async def test_cache_merchant_data(self, sample_merchant_data):
+        """Test caching merchant data in Redis"""
+        from app.repositories.cache_repository import cache_merchant_data
+        with patch('app.nosql.get_redis_client') as mock_client:
+            mock_redis = AsyncMock()
+            mock_client.return_value = mock_redis
+            mock_redis.setex.return_value = True
+            result = await cache_merchant_data("test_merchant_123", sample_merchant_data)
+            assert result is True
+            mock_redis.setex.assert_called_once()
+    @pytest.mark.asyncio
+    async def test_cache_manager_get_or_set(self, sample_merchant_data):
+        """Test cache manager get_or_set functionality"""
+        from app.repositories.cache_repository import cache_manager
+        import json
+        with patch('app.nosql.redis_client') as mock_redis:
+            mock_redis.get.return_value = None  # Cache miss
+            mock_redis.set.return_value = True
+            async def fetch_func():
+                return sample_merchant_data
+            result = await cache_manager.get_or_set_cache("test_key", fetch_func)
+            assert result["name"] == "Test Hair Salon"
+            mock_redis.set.assert_called_once()
+    @pytest.mark.asyncio
+    async def test_cache_manager_hit(self, sample_merchant_data):
+        """Test cache manager cache hit"""
+        from app.repositories.cache_repository import cache_manager
+        import json
+        with patch('app.nosql.redis_client') as mock_redis:
+            mock_redis.get.return_value = json.dumps(sample_merchant_data)
+            async def fetch_func():
+                return {"should": "not_be_called"}
+            result = await cache_manager.get_or_set_cache("test_key", fetch_func)
+            assert result["name"] == "Test Hair Salon"
+            mock_redis.get.assert_called_once()
+    @pytest.mark.asyncio
+    async def test_cache_manager_invalidate(self):
+        """Test cache invalidation"""
+        from app.repositories.cache_repository import cache_manager
+        with patch('app.nosql.redis_client') as mock_redis:
+            mock_redis.delete.return_value = 1
+            await cache_manager.invalidate_cache("test_key")
+            mock_redis.delete.assert_called_once()
+    @pytest.mark.asyncio
+    async def test_invalidate_cache(self):
+        """Test cache invalidation"""
+        from app.repositories.cache_repository import invalidate_cache
+        with patch('app.nosql.get_redis_client') as mock_client:
+            mock_redis = AsyncMock()
+            mock_client.return_value = mock_redis
+            mock_redis.delete.return_value = 1
+            result = await invalidate_cache("test_key")
+            assert result is True
+            mock_redis.delete.assert_called_once_with("test_key")
+class TestDatabaseIndexes:
+    """Test database indexing functionality"""
+    @pytest.mark.asyncio
+    async def test_create_geospatial_index(self):
+        """Test creating geospatial index"""
+        from app.database.indexes import create_geospatial_index
+        with patch('app.nosql.get_mongodb_client') as mock_client:
+            mock_collection = MagicMock()
+            mock_client.return_value.__getitem__.return_value.__getitem__.return_value = mock_collection
+            mock_collection.create_index.return_value = "location_2dsphere"
+            result = await create_geospatial_index()
+            assert result == "location_2dsphere"
+            mock_collection.create_index.assert_called_once()
+    @pytest.mark.asyncio
+    async def test_create_text_search_index(self):
+        """Test creating text search index"""
+        from app.database.indexes import create_text_search_index
+        with patch('app.nosql.get_mongodb_client') as mock_client:
+            mock_collection = MagicMock()
+            mock_client.return_value.__getitem__.return_value.__getitem__.return_value = mock_collection
+            mock_collection.create_index.return_value = "text_search_index"
+            result = await create_text_search_index()
+            assert result == "text_search_index"
+            mock_collection.create_index.assert_called_once()
+    @pytest.mark.asyncio
+    async def test_create_category_index(self):
+        """Test creating category index"""
+        from app.database.indexes import create_category_index
+        with patch('app.nosql.get_mongodb_client') as mock_client:
+            mock_collection = MagicMock()
+            mock_client.return_value.__getitem__.return_value.__getitem__.return_value = mock_collection
+            mock_collection.create_index.return_value = "category_index"
+            result = await create_category_index()
+            assert result == "category_index"
+            mock_collection.create_index.assert_called_once()
+class TestQueryOptimization:
+    """Test database query optimization"""
+    @pytest.mark.asyncio
+    async def test_optimized_merchant_search(self, sample_merchant_data):
+        """Test optimized merchant search query"""
+        from app.database.query_optimizer import optimize_search_query
+        search_params = {
+            "category": "salon",
+            "latitude": 40.7128,
+            "longitude": -74.0060,
+            "radius": 5000,
+            "min_rating": 4.0
+        }
+        optimized_query = optimize_search_query(search_params)
+        assert isinstance(optimized_query, dict)
+        assert "location" in optimized_query
+        assert "category" in optimized_query
+        assert "average_rating" in optimized_query
+    @pytest.mark.asyncio
+    async def test_query_performance_analysis(self):
+        """Test query performance analysis"""
+        from app.database.query_optimizer import analyze_query_performance
+        with patch('app.nosql.get_mongodb_client') as mock_client:
+            mock_collection = MagicMock()
+            mock_client.return_value.__getitem__.return_value.__getitem__.return_value = mock_collection
+            # Mock explain output
+            mock_collection.find.return_value.explain.return_value = {
+                "executionStats": {
+                    "totalDocsExamined": 100,
+                    "totalDocsReturned": 10,
+                    "executionTimeMillis": 50
+                }
+            }
+            query = {"category": "salon"}
+            result = await analyze_query_performance(query)
+            assert "executionStats" in result
+            assert result["executionStats"]["totalDocsExamined"] == 100
+class TestDatabaseTransactions:
+    """Test database transaction handling"""
+    @pytest.mark.asyncio
+    async def test_merchant_creation_transaction(self, sample_merchant_data):
+        """Test merchant creation with transaction"""
+        from app.repositories.db_repository import create_merchant_with_transaction
+        with patch('app.nosql.get_mongodb_client') as mock_client:
+            mock_session = AsyncMock()
+            mock_client.return_value.start_session.return_value.__aenter__.return_value = mock_session
+            mock_collection = MagicMock()
+            mock_client.return_value.__getitem__.return_value.__getitem__.return_value = mock_collection
+            mock_collection.insert_one.return_value.inserted_id = "new_merchant_id"
+            result = await create_merchant_with_transaction(sample_merchant_data)
+            assert result == "new_merchant_id"
+            mock_session.start_transaction.assert_called_once()
+    @pytest.mark.asyncio
+    async def test_transaction_rollback(self, sample_merchant_data):
+        """Test transaction rollback on error"""
+        from app.repositories.db_repository import create_merchant_with_transaction
+        with patch('app.nosql.get_mongodb_client') as mock_client:
+            mock_session = AsyncMock()
+            mock_client.return_value.start_session.return_value.__aenter__.return_value = mock_session
+            mock_collection = MagicMock()
+            mock_client.return_value.__getitem__.return_value.__getitem__.return_value = mock_collection
+            mock_collection.insert_one.side_effect = Exception("Insert failed")
+            with pytest.raises(Exception):
+                await create_merchant_with_transaction(sample_merchant_data)
+            mock_session.abort_transaction.assert_called_once()
+class TestDatabasePerformance:
+    """Test database performance characteristics"""
+    @pytest.mark.asyncio
+    async def test_concurrent_database_operations(self, sample_merchant_data):
+        """Test concurrent database operations"""
+        from app.repositories.db_repository import get_merchant_by_id_from_db
+        with patch('app.nosql.get_mongodb_client') as mock_client:
+            mock_collection = MagicMock()
+            mock_client.return_value.__getitem__.return_value.__getitem__.return_value = mock_collection
+            mock_collection.find_one.return_value = sample_merchant_data
+            # Create multiple concurrent requests
+            tasks = [
+                get_merchant_by_id_from_db(f"merchant_{i}")
+                for i in range(20)
+            ]
+            results = await asyncio.gather(*tasks)
+            assert len(results) == 20
+            assert all(result["name"] == "Test Hair Salon" for result in results)
+    @pytest.mark.asyncio
+    async def test_large_result_set_handling(self):
+        """Test handling of large result sets"""
+        from app.repositories.db_repository import get_merchants_from_db
+        # Create mock data for large result set
+        large_dataset = [{"_id": f"merchant_{i}", "name": f"Merchant {i}"} for i in range(1000)]
+        with patch('app.nosql.get_mongodb_client') as mock_client:
+            mock_collection = MagicMock()
+            mock_client.return_value.__getitem__.return_value.__getitem__.return_value = mock_collection
+            mock_collection.find.return_value.limit.return_value.skip.return_value.to_list.return_value = large_dataset[:100]
+            result = await get_merchants_from_db(limit=100, skip=0)
+            assert len(result) == 100
+            # Verify pagination was applied
+            mock_collection.find.return_value.limit.assert_called_with(100)
+    @pytest.mark.asyncio
+    async def test_connection_pool_management(self):
+        """Test database connection pool management"""
+        from app.nosql import get_mongodb_client
+        # Test multiple client requests
+        clients = []
+        for _ in range(10):
+            client = get_mongodb_client()
+            clients.append(client)
+        # All clients should be the same instance (singleton pattern)
+        assert all(client is clients[0] for client in clients)
+class TestDatabaseErrorHandling:
+    """Test database error handling scenarios"""
+    @pytest.mark.asyncio
+    async def test_connection_timeout_handling(self):
+        """Test handling of connection timeouts"""
+        from app.repositories.db_repository import get_merchants_from_db
+        with patch('app.nosql.get_mongodb_client') as mock_client:
+            mock_client.side_effect = Exception("Connection timeout")
+            with pytest.raises(Exception) as exc_info:
+                await get_merchants_from_db()
+            assert "Connection timeout" in str(exc_info.value)
+    @pytest.mark.asyncio
+    async def test_invalid_query_handling(self):
+        """Test handling of invalid queries"""
+        from app.repositories.db_repository import search_merchants_in_db
+        with patch('app.nosql.get_mongodb_client') as mock_client:
+            mock_collection = MagicMock()
+            mock_client.return_value.__getitem__.return_value.__getitem__.return_value = mock_collection
+            mock_collection.find.side_effect = Exception("Invalid query")
+            with pytest.raises(Exception) as exc_info:
+                await search_merchants_in_db(latitude=200, longitude=200)  # Invalid coordinates
+            assert "Invalid query" in str(exc_info.value)
+    @pytest.mark.asyncio
+    async def test_duplicate_key_error_handling(self, sample_merchant_data):
+        """Test handling of duplicate key errors"""
+        from app.repositories.db_repository import create_merchant_in_db
+        with patch('app.nosql.get_mongodb_client') as mock_client:
+            mock_collection = MagicMock()
+            mock_client.return_value.__getitem__.return_value.__getitem__.return_value = mock_collection
+            mock_collection.insert_one.side_effect = Exception("Duplicate key error")
+            with pytest.raises(Exception) as exc_info:
+                await create_merchant_in_db(sample_merchant_data)
+            assert "Duplicate key error" in str(exc_info.value)

app/tests/test_integration.py ADDED Viewed

	@@ -0,0 +1,495 @@

+"""
+Integration tests for end-to-end functionality
+"""
+import pytest
+import asyncio
+from unittest.mock import patch, AsyncMock, MagicMock
+from typing import Dict, Any, List
+class TestEndToEndUserJourney:
+    """Test complete user journey scenarios"""
+    @pytest.mark.asyncio
+    async def test_complete_search_journey(self, async_client, sample_merchant_data):
+        """Test complete user search journey from query to results"""
+        # Mock all dependencies
+        with patch('app.services.advanced_nlp.advanced_nlp_pipeline') as mock_nlp, \
+             patch('app.services.merchant.search_merchants') as mock_search:
+            # Setup NLP pipeline mock
+            mock_nlp.process_query.return_value = {
+                "query": "find the best hair salon near me with parking",
+                "primary_intent": {"intent": "SEARCH_SERVICE", "confidence": 0.9},
+                "entities": {
+                    "service_types": ["haircut", "hair styling"],
+                    "amenities": ["parking"],
+                    "location_modifiers": ["near me"],
+                    "quality_indicators": ["best"]
+                },
+                "similar_services": [("salon", 0.95), ("beauty", 0.8)],
+                "search_parameters": {
+                    "merchant_category": "salon",
+                    "amenities": ["parking"],
+                    "radius": 5000,
+                    "min_rating": 4.0
+                },
+                "processing_time": 0.234
+            }
+            # Setup merchant search mock
+            mock_search.return_value = [sample_merchant_data]
+            # Step 1: User submits natural language query
+            nlp_response = await async_client.post("/api/v1/nlp/analyze-query", params={
+                "query": "find the best hair salon near me with parking",
+                "latitude": 40.7128,
+                "longitude": -74.0060
+            })
+            assert nlp_response.status_code == 200
+            nlp_data = nlp_response.json()
+            assert nlp_data["status"] == "success"
+            assert "analysis" in nlp_data
+            # Step 2: Use NLP results to search merchants
+            search_params = nlp_data["analysis"]["search_parameters"]
+            search_response = await async_client.get("/api/v1/merchants/search", params={
+                "latitude": 40.7128,
+                "longitude": -74.0060,
+                **search_params
+            })
+            assert search_response.status_code == 200
+            search_data = search_response.json()
+            assert len(search_data) == 1
+            assert search_data[0]["name"] == "Test Hair Salon"
+            assert "parking" in search_data[0]["amenities"]
+    @pytest.mark.asyncio
+    async def test_merchant_discovery_flow(self, async_client, sample_merchant_data):
+        """Test merchant discovery and detail retrieval flow"""
+        with patch('app.services.merchant.get_merchants') as mock_get_all, \
+             patch('app.services.merchant.get_merchant_by_id') as mock_get_by_id:
+            mock_get_all.return_value = [sample_merchant_data]
+            mock_get_by_id.return_value = sample_merchant_data
+            # Step 1: Browse all merchants
+            browse_response = await async_client.get("/api/v1/merchants/")
+            assert browse_response.status_code == 200
+            merchants = browse_response.json()
+            assert len(merchants) == 1
+            # Step 2: Get detailed information for specific merchant
+            merchant_id = merchants[0]["_id"]
+            detail_response = await async_client.get(f"/api/v1/merchants/{merchant_id}")
+            assert detail_response.status_code == 200
+            merchant_detail = detail_response.json()
+            assert merchant_detail["_id"] == merchant_id
+            assert merchant_detail["name"] == "Test Hair Salon"
+            assert "services" in merchant_detail
+            assert "business_hours" in merchant_detail
+    @pytest.mark.asyncio
+    async def test_error_recovery_flow(self, async_client):
+        """Test error recovery and graceful degradation"""
+        # Test NLP service failure with fallback
+        with patch('app.services.advanced_nlp.advanced_nlp_pipeline') as mock_nlp, \
+             patch('app.services.helper.process_free_text') as mock_fallback:
+            # NLP service fails
+            mock_nlp.process_query.side_effect = Exception("NLP service unavailable")
+            # Fallback service works
+            mock_fallback.return_value = {
+                "query": "hair salon",
+                "extracted_keywords": ["hair", "salon"],
+                "suggested_category": "salon"
+            }
+            # Should gracefully fall back to basic processing
+            response = await async_client.post("/api/v1/helpers/process-text", json={
+                "text": "hair salon",
+                "latitude": 40.7128,
+                "longitude": -74.0060
+            })
+            assert response.status_code == 200
+            data = response.json()
+            assert data["suggested_category"] == "salon"
+class TestServiceIntegration:
+    """Test integration between different services"""
+    @pytest.mark.asyncio
+    async def test_nlp_to_search_integration(self, sample_merchant_data):
+        """Test integration between NLP processing and merchant search"""
+        from app.services.advanced_nlp import AdvancedNLPPipeline
+        from app.services.merchant import search_merchants
+        with patch('app.repositories.db_repository.search_merchants_in_db') as mock_db_search:
+            mock_db_search.return_value = [sample_merchant_data]
+            # Process query with NLP
+            pipeline = AdvancedNLPPipeline()
+            nlp_result = await pipeline.process_query("find a hair salon with parking")
+            # Use NLP results for merchant search
+            search_params = nlp_result["search_parameters"]
+            merchants = await search_merchants(**search_params)
+            assert len(merchants) == 1
+            assert merchants[0]["category"] == "salon"
+    @pytest.mark.asyncio
+    async def test_cache_integration(self, sample_merchant_data):
+        """Test cache integration across services"""
+        from app.services.merchant import get_merchant_by_id
+        from app.repositories.cache_repository import cache_merchant_data, get_cached_merchant_data
+        with patch('app.nosql.get_redis_client') as mock_redis_client, \
+             patch('app.repositories.db_repository.get_merchant_by_id_from_db') as mock_db:
+            mock_redis = AsyncMock()
+            mock_redis_client.return_value = mock_redis
+            mock_db.return_value = sample_merchant_data
+            # First call - cache miss, should hit database
+            mock_redis.get.return_value = None
+            result1 = await get_merchant_by_id("test_merchant_123")
+            # Should cache the result
+            mock_redis.setex.assert_called_once()
+            # Second call - cache hit, should not hit database
+            mock_redis.get.return_value = '{"_id": "test_merchant_123", "name": "Test Hair Salon"}'
+            result2 = await get_merchant_by_id("test_merchant_123")
+            assert result1 is not None
+            assert result2 is not None
+    @pytest.mark.asyncio
+    async def test_database_to_api_integration(self, sample_merchant_data):
+        """Test integration from database layer to API layer"""
+        from app.repositories.db_repository import get_merchants_from_db
+        from app.services.merchant import get_merchants
+        with patch('app.nosql.get_mongodb_client') as mock_client:
+            mock_collection = AsyncMock()
+            mock_client.return_value.__getitem__.return_value.__getitem__.return_value = mock_collection
+            mock_collection.find.return_value.limit.return_value.skip.return_value.to_list.return_value = [sample_merchant_data]
+            # Database layer
+            db_result = await get_merchants_from_db(limit=10, skip=0)
+            # Service layer
+            service_result = await get_merchants(limit=10, skip=0)
+            assert len(db_result) == 1
+            assert len(service_result) == 1
+            assert db_result[0]["name"] == service_result[0]["name"]
+class TestDataFlowIntegration:
+    """Test data flow through the entire system"""
+    @pytest.mark.asyncio
+    async def test_search_data_flow(self, async_client, sample_merchant_data):
+        """Test complete search data flow"""
+        with patch('app.nosql.get_mongodb_client') as mock_mongo, \
+             patch('app.nosql.get_redis_client') as mock_redis_client:
+            # Setup MongoDB mock
+            mock_collection = AsyncMock()
+            mock_mongo.return_value.__getitem__.return_value.__getitem__.return_value = mock_collection
+            mock_collection.find.return_value.limit.return_value.to_list.return_value = [sample_merchant_data]
+            # Setup Redis mock
+            mock_redis = AsyncMock()
+            mock_redis_client.return_value = mock_redis
+            mock_redis.get.return_value = None  # Cache miss
+            # Make search request
+            response = await async_client.get("/api/v1/merchants/search", params={
+                "latitude": 40.7128,
+                "longitude": -74.0060,
+                "radius": 5000,
+                "category": "salon"
+            })
+            assert response.status_code == 200
+            data = response.json()
+            assert len(data) == 1
+            assert data[0]["name"] == "Test Hair Salon"
+            # Verify data flowed through all layers
+            mock_collection.find.assert_called_once()  # Database was queried
+            mock_redis.setex.assert_called_once()      # Result was cached
+    @pytest.mark.asyncio
+    async def test_nlp_processing_data_flow(self, async_client):
+        """Test NLP processing data flow"""
+        with patch('app.services.advanced_nlp.IntentClassifier') as mock_intent, \
+             patch('app.services.advanced_nlp.BusinessEntityExtractor') as mock_entity, \
+             patch('app.services.advanced_nlp.SemanticMatcher') as mock_semantic:
+            # Setup mocks
+            mock_intent_instance = MagicMock()
+            mock_intent.return_value = mock_intent_instance
+            mock_intent_instance.get_primary_intent.return_value = ("SEARCH_SERVICE", 0.9)
+            mock_entity_instance = MagicMock()
+            mock_entity.return_value = mock_entity_instance
+            mock_entity_instance.extract_entities.return_value = {"service_types": ["haircut"]}
+            mock_semantic_instance = MagicMock()
+            mock_semantic.return_value = mock_semantic_instance
+            mock_semantic_instance.find_similar_services.return_value = [("salon", 0.9)]
+            response = await async_client.post("/api/v1/nlp/analyze-query", params={
+                "query": "find a hair salon"
+            })
+            assert response.status_code == 200
+            data = response.json()
+            assert data["status"] == "success"
+            # Verify data flowed through NLP pipeline
+            mock_intent_instance.get_primary_intent.assert_called_once()
+            mock_entity_instance.extract_entities.assert_called_once()
+            mock_semantic_instance.find_similar_services.assert_called_once()
+class TestConcurrencyIntegration:
+    """Test concurrent operations and race conditions"""
+    @pytest.mark.asyncio
+    async def test_concurrent_cache_operations(self, sample_merchant_data):
+        """Test concurrent cache read/write operations"""
+        from app.repositories.cache_repository import cache_merchant_data, get_cached_merchant_data
+        with patch('app.nosql.get_redis_client') as mock_redis_client:
+            mock_redis = AsyncMock()
+            mock_redis_client.return_value = mock_redis
+            mock_redis.setex.return_value = True
+            mock_redis.get.return_value = '{"_id": "test_merchant_123", "name": "Test Hair Salon"}'
+            # Create concurrent cache operations
+            cache_tasks = [
+                cache_merchant_data(f"merchant_{i}", sample_merchant_data)
+                for i in range(10)
+            ]
+            read_tasks = [
+                get_cached_merchant_data(f"merchant_{i}")
+                for i in range(10)
+            ]
+            # Execute concurrently
+            cache_results = await asyncio.gather(*cache_tasks)
+            read_results = await asyncio.gather(*read_tasks)
+            assert all(result is True for result in cache_results)
+            assert all(result is not None for result in read_results)
+    @pytest.mark.asyncio
+    async def test_concurrent_nlp_processing(self):
+        """Test concurrent NLP processing"""
+        from app.services.advanced_nlp import AdvancedNLPPipeline
+        pipeline = AdvancedNLPPipeline()
+        queries = [
+            "find a hair salon",
+            "best spa near me",
+            "gym with parking",
+            "dental clinic",
+            "massage therapy"
+        ]
+        # Process queries concurrently
+        tasks = [pipeline.process_query(query) for query in queries]
+        results = await asyncio.gather(*tasks)
+        assert len(results) == 5
+        assert all("query" in result for result in results)
+        assert all("primary_intent" in result for result in results)
+    @pytest.mark.asyncio
+    async def test_concurrent_database_operations(self, sample_merchant_data):
+        """Test concurrent database operations"""
+        from app.repositories.db_repository import get_merchant_by_id_from_db, search_merchants_in_db
+        with patch('app.nosql.get_mongodb_client') as mock_client:
+            mock_collection = AsyncMock()
+            mock_client.return_value.__getitem__.return_value.__getitem__.return_value = mock_collection
+            mock_collection.find_one.return_value = sample_merchant_data
+            mock_collection.find.return_value.limit.return_value.to_list.return_value = [sample_merchant_data]
+            # Mix of different database operations
+            tasks = []
+            # Add get by ID tasks
+            for i in range(5):
+                tasks.append(get_merchant_by_id_from_db(f"merchant_{i}"))
+            # Add search tasks
+            for i in range(5):
+                tasks.append(search_merchants_in_db(
+                    latitude=40.7128 + (i * 0.01),
+                    longitude=-74.0060 + (i * 0.01),
+                    radius=5000
+                ))
+            results = await asyncio.gather(*tasks)
+            assert len(results) == 10
+            # First 5 should be single merchants
+            assert all(isinstance(result, dict) for result in results[:5])
+            # Last 5 should be lists of merchants
+            assert all(isinstance(result, list) for result in results[5:])
+class TestErrorHandlingIntegration:
+    """Test error handling across integrated components"""
+    @pytest.mark.asyncio
+    async def test_database_failure_propagation(self, async_client):
+        """Test how database failures propagate through the system"""
+        with patch('app.nosql.get_mongodb_client') as mock_client:
+            mock_client.side_effect = Exception("Database connection failed")
+            response = await async_client.get("/api/v1/merchants/")
+            assert response.status_code == 500
+    @pytest.mark.asyncio
+    async def test_cache_failure_graceful_degradation(self, async_client, sample_merchant_data):
+        """Test graceful degradation when cache fails"""
+        with patch('app.nosql.get_redis_client') as mock_redis_client, \
+             patch('app.repositories.db_repository.get_merchants_from_db') as mock_db:
+            # Cache fails
+            mock_redis_client.side_effect = Exception("Redis connection failed")
+            # Database works
+            mock_db.return_value = [sample_merchant_data]
+            response = await async_client.get("/api/v1/merchants/")
+            # Should still work, just without caching
+            assert response.status_code == 200
+            data = response.json()
+            assert len(data) == 1
+    @pytest.mark.asyncio
+    async def test_nlp_service_fallback(self, async_client):
+        """Test fallback when NLP service fails"""
+        with patch('app.services.advanced_nlp.advanced_nlp_pipeline') as mock_nlp, \
+             patch('app.services.helper.process_free_text') as mock_fallback:
+            # NLP service fails
+            mock_nlp.process_query.side_effect = Exception("NLP service error")
+            # Fallback service works
+            mock_fallback.return_value = {
+                "query": "hair salon",
+                "extracted_keywords": ["hair", "salon"],
+                "suggested_category": "salon"
+            }
+            # Try NLP endpoint first (should fail)
+            nlp_response = await async_client.post("/api/v1/nlp/analyze-query", params={
+                "query": "hair salon"
+            })
+            assert nlp_response.status_code == 500
+            # Use fallback endpoint (should work)
+            fallback_response = await async_client.post("/api/v1/helpers/process-text", json={
+                "text": "hair salon",
+                "latitude": 40.7128,
+                "longitude": -74.0060
+            })
+            assert fallback_response.status_code == 200
+class TestSecurityIntegration:
+    """Test security measures across integrated components"""
+    @pytest.mark.asyncio
+    async def test_input_sanitization_flow(self, async_client):
+        """Test input sanitization across the request flow"""
+        # Test with potentially malicious input
+        malicious_query = "<script>alert('xss')</script>find salon"
+        with patch('app.services.advanced_nlp.advanced_nlp_pipeline') as mock_nlp:
+            mock_nlp.process_query.return_value = {
+                "query": "find salon",  # Should be sanitized
+                "primary_intent": {"intent": "SEARCH_SERVICE", "confidence": 0.8},
+                "entities": {},
+                "similar_services": [],
+                "search_parameters": {},
+                "processing_time": 0.1
+            }
+            response = await async_client.post("/api/v1/nlp/analyze-query", params={
+                "query": malicious_query
+            })
+            assert response.status_code == 200
+            data = response.json()
+            # Script tags should be removed/sanitized
+            assert "<script>" not in data["analysis"]["query"]
+    @pytest.mark.asyncio
+    async def test_rate_limiting_integration(self, async_client):
+        """Test rate limiting across endpoints"""
+        # This would require actual rate limiting implementation
+        # For now, test that multiple requests don't crash the system
+        tasks = [
+            async_client.get("/health")
+            for _ in range(50)
+        ]
+        responses = await asyncio.gather(*tasks, return_exceptions=True)
+        # All requests should either succeed or be rate limited (not crash)
+        for response in responses:
+            if hasattr(response, 'status_code'):
+                assert response.status_code in [200, 429]  # OK or Too Many Requests
+class TestPerformanceIntegration:
+    """Test performance characteristics of integrated system"""
+    @pytest.mark.asyncio
+    async def test_end_to_end_performance(self, async_client, sample_merchant_data):
+        """Test end-to-end performance of complete user journey"""
+        import time
+        with patch('app.services.advanced_nlp.advanced_nlp_pipeline') as mock_nlp, \
+             patch('app.services.merchant.search_merchants') as mock_search:
+            mock_nlp.process_query.return_value = {
+                "query": "find salon",
+                "primary_intent": {"intent": "SEARCH_SERVICE", "confidence": 0.8},
+                "entities": {"service_types": ["haircut"]},
+                "similar_services": [("salon", 0.9)],
+                "search_parameters": {"merchant_category": "salon"},
+                "processing_time": 0.1
+            }
+            mock_search.return_value = [sample_merchant_data]
+            start_time = time.time()
+            # Step 1: NLP processing
+            nlp_response = await async_client.post("/api/v1/nlp/analyze-query", params={
+                "query": "find a hair salon"
+            })
+            # Step 2: Merchant search
+            search_response = await async_client.get("/api/v1/merchants/search", params={
+                "category": "salon"
+            })
+            total_time = time.time() - start_time
+            assert nlp_response.status_code == 200
+            assert search_response.status_code == 200
+            assert total_time < 3.0  # Complete journey within 3 seconds

app/tests/test_performance.py ADDED Viewed

	@@ -0,0 +1,493 @@

+"""
+Performance regression tests for the application
+"""
+import pytest
+import time
+import asyncio
+from typing import List, Dict, Any
+from unittest.mock import patch, AsyncMock
+import statistics
+class TestAPIPerformance:
+    """Test API endpoint performance"""
+    @pytest.mark.asyncio
+    async def test_health_endpoint_response_time(self, async_client):
+        """Test health endpoint response time"""
+        start_time = time.time()
+        response = await async_client.get("/health")
+        response_time = time.time() - start_time
+        assert response.status_code == 200
+        assert response_time < 0.1  # Should respond within 100ms
+    @pytest.mark.asyncio
+    async def test_merchant_search_performance(self, async_client, sample_merchant_data):
+        """Test merchant search endpoint performance"""
+        with patch('app.services.merchant.search_merchants') as mock_search:
+            mock_search.return_value = [sample_merchant_data] * 10
+            start_time = time.time()
+            response = await async_client.get("/api/v1/merchants/search", params={
+                "latitude": 40.7128,
+                "longitude": -74.0060,
+                "radius": 5000,
+                "category": "salon"
+            })
+            response_time = time.time() - start_time
+            assert response.status_code == 200
+            assert response_time < 1.0  # Should respond within 1 second
+            assert len(response.json()) == 10
+    @pytest.mark.asyncio
+    async def test_nlp_processing_performance(self, async_client, mock_nlp_pipeline):
+        """Test NLP processing endpoint performance"""
+        with patch('app.services.advanced_nlp.advanced_nlp_pipeline') as mock_pipeline:
+            mock_pipeline.process_query = mock_nlp_pipeline.process_query
+            start_time = time.time()
+            response = await async_client.post("/api/v1/nlp/analyze-query", params={
+                "query": "find the best hair salon near me with parking"
+            })
+            response_time = time.time() - start_time
+            assert response.status_code == 200
+            assert response_time < 2.0  # Should respond within 2 seconds
+    @pytest.mark.asyncio
+    async def test_concurrent_api_requests(self, async_client, sample_merchant_data):
+        """Test concurrent API request handling"""
+        with patch('app.services.merchant.get_merchants') as mock_get:
+            mock_get.return_value = [sample_merchant_data] * 5
+            # Create 20 concurrent requests
+            tasks = [
+                async_client.get("/api/v1/merchants/")
+                for _ in range(20)
+            ]
+            start_time = time.time()
+            responses = await asyncio.gather(*tasks)
+            total_time = time.time() - start_time
+            # All requests should succeed
+            assert all(r.status_code == 200 for r in responses)
+            # Should handle concurrent requests efficiently
+            assert total_time < 5.0  # Within 5 seconds for 20 requests
+    @pytest.mark.asyncio
+    async def test_large_result_set_performance(self, async_client):
+        """Test performance with large result sets"""
+        # Mock large dataset
+        large_dataset = [
+            {
+                "_id": f"merchant_{i}",
+                "name": f"Merchant {i}",
+                "category": "salon",
+                "location": {"type": "Point", "coordinates": [-74.0060, 40.7128]}
+            }
+            for i in range(100)
+        ]
+        with patch('app.services.merchant.get_merchants') as mock_get:
+            mock_get.return_value = large_dataset
+            start_time = time.time()
+            response = await async_client.get("/api/v1/merchants/", params={"limit": 100})
+            response_time = time.time() - start_time
+            assert response.status_code == 200
+            assert len(response.json()) == 100
+            assert response_time < 2.0  # Should handle large datasets efficiently
+class TestDatabasePerformance:
+    """Test database operation performance"""
+    @pytest.mark.asyncio
+    async def test_single_merchant_query_performance(self, sample_merchant_data):
+        """Test single merchant query performance"""
+        from app.repositories.db_repository import get_merchant_by_id_from_db
+        with patch('app.nosql.get_mongodb_client') as mock_client:
+            mock_collection = AsyncMock()
+            mock_client.return_value.__getitem__.return_value.__getitem__.return_value = mock_collection
+            mock_collection.find_one.return_value = sample_merchant_data
+            start_time = time.time()
+            result = await get_merchant_by_id_from_db("test_merchant_123")
+            query_time = time.time() - start_time
+            assert result is not None
+            assert query_time < 0.1  # Should complete within 100ms
+    @pytest.mark.asyncio
+    async def test_geospatial_search_performance(self, sample_merchant_data):
+        """Test geospatial search performance"""
+        from app.repositories.db_repository import search_merchants_in_db
+        with patch('app.nosql.get_mongodb_client') as mock_client:
+            mock_collection = AsyncMock()
+            mock_client.return_value.__getitem__.return_value.__getitem__.return_value = mock_collection
+            mock_collection.find.return_value.limit.return_value.to_list.return_value = [sample_merchant_data] * 20
+            start_time = time.time()
+            result = await search_merchants_in_db(
+                latitude=40.7128,
+                longitude=-74.0060,
+                radius=5000,
+                category="salon"
+            )
+            query_time = time.time() - start_time
+            assert len(result) == 20
+            assert query_time < 0.5  # Should complete within 500ms
+    @pytest.mark.asyncio
+    async def test_concurrent_database_queries(self, sample_merchant_data):
+        """Test concurrent database query performance"""
+        from app.repositories.db_repository import get_merchant_by_id_from_db
+        with patch('app.nosql.get_mongodb_client') as mock_client:
+            mock_collection = AsyncMock()
+            mock_client.return_value.__getitem__.return_value.__getitem__.return_value = mock_collection
+            mock_collection.find_one.return_value = sample_merchant_data
+            # Create 50 concurrent queries
+            tasks = [
+                get_merchant_by_id_from_db(f"merchant_{i}")
+                for i in range(50)
+            ]
+            start_time = time.time()
+            results = await asyncio.gather(*tasks)
+            total_time = time.time() - start_time
+            assert len(results) == 50
+            assert all(r is not None for r in results)
+            assert total_time < 2.0  # Should handle 50 concurrent queries within 2 seconds
+    @pytest.mark.asyncio
+    async def test_cache_performance(self, sample_merchant_data):
+        """Test cache operation performance"""
+        from app.repositories.cache_repository import cache_merchant_data, get_cached_merchant_data
+        with patch('app.nosql.get_redis_client') as mock_client:
+            mock_redis = AsyncMock()
+            mock_client.return_value = mock_redis
+            mock_redis.setex.return_value = True
+            mock_redis.get.return_value = '{"_id": "test_merchant_123", "name": "Test Hair Salon"}'
+            # Test cache write performance
+            start_time = time.time()
+            await cache_merchant_data("test_merchant_123", sample_merchant_data)
+            cache_write_time = time.time() - start_time
+            # Test cache read performance
+            start_time = time.time()
+            result = await get_cached_merchant_data("test_merchant_123")
+            cache_read_time = time.time() - start_time
+            assert cache_write_time < 0.05  # Cache write within 50ms
+            assert cache_read_time < 0.05   # Cache read within 50ms
+            assert result is not None
+class TestNLPPerformance:
+    """Test NLP processing performance"""
+    @pytest.mark.asyncio
+    async def test_intent_classification_performance(self):
+        """Test intent classification performance"""
+        from app.services.advanced_nlp import IntentClassifier
+        classifier = IntentClassifier()
+        test_queries = [
+            "find a hair salon",
+            "best spa near me",
+            "gym with parking",
+            "dental clinic open now",
+            "massage therapy luxury"
+        ]
+        start_time = time.time()
+        results = [classifier.get_primary_intent(query) for query in test_queries]
+        total_time = time.time() - start_time
+        assert len(results) == 5
+        assert all(len(result) == 2 for result in results)  # (intent, confidence)
+        assert total_time < 1.0  # Should classify 5 queries within 1 second
+    @pytest.mark.asyncio
+    async def test_entity_extraction_performance(self):
+        """Test entity extraction performance"""
+        from app.services.advanced_nlp import BusinessEntityExtractor
+        extractor = BusinessEntityExtractor()
+        test_queries = [
+            "hair salon with parking near me",
+            "luxury spa treatment with wifi",
+            "budget-friendly gym open 24/7",
+            "pet-friendly grooming service",
+            "organic restaurant with outdoor seating"
+        ]
+        start_time = time.time()
+        results = [extractor.extract_entities(query) for query in test_queries]
+        total_time = time.time() - start_time
+        assert len(results) == 5
+        assert all(isinstance(result, dict) for result in results)
+        assert total_time < 2.0  # Should extract entities from 5 queries within 2 seconds
+    @pytest.mark.asyncio
+    async def test_semantic_matching_performance(self):
+        """Test semantic matching performance"""
+        from app.services.advanced_nlp import SemanticMatcher
+        matcher = SemanticMatcher()
+        test_queries = [
+            "hair salon",
+            "spa treatment",
+            "fitness center",
+            "dental care",
+            "massage therapy"
+        ]
+        start_time = time.time()
+        results = [matcher.find_similar_services(query) for query in test_queries]
+        total_time = time.time() - start_time
+        assert len(results) == 5
+        assert all(isinstance(result, list) for result in results)
+        assert total_time < 1.5  # Should find matches for 5 queries within 1.5 seconds
+    @pytest.mark.asyncio
+    async def test_complete_nlp_pipeline_performance(self):
+        """Test complete NLP pipeline performance"""
+        from app.services.advanced_nlp import AdvancedNLPPipeline
+        pipeline = AdvancedNLPPipeline()
+        test_queries = [
+            "find the best hair salon near me with parking",
+            "luxury spa treatment open now",
+            "budget-friendly gym with pool",
+            "pet-friendly grooming service",
+            "organic restaurant with delivery"
+        ]
+        processing_times = []
+        for query in test_queries:
+            start_time = time.time()
+            result = await pipeline.process_query(query)
+            processing_time = time.time() - start_time
+            processing_times.append(processing_time)
+            assert "processing_time" in result
+            assert processing_time < 3.0  # Each query within 3 seconds
+        # Calculate statistics
+        avg_time = statistics.mean(processing_times)
+        max_time = max(processing_times)
+        assert avg_time < 2.0  # Average processing time under 2 seconds
+        assert max_time < 3.0  # Maximum processing time under 3 seconds
+class TestMemoryPerformance:
+    """Test memory usage and performance"""
+    @pytest.mark.asyncio
+    async def test_memory_usage_during_processing(self):
+        """Test memory usage during heavy processing"""
+        import psutil
+        import os
+        process = psutil.Process(os.getpid())
+        initial_memory = process.memory_info().rss / 1024 / 1024  # MB
+        # Simulate heavy processing
+        from app.services.advanced_nlp import AdvancedNLPPipeline
+        pipeline = AdvancedNLPPipeline()
+        # Process multiple queries
+        queries = [f"find service {i}" for i in range(100)]
+        tasks = [pipeline.process_query(query) for query in queries]
+        await asyncio.gather(*tasks)
+        final_memory = process.memory_info().rss / 1024 / 1024  # MB
+        memory_increase = final_memory - initial_memory
+        # Memory increase should be reasonable (less than 100MB for 100 queries)
+        assert memory_increase < 100
+    @pytest.mark.asyncio
+    async def test_cache_memory_efficiency(self):
+        """Test cache memory efficiency"""
+        from app.services.advanced_nlp import AsyncNLPProcessor
+        processor = AsyncNLPProcessor(max_cache_size=100)
+        def dummy_processor(text):
+            return {"processed": text}
+        # Fill cache beyond limit
+        for i in range(150):
+            await processor.process_async(f"query_{i}", dummy_processor)
+        # Cache should not exceed max size
+        assert len(processor.cache) <= 100
+    @pytest.mark.asyncio
+    async def test_garbage_collection_efficiency(self):
+        """Test garbage collection during processing"""
+        import gc
+        # Force garbage collection
+        gc.collect()
+        initial_objects = len(gc.get_objects())
+        # Create and process many objects
+        from app.services.advanced_nlp import AdvancedNLPPipeline
+        pipeline = AdvancedNLPPipeline()
+        for i in range(50):
+            await pipeline.process_query(f"test query {i}")
+        # Force garbage collection again
+        gc.collect()
+        final_objects = len(gc.get_objects())
+        # Object count should not grow excessively
+        object_increase = final_objects - initial_objects
+        assert object_increase < 1000  # Reasonable object increase
+class TestLoadTesting:
+    """Load testing scenarios"""
+    @pytest.mark.asyncio
+    async def test_sustained_load_performance(self, async_client, sample_merchant_data):
+        """Test performance under sustained load"""
+        with patch('app.services.merchant.get_merchants') as mock_get:
+            mock_get.return_value = [sample_merchant_data] * 10
+            # Simulate sustained load for 30 seconds
+            end_time = time.time() + 30
+            request_count = 0
+            response_times = []
+            while time.time() < end_time:
+                start_time = time.time()
+                response = await async_client.get("/api/v1/merchants/")
+                response_time = time.time() - start_time
+                response_times.append(response_time)
+                request_count += 1
+                assert response.status_code == 200
+                # Small delay to prevent overwhelming
+                await asyncio.sleep(0.1)
+            # Calculate performance metrics
+            avg_response_time = statistics.mean(response_times)
+            max_response_time = max(response_times)
+            requests_per_second = request_count / 30
+            assert avg_response_time < 1.0  # Average response time under 1 second
+            assert max_response_time < 3.0  # Max response time under 3 seconds
+            assert requests_per_second > 5   # At least 5 requests per second
+    @pytest.mark.asyncio
+    async def test_burst_load_handling(self, async_client, sample_merchant_data):
+        """Test handling of burst load"""
+        with patch('app.services.merchant.search_merchants') as mock_search:
+            mock_search.return_value = [sample_merchant_data] * 5
+            # Create burst of 100 concurrent requests
+            tasks = [
+                async_client.get("/api/v1/merchants/search", params={
+                    "latitude": 40.7128,
+                    "longitude": -74.0060,
+                    "radius": 5000
+                })
+                for _ in range(100)
+            ]
+            start_time = time.time()
+            responses = await asyncio.gather(*tasks, return_exceptions=True)
+            total_time = time.time() - start_time
+            # Count successful responses
+            successful_responses = [r for r in responses if hasattr(r, 'status_code') and r.status_code == 200]
+            success_rate = len(successful_responses) / len(responses)
+            assert success_rate > 0.95  # At least 95% success rate
+            assert total_time < 10.0    # Complete within 10 seconds
+class TestPerformanceRegression:
+    """Performance regression detection"""
+    @pytest.mark.asyncio
+    async def test_api_response_time_regression(self, async_client, performance_test_data):
+        """Test for API response time regression"""
+        queries = performance_test_data["queries"]
+        max_expected_time = performance_test_data["expected_max_response_time"]
+        with patch('app.services.advanced_nlp.advanced_nlp_pipeline') as mock_pipeline:
+            mock_pipeline.process_query.return_value = {
+                "query": "test",
+                "primary_intent": {"intent": "SEARCH_SERVICE", "confidence": 0.8},
+                "entities": {},
+                "similar_services": [],
+                "search_parameters": {},
+                "processing_time": 0.1
+            }
+            response_times = []
+            for query in queries:
+                start_time = time.time()
+                response = await async_client.post("/api/v1/nlp/analyze-query", params={"query": query})
+                response_time = time.time() - start_time
+                response_times.append(response_time)
+                assert response.status_code == 200
+            avg_response_time = statistics.mean(response_times)
+            max_response_time = max(response_times)
+            # Check for performance regression
+            assert avg_response_time < max_expected_time
+            assert max_response_time < max_expected_time * 2  # Allow some variance
+    @pytest.mark.asyncio
+    async def test_database_query_performance_regression(self, sample_merchant_data):
+        """Test for database query performance regression"""
+        from app.repositories.db_repository import search_merchants_in_db
+        with patch('app.nosql.get_mongodb_client') as mock_client:
+            mock_collection = AsyncMock()
+            mock_client.return_value.__getitem__.return_value.__getitem__.return_value = mock_collection
+            mock_collection.find.return_value.limit.return_value.to_list.return_value = [sample_merchant_data] * 10
+            query_times = []
+            # Test multiple search scenarios
+            for i in range(10):
+                start_time = time.time()
+                await search_merchants_in_db(
+                    latitude=40.7128 + (i * 0.01),
+                    longitude=-74.0060 + (i * 0.01),
+                    radius=5000,
+                    category="salon"
+                )
+                query_time = time.time() - start_time
+                query_times.append(query_time)
+            avg_query_time = statistics.mean(query_times)
+            max_query_time = max(query_times)
+            # Database queries should be fast
+            assert avg_query_time < 0.1  # Average under 100ms
+            assert max_query_time < 0.2  # Maximum under 200ms

app/tests/test_regression_suite.py ADDED Viewed

	@@ -0,0 +1,519 @@

+"""
+Main regression test suite runner and comprehensive system tests
+"""
+import pytest
+import asyncio
+import time
+import statistics
+from typing import Dict, Any, List
+from unittest.mock import patch, AsyncMock
+class TestRegressionSuite:
+    """Main regression test suite for critical functionality"""
+    @pytest.mark.asyncio
+    async def test_critical_path_regression(self, async_client, sample_merchant_data):
+        """Test the most critical user path for regressions"""
+        with patch('app.services.advanced_nlp.advanced_nlp_pipeline') as mock_nlp, \
+             patch('app.services.merchant.search_merchants') as mock_search:
+            # Setup mocks for critical path
+            mock_nlp.process_query.return_value = {
+                "query": "find the best hair salon near me",
+                "primary_intent": {"intent": "SEARCH_SERVICE", "confidence": 0.9},
+                "entities": {"service_types": ["haircut"], "location_modifiers": ["near me"]},
+                "similar_services": [("salon", 0.95)],
+                "search_parameters": {"merchant_category": "salon", "radius": 5000},
+                "processing_time": 0.15
+            }
+            mock_search.return_value = [sample_merchant_data]
+            # Critical Path: Health Check -> NLP Processing -> Merchant Search
+            # Step 1: System health
+            health_response = await async_client.get("/health")
+            assert health_response.status_code == 200
+            assert health_response.json()["status"] == "healthy"
+            # Step 2: NLP processing
+            nlp_response = await async_client.post("/api/v1/nlp/analyze-query", params={
+                "query": "find the best hair salon near me",
+                "latitude": 40.7128,
+                "longitude": -74.0060
+            })
+            assert nlp_response.status_code == 200
+            nlp_data = nlp_response.json()
+            assert nlp_data["status"] == "success"
+            assert "analysis" in nlp_data
+            # Step 3: Merchant search using NLP results
+            search_params = nlp_data["analysis"]["search_parameters"]
+            search_response = await async_client.get("/api/v1/merchants/search", params={
+                "latitude": 40.7128,
+                "longitude": -74.0060,
+                **search_params
+            })
+            assert search_response.status_code == 200
+            merchants = search_response.json()
+            assert len(merchants) >= 1
+            assert merchants[0]["name"] == "Test Hair Salon"
+    @pytest.mark.asyncio
+    async def test_api_contract_regression(self, async_client):
+        """Test that API contracts haven't changed unexpectedly"""
+        # Test health endpoint contract
+        health_response = await async_client.get("/health")
+        assert health_response.status_code == 200
+        health_data = health_response.json()
+        required_health_fields = ["status", "timestamp", "service", "version"]
+        for field in required_health_fields:
+            assert field in health_data, f"Health endpoint missing required field: {field}"
+        # Test NLP supported intents contract
+        intents_response = await async_client.get("/api/v1/nlp/supported-intents")
+        assert intents_response.status_code == 200
+        intents_data = intents_response.json()
+        assert "status" in intents_data
+        assert "supported_intents" in intents_data
+        assert intents_data["status"] == "success"
+        # Verify expected intents are still supported
+        expected_intents = ["SEARCH_SERVICE", "FILTER_QUALITY", "FILTER_LOCATION"]
+        for intent in expected_intents:
+            assert intent in intents_data["supported_intents"]
+    @pytest.mark.asyncio
+    async def test_performance_regression_baseline(self, async_client, performance_test_data):
+        """Test for performance regressions against baseline"""
+        queries = performance_test_data["queries"][:5]  # Test subset for speed
+        max_expected_time = performance_test_data["expected_max_response_time"]
+        with patch('app.services.advanced_nlp.advanced_nlp_pipeline') as mock_nlp:
+            mock_nlp.process_query.return_value = {
+                "query": "test",
+                "primary_intent": {"intent": "SEARCH_SERVICE", "confidence": 0.8},
+                "entities": {},
+                "similar_services": [],
+                "search_parameters": {},
+                "processing_time": 0.1
+            }
+            response_times = []
+            for query in queries:
+                start_time = time.time()
+                response = await async_client.post("/api/v1/nlp/analyze-query", params={"query": query})
+                response_time = time.time() - start_time
+                response_times.append(response_time)
+                assert response.status_code == 200
+            avg_response_time = statistics.mean(response_times)
+            p95_response_time = sorted(response_times)[int(len(response_times) * 0.95)]
+            # Performance regression checks
+            assert avg_response_time < max_expected_time, f"Average response time {avg_response_time:.3f}s exceeds baseline {max_expected_time}s"
+            assert p95_response_time < max_expected_time * 1.5, f"P95 response time {p95_response_time:.3f}s exceeds acceptable threshold"
+    @pytest.mark.asyncio
+    async def test_error_handling_regression(self, async_client):
+        """Test that error handling hasn't regressed"""
+        # Test various error scenarios
+        error_scenarios = [
+            # Invalid coordinates
+            {
+                "endpoint": "/api/v1/merchants/search",
+                "params": {"latitude": 999, "longitude": 999},
+                "expected_codes": [400, 422]
+            },
+            # Empty NLP query
+            {
+                "endpoint": "/api/v1/nlp/analyze-query",
+                "params": {"query": ""},
+                "expected_codes": [400, 422]
+            },
+            # Non-existent merchant
+            {
+                "endpoint": "/api/v1/merchants/nonexistent_id",
+                "params": {},
+                "expected_codes": [404]
+            },
+            # Invalid endpoint
+            {
+                "endpoint": "/api/v1/invalid-endpoint",
+                "params": {},
+                "expected_codes": [404]
+            }
+        ]
+        for scenario in error_scenarios:
+            if scenario["endpoint"].endswith("analyze-query"):
+                response = await async_client.post(scenario["endpoint"], params=scenario["params"])
+            else:
+                response = await async_client.get(scenario["endpoint"], params=scenario["params"])
+            assert response.status_code in scenario["expected_codes"], \
+                f"Endpoint {scenario['endpoint']} returned {response.status_code}, expected one of {scenario['expected_codes']}"
+    @pytest.mark.asyncio
+    async def test_data_consistency_regression(self, async_client, sample_merchant_data):
+        """Test data consistency across different endpoints"""
+        with patch('app.services.merchant.get_merchant_by_id') as mock_get_by_id, \
+             patch('app.services.merchant.search_merchants') as mock_search:
+            mock_get_by_id.return_value = sample_merchant_data
+            mock_search.return_value = [sample_merchant_data]
+            # Get merchant by ID
+            merchant_response = await async_client.get(f"/api/v1/merchants/{sample_merchant_data['_id']}")
+            assert merchant_response.status_code == 200
+            merchant_data = merchant_response.json()
+            # Search for the same merchant
+            search_response = await async_client.get("/api/v1/merchants/search", params={
+                "category": sample_merchant_data["category"],
+                "latitude": 40.7128,
+                "longitude": -74.0060
+            })
+            assert search_response.status_code == 200
+            search_results = search_response.json()
+            # Data should be consistent
+            assert len(search_results) >= 1
+            found_merchant = next((m for m in search_results if m["_id"] == sample_merchant_data["_id"]), None)
+            assert found_merchant is not None
+            # Key fields should match
+            key_fields = ["_id", "name", "category", "average_rating"]
+            for field in key_fields:
+                assert merchant_data[field] == found_merchant[field], f"Field {field} inconsistent between endpoints"
+class TestSystemStability:
+    """Test system stability under various conditions"""
+    @pytest.mark.asyncio
+    async def test_concurrent_load_stability(self, async_client, sample_merchant_data):
+        """Test system stability under concurrent load"""
+        with patch('app.services.merchant.get_merchants') as mock_get:
+            mock_get.return_value = [sample_merchant_data] * 10
+            # Create concurrent requests
+            concurrent_requests = 50
+            tasks = [
+                async_client.get("/api/v1/merchants/")
+                for _ in range(concurrent_requests)
+            ]
+            # Execute all requests concurrently
+            responses = await asyncio.gather(*tasks, return_exceptions=True)
+            # Analyze results
+            successful_responses = 0
+            error_responses = 0
+            exceptions = 0
+            for response in responses:
+                if isinstance(response, Exception):
+                    exceptions += 1
+                elif hasattr(response, 'status_code'):
+                    if response.status_code == 200:
+                        successful_responses += 1
+                    else:
+                        error_responses += 1
+                else:
+                    exceptions += 1
+            # System should handle concurrent load gracefully
+            success_rate = successful_responses / concurrent_requests
+            assert success_rate >= 0.9, f"Success rate {success_rate:.2%} below acceptable threshold"
+            assert exceptions == 0, f"Got {exceptions} exceptions during concurrent load test"
+    @pytest.mark.asyncio
+    async def test_memory_leak_detection(self, async_client):
+        """Test for potential memory leaks during repeated operations"""
+        import psutil
+        import os
+        process = psutil.Process(os.getpid())
+        initial_memory = process.memory_info().rss / 1024 / 1024  # MB
+        # Perform many operations
+        for i in range(100):
+            await async_client.get("/health")
+            # Occasionally check memory growth
+            if i % 20 == 0:
+                current_memory = process.memory_info().rss / 1024 / 1024  # MB
+                memory_growth = current_memory - initial_memory
+                # Memory growth should be reasonable
+                assert memory_growth < 50, f"Excessive memory growth: {memory_growth:.2f}MB after {i} requests"
+        final_memory = process.memory_info().rss / 1024 / 1024  # MB
+        total_growth = final_memory - initial_memory
+        # Total memory growth should be reasonable
+        assert total_growth < 100, f"Total memory growth {total_growth:.2f}MB suggests potential memory leak"
+    @pytest.mark.asyncio
+    async def test_database_connection_resilience(self, async_client, sample_merchant_data):
+        """Test resilience to database connection issues"""
+        with patch('app.nosql.get_mongodb_client') as mock_mongo_client:
+            # Simulate intermittent database failures
+            call_count = 0
+            def side_effect(*args, **kwargs):
+                nonlocal call_count
+                call_count += 1
+                if call_count % 3 == 0:  # Fail every 3rd call
+                    raise Exception("Database connection timeout")
+                # Return mock for successful calls
+                mock_client = AsyncMock()
+                mock_collection = AsyncMock()
+                mock_client.__getitem__.return_value.__getitem__.return_value = mock_collection
+                mock_collection.find.return_value.limit.return_value.skip.return_value.to_list.return_value = [sample_merchant_data]
+                return mock_client
+            mock_mongo_client.side_effect = side_effect
+            # Make multiple requests
+            success_count = 0
+            error_count = 0
+            for _ in range(10):
+                response = await async_client.get("/api/v1/merchants/")
+                if response.status_code == 200:
+                    success_count += 1
+                else:
+                    error_count += 1
+            # System should handle some database failures gracefully
+            assert success_count > 0, "No successful requests despite intermittent failures"
+            # Some failures are expected due to simulated database issues
+class TestBackwardCompatibility:
+    """Test backward compatibility of API changes"""
+    @pytest.mark.asyncio
+    async def test_api_version_compatibility(self, async_client):
+        """Test that API versions remain compatible"""
+        # Test v1 API endpoints still work
+        v1_endpoints = [
+            "/api/v1/merchants/",
+            "/api/v1/nlp/supported-intents",
+            "/api/v1/nlp/supported-entities"
+        ]
+        for endpoint in v1_endpoints:
+            response = await async_client.get(endpoint)
+            assert response.status_code == 200, f"v1 endpoint {endpoint} is not accessible"
+    @pytest.mark.asyncio
+    async def test_response_format_compatibility(self, async_client, sample_merchant_data):
+        """Test that response formats haven't broken compatibility"""
+        with patch('app.services.merchant.get_merchants') as mock_get:
+            mock_get.return_value = [sample_merchant_data]
+            response = await async_client.get("/api/v1/merchants/")
+            assert response.status_code == 200
+            merchants = response.json()
+            assert isinstance(merchants, list)
+            if merchants:
+                merchant = merchants[0]
+                # Check that essential fields are still present
+                essential_fields = ["_id", "name", "category", "location"]
+                for field in essential_fields:
+                    assert field in merchant, f"Essential field {field} missing from merchant response"
+    @pytest.mark.asyncio
+    async def test_parameter_compatibility(self, async_client):
+        """Test that API parameters remain compatible"""
+        # Test that old parameter names still work
+        response = await async_client.get("/api/v1/merchants/search", params={
+            "latitude": 40.7128,
+            "longitude": -74.0060,
+            "radius": 5000,
+            "category": "salon"
+        })
+        # Should accept these parameters without error
+        assert response.status_code in [200, 400]  # 400 might be due to mocking, but not 422 (validation error)
+class TestDataIntegrity:
+    """Test data integrity across the system"""
+    @pytest.mark.asyncio
+    async def test_search_result_integrity(self, async_client, sample_merchant_data):
+        """Test integrity of search results"""
+        with patch('app.services.merchant.search_merchants') as mock_search:
+            mock_search.return_value = [sample_merchant_data]
+            response = await async_client.get("/api/v1/merchants/search", params={
+                "latitude": 40.7128,
+                "longitude": -74.0060,
+                "radius": 5000,
+                "category": "salon"
+            })
+            assert response.status_code == 200
+            merchants = response.json()
+            for merchant in merchants:
+                # Verify data integrity
+                assert "_id" in merchant and merchant["_id"]
+                assert "name" in merchant and merchant["name"]
+                assert "category" in merchant and merchant["category"]
+                # Verify location data integrity
+                if "location" in merchant:
+                    location = merchant["location"]
+                    assert "type" in location
+                    assert "coordinates" in location
+                    assert len(location["coordinates"]) == 2
+                    # Coordinates should be valid
+                    lng, lat = location["coordinates"]
+                    assert -180 <= lng <= 180, f"Invalid longitude: {lng}"
+                    assert -90 <= lat <= 90, f"Invalid latitude: {lat}"
+    @pytest.mark.asyncio
+    async def test_nlp_result_integrity(self, async_client):
+        """Test integrity of NLP processing results"""
+        with patch('app.services.advanced_nlp.advanced_nlp_pipeline') as mock_nlp:
+            mock_nlp.process_query.return_value = {
+                "query": "find a hair salon",
+                "primary_intent": {"intent": "SEARCH_SERVICE", "confidence": 0.85},
+                "entities": {"service_types": ["haircut"]},
+                "similar_services": [("salon", 0.9)],
+                "search_parameters": {"merchant_category": "salon"},
+                "processing_time": 0.123
+            }
+            response = await async_client.post("/api/v1/nlp/analyze-query", params={
+                "query": "find a hair salon"
+            })
+            assert response.status_code == 200
+            data = response.json()
+            # Verify NLP result integrity
+            assert data["status"] == "success"
+            assert "analysis" in data
+            analysis = data["analysis"]
+            assert "query" in analysis
+            assert "primary_intent" in analysis
+            assert "entities" in analysis
+            assert "similar_services" in analysis
+            assert "search_parameters" in analysis
+            assert "processing_time" in analysis
+            # Verify confidence scores are valid
+            if "confidence" in analysis["primary_intent"]:
+                confidence = analysis["primary_intent"]["confidence"]
+                assert 0.0 <= confidence <= 1.0, f"Invalid confidence score: {confidence}"
+class TestSystemRecovery:
+    """Test system recovery from various failure scenarios"""
+    @pytest.mark.asyncio
+    async def test_recovery_from_service_failure(self, async_client):
+        """Test recovery when services fail and recover"""
+        with patch('app.services.advanced_nlp.advanced_nlp_pipeline') as mock_nlp:
+            # First, service fails
+            mock_nlp.process_query.side_effect = Exception("Service temporarily unavailable")
+            response1 = await async_client.post("/api/v1/nlp/analyze-query", params={
+                "query": "find a salon"
+            })
+            assert response1.status_code == 500
+            # Then, service recovers
+            mock_nlp.process_query.side_effect = None
+            mock_nlp.process_query.return_value = {
+                "query": "find a salon",
+                "primary_intent": {"intent": "SEARCH_SERVICE", "confidence": 0.8},
+                "entities": {},
+                "similar_services": [],
+                "search_parameters": {},
+                "processing_time": 0.1
+            }
+            response2 = await async_client.post("/api/v1/nlp/analyze-query", params={
+                "query": "find a salon"
+            })
+            assert response2.status_code == 200
+    @pytest.mark.asyncio
+    async def test_graceful_degradation(self, async_client, sample_merchant_data):
+        """Test graceful degradation when advanced features fail"""
+        with patch('app.services.advanced_nlp.advanced_nlp_pipeline') as mock_nlp, \
+             patch('app.services.helper.process_free_text') as mock_fallback, \
+             patch('app.services.merchant.get_merchants') as mock_get_merchants:
+            # Advanced NLP fails
+            mock_nlp.process_query.side_effect = Exception("NLP service down")
+            # Fallback service works
+            mock_fallback.return_value = {
+                "query": "salon",
+                "extracted_keywords": ["salon"],
+                "suggested_category": "salon"
+            }
+            # Basic merchant service works
+            mock_get_merchants.return_value = [sample_merchant_data]
+            # Should still be able to get basic functionality
+            fallback_response = await async_client.post("/api/v1/helpers/process-text", json={
+                "text": "salon",
+                "latitude": 40.7128,
+                "longitude": -74.0060
+            })
+            assert fallback_response.status_code == 200
+            merchants_response = await async_client.get("/api/v1/merchants/")
+            assert merchants_response.status_code == 200
+# Performance benchmarks for regression detection
+class TestPerformanceBenchmarks:
+    """Performance benchmarks to detect regressions"""
+    @pytest.mark.asyncio
+    async def test_response_time_benchmarks(self, async_client):
+        """Benchmark response times for key endpoints"""
+        benchmarks = {
+            "/health": 0.1,  # 100ms
+            "/api/v1/nlp/supported-intents": 0.2,  # 200ms
+            "/api/v1/nlp/supported-entities": 0.2,  # 200ms
+        }
+        for endpoint, max_time in benchmarks.items():
+            start_time = time.time()
+            response = await async_client.get(endpoint)
+            response_time = time.time() - start_time
+            assert response.status_code == 200
+            assert response_time < max_time, f"Endpoint {endpoint} took {response_time:.3f}s, exceeds benchmark {max_time}s"
+    @pytest.mark.asyncio
+    async def test_throughput_benchmarks(self, async_client):
+        """Benchmark throughput for key operations"""
+        # Test health endpoint throughput
+        start_time = time.time()
+        tasks = [async_client.get("/health") for _ in range(20)]
+        responses = await asyncio.gather(*tasks)
+        total_time = time.time() - start_time
+        # All should succeed
+        assert all(r.status_code == 200 for r in responses)
+        # Should handle 20 requests reasonably fast
+        requests_per_second = 20 / total_time
+        assert requests_per_second > 10, f"Throughput {requests_per_second:.1f} req/s below benchmark"

app/tests/test_security.py ADDED Viewed

	@@ -0,0 +1,473 @@

+"""
+Security regression tests
+"""
+import pytest
+from unittest.mock import patch, MagicMock
+from fastapi.testclient import TestClient
+class TestInputValidation:
+    """Test input validation and sanitization"""
+    def test_sql_injection_prevention(self, client: TestClient):
+        """Test SQL injection prevention in merchant search"""
+        malicious_input = "'; DROP TABLE merchants; --"
+        response = client.get("/api/v1/merchants/search", params={
+            "category": malicious_input,
+            "latitude": 40.7128,
+            "longitude": -74.0060
+        })
+        # Should either sanitize input or return 400
+        assert response.status_code in [200, 400]
+        if response.status_code == 200:
+            # If processed, should not contain malicious SQL
+            data = response.json()
+            assert isinstance(data, list)
+    def test_xss_prevention_in_nlp_query(self, client: TestClient):
+        """Test XSS prevention in NLP query processing"""
+        xss_payload = "<script>alert('xss')</script>find salon"
+        with patch('app.services.advanced_nlp.advanced_nlp_pipeline') as mock_nlp:
+            mock_nlp.process_query.return_value = {
+                "query": "find salon",  # Should be sanitized
+                "primary_intent": {"intent": "SEARCH_SERVICE", "confidence": 0.8},
+                "entities": {},
+                "similar_services": [],
+                "search_parameters": {},
+                "processing_time": 0.1
+            }
+            response = client.post("/api/v1/nlp/analyze-query", params={
+                "query": xss_payload
+            })
+            assert response.status_code == 200
+            data = response.json()
+            # Script tags should be removed/sanitized
+            assert "<script>" not in str(data)
+    def test_command_injection_prevention(self, client: TestClient):
+        """Test command injection prevention"""
+        command_injection = "; rm -rf /"
+        response = client.post("/api/v1/helpers/process-text", json={
+            "text": f"find salon{command_injection}",
+            "latitude": 40.7128,
+            "longitude": -74.0060
+        })
+        # Should handle malicious input safely
+        assert response.status_code in [200, 400]
+    def test_path_traversal_prevention(self, client: TestClient):
+        """Test path traversal prevention in merchant ID"""
+        path_traversal = "../../../etc/passwd"
+        response = client.get(f"/api/v1/merchants/{path_traversal}")
+        # Should not allow path traversal
+        assert response.status_code in [400, 404]
+    def test_large_payload_handling(self, client: TestClient):
+        """Test handling of excessively large payloads"""
+        large_text = "A" * (10 * 1024 * 1024)  # 10MB
+        response = client.post("/api/v1/helpers/process-text", json={
+            "text": large_text,
+            "latitude": 40.7128,
+            "longitude": -74.0060
+        })
+        # Should reject or handle large payloads appropriately
+        assert response.status_code in [400, 413, 422]
+    def test_invalid_coordinates_handling(self, client: TestClient):
+        """Test handling of invalid coordinates"""
+        invalid_coords = [
+            (999, 999),      # Out of range
+            (-999, -999),    # Out of range
+            ("abc", "def"),  # Non-numeric
+            (None, None)     # Null values
+        ]
+        for lat, lng in invalid_coords:
+            response = client.get("/api/v1/merchants/search", params={
+                "latitude": lat,
+                "longitude": lng,
+                "radius": 5000
+            })
+            # Should handle invalid coordinates gracefully
+            assert response.status_code in [200, 400, 422]
+class TestAuthentication:
+    """Test authentication mechanisms (if implemented)"""
+    def test_unauthenticated_access_to_public_endpoints(self, client: TestClient):
+        """Test that public endpoints don't require authentication"""
+        public_endpoints = [
+            "/health",
+            "/api/v1/merchants/",
+            "/api/v1/merchants/search"
+        ]
+        for endpoint in public_endpoints:
+            response = client.get(endpoint)
+            # Should not require authentication
+            assert response.status_code != 401
+    def test_api_key_validation(self, client: TestClient):
+        """Test API key validation if implemented"""
+        # This test assumes API key authentication might be implemented
+        invalid_api_key = "invalid_key_12345"
+        response = client.get("/api/v1/merchants/", headers={
+            "X-API-Key": invalid_api_key
+        })
+        # Should either ignore invalid key or reject it
+        assert response.status_code in [200, 401, 403]
+class TestAuthorization:
+    """Test authorization and access control"""
+    def test_admin_endpoint_access(self, client: TestClient):
+        """Test access to admin endpoints"""
+        # Assuming there might be admin endpoints
+        admin_endpoints = [
+            "/admin/users",
+            "/admin/merchants",
+            "/admin/system"
+        ]
+        for endpoint in admin_endpoints:
+            response = client.get(endpoint)
+            # Should require proper authorization or not exist
+            assert response.status_code in [401, 403, 404]
+    def test_user_data_isolation(self, client: TestClient):
+        """Test that users can only access their own data"""
+        # This would be relevant if user-specific data exists
+        user_id = "user123"
+        other_user_id = "user456"
+        # Try to access another user's data
+        response = client.get(f"/api/v1/users/{other_user_id}/data", headers={
+            "User-ID": user_id
+        })
+        # Should not allow access to other user's data
+        assert response.status_code in [401, 403, 404]
+class TestDataProtection:
+    """Test data protection and privacy"""
+    def test_sensitive_data_not_exposed(self, client: TestClient, sample_merchant_data):
+        """Test that sensitive data is not exposed in API responses"""
+        with patch('app.services.merchant.get_merchants') as mock_get:
+            # Add sensitive data to mock
+            merchant_with_sensitive = sample_merchant_data.copy()
+            merchant_with_sensitive.update({
+                "internal_id": "INTERNAL_123",
+                "api_key": "secret_api_key",
+                "database_password": "secret_password",
+                "private_notes": "Internal business notes"
+            })
+            mock_get.return_value = [merchant_with_sensitive]
+            response = client.get("/api/v1/merchants/")
+            assert response.status_code == 200
+            data = response.json()
+            response_text = str(data)
+            # Sensitive fields should not be in response
+            sensitive_fields = ["api_key", "database_password", "internal_id"]
+            for field in sensitive_fields:
+                assert field not in response_text
+    def test_error_message_information_disclosure(self, client: TestClient):
+        """Test that error messages don't disclose sensitive information"""
+        with patch('app.services.merchant.get_merchants') as mock_get:
+            # Simulate database error with sensitive info
+            mock_get.side_effect = Exception("Database connection failed: host=internal-db.company.com, user=admin, password=secret123")
+            response = client.get("/api/v1/merchants/")
+            assert response.status_code == 500
+            # Error response should not contain sensitive database info
+            error_text = str(response.json())
+            sensitive_info = ["password=", "host=internal-db", "user=admin"]
+            for info in sensitive_info:
+                assert info not in error_text
+    def test_log_sanitization(self, client: TestClient):
+        """Test that logs don't contain sensitive information"""
+        # This would require checking actual log output
+        # For now, test that endpoints handle sensitive data properly
+        sensitive_query = "my credit card is 4111-1111-1111-1111"
+        response = client.post("/api/v1/nlp/analyze-query", params={
+            "query": sensitive_query
+        })
+        # Should process without exposing sensitive data
+        assert response.status_code in [200, 400]
+class TestCORSAndHeaders:
+    """Test CORS and security headers"""
+    def test_cors_configuration(self, client: TestClient):
+        """Test CORS configuration"""
+        # Test preflight request
+        response = client.options("/api/v1/merchants/", headers={
+            "Origin": "http://localhost:3000",
+            "Access-Control-Request-Method": "GET"
+        })
+        # Should handle CORS properly
+        assert response.status_code in [200, 204]
+    def test_cors_origin_validation(self, client: TestClient):
+        """Test CORS origin validation"""
+        # Test with allowed origin
+        response = client.get("/api/v1/merchants/", headers={
+            "Origin": "http://localhost:3000"
+        })
+        assert response.status_code == 200
+        # Test with disallowed origin
+        response = client.get("/api/v1/merchants/", headers={
+            "Origin": "http://malicious-site.com"
+        })
+        # Should still work but without CORS headers for invalid origin
+        assert response.status_code == 200
+    def test_security_headers(self, client: TestClient):
+        """Test security headers in responses"""
+        response = client.get("/health")
+        # Check for common security headers
+        headers = response.headers
+        # These might not be implemented yet, but good to test for
+        security_headers = [
+            "X-Content-Type-Options",
+            "X-Frame-Options",
+            "X-XSS-Protection",
+            "Strict-Transport-Security"
+        ]
+        # At minimum, should not have dangerous headers
+        dangerous_headers = [
+            "Server",  # Should not expose server details
+            "X-Powered-By"  # Should not expose technology stack
+        ]
+        for header in dangerous_headers:
+            if header in headers:
+                # If present, should not contain sensitive info
+                assert "internal" not in headers[header].lower()
+                assert "secret" not in headers[header].lower()
+class TestRateLimiting:
+    """Test rate limiting and abuse prevention"""
+    @pytest.mark.asyncio
+    async def test_rate_limiting_basic(self, async_client):
+        """Test basic rate limiting functionality"""
+        # Make many requests quickly
+        responses = []
+        for _ in range(100):
+            response = await async_client.get("/health")
+            responses.append(response)
+        # Should either all succeed or some be rate limited
+        status_codes = [r.status_code for r in responses]
+        # All should be either 200 (OK) or 429 (Too Many Requests)
+        assert all(code in [200, 429] for code in status_codes)
+    def test_rate_limiting_per_endpoint(self, client: TestClient):
+        """Test rate limiting per endpoint"""
+        endpoints = [
+            "/health",
+            "/api/v1/merchants/",
+            "/api/v1/nlp/supported-intents"
+        ]
+        for endpoint in endpoints:
+            # Make multiple requests to each endpoint
+            responses = []
+            for _ in range(20):
+                response = client.get(endpoint)
+                responses.append(response)
+            # Should handle multiple requests appropriately
+            status_codes = [r.status_code for r in responses]
+            assert all(code in [200, 429, 500] for code in status_codes)
+class TestInputSanitization:
+    """Test comprehensive input sanitization"""
+    def test_html_sanitization(self, client: TestClient):
+        """Test HTML tag sanitization"""
+        html_inputs = [
+            "<b>bold text</b>",
+            "<img src='x' onerror='alert(1)'>",
+            "<iframe src='javascript:alert(1)'></iframe>",
+            "<<script>alert('xss')</script>script>alert('xss')<</script>/script>"
+        ]
+        for html_input in html_inputs:
+            response = client.post("/api/v1/helpers/process-text", json={
+                "text": html_input,
+                "latitude": 40.7128,
+                "longitude": -74.0060
+            })
+            # Should handle HTML input safely
+            assert response.status_code in [200, 400]
+            if response.status_code == 200:
+                # Response should not contain dangerous HTML
+                response_text = str(response.json())
+                assert "<script>" not in response_text
+                assert "javascript:" not in response_text
+    def test_unicode_handling(self, client: TestClient):
+        """Test Unicode and special character handling"""
+        unicode_inputs = [
+            "café résumé naïve",  # Accented characters
+            "🏪🔍💇‍♀️",  # Emojis
+            "测试中文字符",  # Chinese characters
+            "тест кириллица",  # Cyrillic
+            "\u0000\u0001\u0002",  # Control characters
+        ]
+        for unicode_input in unicode_inputs:
+            response = client.post("/api/v1/nlp/analyze-query", params={
+                "query": unicode_input
+            })
+            # Should handle Unicode safely
+            assert response.status_code in [200, 400]
+    def test_numeric_input_validation(self, client: TestClient):
+        """Test numeric input validation"""
+        invalid_numeric_inputs = [
+            ("latitude", "not_a_number"),
+            ("longitude", "infinity"),
+            ("radius", "-1000"),
+            ("limit", "999999999999999999999"),
+            ("skip", "-1")
+        ]
+        for param, value in invalid_numeric_inputs:
+            response = client.get("/api/v1/merchants/search", params={
+                param: value,
+                "latitude": 40.7128 if param != "latitude" else value,
+                "longitude": -74.0060 if param != "longitude" else value
+            })
+            # Should validate numeric inputs
+            assert response.status_code in [200, 400, 422]
+class TestDatabaseSecurity:
+    """Test database security measures"""
+    @pytest.mark.asyncio
+    async def test_mongodb_injection_prevention(self):
+        """Test MongoDB injection prevention"""
+        from app.repositories.db_repository import search_merchants_in_db
+        # MongoDB injection attempts
+        injection_attempts = [
+            {"$where": "this.name == 'test'"},
+            {"$regex": ".*"},
+            {"$ne": None}
+        ]
+        with patch('app.nosql.get_mongodb_client') as mock_client:
+            mock_collection = MagicMock()
+            mock_client.return_value.__getitem__.return_value.__getitem__.return_value = mock_collection
+            mock_collection.find.return_value.limit.return_value.to_list.return_value = []
+            for injection in injection_attempts:
+                try:
+                    # Should sanitize or reject injection attempts
+                    await search_merchants_in_db(category=injection)
+                    # If it doesn't raise an exception, check that query was sanitized
+                    call_args = mock_collection.find.call_args
+                    if call_args:
+                        query = call_args[0][0]
+                        # Should not contain MongoDB operators in user input
+                        assert "$where" not in str(query.get("category", ""))
+                except (ValueError, TypeError):
+                    # Expected for invalid input
+                    pass
+    def test_connection_string_security(self):
+        """Test that database connection strings don't expose credentials"""
+        from app.nosql import get_mongodb_client
+        # This test ensures connection strings are properly configured
+        # In a real scenario, you'd check that credentials aren't hardcoded
+        client = get_mongodb_client()
+        # Should have a client instance
+        assert client is not None
+        # Connection string should not be exposed in error messages
+        # This would require triggering a connection error and checking the message
+class TestAPISecurityBestPractices:
+    """Test API security best practices"""
+    def test_http_methods_restriction(self, client: TestClient):
+        """Test that endpoints only accept appropriate HTTP methods"""
+        # Test that GET endpoints don't accept POST
+        response = client.post("/api/v1/merchants/")
+        assert response.status_code == 405  # Method Not Allowed
+        # Test that POST endpoints don't accept GET
+        response = client.get("/api/v1/helpers/process-text")
+        assert response.status_code == 405  # Method Not Allowed
+    def test_content_type_validation(self, client: TestClient):
+        """Test content type validation"""
+        # Send JSON data with wrong content type
+        response = client.post(
+            "/api/v1/helpers/process-text",
+            data='{"text": "test"}',
+            headers={"Content-Type": "text/plain"}
+        )
+        # Should reject or handle appropriately
+        assert response.status_code in [400, 415, 422]
+    def test_parameter_pollution(self, client: TestClient):
+        """Test handling of parameter pollution"""
+        # Send duplicate parameters
+        response = client.get("/api/v1/merchants/search?category=salon&category=spa")
+        # Should handle duplicate parameters appropriately
+        assert response.status_code in [200, 400]
+    def test_null_byte_injection(self, client: TestClient):
+        """Test null byte injection prevention"""
+        null_byte_input = "test\x00malicious"
+        response = client.post("/api/v1/nlp/analyze-query", params={
+            "query": null_byte_input
+        })
+        # Should handle null bytes safely
+        assert response.status_code in [200, 400]
+        if response.status_code == 200:
+            # Response should not contain null bytes
+            response_text = str(response.json())
+            assert "\x00" not in response_text

app/tests/test_services.py ADDED Viewed

	@@ -0,0 +1,439 @@

+"""
+Regression tests for service layer components
+"""
+import pytest
+from unittest.mock import AsyncMock, MagicMock, patch
+from typing import Dict, Any
+class TestMerchantService:
+    """Test merchant service functionality"""
+    @patch('app.nosql.db')
+    @pytest.mark.asyncio
+    async def test_get_merchants_success(self, mock_db, sample_merchant_data):
+        """Test successful merchant retrieval"""
+        from app.services.merchant import get_merchants
+        # Mock the MongoDB collection
+        mock_collection = AsyncMock()
+        mock_db.__getitem__.return_value = mock_collection
+        mock_collection.find.return_value.limit.return_value.skip.return_value.to_list.return_value = [sample_merchant_data]
+        result = await get_merchants(limit=10, skip=0)
+        assert len(result) == 1
+        assert result[0]["name"] == "Test Hair Salon"
+    @patch('app.nosql.db')
+    @pytest.mark.asyncio
+    async def test_get_merchant_by_id_success(self, mock_db, sample_merchant_data):
+        """Test successful merchant retrieval by ID"""
+        from app.services.merchant import get_merchant_by_id
+        # Mock the MongoDB collection
+        mock_collection = AsyncMock()
+        mock_db.__getitem__.return_value = mock_collection
+        mock_collection.find_one.return_value = sample_merchant_data
+        result = await get_merchant_by_id("test_merchant_123")
+        assert result["_id"] == "test_merchant_123"
+        assert result["name"] == "Test Hair Salon"
+    @patch('app.nosql.db')
+    @pytest.mark.asyncio
+    async def test_get_merchant_by_id_not_found(self, mock_db):
+        """Test merchant not found scenario"""
+        from app.services.merchant import get_merchant_by_id
+        # Mock the MongoDB collection
+        mock_collection = AsyncMock()
+        mock_db.__getitem__.return_value = mock_collection
+        mock_collection.find_one.return_value = None
+        result = await get_merchant_by_id("nonexistent_id")
+        assert result is None
+    @patch('app.nosql.db')
+    @pytest.mark.asyncio
+    async def test_search_merchants_with_location(self, mock_db, sample_merchant_data):
+        """Test merchant search with location parameters"""
+        from app.services.merchant import search_merchants
+        # Mock the MongoDB collection
+        mock_collection = AsyncMock()
+        mock_db.__getitem__.return_value = mock_collection
+        mock_collection.find.return_value.limit.return_value.to_list.return_value = [sample_merchant_data]
+        result = await search_merchants(
+            latitude=40.7128,
+            longitude=-74.0060,
+            radius=5000,
+            category="salon"
+        )
+        assert len(result) == 1
+        assert result[0]["category"] == "salon"
+    @patch('app.nosql.redis_client')
+    @patch('app.nosql.db')
+    @pytest.mark.asyncio
+    async def test_merchant_caching(self, mock_db, mock_redis, sample_merchant_data):
+        """Test merchant data caching"""
+        from app.services.merchant import get_merchants
+        # Mock the MongoDB collection
+        mock_collection = AsyncMock()
+        mock_db.__getitem__.return_value = mock_collection
+        mock_collection.find.return_value.limit.return_value.skip.return_value.to_list.return_value = [sample_merchant_data]
+        # Mock Redis
+        mock_redis.get.return_value = None  # Cache miss
+        mock_redis.set.return_value = True
+        result = await get_merchants(limit=10, skip=0)
+        assert len(result) == 1
+        assert result[0]["name"] == "Test Hair Salon"
+class TestHelperService:
+    """Test helper service functionality"""
+    @pytest.mark.asyncio
+    async def test_process_free_text_basic(self):
+        """Test basic free text processing"""
+        from app.services.helper import process_free_text
+        result = await process_free_text("find a hair salon", 40.7128, -74.0060)
+        assert "query" in result
+        assert "extracted_keywords" in result
+        assert result["query"] == "find a hair salon"
+    @pytest.mark.asyncio
+    async def test_process_free_text_with_location(self):
+        """Test free text processing with location"""
+        from app.services.helper import process_free_text
+        result = await process_free_text("salon near me", 40.7128, -74.0060)
+        assert "location" in result or "search_parameters" in result
+        assert result["query"] == "salon near me"
+    @pytest.mark.asyncio
+    async def test_process_free_text_empty_input(self):
+        """Test free text processing with empty input"""
+        from app.services.helper import process_free_text
+        result = await process_free_text("", 40.7128, -74.0060)
+        # Should handle empty input gracefully
+        assert "query" in result
+        assert result["query"] == ""
+    @pytest.mark.asyncio
+    async def test_extract_keywords(self):
+        """Test keyword extraction functionality"""
+        from app.services.helper import extract_keywords
+        keywords = extract_keywords("find the best hair salon with parking")
+        assert isinstance(keywords, list)
+        assert len(keywords) > 0
+        # Should extract relevant keywords
+        assert any("hair" in kw.lower() or "salon" in kw.lower() for kw in keywords)
+    @pytest.mark.asyncio
+    async def test_suggest_category(self):
+        """Test category suggestion functionality"""
+        from app.services.helper import suggest_category
+        category = suggest_category("hair salon")
+        assert category is not None
+        assert category in ["salon", "beauty", "hair_salon"]
+class TestSearchHelpers:
+    """Test search helper functionality"""
+    @pytest.mark.asyncio
+    async def test_build_search_query(self):
+        """Test search query building"""
+        from app.services.search_helpers import build_search_query
+        params = {
+            "category": "salon",
+            "latitude": 40.7128,
+            "longitude": -74.0060,
+            "radius": 5000
+        }
+        query = build_search_query(params)
+        assert isinstance(query, dict)
+        assert "location" in query or "category" in query
+    @pytest.mark.asyncio
+    async def test_apply_filters(self):
+        """Test filter application"""
+        from app.services.search_helpers import apply_filters
+        merchants = [
+            {"name": "Salon A", "average_rating": 4.5, "price_range": "$$"},
+            {"name": "Salon B", "average_rating": 3.5, "price_range": "$"},
+            {"name": "Salon C", "average_rating": 4.8, "price_range": "$$$"}
+        ]
+        filters = {"min_rating": 4.0, "max_price_range": "$$"}
+        filtered = apply_filters(merchants, filters)
+        assert len(filtered) <= len(merchants)
+        # Should only include merchants meeting criteria
+        for merchant in filtered:
+            assert merchant["average_rating"] >= 4.0
+    @pytest.mark.asyncio
+    async def test_sort_results(self):
+        """Test result sorting"""
+        from app.services.search_helpers import sort_results
+        merchants = [
+            {"name": "Salon A", "average_rating": 4.5, "distance": 1000},
+            {"name": "Salon B", "average_rating": 4.8, "distance": 2000},
+            {"name": "Salon C", "average_rating": 4.2, "distance": 500}
+        ]
+        # Sort by rating
+        sorted_by_rating = sort_results(merchants, "rating")
+        assert sorted_by_rating[0]["average_rating"] >= sorted_by_rating[1]["average_rating"]
+        # Sort by distance
+        sorted_by_distance = sort_results(merchants, "distance")
+        assert sorted_by_distance[0]["distance"] <= sorted_by_distance[1]["distance"]
+class TestAdvancedNLPService:
+    """Test advanced NLP service functionality"""
+    @pytest.mark.asyncio
+    async def test_nlp_pipeline_initialization(self):
+        """Test NLP pipeline initialization"""
+        from app.services.advanced_nlp import AdvancedNLPPipeline
+        pipeline = AdvancedNLPPipeline()
+        assert pipeline.intent_classifier is not None
+        assert pipeline.entity_extractor is not None
+        assert pipeline.semantic_matcher is not None
+        assert pipeline.context_processor is not None
+    @pytest.mark.asyncio
+    async def test_intent_classification(self):
+        """Test intent classification"""
+        from app.services.advanced_nlp import IntentClassifier
+        classifier = IntentClassifier()
+        # Test search intent
+        intent, confidence = classifier.get_primary_intent("find a hair salon")
+        assert intent == "SEARCH_SERVICE"
+        assert confidence > 0.0
+        # Test quality filter intent
+        intent, confidence = classifier.get_primary_intent("best salon in town")
+        assert intent == "FILTER_QUALITY"
+        assert confidence > 0.0
+    @pytest.mark.asyncio
+    async def test_entity_extraction(self):
+        """Test entity extraction"""
+        from app.services.advanced_nlp import BusinessEntityExtractor
+        extractor = BusinessEntityExtractor()
+        entities = extractor.extract_entities("hair salon with parking near me")
+        assert isinstance(entities, dict)
+        assert "service_types" in entities or "amenities" in entities
+    @pytest.mark.asyncio
+    async def test_semantic_matching(self):
+        """Test semantic matching"""
+        from app.services.advanced_nlp import SemanticMatcher
+        matcher = SemanticMatcher()
+        matches = matcher.find_similar_services("hair salon")
+        assert isinstance(matches, list)
+        assert len(matches) > 0
+        # Should return tuples of (service, similarity_score)
+        for match in matches:
+            assert len(match) == 2
+            assert isinstance(match[1], float)
+            assert 0.0 <= match[1] <= 1.0
+    @pytest.mark.asyncio
+    async def test_context_processing(self):
+        """Test context-aware processing"""
+        from app.services.advanced_nlp import ContextAwareProcessor
+        processor = ContextAwareProcessor()
+        result = await processor.process_with_context(
+            "spa treatment",
+            {"service_categories": ["spa"]},
+            [("spa", 0.9)]
+        )
+        assert isinstance(result, dict)
+    @pytest.mark.asyncio
+    async def test_complete_nlp_pipeline(self):
+        """Test complete NLP pipeline processing"""
+        from app.services.advanced_nlp import AdvancedNLPPipeline
+        pipeline = AdvancedNLPPipeline()
+        result = await pipeline.process_query("find the best hair salon near me")
+        assert "query" in result
+        assert "primary_intent" in result
+        assert "entities" in result
+        assert "similar_services" in result
+        assert "search_parameters" in result
+        assert "processing_time" in result
+    @pytest.mark.asyncio
+    async def test_nlp_caching(self):
+        """Test NLP result caching"""
+        from app.services.advanced_nlp import AsyncNLPProcessor
+        processor = AsyncNLPProcessor()
+        def dummy_processor(text):
+            return {"processed": text.upper()}
+        # First call
+        result1 = await processor.process_async("test", dummy_processor)
+        # Second call should use cache
+        result2 = await processor.process_async("test", dummy_processor)
+        assert result1 == result2
+    @pytest.mark.asyncio
+    async def test_nlp_error_handling(self):
+        """Test NLP error handling"""
+        from app.services.advanced_nlp import AdvancedNLPPipeline
+        pipeline = AdvancedNLPPipeline()
+        # Test with empty query
+        result = await pipeline.process_query("")
+        assert "query" in result
+        assert result["query"] == ""
+        # Should handle gracefully without crashing
+class TestServiceIntegration:
+    """Test service integration scenarios"""
+    @patch('app.services.merchant.search_merchants')
+    @patch('app.services.advanced_nlp.advanced_nlp_pipeline')
+    @pytest.mark.asyncio
+    async def test_nlp_to_merchant_search_integration(self, mock_nlp, mock_search, sample_merchant_data):
+        """Test integration between NLP processing and merchant search"""
+        # Mock NLP pipeline response
+        mock_nlp.process_query.return_value = {
+            "search_parameters": {
+                "merchant_category": "salon",
+                "radius": 5000,
+                "amenities": ["parking"]
+            }
+        }
+        # Mock merchant search response
+        mock_search.return_value = [sample_merchant_data]
+        # Simulate the integration
+        nlp_result = await mock_nlp.process_query("salon with parking")
+        search_params = nlp_result["search_parameters"]
+        merchants = await mock_search(**search_params)
+        assert len(merchants) == 1
+        assert merchants[0]["category"] == "salon"
+    @patch('app.repositories.cache_repository.cache_search_results')
+    @patch('app.services.merchant.search_merchants')
+    @pytest.mark.asyncio
+    async def test_search_result_caching(self, mock_search, mock_cache, sample_merchant_data):
+        """Test search result caching integration"""
+        mock_search.return_value = [sample_merchant_data]
+        # Perform search
+        results = await mock_search(category="salon")
+        # Cache should be called
+        mock_cache.assert_called_once()
+        assert len(results) == 1
+    @pytest.mark.asyncio
+    async def test_error_propagation(self):
+        """Test error propagation between services"""
+        from app.services.merchant import get_merchant_by_id
+        with patch('app.nosql.db') as mock_db:
+            mock_collection = AsyncMock()
+            mock_db.__getitem__.return_value = mock_collection
+            mock_collection.find_one.side_effect = Exception("Database connection failed")
+            with pytest.raises(Exception) as exc_info:
+                await get_merchant_by_id("test_id")
+            assert "Database connection failed" in str(exc_info.value)
+class TestServicePerformance:
+    """Test service performance characteristics"""
+    @pytest.mark.asyncio
+    async def test_concurrent_merchant_requests(self, sample_merchant_data):
+        """Test concurrent merchant service requests"""
+        import asyncio
+        from app.services.merchant import get_merchant_by_id
+        with patch('app.nosql.db') as mock_db:
+            mock_collection = AsyncMock()
+            mock_db.__getitem__.return_value = mock_collection
+            mock_collection.find_one.return_value = sample_merchant_data
+            # Create multiple concurrent requests
+            tasks = [
+                get_merchant_by_id(f"merchant_{i}")
+                for i in range(10)
+            ]
+            results = await asyncio.gather(*tasks)
+            assert len(results) == 10
+            assert all(result["name"] == "Test Hair Salon" for result in results)
+    @pytest.mark.asyncio
+    async def test_nlp_processing_performance(self):
+        """Test NLP processing performance"""
+        import time
+        from app.services.advanced_nlp import AdvancedNLPPipeline
+        pipeline = AdvancedNLPPipeline()
+        start_time = time.time()
+        result = await pipeline.process_query("find a hair salon")
+        processing_time = time.time() - start_time
+        # Should process within reasonable time
+        assert processing_time < 2.0  # 2 seconds max
+        assert "processing_time" in result
+        assert result["processing_time"] > 0