Spaces:

Koottu
/

FaceMatch-Azure-Dev

Runtime error

App Files Files Community

vkoottu commited on Jul 2, 2025

Commit

7a5665b

verified ·

1 Parent(s): 7be4a5c

Upload 7 files

Browse files

Files changed (7) hide show

.gitignore +44 -0
Dockerfile +12 -0
README.md +229 -10
config.py +99 -0
handler.py +354 -0
main.py +176 -0
requirements.txt +15 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,44 @@

+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+# Environment
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+# IDE
+.vscode/
+.idea/
+*.swp
+*.swo
+# OS
+.DS_Store
+Thumbs.db
+# Application specific
+user_preferences.json
+*.log

Dockerfile ADDED Viewed

	@@ -0,0 +1,12 @@

+FROM python:3.10-slim
+WORKDIR /code
+COPY requirements.txt .
+RUN pip install --no-cache-dir -r requirements.txt
+COPY . .
+EXPOSE 7860
+CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "7860"]

README.md CHANGED Viewed

@@ -1,10 +1,229 @@
----
-title: FaceMatch Azure Dev
-emoji: 🐨
-colorFrom: red
-colorTo: green
-sdk: docker
-pinned: false
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# FaceMatch FastAPI
+A face matching and recommendation system built with FastAPI, InsightFace, and Azure Blob Storage. This application provides personalized face recommendations based on user preferences and similarity matching.
+## Features
+- **Face Detection & Embedding**: Uses InsightFace for robust face detection and embedding extraction
+- **Similarity Matching**: Finds similar faces using cosine similarity on face embeddings
+- **Personalized Recommendations**: Learns from user likes/dislikes to provide personalized matches
+- **Gender Filtering**: Filter recommendations by gender (male, female, or all)
+- **Azure Integration**: Stores images and embeddings in Azure Blob Storage
+- **FastAPI**: Modern, fast web framework with automatic API documentation
+## API Endpoints
+### Core Endpoints
+- `GET /` - Health check and welcome message
+- `POST /api/init_user` - Initialize a new user session
+- `GET /api/get_training_images` - Get training images for user preference learning
+- `POST /api/record_preference` - Record user like/dislike preferences
+- `POST /api/get_matches` - Get personalized matches based on user preferences
+- `POST /api/get_recommendations` - Get recommendations based on query images
+- `POST /api/extract_embeddings` - Extract embeddings from all images (admin)
+### API Documentation
+Visit `/docs` for interactive Swagger UI documentation when running locally.
+## Local Setup
+### Prerequisites
+- Python 3.8+
+- Azure Blob Storage account
+- Azure credentials
+### Installation
+1. **Clone the repository**
+   ```bash
+   git clone <your-repo-url>
+   cd Facematch_Dev
+   ```
+2. **Install dependencies**
+   ```bash
+   pip install -r requirements.txt
+   ```
+3. **Configure Azure credentials**
+   Set your Azure credentials as environment variables:
+   ```bash
+   export AZURE_STORAGE_CONNECTION_STRING="your_connection_string"
+   export AZURE_CONTAINER_NAME="your_container_name"
+   ```
+   Or create a `config.py` file with your credentials.
+4. **Run the application**
+   ```bash
+   python -m uvicorn main:app --reload --host 0.0.0.0 --port 8000
+   ```
+5. **Access the API**
+   - API: http://localhost:8000
+   - Documentation: http://localhost:8000/docs
+## Usage Examples
+### Get Recommendations
+**Direct Format:**
+```bash
+curl -X POST "http://localhost:8000/api/get_recommendations" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "query_images": [
+      "https://your-azure-url/image1.jpg",
+      "https://your-azure-url/image2.jpg"
+    ],
+    "gender": "female",
+    "top_n": 5
+  }'
+```
+**Hugging Face Format:**
+```bash
+curl -X POST "http://localhost:8000/api/get_recommendations" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "inputs": {
+      "query_images": [
+        "https://your-azure-url/image1.jpg",
+        "https://your-azure-url/image2.jpg"
+      ],
+      "gender": "female",
+      "top_n": 5
+    }
+  }'
+```
+### Initialize User Session
+```bash
+curl -X POST "http://localhost:8000/api/init_user"
+```
+### Record Preferences
+```bash
+curl -X POST "http://localhost:8000/api/record_preference" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "user_id": "your_user_id",
+    "image_url": "https://your-azure-url/image.jpg",
+    "preference": "like"
+  }'
+```
+## Hugging Face Spaces Deployment
+### 1. Create a Hugging Face Space
+1. Go to [Hugging Face Spaces](https://huggingface.co/spaces)
+2. Click "Create new Space"
+3. Choose "FastAPI" as the SDK
+4. Set visibility (public or private)
+5. Create the space
+### 2. Configure Secrets
+In your Hugging Face Space settings, add these secrets:
+- `AZURE_STORAGE_CONNECTION_STRING`: Your Azure connection string
+- `AZURE_CONTAINER_NAME`: Your Azure container name
+### 3. Upload Files
+Upload these files to your Hugging Face Space:
+- `main.py` - FastAPI application
+- `handler.py` - Face matching logic
+- `requirements.txt` - Dependencies
+- `config.py` - Configuration (if using file-based config)
+### 4. Deploy
+The space will automatically build and deploy your FastAPI application.
+### 5. Access Your API
+Your API will be available at:
+```
+https://your-username-your-space-name.hf.space
+```
+## Azure Setup
+### Required Azure Resources
+1. **Storage Account**: For storing images and embeddings
+2. **Blob Container**: Organized with folders:
+   - `ai-images/men/` - Training images for men
+   - `ai-images/women/` - Training images for women
+   - `profile-media/` - Images to search for matches
+### Configuration
+The application expects these Azure settings:
+```python
+# In config.py or environment variables
+AZURE_STORAGE_CONNECTION_STRING = "your_connection_string"
+AZURE_CONTAINER_NAME = "your_container_name"
+```
+## File Structure
+```
+Facematch_Dev/
+├── main.py                 # FastAPI application
+├── handler.py              # Face matching logic
+├── config.py               # Configuration
+├── requirements.txt        # Dependencies
+├── README.md              # This file
+├── templates/             # HTML templates (if needed)
+└── user_preferences.json  # User preferences storage
+```
+## Performance Notes
+- **Local Development**: Runs on CPU, suitable for testing
+- **Hugging Face Spaces**: Runs on GPU, much faster for production
+- **Embedding Extraction**: Run `/api/extract_embeddings` after uploading new images
+- **Caching**: Embeddings are cached in Azure for faster subsequent queries
+## Troubleshooting
+### Common Issues
+1. **Face Detection Fails**: Some images may not contain detectable faces
+2. **Azure Connection**: Ensure credentials are correctly set
+3. **Memory Issues**: Large image collections may require more memory on Hugging Face
+### Debug Mode
+Enable debug logging by setting environment variable:
+```bash
+export DEBUG=1
+```
+## Contributing
+1. Fork the repository
+2. Create a feature branch
+3. Make your changes
+4. Test thoroughly
+5. Submit a pull request
+## License
+[Add your license information here]
+## Support
+For issues and questions:
+- Create an issue on GitHub
+- Check the API documentation at `/docs`
+- Review the debug logs for detailed error information

config.py ADDED Viewed

	@@ -0,0 +1,99 @@

+import os
+from typing import Dict, Any
+class Config:
+    """Configuration class for FaceMatch application"""
+    # Azure Storage Configuration
+    AZURE_STORAGE_CONNECTION_STRING = os.getenv('AZURE_STORAGE_CONNECTION_STRING', 'DefaultEndpointsProtocol=https;AccountName=koottumedia;AccountKey=jqAuUPk6tiCuhpIqlcguTBWoVR++kcqiQDgfPVlE05bXfYH4W/TmbXez3kKHdVImVfz/FZ1wltnq+AStx/Bakw==;EndpointSuffix=core.windows.net')
+    AZURE_STORAGE_ACCOUNT_NAME = os.getenv('AZURE_STORAGE_ACCOUNT_NAME', 'koottumedia')
+    AZURE_STORAGE_ACCOUNT_KEY = os.getenv('AZURE_STORAGE_ACCOUNT_KEY', 'jqAuUPk6tiCuhpIqlcguTBWoVR++kcqiQDgfPVlE05bXfYH4W/TmbXez3kKHdVImVfz/FZ1wltnq+AStx/Bakw==')
+    AZURE_CONTAINER_NAME = os.getenv('AZURE_CONTAINER_NAME', 'koottu-media')
+    AZURE_PREFIX = os.getenv('AZURE_PREFIX', 'koottu-media/profile-media/')
+    AZURE_EMBEDDINGS_FOLDER = os.getenv('AZURE_EMBEDDINGS_FOLDER', 'koottu-media/embeddings/')
+    AZURE_TRAINING_IMAGES_FOLDER = os.getenv('AZURE_TRAINING_IMAGES_FOLDER', 'koottu-media/ai-images/')
+    # Face Recognition Configuration
+    INSIGHTFACE_CTX_ID = int(os.getenv('INSIGHTFACE_CTX_ID', '0'))  # 0 for GPU, -1 for CPU
+    FACE_EMBEDDING_DIMENSION = 512
+    SIMILARITY_THRESHOLD = float(os.getenv('SIMILARITY_THRESHOLD', '0.5'))
+    # Application Configuration
+    FLASK_SECRET_KEY = os.getenv('FLASK_SECRET_KEY', 'your-secret-key-here')
+    FLASK_HOST = os.getenv('FLASK_HOST', '0.0.0.0')
+    FLASK_PORT = int(os.getenv('FLASK_PORT', '5000'))
+    FLASK_DEBUG = os.getenv('FLASK_DEBUG', 'True').lower() == 'true'
+    # User Preferences Configuration
+    USER_PREFERENCES_FILE = os.getenv('USER_PREFERENCES_FILE', 'user_preferences.json')
+    MAX_TRAINING_IMAGES = int(os.getenv('MAX_TRAINING_IMAGES', '10'))
+    DEFAULT_MATCH_COUNT = int(os.getenv('DEFAULT_MATCH_COUNT', '10'))
+    MAX_MATCH_COUNT = int(os.getenv('MAX_MATCH_COUNT', '50'))
+    # Embedding Database Configuration
+    EMBEDDING_UPDATE_DAYS = int(os.getenv('EMBEDDING_UPDATE_DAYS', '30'))
+    MIN_FACE_CONFIDENCE = float(os.getenv('MIN_FACE_CONFIDENCE', '0.5'))
+    # Performance Configuration
+    BATCH_SIZE = int(os.getenv('BATCH_SIZE', '10'))
+    CACHE_TTL = int(os.getenv('CACHE_TTL', '3600'))  # 1 hour
+    @classmethod
+    def get_azure_config(cls) -> Dict[str, Any]:
+        """Get Azure Storage configuration dictionary"""
+        return {
+            'connection_string': cls.AZURE_STORAGE_CONNECTION_STRING,
+            'account_name': cls.AZURE_STORAGE_ACCOUNT_NAME,
+            'account_key': cls.AZURE_STORAGE_ACCOUNT_KEY,
+            'container_name': cls.AZURE_CONTAINER_NAME
+        }
+    @classmethod
+    def get_storage_config(cls) -> Dict[str, str]:
+        """Get storage configuration dictionary"""
+        return {
+            'container_name': cls.AZURE_CONTAINER_NAME,
+            'prefix': cls.AZURE_PREFIX,
+            'embeddings_folder': cls.AZURE_EMBEDDINGS_FOLDER
+        }
+    @classmethod
+    def get_flask_config(cls) -> Dict[str, Any]:
+        """Get Flask configuration dictionary"""
+        return {
+            'host': cls.FLASK_HOST,
+            'port': cls.FLASK_PORT,
+            'debug': cls.FLASK_DEBUG
+        }
+class DevelopmentConfig(Config):
+    """Development configuration"""
+    FLASK_DEBUG = True
+    INSIGHTFACE_CTX_ID = -1  # Use CPU for development
+class ProductionConfig(Config):
+    """Production configuration"""
+    FLASK_DEBUG = False
+    INSIGHTFACE_CTX_ID = 0  # Use GPU for production
+    FLASK_SECRET_KEY = os.getenv('FLASK_SECRET_KEY', 'change-this-in-production')
+class TestingConfig(Config):
+    """Testing configuration"""
+    FLASK_DEBUG = True
+    INSIGHTFACE_CTX_ID = -1
+    AZURE_CONTAINER_NAME = 'test-facematch-images'
+    USER_PREFERENCES_FILE = 'test_user_preferences.json'
+# Configuration mapping
+config_map = {
+    'development': DevelopmentConfig,
+    'production': ProductionConfig,
+    'testing': TestingConfig
+}
+def get_config(config_name: str = None) -> Config:
+    """Get configuration based on environment"""
+    if config_name is None:
+        config_name = os.getenv('FLASK_ENV', 'development') or 'development'
+    return config_map.get(config_name, DevelopmentConfig)

handler.py ADDED Viewed

	@@ -0,0 +1,354 @@

+import os
+import json
+import tempfile
+import numpy as np
+from insightface.app import FaceAnalysis
+from scipy.spatial.distance import cosine
+import cv2  # OpenCV for image processing
+from typing import List, Dict, Any
+from datetime import datetime, timedelta
+import requests
+import base64
+from io import BytesIO
+from PIL import Image
+from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient
+from config import get_config
+import time
+class EndpointHandler:
+    def __init__(self, model_dir=None):
+        self.app = FaceAnalysis()
+        self.app.prepare(ctx_id=0)  # Set to 0 for GPU, or -1 for CPU
+        # Get configuration
+        config = get_config()
+        azure_config = config.get_azure_config()
+        storage_config = config.get_storage_config()
+        # Initialize Azure Blob Storage client
+        if azure_config['connection_string']:
+            self.blob_service_client = BlobServiceClient.from_connection_string(
+                azure_config['connection_string']
+            )
+        else:
+            # Use account name and key if connection string not available
+            account_url = f"https://{azure_config['account_name']}.blob.core.windows.net"
+            self.blob_service_client = BlobServiceClient(
+                account_url=account_url,
+                credential=azure_config['account_key']
+            )
+        self.container_name = storage_config['container_name']
+        self.prefix = storage_config['prefix']
+        self.embeddings_folder = storage_config['embeddings_folder']
+        # Get container client
+        self.container_client = self.blob_service_client.get_container_client(self.container_name)
+    def __call__(self, data: Dict[str, Any]) -> Dict[str, Any]:
+        try:
+            if "inputs" in data:
+                return self.process_hf_input(data)
+            else:
+                return self.process_json_input(data)
+        except ValueError as e:
+            return {"error": str(e)}
+        except Exception as e:
+            return {"error": str(e)}
+    def process_hf_input(self, hf_data):
+        """Process Hugging Face format input."""
+        if "inputs" in hf_data:
+            actual_data = hf_data["inputs"]
+            return self.process_json_input(actual_data)
+        else:
+            return {"error": "Invalid Hugging Face JSON structure."}
+    def process_json_input(self, json_data):
+        if "query_images" in json_data and "gender" in json_data:
+            query_images = json_data["query_images"]
+            gender = json_data["gender"]
+            top_n = json_data.get("top_n", 5)
+            similar_images = self.find_similar_images_aggregate(query_images, gender, top_n)
+            return {"similar_images": similar_images}
+        elif "extract_embeddings" in json_data and json_data["extract_embeddings"]:
+            self.extract_and_save_embeddings()
+            return {"status": "Embeddings extraction completed."}
+        else:
+            raise ValueError("Invalid JSON structure.")
+    def load_embeddings_from_azure(self):
+        """Load existing embeddings from Azure Blob Storage if they exist, else return an empty list."""
+        try:
+            # Check if embeddings file exists in Azure - look in profile-media/embeddings/
+            blob_name = f'profile-media/embeddings/embeddings_db.json'
+            blob_client = self.container_client.get_blob_client(blob_name)
+            # Download the existing embeddings file if it exists
+            temp_dir = tempfile.gettempdir()
+            temp_file_path = os.path.join(temp_dir, 'embeddings_db.json')
+            with open(temp_file_path, 'wb') as download_file:
+                download_stream = blob_client.download_blob()
+                download_file.write(download_stream.readall())
+            with open(temp_file_path, 'r') as f:
+                return json.load(f)
+        except Exception as e:
+            print(f'Embeddings file not found in Azure, initializing a new one: {e}')
+            return []
+    def extract_and_save_embeddings(self):
+        """Extract embeddings from images and save them to Azure Blob Storage."""
+        embeddings_db = self.load_embeddings_from_azure()
+        now = datetime.utcnow()
+        thirty_days_ago = now - timedelta(days=30)
+        # Process images from both profile-media and ai-images folders
+        folders_to_process = [
+            'profile-media/',  # profile-media folder (without container name)
+            'ai-images/men/',  # ai-images/men folder (without container name)
+            'ai-images/women/'  # ai-images/women folder (without container name)
+        ]
+        for folder_prefix in folders_to_process:
+            try:
+                print(f"Processing folder: {folder_prefix}")
+                # List all blobs in the container with the current prefix
+                blob_list = self.container_client.list_blobs(name_starts_with=folder_prefix)
+                for blob in blob_list:
+                    blob_name = blob.name
+                    if blob_name.endswith(('.jpg', '.jpeg', '.png')):
+                        image_url = f'https://{self.blob_service_client.account_name}.blob.core.windows.net/{self.container_name}/{blob_name}'
+                        existing_entry = next((item for item in embeddings_db if item['image_url'] == image_url), None)
+                        if existing_entry:
+                            embedding_timestamp = datetime.fromisoformat(existing_entry['timestamp'])
+                            if (existing_entry.get('no_face_detected') or embedding_timestamp > thirty_days_ago) and blob.last_modified.replace(tzinfo=None) <= thirty_days_ago:
+                                continue
+                        print(f"Processing image: {blob_name}")
+                        try:
+                            # Create a unique temporary file with proper permissions
+                            temp_suffix = os.path.splitext(blob_name)[1] or '.jpg'
+                            with tempfile.NamedTemporaryFile(suffix=temp_suffix, delete=False) as temp_image_file:
+                                temp_file_path = temp_image_file.name
+                            # Download blob to temporary file
+                            blob_client = self.container_client.get_blob_client(blob_name)
+                            with open(temp_file_path, 'wb') as download_file:
+                                download_stream = blob_client.download_blob()
+                                download_file.write(download_stream.readall())
+                            img = self.load_image_from_blob(blob_client)
+                            # Clean up temporary file immediately after reading
+                            try:
+                                os.unlink(temp_file_path)
+                            except:
+                                pass  # Ignore cleanup errors
+                            if img is None:
+                                print(f"Failed to read image: {blob_name}")
+                                continue
+                            faces = self.app.get(img)
+                            if len(faces) == 0:
+                                print(f"No face detected in: {blob_name}")
+                                no_face_entry = {
+                                    'image_url': image_url,
+                                    'no_face_detected': True,
+                                    'timestamp': now.isoformat()
+                                }
+                                if existing_entry:
+                                    existing_entry.update(no_face_entry)
+                                else:
+                                    embeddings_db.append(no_face_entry)
+                                continue
+                            face = faces[0]
+                            embedding = face.embedding.tolist()
+                            gender = 'male' if face.gender == 1 else 'female'
+                            new_entry = {
+                                'embedding': embedding,
+                                'gender': gender,
+                                'image_url': image_url,
+                                'timestamp': now.isoformat()
+                            }
+                            if existing_entry:
+                                existing_entry.update(new_entry)
+                            else:
+                                embeddings_db.append(new_entry)
+                            print(f"Successfully processed: {blob_name} (gender: {gender})")
+                        except Exception as e:
+                            print(f"Error processing image {blob_name}: {e}")
+                            continue
+            except Exception as e:
+                print(f"Error processing folder {folder_prefix}: {e}")
+                continue
+        print(f"Total embeddings in database: {len(embeddings_db)}")
+        # Save embeddings back to Azure
+        try:
+            temp_json_path = os.path.join(tempfile.gettempdir(), f'embeddings_db_{int(time.time())}.json')
+            with open(temp_json_path, 'w') as temp_json_file:
+                json.dump(embeddings_db, temp_json_file)
+            # Upload to Azure Blob Storage - save in profile-media/embeddings/
+            blob_name = f'profile-media/embeddings/embeddings_db.json'
+            blob_client = self.container_client.get_blob_client(blob_name)
+            with open(temp_json_path, 'rb') as data:
+                blob_client.upload_blob(data, overwrite=True)
+            print(f"Embeddings saved to Azure: {blob_name}")
+            # Clean up temporary file
+            try:
+                os.unlink(temp_json_path)
+            except:
+                pass  # Ignore cleanup errors
+        except Exception as e:
+            print(f"Error saving embeddings: {e}")
+    def find_similar_images_aggregate(self, query_images: List[str], gender: str, top_n: int = 5) -> List[str]:
+        print(f"Debug: Starting similarity search with {len(query_images)} query images")
+        print(f"Debug: Looking for gender: {gender}, top_n: {top_n}")
+        similarities = {}
+        for i, image_input in enumerate(query_images):
+            print(f"Debug: Processing query image {i+1}/{len(query_images)}: {image_input}")
+            try:
+                # Determine the type of image input
+                if image_input.startswith('http'):
+                    # It's a URL
+                    img = self.load_image_from_url(image_input)
+                elif image_input.startswith('data:image/'):
+                    # It's a base64-encoded image
+                    img = self.load_image_from_base64(image_input)
+                else:
+                    # Assume it's a local file path
+                    img = cv2.imread(image_input)
+                if img is None:
+                    print(f"Failed to load image: {image_input}")
+                    continue
+                faces = self.app.get(img)
+                if len(faces) == 0:
+                    print(f"Debug: No faces detected in query image {i+1}")
+                    continue
+                query_embedding = faces[0].embedding
+                print(f"Debug: Successfully extracted face embedding from query image {i+1}")
+                # Load embeddings database from Azure
+                embeddings_db = self.load_embeddings_from_azure()
+                print(f"Debug: Total embeddings in database: {len(embeddings_db)}")
+                # Filter to only include images from profile-media folder structure
+                profile_media_db = [item for item in embeddings_db if 'image_url' in item and 'profile-media' in item['image_url']]
+                print(f"Debug: Profile-media embeddings: {len(profile_media_db)}")
+                filtered_db = [item for item in profile_media_db if 'gender' in item and item['gender'] == gender]
+                print(f"Debug: Filtered by gender '{gender}': {len(filtered_db)}")
+                if len(filtered_db) == 0:
+                    print(f"Debug: No embeddings found for gender '{gender}' in profile-media folder")
+                    print(f"Debug: Available genders in profile-media: {list(set([item.get('gender') for item in profile_media_db if 'gender' in item]))}")
+                    continue
+                for item in filtered_db:
+                    similarity = 1 - cosine(query_embedding, np.array(item['embedding']))
+                    if item['image_url'] in similarities:
+                        similarities[item['image_url']].append(similarity)
+                    else:
+                        similarities[item['image_url']] = [similarity]
+            except Exception as e:
+                error_message = f"Error processing image input: {e}"
+                print(error_message)
+                # Return empty list instead of error dict
+                return []
+        # Aggregate similarities
+        print(f"Debug: Total similarities found: {len(similarities)}")
+        aggregated_similarities = [(np.mean(scores), url) for url, scores in similarities.items()]
+        aggregated_similarities.sort(reverse=True, key=lambda x: x[0])
+        result = [url for _, url in aggregated_similarities[:top_n]]
+        print(f"Debug: Returning {len(result)} recommendations")
+        return result
+    def find_similar_images_by_embedding(self, query_embedding: np.ndarray, gender: str = 'all', top_n: int = 10, excluded_images: List[str] = None) -> List[str]:
+        """Find similar images based on a given embedding vector."""
+        try:
+            # Load embeddings database from Azure
+            embeddings_db = self.load_embeddings_from_azure()
+            # Filter to only include images from profile-media folder structure
+            profile_media_db = [item for item in embeddings_db if 'image_url' in item and 'profile-media' in item['image_url']]
+            # Filter by gender if specified
+            if gender != 'all':
+                filtered_db = [item for item in profile_media_db if 'gender' in item and item['gender'] == gender]
+            else:
+                filtered_db = [item for item in profile_media_db if 'embedding' in item]
+            # Filter out excluded images
+            if excluded_images is not None:
+                filtered_db = [item for item in filtered_db if item['image_url'] not in excluded_images]
+            similarities = []
+            for item in filtered_db:
+                if 'embedding' in item and not item.get('no_face_detected', False):
+                    similarity = 1 - cosine(query_embedding, np.array(item['embedding']))
+                    similarities.append((similarity, item['image_url']))
+            # Sort by similarity and return top matches
+            similarities.sort(reverse=True, key=lambda x: x[0])
+            return [url for _, url in similarities[:top_n]]
+        except Exception as e:
+            print(f"Error in find_similar_images_by_embedding: {e}")
+            return []
+    def load_image_from_url(self, url):
+        try:
+            response = requests.get(url, timeout=30)
+            response.raise_for_status()
+            image = Image.open(BytesIO(response.content)).convert('RGB')
+            image = np.array(image)
+            return cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
+        except Exception as e:
+            print(f"Error loading image from URL {url}: {e}")
+            return None
+    def load_image_from_blob(self, blob_client):
+        try:
+            blob_bytes = blob_client.download_blob().readall()
+            image = Image.open(BytesIO(blob_bytes)).convert('RGB')
+            image = np.array(image)
+            return cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
+        except Exception as e:
+            print(f"Error loading image from blob: {e}")
+            return None
+    def load_image_from_base64(self, base64_string):
+        header, encoded = base64_string.split(',', 1)
+        data = base64.b64decode(encoded)
+        np_arr = np.frombuffer(data, np.uint8)
+        img = cv2.imdecode(np_arr, cv2.IMREAD_COLOR)
+        return img  # Returns BGR image as expected by OpenCV
+# Instantiate the handler
+handler = EndpointHandler()

main.py ADDED Viewed

	@@ -0,0 +1,176 @@

+from fastapi import FastAPI, Request, HTTPException, Body
+from fastapi.middleware.cors import CORSMiddleware
+from fastapi.responses import HTMLResponse, JSONResponse
+from pydantic import BaseModel, Field
+from typing import List, Optional, Union, Dict, Any
+import uuid
+import json
+import os
+from datetime import datetime
+from handler import EndpointHandler
+import numpy as np
+app = FastAPI()
+# Enable CORS
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+# In-memory user session (stateless, resets on restart)
+user_sessions = {}
+USER_PREFERENCES_FILE = 'user_preferences.json'
+face_handler = EndpointHandler()
+# Pydantic model for recommendations
+class RecommendationRequest(BaseModel):
+    query_images: List[str] = Field(..., description="List of Azure URLs for query images")
+    gender: Optional[str] = Field('all', description="Gender filter: 'male', 'female', or 'all'")
+    top_n: Optional[int] = Field(5, description="Number of recommendations to return")
+# Pydantic model for Hugging Face format
+class HuggingFaceRequest(BaseModel):
+    inputs: RecommendationRequest
+# Helper functions
+def load_user_preferences():
+    if os.path.exists(USER_PREFERENCES_FILE):
+        with open(USER_PREFERENCES_FILE, 'r') as f:
+            return json.load(f)
+    return {}
+def save_user_preferences(preferences):
+    with open(USER_PREFERENCES_FILE, 'w') as f:
+        json.dump(preferences, f, indent=2)
+@app.get("/", response_class=HTMLResponse)
+def index():
+    # Serve the UI if needed, or just a welcome message
+    return "<h2>FaceMatch FastAPI is running!</h2>"
+@app.post("/api/init_user")
+def init_user():
+    user_id = str(uuid.uuid4())
+    user_sessions[user_id] = True
+    preferences = load_user_preferences()
+    if user_id not in preferences:
+        preferences[user_id] = {
+            'liked_images': [],
+            'disliked_images': [],
+            'preference_embedding': None,
+            'created_at': datetime.now().isoformat()
+        }
+        save_user_preferences(preferences)
+    return {"user_id": user_id, "status": "initialized"}
+@app.get("/api/get_training_images")
+def get_training_images():
+    try:
+        training_images = []
+        for gender_folder in ['men', 'women']:
+            gender_prefix = f'ai-images/{gender_folder}/'
+            blob_list = face_handler.container_client.list_blobs(name_starts_with=gender_prefix)
+            for blob in blob_list:
+                if blob.name.endswith(('.jpg', '.jpeg', '.png')):
+                    image_url = f'https://{face_handler.blob_service_client.account_name}.blob.core.windows.net/{face_handler.container_name}/{blob.name}'
+                    training_images.append(image_url)
+        return {"training_images": training_images[:10], "status": "success"}
+    except Exception as e:
+        return JSONResponse(status_code=500, content={"error": str(e)})
+@app.post("/api/record_preference")
+async def record_preference(request: Request):
+    try:
+        data = await request.json()
+        user_id = data.get('user_id')
+        image_url = data.get('image_url')
+        preference = data.get('preference')
+        if not user_id or not image_url or not preference:
+            raise HTTPException(status_code=400, detail="Missing required parameters")
+        preferences = load_user_preferences()
+        if user_id not in preferences:
+            raise HTTPException(status_code=404, detail="User not found")
+        if preference == 'like':
+            if image_url not in preferences[user_id]['liked_images']:
+                preferences[user_id]['liked_images'].append(image_url)
+        elif preference == 'dislike':
+            if image_url not in preferences[user_id]['disliked_images']:
+                preferences[user_id]['disliked_images'].append(image_url)
+        save_user_preferences(preferences)
+        return {"status": "preference_recorded"}
+    except Exception as e:
+        return JSONResponse(status_code=500, content={"error": str(e)})
+@app.post("/api/get_matches")
+async def get_matches(request: Request):
+    try:
+        data = await request.json()
+        user_id = data.get('user_id')
+        gender = data.get('gender', 'all')
+        top_n = data.get('top_n', 10)
+        if not user_id:
+            raise HTTPException(status_code=404, detail="User not found")
+        preferences = load_user_preferences()
+        if user_id not in preferences:
+            raise HTTPException(status_code=404, detail="User preferences not found")
+        user_prefs = preferences[user_id]
+        if user_prefs['liked_images']:
+            liked_embeddings = []
+            for image_url in user_prefs['liked_images']:
+                try:
+                    img = face_handler.load_image_from_url(image_url)
+                    faces = face_handler.app.get(img)
+                    if len(faces) > 0:
+                        liked_embeddings.append(faces[0].embedding)
+                except Exception as e:
+                    continue
+            if liked_embeddings:
+                preference_embedding = np.mean(liked_embeddings, axis=0)
+                user_prefs['preference_embedding'] = preference_embedding.tolist()
+                save_user_preferences(preferences)
+                similar_images = face_handler.find_similar_images_by_embedding(
+                    preference_embedding, gender, top_n, user_prefs['disliked_images']
+                )
+                return {"similar_images": similar_images}
+        return {"similar_images": []}
+    except Exception as e:
+        return JSONResponse(status_code=500, content={"error": str(e)})
+@app.post("/api/get_recommendations")
+async def get_recommendations(
+    body: Union[RecommendationRequest, HuggingFaceRequest] = Body(...)
+):
+    try:
+        # Handle both direct format and Hugging Face format
+        if isinstance(body, HuggingFaceRequest):
+            # Hugging Face format: {"inputs": {...}}
+            query_images = body.inputs.query_images
+            gender = body.inputs.gender or 'all'
+            top_n = body.inputs.top_n or 5
+        else:
+            # Direct format: {...}
+            query_images = body.query_images
+            gender = body.gender or 'all'
+            top_n = body.top_n or 5
+        if not query_images:
+            raise HTTPException(status_code=400, detail="No query images provided")
+        similar_images = face_handler.find_similar_images_aggregate(query_images, gender, top_n)
+        return {"similar_images": similar_images}
+    except Exception as e:
+        return JSONResponse(status_code=500, content={"error": str(e)})
+@app.post("/api/extract_embeddings")
+def extract_embeddings():
+    try:
+        face_handler.extract_and_save_embeddings()
+        return {"status": "Embeddings extraction completed"}
+    except Exception as e:
+        return JSONResponse(status_code=500, content={"error": str(e)})

requirements.txt ADDED Viewed

	@@ -0,0 +1,15 @@

+azure-storage-blob
+onnxruntime
+insightface
+opencv-python
+flask
+flask-cors
+numpy
+scipy
+pillow
+requests
+scikit-learn
+pandas
+fastapi
+uvicorn