Spaces:
Sleeping
Sleeping
| # IRIS Detailed System Architecture | |
| This document provides a comprehensive look at the IRIS architecture, broken down by functional layers and individual process steps. | |
| ## Overall System Flow | |
| This tiered diagram shows how data flows through the three main layers of the system. | |
| ```mermaid | |
| graph TD | |
| subgraph "1. Ingestion & Preprocessing" | |
| UC[User/Admin] -->|Upload| SS[Supabase Storage] | |
| SS -->|Webhook| BE[FastAPI Backend] | |
| BE -->|Download| PC[Text Cleaning] | |
| PC -->|Anonymize| PA[PII Removal] | |
| end | |
| subgraph "2. NLP Processing Layer" | |
| PA -->|Raw Text| EX[Gemini Extraction] | |
| EX -->|JSON| DB[(Supabase DB)] | |
| DB -->|Text Fields| EM[BGE-M3 Embedding] | |
| EM -->|Vectors| DB | |
| end | |
| subgraph "3. Matching & AI Analysis" | |
| DB -->|Job vs Resume| MS[Semantic Matching] | |
| MS -->|Score| MG[Skill Gap Analysis] | |
| MG -->|Insights| AI[Gemini Analysis] | |
| AI -->|Final Report| UI[Admin Dashboard] | |
| end | |
| ``` | |
| --- | |
| ## 1. Data Ingestion & Preprocessing | |
| This layer ensures that incoming data is clean, secure, and ready for AI processing. | |
| * **File Upload**: Resumes and Job Descriptions are stored securely in Supabase buckets. | |
| * **Event Trigger**: Database Webhooks instantly notify the backend when a new file arrives. | |
| * **Text Cleaning**: Standardizes encoding, removes special characters, and handles whitespace. | |
| * **PII Anonymization**: Uses Regex and NLP patterns to detect and protect sensitive personal information (phone, address) before deep processing. | |
| ## 2. NLP Processing Pipeline | |
| The "Intelligence" layer that understands the meaning behind the text. | |
| * **Structured Extraction**: Google Gemini parses unstructured text into logical objects (Skills, Experience, Education). | |
| * **Relational Storage**: Structured data is saved into dedicated PostgreSQL tables for rapid querying. | |
| * **Vector Embedding**: The BGE-M3 model creates "mathematical summaries" (vectors) of the candidate's profile and the job requirements. | |
| * **Vector Search Index**: These vectors allow the system to find matches based on *meaning* rather than just keywords (e.g., matching "Software Engineer" with "Full Stack Developer"). | |
| ## 3. Matching & AI Analysis Layer | |
| The decision-making layer that provides final value to the recruiter. | |
| * **Semantic Scoring**: Calculates the mathematical distance between a candidate's vector and a job's vector. | |
| * **Skill Gap Analysis**: Compares the extracted skill sets to identify exactly what is missing or where the candidate excels. | |
| * **AI Insight Generation**: A second pass with Gemini generates a human-readable summary, custom strengths, and potential weaknesses. | |
| * **Final Ranking**: Aggregates all scores into a prioritized list for the Admin dashboard. | |
| ## Technology Stack | |
| | Layer | Technologies | | |
| | :--- | :--- | | |
| | **Frontend** | React, Vite, Framer Motion, Lucide Icons | | |
| | **Backend** | FastAPI, Python, SQLAlchemy/Supabase-py | | |
| | **Data** | Supabase (Postgres), pgvector, Supabase Storage | | |
| | **AI/ML** | Google Gemini (LLM), BGE-M3 (Embeddings), Sentence Transformers | | |