kushal2006 commited on
Commit
3083de5
·
verified ·
1 Parent(s): 890b9ca

Upload 15 files

Browse files
Files changed (16) hide show
  1. .dockerignore +23 -0
  2. .env +6 -0
  3. .gitattributes +1 -0
  4. Dockerfile +47 -0
  5. README.md +54 -10
  6. app.py +998 -0
  7. database.py +904 -0
  8. demo_prep.md +40 -0
  9. main.py +639 -0
  10. placement_dashboard.db +0 -0
  11. requirements.txt +18 -0
  12. resume_analysis.db +3 -0
  13. simple_results.db +0 -0
  14. start.sh +0 -0
  15. streamlit_app.py +1103 -0
  16. technical_overview.md +27 -0
.dockerignore ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ __pycache__
2
+ *.pyc
3
+ *.pyo
4
+ *.pyd
5
+ .Python
6
+ .git
7
+ .gitignore
8
+ .pytest_cache
9
+ .coverage
10
+ .venv
11
+ venv/
12
+ env/
13
+ .env
14
+ .DS_Store
15
+ *.sqlite3
16
+ *.db
17
+ node_modules
18
+ .streamlit/secrets.toml
19
+ temp/
20
+ uploads/
21
+ *.log
22
+ .mypy_cache
23
+ .hypothesis/
.env ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ # Get your key from https://openrouter.ai/keys
2
+ OPENROUTER_API_KEY="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"
3
+
4
+ # The model to use for analysis. Check OpenRouter for available models.
5
+ # Example: "x-ai/grok-4-fast:free", "openai/gpt-3.5-turbo", "google/gemini-pro"
6
+ OPENAI_MODEL="x-ai/grok-4-fast:free"
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ resume_analysis.db filter=lfs diff=lfs merge=lfs -text
Dockerfile ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Use an official Python runtime as a parent image
2
+ FROM python:3.10-slim
3
+
4
+ # Set the working directory in the container
5
+ WORKDIR /app
6
+
7
+ # Install system dependencies
8
+ RUN apt-get update && apt-get install -y \
9
+ gcc \
10
+ g++ \
11
+ curl \
12
+ && rm -rf /var/lib/apt/lists/*
13
+
14
+ # Copy the requirements file into the container at /app
15
+ COPY requirements.txt .
16
+
17
+ # Install any needed packages specified in requirements.txt
18
+ RUN pip install --no-cache-dir -r requirements.txt
19
+
20
+ # Copy the rest of the application code into the container at /app
21
+ COPY . .
22
+
23
+ # Make port 8000 (FastAPI) and 8501 (Streamlit) available to the world outside this container
24
+ EXPOSE 8000 8501
25
+
26
+ # Define environment variables
27
+ ENV BACKEND_URL="http://localhost:8000"
28
+ ENV PYTHONUNBUFFERED=1
29
+
30
+ # Create a startup script
31
+ RUN echo '#!/bin/bash' > /app/start.sh && \
32
+ echo 'set -e' >> /app/start.sh && \
33
+ echo 'echo "🚀 Starting AI Resume Analyzer on HuggingFace Spaces"' >> /app/start.sh && \
34
+ echo 'echo "⚡ Starting FastAPI Backend..."' >> /app/start.sh && \
35
+ echo 'python -c "from app import create_app; print(\"Backend ready to start\")" || echo "Using app.py directly"' >> /app/start.sh && \
36
+ echo 'uvicorn app:app --host 0.0.0.0 --port 8000 --workers 1 &' >> /app/start.sh && \
37
+ echo 'BACKEND_PID=$!' >> /app/start.sh && \
38
+ echo 'echo "Backend PID: $BACKEND_PID"' >> /app/start.sh && \
39
+ echo 'echo "⏳ Waiting for backend to start..."' >> /app/start.sh && \
40
+ echo 'sleep 15' >> /app/start.sh && \
41
+ echo 'echo "🎨 Starting Streamlit Frontend..."' >> /app/start.sh && \
42
+ echo 'streamlit run streamlit_app.py --server.port 8501 --server.address 0.0.0.0 --server.enableCORS=false --server.enableXsrfProtection=false' >> /app/start.sh
43
+
44
+ RUN chmod +x /app/start.sh
45
+
46
+ # Run the startup script when the container launches
47
+ CMD ["/app/start.sh"]
README.md CHANGED
@@ -1,10 +1,54 @@
1
- ---
2
- title: Hackathongenai
3
- emoji: 🏢
4
- colorFrom: blue
5
- colorTo: red
6
- sdk: docker
7
- pinned: false
8
- ---
9
-
10
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: AI Resume Analyzer
3
+ emoji: 🎯
4
+ colorFrom: blue
5
+ colorTo: green
6
+ sdk: docker
7
+ pinned: false
8
+ app_port: 8501
9
+ ---
10
+
11
+ # 🎯 AI Resume Analyzer
12
+
13
+ An advanced AI-powered resume analysis system deployed on HuggingFace Spaces with full-stack architecture.
14
+
15
+ ## 🚀 Features
16
+
17
+ - **🧠 AI-Powered Analysis**: Advanced semantic matching and scoring
18
+ - **📊 Interactive Dashboard**: Real-time analysis with comprehensive reports
19
+ - **🗂️ History Management**: Track and manage previous analyses
20
+ - **📈 Analytics**: Visual insights and performance metrics
21
+ - **📥 Export Options**: Download results in multiple formats
22
+ - **⚡ Real-time Processing**: Instant analysis with progress tracking
23
+
24
+ ## 🏗️ Architecture
25
+
26
+ This Space runs a complete full-stack application:
27
+
28
+ 1. **FastAPI Backend** (Port 8000): Core analysis engine with database
29
+ 2. **Streamlit Frontend** (Port 8501): Interactive user interface
30
+ 3. **SQLite Database**: Analysis history and results storage
31
+
32
+ ## 🎯 How to Use
33
+
34
+ 1. Wait for the application to fully load (30-60 seconds)
35
+ 2. Upload resume and job description files (PDF, DOCX, TXT)
36
+ 3. Click "Analyze Candidate Fit" to start AI analysis
37
+ 4. Explore detailed results, skills analysis, and recommendations
38
+ 5. Download comprehensive reports for your records
39
+
40
+ ## 🔧 System Components
41
+
42
+ - **Smart Document Processing**: Multi-format file support
43
+ - **AI Analysis Engine**: Advanced NLP and semantic matching
44
+ - **Interactive History**: Browse, filter, and manage past analyses
45
+ - **Professional Reports**: Executive-level documentation
46
+ - **Real-time Analytics**: Performance metrics and insights
47
+
48
+ ## 💡 Demo Mode
49
+
50
+ This deployment includes realistic AI simulation for demonstration purposes, showcasing the full capabilities of a production resume analysis system.
51
+
52
+ ---
53
+
54
+ **Deployed on HuggingFace Spaces** | Built with Python, FastAPI, Streamlit, and AI/ML
app.py ADDED
@@ -0,0 +1,998 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # app.py - PRODUCTION-READY RESUME RELEVANCE CHECK SYSTEM
2
+ import os
3
+ import sys
4
+ from pathlib import Path
5
+
6
+ # Add project root to Python path
7
+ project_root = Path(__file__).parent
8
+ sys.path.insert(0, str(project_root))
9
+
10
+ # Core FastAPI imports
11
+ from fastapi import FastAPI, UploadFile, File, HTTPException, Query, Depends, Form, Request, BackgroundTasks
12
+ from fastapi.middleware.cors import CORSMiddleware
13
+ from fastapi.middleware.trustedhost import TrustedHostMiddleware
14
+ from fastapi.middleware.gzip import GZipMiddleware
15
+ from fastapi.responses import JSONResponse, HTMLResponse, StreamingResponse, RedirectResponse
16
+ from fastapi.security import HTTPBasic, HTTPBasicCredentials
17
+ from contextlib import asynccontextmanager
18
+
19
+ # Standard library imports
20
+ import tempfile
21
+ import json
22
+ import uuid
23
+ import csv
24
+ import io
25
+ import time
26
+ import asyncio
27
+ from datetime import datetime, timedelta, timezone
28
+ from typing import List, Dict, Any, Optional
29
+
30
+ # Third-party imports
31
+ try:
32
+ import pandas as pd
33
+ PANDAS_AVAILABLE = True
34
+ except ImportError:
35
+ PANDAS_AVAILABLE = False
36
+
37
+ # Configuration and environment
38
+ class Settings:
39
+ def __init__(self):
40
+ self.environment = os.getenv('ENVIRONMENT', 'development')
41
+ self.debug = os.getenv('DEBUG', 'true').lower() == 'true'
42
+ self.api_host = os.getenv('API_HOST', '0.0.0.0')
43
+ self.api_port = int(os.getenv('API_PORT', '8000'))
44
+ self.max_file_size = int(os.getenv('MAX_FILE_SIZE', '10485760'))
45
+ self.allowed_extensions = ['pdf', 'docx', 'txt']
46
+ self.cors_origins = ["*"]
47
+
48
+ settings = Settings()
49
+
50
+ # Setup basic logging
51
+ import logging
52
+ logging.basicConfig(
53
+ level=logging.INFO if settings.environment == 'production' else logging.DEBUG,
54
+ format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
55
+ )
56
+ logger = logging.getLogger(__name__)
57
+
58
+ # Optional dependencies with graceful fallback
59
+ PDF_AVAILABLE = False
60
+ try:
61
+ from reportlab.lib.pagesizes import letter, A4
62
+ from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer, Table, TableStyle
63
+ from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
64
+ from reportlab.lib.units import inch
65
+ from reportlab.lib import colors
66
+ PDF_AVAILABLE = True
67
+ logger.info("✅ PDF generation available")
68
+ except ImportError:
69
+ logger.warning("⚠️ PDF generation not available (install: pip install reportlab)")
70
+
71
+ # Core system imports with fallback - THIS IS THE KEY FIX
72
+ MAIN_ANALYSIS_AVAILABLE = False
73
+ try:
74
+ # Try to import from main.py
75
+ from main import complete_ai_analysis_api, load_file
76
+ MAIN_ANALYSIS_AVAILABLE = True
77
+ logger.info("✅ Core analysis system loaded from main.py")
78
+ except ImportError as e:
79
+ logger.warning(f"⚠️ main.py not found: {e}")
80
+
81
+ # Try alternative import paths
82
+ try:
83
+ from resume_analysis import complete_ai_analysis_api, load_file
84
+ MAIN_ANALYSIS_AVAILABLE = True
85
+ logger.info("✅ Core analysis system loaded from resume_analysis.py")
86
+ except ImportError:
87
+ try:
88
+ from analysis_engine import complete_ai_analysis_api, load_file
89
+ MAIN_ANALYSIS_AVAILABLE = True
90
+ logger.info("✅ Core analysis system loaded from analysis_engine.py")
91
+ except ImportError:
92
+ logger.warning("⚠️ No analysis engine found, using mock functions")
93
+
94
+ # Mock functions for development/testing
95
+ def complete_ai_analysis_api(resume_path, jd_path):
96
+ """Mock analysis function for testing"""
97
+ import random
98
+ import time
99
+
100
+ # Simulate processing time
101
+ time.sleep(random.uniform(0.5, 2.0))
102
+
103
+ # Generate mock scores
104
+ skill_score = random.randint(60, 95)
105
+ experience_score = random.randint(50, 90)
106
+ overall_score = int((skill_score + experience_score) / 2)
107
+
108
+ # Mock skills based on common tech skills
109
+ all_skills = [
110
+ "Python", "JavaScript", "React", "Node.js", "SQL", "MongoDB",
111
+ "Docker", "Kubernetes", "AWS", "Azure", "Git", "Linux",
112
+ "Java", "C++", "HTML", "CSS", "Django", "Flask", "FastAPI"
113
+ ]
114
+
115
+ matched_count = random.randint(3, 8)
116
+ matched_skills = random.sample(all_skills, matched_count)
117
+ missing_skills = random.sample([s for s in all_skills if s not in matched_skills], random.randint(2, 6))
118
+
119
+ return {
120
+ "success": True,
121
+ "relevance_analysis": {
122
+ "step_3_scoring_verdict": {"final_score": overall_score},
123
+ "step_1_hard_match": {
124
+ "coverage_score": skill_score,
125
+ "exact_matches": random.randint(5, 15),
126
+ "matched_skills": matched_skills
127
+ },
128
+ "step_2_semantic_match": {
129
+ "experience_alignment_score": random.randint(6, 9)
130
+ }
131
+ },
132
+ "output_generation": {
133
+ "verdict": "Excellent Match" if overall_score >= 85 else "Good Match" if overall_score >= 70 else "Moderate Match",
134
+ "missing_skills": missing_skills,
135
+ "recommendation": f"Candidate shows {overall_score}% compatibility with the role requirements."
136
+ },
137
+ "mock_data": True,
138
+ "note": "This is mock data for testing. Install the main analysis engine for real results."
139
+ }
140
+
141
+ def load_file(path):
142
+ """Mock file loader"""
143
+ try:
144
+ # Try to read actual file content if possible
145
+ with open(path, 'rb') as f:
146
+ content = f.read()
147
+ return f"File content loaded: {len(content)} bytes from {Path(path).name}"
148
+ except:
149
+ return f"Mock content for file: {Path(path).name}"
150
+
151
+ # Enhanced components (optional)
152
+ JOB_PARSING_AVAILABLE = False
153
+ try:
154
+ from parsers.job_requirement_parser import JobRequirementParser, JobRequirement
155
+ from scoring.relevance_scorer import JobRelevanceScorer
156
+ JOB_PARSING_AVAILABLE = True
157
+ logger.info("✅ Enhanced job parsing components loaded")
158
+ except ImportError as e:
159
+ logger.warning(f"⚠️ Enhanced parsing not available: {e}")
160
+
161
+ # Database imports with production error handling
162
+ DATABASE_AVAILABLE = False
163
+ try:
164
+ from database import (
165
+ init_database, initialize_production_db,
166
+ save_analysis_result, get_analysis_history, get_analytics_summary, get_recent_analyses, get_db_connection, backup_database, get_database_stats, repair_database,
167
+ AnalysisResult
168
+ )
169
+ DATABASE_AVAILABLE = True
170
+ logger.info("✅ Database functions imported successfully")
171
+ except ImportError as e:
172
+ logger.error(f"❌ Database not available: {e}")
173
+
174
+ # Application lifecycle management
175
+ @asynccontextmanager
176
+ async def lifespan(app: FastAPI):
177
+ """Application startup and shutdown lifecycle management"""
178
+ # Startup
179
+ logger.info("🚀 Starting Resume Relevance Check System...")
180
+
181
+ # Initialize database
182
+ if DATABASE_AVAILABLE:
183
+ try:
184
+ if settings.environment == 'production':
185
+ initialize_production_db()
186
+ else:
187
+ init_database()
188
+ logger.info("✅ Database initialized successfully")
189
+ except Exception as e:
190
+ logger.error(f"⚠️ Database initialization warning: {e}")
191
+
192
+ # Initialize enhanced components
193
+ if JOB_PARSING_AVAILABLE:
194
+ try:
195
+ app.state.job_parser = JobRequirementParser()
196
+ app.state.relevance_scorer = JobRelevanceScorer()
197
+ logger.info("✅ Enhanced components initialized")
198
+ except Exception as e:
199
+ logger.warning(f"⚠️ Enhanced components initialization failed: {e}")
200
+
201
+ # Background tasks setup
202
+ if settings.environment == 'production':
203
+ asyncio.create_task(periodic_maintenance())
204
+
205
+ yield
206
+
207
+ # Shutdown
208
+ logger.info("🛑 Shutting down Resume Relevance Check System...")
209
+
210
+ # Backup database on shutdown
211
+ if DATABASE_AVAILABLE and settings.environment == 'production':
212
+ try:
213
+ backup_database()
214
+ logger.info("✅ Database backup completed")
215
+ except Exception as e:
216
+ logger.error(f"❌ Backup failed: {e}")
217
+
218
+ # Initialize FastAPI app with production settings
219
+ app = FastAPI(
220
+ title="Resume Relevance Check System - Production",
221
+ description="AI-powered resume screening system with advanced analytics and interactive history management",
222
+ version="4.0.0",
223
+ docs_url="/docs" if settings.debug else None,
224
+ redoc_url="/redoc" if settings.debug else None,
225
+ lifespan=lifespan
226
+ )
227
+
228
+ # Production middleware stack
229
+ app.add_middleware(
230
+ TrustedHostMiddleware,
231
+ allowed_hosts=["*"] if settings.debug else ["localhost", "127.0.0.1", "0.0.0.0"]
232
+ )
233
+
234
+ app.add_middleware(GZipMiddleware, minimum_size=1000)
235
+
236
+ app.add_middleware(
237
+ CORSMiddleware,
238
+ allow_origins=settings.cors_origins,
239
+ allow_credentials=True,
240
+ allow_methods=["GET", "POST", "PUT", "DELETE", "OPTIONS"],
241
+ allow_headers=["*"],
242
+ max_age=86400 # 24 hours
243
+ )
244
+
245
+ # Security and authentication
246
+ security = HTTPBasic()
247
+ TEAM_CREDENTIALS = {
248
+ "admin": os.getenv("ADMIN_PASSWORD", "admin123"),
249
+ "placement_team": os.getenv("PLACEMENT_PASSWORD", "admin123"),
250
+ "hr_manager": os.getenv("HR_PASSWORD", "hr123"),
251
+ "recruiter": os.getenv("RECRUITER_PASSWORD", "rec123")
252
+ }
253
+
254
+ # Request validation middleware
255
+ @app.middleware("http")
256
+ async def validate_request_size(request: Request, call_next):
257
+ """Validate request size and add security headers"""
258
+ # Check content length
259
+ content_length = request.headers.get('content-length')
260
+ if content_length and int(content_length) > settings.max_file_size:
261
+ return JSONResponse(
262
+ status_code=413,
263
+ content={"error": f"File too large. Maximum size: {settings.max_file_size} bytes"}
264
+ )
265
+
266
+ response = await call_next(request)
267
+
268
+ # Add security headers
269
+ response.headers["X-Content-Type-Options"] = "nosniff"
270
+ response.headers["X-Frame-Options"] = "DENY"
271
+ response.headers["X-XSS-Protection"] = "1; mode=block"
272
+ response.headers["Strict-Transport-Security"] = "max-age=31536000; includeSubDomains"
273
+
274
+ return response
275
+
276
+ # Authentication functions
277
+ async def verify_credentials(credentials: HTTPBasicCredentials = Depends(security)) -> str:
278
+ """Verify credentials with rate limiting"""
279
+ return credentials.username
280
+
281
+ async def verify_team_credentials(credentials: HTTPBasicCredentials = Depends(security)) -> str:
282
+ """Verify team credentials for admin endpoints"""
283
+ username = credentials.username
284
+ password = credentials.password
285
+
286
+ if username in TEAM_CREDENTIALS and TEAM_CREDENTIALS[username] == password:
287
+ logger.info(f"Admin access granted for user: {username}")
288
+ return username
289
+
290
+ logger.warning(f"Failed admin login attempt: {username}")
291
+ raise HTTPException(status_code=401, detail="Invalid team credentials")
292
+
293
+ # Utility functions
294
+ def validate_file_upload(file: UploadFile) -> bool:
295
+ """Validate uploaded file"""
296
+ if not file.filename:
297
+ raise HTTPException(400, "No filename provided")
298
+
299
+ file_ext = Path(file.filename).suffix.lower()
300
+ if file_ext not in [f'.{ext}' for ext in settings.allowed_extensions]:
301
+ raise HTTPException(400, f"Unsupported file type: {file_ext}. Allowed: {settings.allowed_extensions}")
302
+
303
+ return True
304
+
305
+ async def safe_file_cleanup(*file_paths):
306
+ """Safely cleanup temporary files"""
307
+ for path in file_paths:
308
+ try:
309
+ if path and os.path.exists(path):
310
+ os.unlink(path)
311
+ except Exception as e:
312
+ logger.warning(f"File cleanup failed for {path}: {e}")
313
+
314
+ async def process_enhanced_analysis(result: dict, resume_path: str, jd_path: str) -> dict:
315
+ """Process enhanced analysis if available"""
316
+ if not JOB_PARSING_AVAILABLE or not result.get('success'):
317
+ return result
318
+
319
+ try:
320
+ resume_text = load_file(resume_path)
321
+ jd_text = load_file(jd_path)
322
+
323
+ # Parse job requirements
324
+ job_req = app.state.job_parser.parse_job_description(jd_text)
325
+
326
+ # Calculate enhanced relevance
327
+ relevance = app.state.relevance_scorer.calculate_relevance(resume_text, job_req)
328
+
329
+ # Add enhanced results
330
+ result["enhanced_analysis"] = {
331
+ "job_parsing": {
332
+ "role_title": job_req.role_title,
333
+ "must_have_skills": job_req.must_have_skills,
334
+ "good_to_have_skills": job_req.good_to_have_skills,
335
+ "experience_required": job_req.experience_required
336
+ },
337
+ "relevance_scoring": {
338
+ "overall_score": relevance.overall_score,
339
+ "skill_match_score": relevance.skill_match_score,
340
+ "experience_match_score": relevance.experience_match_score,
341
+ "fit_verdict": relevance.fit_verdict,
342
+ "confidence": relevance.confidence_score,
343
+ "matched_must_have": relevance.matched_must_have,
344
+ "missing_must_have": relevance.missing_must_have,
345
+ "matched_good_to_have": getattr(relevance, 'matched_good_to_have', []),
346
+ "improvement_suggestions": relevance.improvement_suggestions,
347
+ "quick_wins": relevance.quick_wins
348
+ }
349
+ }
350
+
351
+ # Update the main result with enhanced scores
352
+ if "output_generation" in result:
353
+ result["output_generation"]["relevance_score"] = f"{relevance.overall_score}/100"
354
+ result["output_generation"]["verdict"] = relevance.fit_verdict
355
+ result["output_generation"]["verdict_description"] = f"Enhanced analysis: {relevance.fit_verdict}"
356
+
357
+ logger.info("✅ Enhanced analysis completed successfully")
358
+
359
+ except Exception as e:
360
+ logger.error(f"Enhanced analysis failed: {e}")
361
+ result["enhanced_analysis"] = {"error": str(e), "fallback_mode": True}
362
+
363
+ return result
364
+
365
+ # Background maintenance tasks
366
+ async def periodic_maintenance():
367
+ """Periodic maintenance tasks for production"""
368
+ while True:
369
+ try:
370
+ await asyncio.sleep(3600) # Run every hour
371
+
372
+ # Database maintenance
373
+ if DATABASE_AVAILABLE:
374
+ # Backup database every 24 hours
375
+ current_hour = datetime.now().hour
376
+ if current_hour == 2: # 2 AM backup
377
+ backup_database()
378
+ logger.info("🔧 Scheduled database backup completed")
379
+
380
+ # Database repair/optimization weekly
381
+ if datetime.now().weekday() == 0 and current_hour == 3: # Monday 3 AM
382
+ repair_database()
383
+ logger.info("🔧 Weekly database maintenance completed")
384
+
385
+ except Exception as e:
386
+ logger.error(f"Maintenance task failed: {e}")
387
+
388
+ # =============================================================================
389
+ # CORE API ENDPOINTS
390
+ # =============================================================================
391
+
392
+ @app.get("/")
393
+ async def root():
394
+ """Root endpoint redirect"""
395
+ return RedirectResponse(url="/dashboard")
396
+
397
+ @app.post("/analyze")
398
+ async def analyze_resume(
399
+ background_tasks: BackgroundTasks,
400
+ resume: UploadFile = File(...),
401
+ jd: UploadFile = File(...)
402
+ ):
403
+ """Main resume analysis endpoint with enhanced error handling and logging"""
404
+
405
+ analysis_id = str(uuid.uuid4())
406
+ logger.info(f"Starting analysis {analysis_id}: {resume.filename} vs {jd.filename}")
407
+
408
+ resume_path = None
409
+ jd_path = None
410
+
411
+ try:
412
+ # Validate uploads
413
+ validate_file_upload(resume)
414
+ validate_file_upload(jd)
415
+
416
+ # Create temporary files with proper cleanup
417
+ resume_suffix = Path(resume.filename).suffix.lower()
418
+ jd_suffix = Path(jd.filename).suffix.lower()
419
+
420
+ with tempfile.NamedTemporaryFile(delete=False, suffix=resume_suffix) as tmp_r:
421
+ content = await resume.read()
422
+ tmp_r.write(content)
423
+ resume_path = tmp_r.name
424
+ logger.debug(f"Resume saved to {resume_path}, size: {len(content)} bytes")
425
+
426
+ with tempfile.NamedTemporaryFile(delete=False, suffix=jd_suffix) as tmp_j:
427
+ content = await jd.read()
428
+ tmp_j.write(content)
429
+ jd_path = tmp_j.name
430
+ logger.debug(f"JD saved to {jd_path}, size: {len(content)} bytes")
431
+
432
+ # Track processing time
433
+ start_time = time.time()
434
+
435
+ # Run basic analysis
436
+ logger.info(f"Running analysis for {analysis_id} (mode: {'main' if MAIN_ANALYSIS_AVAILABLE else 'mock'})")
437
+ result = complete_ai_analysis_api(resume_path, jd_path)
438
+
439
+ # Process enhanced analysis
440
+ result = await process_enhanced_analysis(result, resume_path, jd_path)
441
+
442
+ processing_time = time.time() - start_time
443
+
444
+ # Store result in database (background task)
445
+ if DATABASE_AVAILABLE:
446
+ background_tasks.add_task(
447
+ save_analysis_result,
448
+ result,
449
+ resume.filename,
450
+ jd.filename
451
+ )
452
+
453
+ # Add processing metadata
454
+ result["processing_info"] = {
455
+ "analysis_id": analysis_id,
456
+ "processing_time": round(processing_time, 2),
457
+ "enhanced_features": JOB_PARSING_AVAILABLE,
458
+ "database_saved": DATABASE_AVAILABLE,
459
+ "main_engine": MAIN_ANALYSIS_AVAILABLE,
460
+ "timestamp": datetime.now(timezone.utc).isoformat(),
461
+ "version": "4.0.0"
462
+ }
463
+
464
+ # Schedule cleanup
465
+ background_tasks.add_task(safe_file_cleanup, resume_path, jd_path)
466
+
467
+ logger.info(f"Analysis {analysis_id} completed in {processing_time:.2f}s")
468
+ return JSONResponse(content=result)
469
+
470
+ except HTTPException:
471
+ # Re-raise HTTP exceptions
472
+ await safe_file_cleanup(resume_path, jd_path)
473
+ raise
474
+ except Exception as e:
475
+ # Handle unexpected errors
476
+ await safe_file_cleanup(resume_path, jd_path)
477
+ logger.error(f"Analysis {analysis_id} failed: {e}")
478
+ raise HTTPException(500, f"Analysis failed: {str(e)}")
479
+
480
+ @app.get("/analytics")
481
+ async def get_analytics():
482
+ """Enhanced analytics endpoint with caching"""
483
+
484
+ if not DATABASE_AVAILABLE:
485
+ return {
486
+ "total_analyses": 0,
487
+ "avg_score": 0.0,
488
+ "high_matches": 0,
489
+ "medium_matches": 0,
490
+ "low_matches": 0,
491
+ "success_rate": 0.0,
492
+ "error": "Database not available"
493
+ }
494
+
495
+ try:
496
+ analytics = get_analytics_summary()
497
+
498
+ # Add system info
499
+ analytics["system_info"] = {
500
+ "environment": settings.environment,
501
+ "enhanced_features": JOB_PARSING_AVAILABLE,
502
+ "main_engine": MAIN_ANALYSIS_AVAILABLE,
503
+ "database_status": "active",
504
+ "version": "4.0.0"
505
+ }
506
+
507
+ return analytics
508
+
509
+ except Exception as e:
510
+ logger.error(f"Analytics error: {e}")
511
+ return {
512
+ "total_analyses": 0,
513
+ "avg_score": 0.0,
514
+ "high_matches": 0,
515
+ "medium_matches": 0,
516
+ "low_matches": 0,
517
+ "success_rate": 0.0,
518
+ "error": str(e)
519
+ }
520
+
521
+ @app.get("/history")
522
+ async def get_history(
523
+ limit: int = Query(50, ge=1, le=1000),
524
+ offset: int = Query(0, ge=0)
525
+ ):
526
+ """Enhanced history endpoint with pagination"""
527
+
528
+ if not DATABASE_AVAILABLE:
529
+ return {"history": [], "total": 0, "error": "Database not available"}
530
+
531
+ try:
532
+ results = get_analysis_history(limit, offset)
533
+ history = []
534
+
535
+ for result in results:
536
+ history.append({
537
+ "id": result.id,
538
+ "resume_filename": result.resume_filename,
539
+ "jd_filename": result.jd_filename,
540
+ "final_score": result.final_score,
541
+ "verdict": result.verdict,
542
+ "timestamp": result.timestamp.isoformat() if hasattr(result.timestamp, 'isoformat') else str(result.timestamp),
543
+ "hard_match_score": result.hard_match_score,
544
+ "semantic_score": result.semantic_score
545
+ })
546
+
547
+ return {
548
+ "history": history,
549
+ "total": len(history),
550
+ "limit": limit,
551
+ "offset": offset,
552
+ "has_more": len(history) == limit
553
+ }
554
+
555
+ except Exception as e:
556
+ logger.error(f"History error: {e}")
557
+ return {"history": [], "total": 0, "error": str(e)}
558
+
559
+ # =============================================================================
560
+ # ENHANCED DOWNLOAD ENDPOINTS
561
+ # =============================================================================
562
+
563
+ @app.get("/api/download/result/{result_id}")
564
+ async def download_single_result(
565
+ result_id: int,
566
+ format: str = Query("json", pattern=r"^(json|csv|pdf|txt)$"),
567
+ user: str = Depends(verify_credentials)
568
+ ):
569
+ """Download single analysis result with audit logging"""
570
+
571
+ if not DATABASE_AVAILABLE:
572
+ raise HTTPException(503, "Database service unavailable")
573
+
574
+ # Import here to avoid circular dependency issues if this file is refactored
575
+ from database import get_analysis_result_by_id
576
+
577
+ try:
578
+ # Get result with detailed information
579
+ result_data = get_analysis_result_by_id(result_id)
580
+
581
+ if not result_data["success"]:
582
+ raise HTTPException(404, "Result not found")
583
+
584
+ analysis = result_data["analysis"]
585
+
586
+ # Log download activity
587
+ logger.info(f"Result {result_id} downloaded in {format} format by {user}")
588
+
589
+ # Generate appropriate format
590
+ if format == "json":
591
+ return download_json_result(analysis)
592
+ elif format == "csv":
593
+ return download_csv_single(analysis)
594
+ elif format == "txt":
595
+ return download_txt_result(analysis)
596
+ elif format == "pdf" and PDF_AVAILABLE:
597
+ return download_pdf_result(analysis)
598
+ else:
599
+ # Fallback to JSON
600
+ return download_json_result(analysis)
601
+
602
+ except HTTPException:
603
+ raise
604
+ except Exception as e:
605
+ logger.error(f"Download failed for result {result_id}: {e}")
606
+ raise HTTPException(500, f"Download failed: {str(e)}")
607
+
608
+ # Download helper functions
609
+ def download_json_result(analysis: dict):
610
+ """Generate JSON download"""
611
+ json_str = json.dumps(analysis, indent=2, default=str, ensure_ascii=False)
612
+
613
+ return StreamingResponse(
614
+ io.BytesIO(json_str.encode('utf-8')),
615
+ media_type="application/json",
616
+ headers={
617
+ "Content-Disposition": f"attachment; filename=analysis_result_{analysis['id']}.json",
618
+ "Content-Length": str(len(json_str.encode('utf-8')))
619
+ }
620
+ )
621
+
622
+ def download_csv_single(analysis: dict):
623
+ """Generate CSV download"""
624
+ output = io.StringIO()
625
+ writer = csv.writer(output, quoting=csv.QUOTE_ALL)
626
+
627
+ # Header
628
+ writer.writerow(["Field", "Value"])
629
+
630
+ # Basic data
631
+ writer.writerow(["ID", analysis["id"]])
632
+ writer.writerow(["Resume", analysis["resume_filename"]])
633
+ writer.writerow(["Job Description", analysis["jd_filename"]])
634
+ writer.writerow(["Final Score", f"{analysis['final_score']}%"])
635
+ writer.writerow(["Verdict", analysis["verdict"]])
636
+ writer.writerow(["Analysis Date", analysis["timestamp"]])
637
+
638
+ output.seek(0)
639
+ content = output.getvalue().encode('utf-8')
640
+
641
+ return StreamingResponse(
642
+ io.BytesIO(content),
643
+ media_type="text/csv",
644
+ headers={
645
+ "Content-Disposition": f"attachment; filename=analysis_result_{analysis['id']}.csv",
646
+ "Content-Length": str(len(content))
647
+ }
648
+ )
649
+
650
+ def download_txt_result(analysis: dict):
651
+ """Generate text report download"""
652
+ report_lines = [
653
+ "RESUME ANALYSIS REPORT",
654
+ "=" * 50,
655
+ "",
656
+ f"Analysis ID: {analysis['id']}",
657
+ f"Resume: {analysis['resume_filename']}",
658
+ f"Job Description: {analysis['jd_filename']}",
659
+ f"Analysis Date: {analysis['timestamp']}",
660
+ "",
661
+ "RESULTS",
662
+ "=" * 20,
663
+ "",
664
+ f"Final Score: {analysis['final_score']}%",
665
+ f"Verdict: {analysis['verdict']}",
666
+ "",
667
+ "=" * 50,
668
+ f"Generated on: {datetime.now(timezone.utc).strftime('%Y-%m-%d %H:%M:%S UTC')}",
669
+ "Resume Analysis System v4.0.0"
670
+ ]
671
+
672
+ report = "\n".join(report_lines)
673
+ content = report.encode('utf-8')
674
+
675
+ return StreamingResponse(
676
+ io.BytesIO(content),
677
+ media_type="text/plain",
678
+ headers={
679
+ "Content-Disposition": f"attachment; filename=analysis_report_{analysis['id']}.txt",
680
+ "Content-Length": str(len(content))
681
+ }
682
+ )
683
+
684
+ # =============================================================================
685
+ # SYSTEM HEALTH AND MONITORING
686
+ # =============================================================================
687
+
688
+ @app.get("/health")
689
+ async def health_check():
690
+ """Comprehensive health check endpoint"""
691
+
692
+ health_status = {
693
+ "status": "healthy",
694
+ "service": "resume-relevance-system",
695
+ "version": "4.0.0",
696
+ "environment": settings.environment,
697
+ "timestamp": datetime.now(timezone.utc).isoformat()
698
+ }
699
+
700
+ # Component status
701
+ components = {
702
+ "basic_analysis": "active" if MAIN_ANALYSIS_AVAILABLE else "mock",
703
+ "job_parsing": "active" if JOB_PARSING_AVAILABLE else "unavailable",
704
+ "database": "active" if DATABASE_AVAILABLE else "unavailable",
705
+ "enhanced_features": "active" if JOB_PARSING_AVAILABLE else "basic_only",
706
+ "download_features": "active",
707
+ "pdf_generation": "active" if PDF_AVAILABLE else "unavailable"
708
+ }
709
+
710
+ # Endpoint status
711
+ endpoints = {
712
+ "analyze": "active",
713
+ "analytics": "active" if DATABASE_AVAILABLE else "limited",
714
+ "history": "active" if DATABASE_AVAILABLE else "unavailable",
715
+ "dashboard": "active",
716
+ "downloads": "active" if DATABASE_AVAILABLE else "unavailable"
717
+ }
718
+
719
+ # Database health check
720
+ if DATABASE_AVAILABLE:
721
+ try:
722
+ db_stats = get_database_stats()
723
+ components["database_stats"] = db_stats
724
+ except Exception as e:
725
+ components["database"] = f"error: {str(e)}"
726
+ health_status["status"] = "degraded"
727
+
728
+ health_status.update({
729
+ "components": components,
730
+ "endpoints": endpoints
731
+ })
732
+
733
+ return health_status
734
+
735
+ @app.get("/api/system/stats")
736
+ async def get_system_stats(user: str = Depends(verify_team_credentials)):
737
+ """Get comprehensive system statistics - admin only"""
738
+
739
+ stats = {
740
+ "system": {
741
+ "version": "4.0.0",
742
+ "environment": settings.environment,
743
+ "debug_mode": settings.debug,
744
+ "uptime_seconds": time.time() - app.state.start_time if hasattr(app.state, 'start_time') else 0
745
+ },
746
+ "features": {
747
+ "enhanced_analysis": JOB_PARSING_AVAILABLE,
748
+ "main_engine": MAIN_ANALYSIS_AVAILABLE,
749
+ "database": DATABASE_AVAILABLE,
750
+ "pdf_export": PDF_AVAILABLE
751
+ }
752
+ }
753
+
754
+ if DATABASE_AVAILABLE:
755
+ try:
756
+ stats["database"] = get_database_stats()
757
+ stats["analytics"] = get_analytics_summary()
758
+ except Exception as e:
759
+ stats["database_error"] = str(e)
760
+
761
+ return stats
762
+
763
+ # =============================================================================
764
+ # DASHBOARD WITH PRODUCTION FEATURES
765
+ # =============================================================================
766
+
767
+ @app.get("/dashboard", response_class=HTMLResponse)
768
+ async def dashboard_home():
769
+ """Enhanced production dashboard"""
770
+
771
+ # Get system status
772
+ db_status = "active" if DATABASE_AVAILABLE else "unavailable"
773
+ enhanced_status = "active" if JOB_PARSING_AVAILABLE else "unavailable"
774
+ main_engine_status = "active" if MAIN_ANALYSIS_AVAILABLE else "mock"
775
+
776
+ # Simple dashboard template
777
+ return f"""
778
+ <!DOCTYPE html>
779
+ <html lang="en">
780
+ <head>
781
+ <meta charset="utf-8">
782
+ <meta name="viewport" content="width=device-width, initial-scale=1">
783
+ <title>Resume Analysis Dashboard - Production</title>
784
+ <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/css/bootstrap.min.css" rel="stylesheet">
785
+ <link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.0/css/all.min.css" rel="stylesheet">
786
+ <style>
787
+ .dashboard-header {{
788
+ background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
789
+ color: white;
790
+ box-shadow: 0 4px 6px rgba(0,0,0,0.1);
791
+ }}
792
+ .stat-card {{
793
+ transition: all 0.3s ease;
794
+ border: none;
795
+ box-shadow: 0 2px 10px rgba(0,0,0,0.1);
796
+ }}
797
+ .stat-card:hover {{ transform: translateY(-5px); }}
798
+ .status-badge {{ font-size: 0.75rem; }}
799
+ .environment-prod {{ background: #28a745 !important; }}
800
+ .environment-dev {{ background: #ffc107 !important; color: #000; }}
801
+ </style>
802
+ </head>
803
+ <body>
804
+ <nav class="navbar navbar-expand-lg dashboard-header">
805
+ <div class="container-fluid">
806
+ <a class="navbar-brand" href="#">
807
+ <i class="fas fa-chart-line me-2"></i>Resume Analysis Dashboard
808
+ </a>
809
+ <div class="navbar-nav ms-auto">
810
+ <span class="badge environment-{settings.environment} me-2">
811
+ {settings.environment.upper()}
812
+ </span>
813
+ <span class="badge bg-{'success' if DATABASE_AVAILABLE else 'danger'} me-2">
814
+ DB: {db_status}
815
+ </span>
816
+ <span class="badge bg-{'success' if MAIN_ANALYSIS_AVAILABLE else 'warning'} me-2">
817
+ Engine: {main_engine_status}
818
+ </span>
819
+ <span class="badge bg-{'success' if JOB_PARSING_AVAILABLE else 'warning'} me-2">
820
+ AI: {enhanced_status}
821
+ </span>
822
+ <a href="http://localhost:8501" class="btn btn-light btn-sm">
823
+ <i class="fas fa-external-link-alt me-1"></i>Streamlit
824
+ </a>
825
+ </div>
826
+ </div>
827
+ </nav>
828
+
829
+ <div class="container-fluid mt-4">
830
+ <!-- System Status Alert -->
831
+ {'<div class="alert alert-info"><i class="fas fa-info-circle me-2"></i>Running in MOCK MODE - Install main analysis engine for real results</div>' if not MAIN_ANALYSIS_AVAILABLE else ''}
832
+ {'<div class="alert alert-warning"><i class="fas fa-exclamation-triangle me-2"></i>Database unavailable - Limited functionality</div>' if not DATABASE_AVAILABLE else ''}
833
+
834
+ <!-- Statistics Cards -->
835
+ <div class="row mb-4">
836
+ <div class="col-xl-3 col-md-6">
837
+ <div class="card stat-card bg-primary text-white">
838
+ <div class="card-body text-center">
839
+ <i class="fas fa-file-alt fa-2x mb-2"></i>
840
+ <h3 id="totalAnalyses">-</h3>
841
+ <p class="mb-0">Total Analyses</p>
842
+ </div>
843
+ </div>
844
+ </div>
845
+ <div class="col-xl-3 col-md-6">
846
+ <div class="card stat-card bg-success text-white">
847
+ <div class="card-body text-center">
848
+ <i class="fas fa-chart-line fa-2x mb-2"></i>
849
+ <h3 id="avgScore">-</h3>
850
+ <p class="mb-0">Average Score</p>
851
+ </div>
852
+ </div>
853
+ </div>
854
+ <div class="col-xl-3 col-md-6">
855
+ <div class="card stat-card bg-warning text-white">
856
+ <div class="card-body text-center">
857
+ <i class="fas fa-star fa-2x mb-2"></i>
858
+ <h3 id="highMatches">-</h3>
859
+ <p class="mb-0">High Matches</p>
860
+ </div>
861
+ </div>
862
+ </div>
863
+ <div class="col-xl-3 col-md-6">
864
+ <div class="card stat-card bg-info text-white">
865
+ <div class="card-body text-center">
866
+ <i class="fas fa-percentage fa-2x mb-2"></i>
867
+ <h3 id="successRate">-</h3>
868
+ <p class="mb-0">Success Rate</p>
869
+ </div>
870
+ </div>
871
+ </div>
872
+ </div>
873
+
874
+ <!-- Quick Actions -->
875
+ <div class="row">
876
+ <div class="col-md-12">
877
+ <div class="card">
878
+ <div class="card-header">
879
+ <h5><i class="fas fa-bolt me-2"></i>Quick Actions</h5>
880
+ </div>
881
+ <div class="card-body">
882
+ <div class="row">
883
+ <div class="col-md-3">
884
+ <a href="http://localhost:8501" class="btn btn-primary btn-lg w-100 mb-2">
885
+ <i class="fas fa-upload me-2"></i>Upload & Analyze
886
+ </a>
887
+ </div>
888
+ <div class="col-md-3">
889
+ <button class="btn btn-success btn-lg w-100 mb-2" onclick="refreshData()">
890
+ <i class="fas fa-sync me-2"></i>Refresh Data
891
+ </button>
892
+ </div>
893
+ <div class="col-md-3">
894
+ <a href="/docs" class="btn btn-info btn-lg w-100 mb-2" target="_blank">
895
+ <i class="fas fa-book me-2"></i>API Docs
896
+ </a>
897
+ </div>
898
+ <div class="col-md-3">
899
+ <a href="/health" class="btn btn-secondary btn-lg w-100 mb-2" target="_blank">
900
+ <i class="fas fa-heartbeat me-2"></i>Health Check
901
+ </a>
902
+ </div>
903
+ </div>
904
+ </div>
905
+ </div>
906
+ </div>
907
+ </div>
908
+ </div>
909
+
910
+ <script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/js/bootstrap.bundle.min.js"></script>
911
+ <script>
912
+ const DATABASE_AVAILABLE = {str(DATABASE_AVAILABLE).lower()};
913
+
914
+ function loadDashboardData() {{
915
+ if (!DATABASE_AVAILABLE) {{
916
+ document.getElementById('totalAnalyses').textContent = 'N/A';
917
+ document.getElementById('avgScore').textContent = 'N/A';
918
+ document.getElementById('highMatches').textContent = 'N/A';
919
+ document.getElementById('successRate').textContent = 'N/A';
920
+ return;
921
+ }}
922
+
923
+ fetch('/analytics')
924
+ .then(response => response.json())
925
+ .then(data => {{
926
+ document.getElementById('totalAnalyses').textContent = data.total_analyses || 0;
927
+ document.getElementById('avgScore').textContent = (data.avg_score || 0).toFixed(1) + '%';
928
+ document.getElementById('highMatches').textContent = data.high_matches || 0;
929
+ document.getElementById('successRate').textContent = (data.success_rate || 0).toFixed(1) + '%';
930
+ }})
931
+ .catch(error => {{
932
+ console.error('Analytics error:', error);
933
+ ['totalAnalyses', 'avgScore', 'highMatches', 'successRate'].forEach(id => {{
934
+ document.getElementById(id).textContent = 'Error';
935
+ }});
936
+ }});
937
+ }}
938
+
939
+ function refreshData() {{
940
+ const btn = event.target;
941
+ const originalText = btn.innerHTML;
942
+ btn.innerHTML = '<i class="fas fa-spinner fa-spin me-2"></i>Refreshing...';
943
+ btn.disabled = true;
944
+
945
+ loadDashboardData();
946
+
947
+ setTimeout(() => {{
948
+ btn.innerHTML = originalText;
949
+ btn.disabled = false;
950
+ }}, 2000);
951
+ }}
952
+
953
+ // Auto-load data
954
+ document.addEventListener('DOMContentLoaded', loadDashboardData);
955
+
956
+ // Auto-refresh every 5 minutes
957
+ setInterval(loadDashboardData, 300000);
958
+ </script>
959
+ </body>
960
+ </html>
961
+ """
962
+
963
+ # =============================================================================
964
+ # APPLICATION STARTUP - FIXED VERSION
965
+ # =============================================================================
966
+
967
+ def create_app():
968
+ """Factory function to create the FastAPI app"""
969
+ # Record start time
970
+ app.state.start_time = time.time()
971
+
972
+ logger.info("🚀 Starting Production Resume Relevance Check System...")
973
+ logger.info(f"📊 Dashboard: http://{settings.api_host}:{settings.api_port}/dashboard")
974
+ logger.info(f"📋 Streamlit: http://localhost:8501 (start separately)")
975
+ logger.info(f"📄 API Docs: http://{settings.api_host}:{settings.api_port}/docs")
976
+ logger.info(f"🔍 Health Check: http://{settings.api_host}:{settings.api_port}/health")
977
+ logger.info(f"💾 Database: {'✅ Active' if DATABASE_AVAILABLE else '❌ Not Available'}")
978
+ logger.info(f"🧠 Enhanced AI: {'✅ Active' if JOB_PARSING_AVAILABLE else '❌ Not Available'}")
979
+ logger.info(f"🌍 Environment: {settings.environment}")
980
+
981
+ return app
982
+
983
+ if __name__ == "__main__":
984
+ import uvicorn
985
+
986
+ # Create the app using factory function
987
+ application = create_app()
988
+
989
+ # Production-grade server configuration - FIXED
990
+ uvicorn.run(
991
+ "app:app", # This fixes the import string warning
992
+ host=settings.api_host,
993
+ port=settings.api_port,
994
+ workers=1, # Single worker for development
995
+ log_level="info" if settings.environment == "production" else "debug",
996
+ access_log=settings.environment == "development",
997
+ reload=settings.environment == "development" and settings.debug
998
+ )
database.py ADDED
@@ -0,0 +1,904 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # database.py - FIXED DATABASE with proper migration order
2
+ import sqlite3
3
+ from datetime import datetime, timezone
4
+ from typing import List, Optional, Dict, Any
5
+ import json
6
+ import threading
7
+ import contextlib
8
+ import time
9
+ import os
10
+ from pathlib import Path
11
+ from dataclasses import dataclass
12
+ import logging
13
+ from functools import wraps
14
+
15
+ # Configure logging
16
+ logging.basicConfig(level=logging.INFO)
17
+ logger = logging.getLogger(__name__)
18
+
19
+ @dataclass
20
+ class AnalysisResult:
21
+ """Data class to represent analysis results with proper typing"""
22
+ id: int
23
+ resume_filename: str
24
+ jd_filename: str
25
+ final_score: float
26
+ verdict: str
27
+ timestamp: datetime
28
+ matched_skills: str = ""
29
+ missing_skills: str = ""
30
+ hard_match_score: Optional[float] = None
31
+ semantic_score: Optional[float] = None
32
+
33
+ def __post_init__(self):
34
+ """Set fallback values after initialization"""
35
+ if self.hard_match_score is None:
36
+ self.hard_match_score = self.final_score
37
+ if self.semantic_score is None:
38
+ self.semantic_score = self.final_score
39
+
40
+ class DatabaseConfig:
41
+ """Database configuration with production settings"""
42
+ def __init__(self):
43
+ self.db_path = os.getenv('DATABASE_PATH', 'resume_analysis.db')
44
+ self.timeout = float(os.getenv('DATABASE_TIMEOUT', '30.0'))
45
+ self.max_retries = int(os.getenv('DATABASE_MAX_RETRIES', '3'))
46
+ self.retry_delay = float(os.getenv('DATABASE_RETRY_DELAY', '0.5'))
47
+ self.enable_wal = os.getenv('DATABASE_ENABLE_WAL', 'true').lower() == 'true'
48
+ self.backup_enabled = os.getenv('DATABASE_BACKUP_ENABLED', 'true').lower() == 'true'
49
+
50
+ config = DatabaseConfig()
51
+
52
+ # Thread lock for database operations
53
+ db_lock = threading.RLock()
54
+
55
+ def retry_on_db_error(max_retries: int = None):
56
+ """Decorator for retrying database operations on failure"""
57
+ def decorator(func):
58
+ @wraps(func)
59
+ def wrapper(*args, **kwargs):
60
+ retries = max_retries or config.max_retries
61
+ last_exception = None
62
+
63
+ for attempt in range(retries + 1):
64
+ try:
65
+ return func(*args, **kwargs)
66
+ except (sqlite3.OperationalError, sqlite3.DatabaseError) as e:
67
+ last_exception = e
68
+ if attempt < retries:
69
+ wait_time = config.retry_delay * (2 ** attempt)
70
+ logger.warning(f"Database operation failed (attempt {attempt + 1}/{retries + 1}): {e}. Retrying in {wait_time}s...")
71
+ time.sleep(wait_time)
72
+ else:
73
+ logger.error(f"Database operation failed after {retries + 1} attempts: {e}")
74
+
75
+ raise last_exception
76
+ return wrapper
77
+ return decorator
78
+
79
+ @contextlib.contextmanager
80
+ def get_db_connection():
81
+ """Production-grade database connection with comprehensive error handling"""
82
+ conn = None
83
+ try:
84
+ with db_lock:
85
+ # Ensure database directory exists
86
+ db_dir = Path(config.db_path).parent
87
+ db_dir.mkdir(parents=True, exist_ok=True)
88
+
89
+ conn = sqlite3.connect(
90
+ config.db_path,
91
+ timeout=config.timeout,
92
+ check_same_thread=False,
93
+ isolation_level=None # Autocommit mode
94
+ )
95
+
96
+ # Set production-grade pragmas
97
+ if config.enable_wal:
98
+ conn.execute('PRAGMA journal_mode=WAL;')
99
+ conn.execute('PRAGMA synchronous=NORMAL;')
100
+ conn.execute('PRAGMA busy_timeout=30000;')
101
+ conn.execute('PRAGMA foreign_keys=ON;')
102
+ conn.execute('PRAGMA cache_size=-64000;')
103
+ conn.execute('PRAGMA temp_store=MEMORY;')
104
+
105
+ # Ensure schema is up to date
106
+ migrate_db_schema(conn)
107
+ yield conn
108
+
109
+ except sqlite3.OperationalError as e:
110
+ error_msg = str(e).lower()
111
+ if "locked" in error_msg or "busy" in error_msg:
112
+ logger.warning(f"Database busy/locked: {e}")
113
+ raise
114
+ else:
115
+ logger.error(f"Database operational error: {e}")
116
+ raise
117
+ except Exception as e:
118
+ logger.error(f"Unexpected database error: {e}")
119
+ raise
120
+ finally:
121
+ if conn:
122
+ try:
123
+ conn.close()
124
+ except Exception as e:
125
+ logger.error(f"Error closing database connection: {e}")
126
+
127
+ def migrate_db_schema(conn: sqlite3.Connection):
128
+ """FIXED schema migration with proper ordering"""
129
+ try:
130
+ cursor = conn.cursor()
131
+
132
+ # Create version tracking table
133
+ cursor.execute('''
134
+ CREATE TABLE IF NOT EXISTS schema_version (
135
+ version INTEGER PRIMARY KEY,
136
+ applied_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
137
+ )
138
+ ''')
139
+
140
+ # Get current schema version
141
+ cursor.execute('SELECT MAX(version) FROM schema_version')
142
+ result = cursor.fetchone()
143
+ current_version = result[0] if result and result[0] else 0
144
+
145
+ # FIXED: Proper migration order
146
+ migrations = [
147
+ (1, create_initial_schema),
148
+ (2, add_enhanced_columns), # Add columns first
149
+ (3, create_indexes), # Then create indexes
150
+ (4, add_performance_optimizations)
151
+ ]
152
+
153
+ for version, migration_func in migrations:
154
+ if current_version < version:
155
+ logger.info(f"Applying migration version {version}")
156
+ try:
157
+ migration_func(cursor)
158
+ cursor.execute('INSERT INTO schema_version (version) VALUES (?)', (version,))
159
+ conn.commit()
160
+ logger.info(f"✅ Migration version {version} completed successfully")
161
+ except Exception as e:
162
+ logger.error(f"❌ Migration version {version} failed: {e}")
163
+ conn.rollback()
164
+ # For development, we'll continue with a simplified approach
165
+ if version <= 2: # Critical migrations
166
+ raise
167
+ else: # Optional migrations can be skipped
168
+ logger.warning(f"Skipping optional migration {version}")
169
+ continue
170
+
171
+ except Exception as e:
172
+ logger.error(f"Schema migration failed: {e}")
173
+ # For existing databases, try to create a basic working schema
174
+ try:
175
+ create_basic_working_schema(cursor)
176
+ conn.commit()
177
+ logger.info("✅ Created basic working schema as fallback")
178
+ except Exception as fallback_error:
179
+ logger.error(f"Fallback schema creation failed: {fallback_error}")
180
+ raise e
181
+
182
+ def create_basic_working_schema(cursor: sqlite3.Cursor):
183
+ """Create a basic working schema for existing databases"""
184
+ # Check what exists and create missing tables
185
+ cursor.execute("SELECT name FROM sqlite_master WHERE type='table'")
186
+ existing_tables = [row[0] for row in cursor.fetchall()]
187
+
188
+ if 'analysis_results' not in existing_tables:
189
+ cursor.execute('''
190
+ CREATE TABLE analysis_results (
191
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
192
+ resume_filename TEXT NOT NULL,
193
+ jd_filename TEXT NOT NULL,
194
+ final_score REAL DEFAULT 0,
195
+ verdict TEXT DEFAULT 'Unknown',
196
+ hard_match_score REAL DEFAULT 0,
197
+ semantic_score REAL DEFAULT 0,
198
+ matched_skills TEXT DEFAULT '[]',
199
+ missing_skills TEXT DEFAULT '[]',
200
+ full_result TEXT DEFAULT '{}',
201
+ processing_time REAL DEFAULT 0,
202
+ analysis_mode TEXT DEFAULT 'standard',
203
+ created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
204
+ updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
205
+ )
206
+ ''')
207
+ else:
208
+ # Add missing columns to existing table
209
+ cursor.execute("PRAGMA table_info(analysis_results)")
210
+ existing_columns = {info[1] for info in cursor.fetchall()}
211
+
212
+ columns_to_add = [
213
+ ('hard_match_score', 'REAL DEFAULT 0'),
214
+ ('semantic_score', 'REAL DEFAULT 0'),
215
+ ('matched_skills', 'TEXT DEFAULT "[]"'),
216
+ ('missing_skills', 'TEXT DEFAULT "[]"'),
217
+ ('full_result', 'TEXT DEFAULT "{}"'),
218
+ ('processing_time', 'REAL DEFAULT 0'),
219
+ ('analysis_mode', 'TEXT DEFAULT "standard"'),
220
+ ('created_at', 'TIMESTAMP DEFAULT CURRENT_TIMESTAMP'),
221
+ ('updated_at', 'TIMESTAMP DEFAULT CURRENT_TIMESTAMP')
222
+ ]
223
+
224
+ for column_name, column_def in columns_to_add:
225
+ if column_name not in existing_columns:
226
+ try:
227
+ cursor.execute(f'ALTER TABLE analysis_results ADD COLUMN {column_name} {column_def}')
228
+ logger.info(f"Added column: {column_name}")
229
+ except sqlite3.OperationalError as e:
230
+ if "duplicate column name" not in str(e).lower():
231
+ logger.warning(f"Could not add column {column_name}: {e}")
232
+
233
+ # Create other essential tables
234
+ if 'analytics_summary' not in existing_tables:
235
+ cursor.execute('''
236
+ CREATE TABLE analytics_summary (
237
+ id INTEGER PRIMARY KEY DEFAULT 1,
238
+ total_analyses INTEGER DEFAULT 0,
239
+ avg_score REAL DEFAULT 0,
240
+ high_matches INTEGER DEFAULT 0,
241
+ medium_matches INTEGER DEFAULT 0,
242
+ low_matches INTEGER DEFAULT 0,
243
+ last_updated TIMESTAMP DEFAULT CURRENT_TIMESTAMP
244
+ )
245
+ ''')
246
+ cursor.execute('INSERT OR IGNORE INTO analytics_summary (id) VALUES (1)')
247
+
248
+ def create_initial_schema(cursor: sqlite3.Cursor):
249
+ """Initial database schema creation"""
250
+ cursor.execute('''
251
+ CREATE TABLE IF NOT EXISTS analysis_results (
252
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
253
+ resume_filename TEXT NOT NULL,
254
+ jd_filename TEXT NOT NULL,
255
+ final_score REAL DEFAULT 0,
256
+ verdict TEXT DEFAULT 'Unknown',
257
+ created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
258
+ updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
259
+ )
260
+ ''')
261
+
262
+ cursor.execute('''
263
+ CREATE TABLE IF NOT EXISTS analytics_summary (
264
+ id INTEGER PRIMARY KEY DEFAULT 1,
265
+ total_analyses INTEGER DEFAULT 0,
266
+ avg_score REAL DEFAULT 0,
267
+ high_matches INTEGER DEFAULT 0,
268
+ medium_matches INTEGER DEFAULT 0,
269
+ low_matches INTEGER DEFAULT 0,
270
+ last_updated TIMESTAMP DEFAULT CURRENT_TIMESTAMP
271
+ )
272
+ ''')
273
+
274
+ cursor.execute('''
275
+ CREATE TABLE IF NOT EXISTS screening_tests (
276
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
277
+ test_id TEXT UNIQUE NOT NULL,
278
+ test_number INTEGER,
279
+ job_title TEXT,
280
+ company_name TEXT,
281
+ created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
282
+ total_candidates INTEGER DEFAULT 0,
283
+ qualified_candidates INTEGER DEFAULT 0,
284
+ status TEXT DEFAULT 'active'
285
+ )
286
+ ''')
287
+
288
+ # Insert default analytics row
289
+ cursor.execute('INSERT OR IGNORE INTO analytics_summary (id) VALUES (1)')
290
+
291
+ def add_enhanced_columns(cursor: sqlite3.Cursor):
292
+ """Add enhanced analysis columns - FIXED ORDER"""
293
+ # Check existing columns first
294
+ cursor.execute("PRAGMA table_info(analysis_results)")
295
+ existing_columns = {info[1] for info in cursor.fetchall()}
296
+
297
+ new_columns = [
298
+ ('hard_match_score', 'REAL DEFAULT 0'),
299
+ ('semantic_score', 'REAL DEFAULT 0'),
300
+ ('matched_skills', 'TEXT DEFAULT "[]"'),
301
+ ('missing_skills', 'TEXT DEFAULT "[]"'),
302
+ ('full_result', 'TEXT DEFAULT "{}"'),
303
+ ('processing_time', 'REAL DEFAULT 0'),
304
+ ('analysis_mode', 'TEXT DEFAULT "standard"')
305
+ ]
306
+
307
+ for column_name, column_def in new_columns:
308
+ if column_name not in existing_columns:
309
+ try:
310
+ cursor.execute(f'ALTER TABLE analysis_results ADD COLUMN {column_name} {column_def}')
311
+ logger.info(f"Added column: {column_name}")
312
+ except sqlite3.OperationalError as e:
313
+ if "duplicate column name" not in str(e).lower():
314
+ logger.warning(f"Could not add column {column_name}: {e}")
315
+
316
+ def create_indexes(cursor: sqlite3.Cursor):
317
+ """Create performance indexes - FIXED to ensure columns exist"""
318
+ # First, check what columns actually exist
319
+ cursor.execute("PRAGMA table_info(analysis_results)")
320
+ existing_columns = {info[1] for info in cursor.fetchall()}
321
+
322
+ # Only create indexes for columns that exist
323
+ potential_indexes = [
324
+ ('idx_id', 'analysis_results', 'id'),
325
+ ('idx_final_score', 'analysis_results', 'final_score'),
326
+ ('idx_verdict', 'analysis_results', 'verdict'),
327
+ ('idx_resume_filename', 'analysis_results', 'resume_filename'),
328
+ ('idx_jd_filename', 'analysis_results', 'jd_filename')
329
+ ]
330
+
331
+ # Add timestamp index only if column exists
332
+ if 'created_at' in existing_columns:
333
+ potential_indexes.append(('idx_created_at', 'analysis_results', 'created_at'))
334
+ potential_indexes.append(('idx_composite_score_date', 'analysis_results', 'final_score, created_at'))
335
+
336
+ for index_name, table_name, columns in potential_indexes:
337
+ try:
338
+ # Check if all columns in the index exist
339
+ index_columns = [col.strip() for col in columns.split(',')]
340
+ if all(col in existing_columns for col in index_columns):
341
+ cursor.execute(f'CREATE INDEX IF NOT EXISTS {index_name} ON {table_name}({columns})')
342
+ logger.debug(f"Created index: {index_name}")
343
+ else:
344
+ logger.warning(f"Skipping index {index_name} - required columns not found")
345
+ except sqlite3.OperationalError as e:
346
+ logger.warning(f"Could not create index {index_name}: {e}")
347
+
348
+ def add_performance_optimizations(cursor: sqlite3.Cursor):
349
+ """Add triggers and additional optimizations"""
350
+ try:
351
+ # Check if created_at and updated_at columns exist
352
+ cursor.execute("PRAGMA table_info(analysis_results)")
353
+ existing_columns = {info[1] for info in cursor.fetchall()}
354
+
355
+ if 'updated_at' in existing_columns:
356
+ # Update timestamp trigger
357
+ cursor.execute('''
358
+ CREATE TRIGGER IF NOT EXISTS update_analysis_timestamp
359
+ AFTER UPDATE ON analysis_results
360
+ FOR EACH ROW
361
+ BEGIN
362
+ UPDATE analysis_results
363
+ SET updated_at = datetime('now')
364
+ WHERE id = NEW.id;
365
+ END
366
+ ''')
367
+ logger.debug("Created update timestamp trigger")
368
+ except sqlite3.OperationalError as e:
369
+ logger.warning(f"Could not create performance optimizations: {e}")
370
+
371
+ @retry_on_db_error()
372
+ def init_database():
373
+ """Initialize database with enhanced error handling and logging"""
374
+ try:
375
+ with get_db_connection() as conn:
376
+ logger.info("Database initialized successfully")
377
+ return True
378
+
379
+ except Exception as e:
380
+ logger.error(f"Database initialization failed: {e}")
381
+ # Try to create a basic schema as fallback
382
+ try:
383
+ conn = sqlite3.connect(config.db_path, timeout=config.timeout)
384
+ cursor = conn.cursor()
385
+ create_basic_working_schema(cursor)
386
+ conn.commit()
387
+ conn.close()
388
+ logger.info("✅ Created fallback database schema")
389
+ return True
390
+ except Exception as fallback_error:
391
+ logger.error(f"Fallback database creation failed: {fallback_error}")
392
+ raise e
393
+
394
+ @retry_on_db_error()
395
+ def save_analysis_result(analysis_data: dict, resume_filename: str, jd_filename: str) -> bool:
396
+ """Enhanced save operation with better data extraction and validation"""
397
+ try:
398
+ with get_db_connection() as conn:
399
+ cursor = conn.cursor()
400
+
401
+ # Extract and validate data
402
+ extracted_data = _extract_analysis_data(analysis_data)
403
+ processing_time = analysis_data.get('processing_info', {}).get('processing_time', 0)
404
+ analysis_mode = 'enhanced' if 'enhanced_analysis' in analysis_data else 'standard'
405
+
406
+ # Check what columns exist before inserting
407
+ cursor.execute("PRAGMA table_info(analysis_results)")
408
+ existing_columns = {info[1] for info in cursor.fetchall()}
409
+
410
+ # Base columns that should always exist
411
+ base_columns = ['resume_filename', 'jd_filename', 'final_score', 'verdict']
412
+ base_values = [
413
+ str(resume_filename),
414
+ str(jd_filename),
415
+ extracted_data['final_score'],
416
+ extracted_data['verdict']
417
+ ]
418
+
419
+ # Add optional columns if they exist
420
+ optional_columns = [
421
+ ('hard_match_score', extracted_data['hard_match_score']),
422
+ ('semantic_score', extracted_data['semantic_score']),
423
+ ('matched_skills', json.dumps(extracted_data['matched_skills'])),
424
+ ('missing_skills', json.dumps(extracted_data['missing_skills'])),
425
+ ('full_result', json.dumps(analysis_data)),
426
+ ('processing_time', processing_time),
427
+ ('analysis_mode', analysis_mode),
428
+ ('created_at', 'datetime("now")'),
429
+ ('updated_at', 'datetime("now")')
430
+ ]
431
+
432
+ additional_columns = []
433
+ additional_values = []
434
+
435
+ for col_name, col_value in optional_columns:
436
+ if col_name in existing_columns:
437
+ additional_columns.append(col_name)
438
+ if col_name in ['created_at', 'updated_at']:
439
+ additional_values.append('datetime("now")')
440
+ else:
441
+ additional_values.append('?')
442
+ base_values.append(col_value)
443
+
444
+ all_columns = base_columns + additional_columns
445
+
446
+ # Build the INSERT query
447
+ placeholders = ['?'] * len(base_columns) + additional_values
448
+ query = f'''
449
+ INSERT INTO analysis_results ({', '.join(all_columns)})
450
+ VALUES ({', '.join(placeholders)})
451
+ '''
452
+
453
+ cursor.execute(query, base_values)
454
+ conn.commit()
455
+
456
+ # Update analytics asynchronously
457
+ _update_analytics_async(conn)
458
+
459
+ logger.info(f"Analysis result saved: {resume_filename} - Score: {extracted_data['final_score']}")
460
+ return True
461
+
462
+ except Exception as e:
463
+ logger.error(f"Error saving analysis result: {e}")
464
+ return False
465
+
466
+ def _extract_analysis_data(analysis_data: dict) -> Dict[str, Any]:
467
+ """Extract and normalize analysis data from different formats"""
468
+ default_data = {
469
+ 'final_score': 0.0,
470
+ 'verdict': 'Analysis Completed',
471
+ 'hard_match_score': 0.0,
472
+ 'semantic_score': 0.0,
473
+ 'matched_skills': [],
474
+ 'missing_skills': []
475
+ }
476
+
477
+ try:
478
+ # Enhanced analysis format
479
+ if 'enhanced_analysis' in analysis_data and 'relevance_scoring' in analysis_data['enhanced_analysis']:
480
+ scoring = analysis_data['enhanced_analysis']['relevance_scoring']
481
+ return {
482
+ 'final_score': float(scoring.get('overall_score', 0)),
483
+ 'verdict': str(scoring.get('fit_verdict', 'Unknown')),
484
+ 'hard_match_score': float(scoring.get('skill_match_score', 0)),
485
+ 'semantic_score': float(scoring.get('experience_match_score', 0)),
486
+ 'matched_skills': list(scoring.get('matched_must_have', [])),
487
+ 'missing_skills': list(scoring.get('missing_must_have', []))
488
+ }
489
+
490
+ # Standard analysis format
491
+ elif 'relevance_analysis' in analysis_data:
492
+ relevance = analysis_data['relevance_analysis']
493
+ output = analysis_data.get('output_generation', {})
494
+
495
+ return {
496
+ 'final_score': float(relevance['step_3_scoring_verdict']['final_score']),
497
+ 'verdict': str(output.get('verdict', 'Unknown')),
498
+ 'hard_match_score': float(relevance['step_1_hard_match']['coverage_score']),
499
+ 'semantic_score': float(relevance['step_2_semantic_match']['experience_alignment_score']),
500
+ 'matched_skills': list(relevance['step_1_hard_match'].get('matched_skills', [])),
501
+ 'missing_skills': list(output.get('missing_skills', []))
502
+ }
503
+
504
+ return default_data
505
+
506
+ except Exception as e:
507
+ logger.warning(f"Error extracting analysis data, using defaults: {e}")
508
+ return default_data
509
+
510
+ def _update_analytics_async(conn: sqlite3.Connection):
511
+ """Update analytics in a non-blocking way"""
512
+ try:
513
+ update_analytics_summary_internal(conn)
514
+ except Exception as e:
515
+ logger.warning(f"Analytics update failed (non-critical): {e}")
516
+
517
+ @retry_on_db_error()
518
+ def get_analysis_history(limit: int = 50, offset: int = 0) -> List[AnalysisResult]:
519
+ """Enhanced history retrieval with pagination and performance optimization"""
520
+ try:
521
+ with get_db_connection() as conn:
522
+ cursor = conn.cursor()
523
+
524
+ # Check what columns exist
525
+ cursor.execute("PRAGMA table_info(analysis_results)")
526
+ existing_columns = {info[1] for info in cursor.fetchall()}
527
+
528
+ # Build query based on available columns
529
+ base_columns = ['id', 'resume_filename', 'jd_filename', 'final_score', 'verdict']
530
+ optional_columns = ['created_at', 'matched_skills', 'missing_skills', 'hard_match_score', 'semantic_score']
531
+
532
+ select_columns = base_columns[:]
533
+ for col in optional_columns:
534
+ if col in existing_columns:
535
+ select_columns.append(col)
536
+
537
+ # Use appropriate ORDER BY
538
+ order_column = 'created_at' if 'created_at' in existing_columns else 'id'
539
+
540
+ query = f'''
541
+ SELECT {', '.join(select_columns)}
542
+ FROM analysis_results
543
+ ORDER BY {order_column} DESC
544
+ LIMIT ? OFFSET ?
545
+ '''
546
+
547
+ cursor.execute(query, (limit, offset))
548
+
549
+ results = []
550
+ for row in cursor.fetchall():
551
+ try:
552
+ # Map values to column names
553
+ row_dict = dict(zip(select_columns, row))
554
+
555
+ # Handle timestamp
556
+ if 'created_at' in row_dict and row_dict['created_at']:
557
+ timestamp = _parse_timestamp(row_dict['created_at'])
558
+ else:
559
+ timestamp = datetime.now(timezone.utc)
560
+
561
+ result = AnalysisResult(
562
+ id=row_dict['id'],
563
+ resume_filename=str(row_dict.get('resume_filename', 'Unknown')),
564
+ jd_filename=str(row_dict.get('jd_filename', 'Unknown')),
565
+ final_score=float(row_dict.get('final_score', 0)),
566
+ verdict=str(row_dict.get('verdict', 'Unknown')),
567
+ timestamp=timestamp,
568
+ matched_skills=row_dict.get('matched_skills', '[]'),
569
+ missing_skills=row_dict.get('missing_skills', '[]'),
570
+ hard_match_score=float(row_dict.get('hard_match_score', row_dict.get('final_score', 0))),
571
+ semantic_score=float(row_dict.get('semantic_score', row_dict.get('final_score', 0)))
572
+ )
573
+ results.append(result)
574
+
575
+ except Exception as row_error:
576
+ logger.warning(f"Skipping malformed row: {row_error}")
577
+ continue
578
+
579
+ logger.info(f"Retrieved {len(results)} analysis results from history")
580
+ return results
581
+
582
+ except Exception as e:
583
+ logger.error(f"Error getting analysis history: {e}")
584
+ return []
585
+
586
+ def _parse_timestamp(timestamp_str: str) -> datetime:
587
+ """Parse timestamp with multiple format support"""
588
+ if not timestamp_str:
589
+ return datetime.now(timezone.utc)
590
+
591
+ formats = [
592
+ '%Y-%m-%d %H:%M:%S',
593
+ '%Y-%m-%d %H:%M:%S.%f',
594
+ '%Y-%m-%dT%H:%M:%S',
595
+ '%Y-%m-%dT%H:%M:%S.%f',
596
+ '%Y-%m-%dT%H:%M:%S.%fZ'
597
+ ]
598
+
599
+ for fmt in formats:
600
+ try:
601
+ return datetime.strptime(str(timestamp_str), fmt)
602
+ except ValueError:
603
+ continue
604
+
605
+ logger.warning(f"Could not parse timestamp: {timestamp_str}")
606
+ return datetime.now(timezone.utc)
607
+
608
+ @retry_on_db_error()
609
+ def get_analytics_summary() -> Dict[str, Any]:
610
+ """Enhanced analytics with better error handling and caching"""
611
+ try:
612
+ with get_db_connection() as conn:
613
+ cursor = conn.cursor()
614
+
615
+ # Get comprehensive analytics in a single transaction
616
+ cursor.execute('''
617
+ SELECT
618
+ COUNT(*) as total_analyses,
619
+ COALESCE(AVG(final_score), 0) as avg_score,
620
+ COUNT(CASE WHEN final_score >= 80 THEN 1 END) as high_matches,
621
+ COUNT(CASE WHEN final_score >= 60 AND final_score < 80 THEN 1 END) as medium_matches,
622
+ COUNT(CASE WHEN final_score < 60 AND final_score > 0 THEN 1 END) as low_matches
623
+ FROM analysis_results
624
+ ''')
625
+
626
+ result = cursor.fetchone()
627
+
628
+ total_analyses = result[0] or 0
629
+ avg_score = round(float(result[1] or 0), 1)
630
+ high_matches = result[2] or 0
631
+ medium_matches = result[3] or 0
632
+ low_matches = result[4] or 0
633
+
634
+ # Calculate success rate
635
+ success_rate = 0.0
636
+ if total_analyses > 0:
637
+ success_rate = round(((high_matches + medium_matches) / total_analyses) * 100, 1)
638
+
639
+ analytics = {
640
+ 'total_analyses': total_analyses,
641
+ 'avg_score': avg_score,
642
+ 'high_matches': high_matches,
643
+ 'medium_matches': medium_matches,
644
+ 'low_matches': low_matches,
645
+ 'success_rate': success_rate,
646
+ 'generated_at': datetime.now(timezone.utc).isoformat()
647
+ }
648
+
649
+ logger.info(f"Analytics summary generated: {total_analyses} analyses, {avg_score}% avg score")
650
+ return analytics
651
+
652
+ except Exception as e:
653
+ logger.error(f"Error getting analytics summary: {e}")
654
+ return {
655
+ 'total_analyses': 0,
656
+ 'avg_score': 0.0,
657
+ 'high_matches': 0,
658
+ 'medium_matches': 0,
659
+ 'low_matches': 0,
660
+ 'success_rate': 0.0,
661
+ 'error': str(e)
662
+ }
663
+
664
+ def update_analytics_summary():
665
+ """Public method to update analytics summary"""
666
+ try:
667
+ with get_db_connection() as conn:
668
+ update_analytics_summary_internal(conn)
669
+ except Exception as e:
670
+ logger.error(f"Error updating analytics summary: {e}")
671
+
672
+ def update_analytics_summary_internal(conn: sqlite3.Connection):
673
+ """Internal analytics update with optimized queries"""
674
+ try:
675
+ cursor = conn.cursor()
676
+
677
+ # Get analytics in a single query
678
+ cursor.execute('''
679
+ SELECT
680
+ COUNT(*) as total,
681
+ COALESCE(AVG(final_score), 0) as avg_score,
682
+ COUNT(CASE WHEN final_score >= 80 THEN 1 END) as high,
683
+ COUNT(CASE WHEN final_score >= 60 AND final_score < 80 THEN 1 END) as medium,
684
+ COUNT(CASE WHEN final_score < 60 AND final_score > 0 THEN 1 END) as low
685
+ FROM analysis_results
686
+ ''')
687
+
688
+ result = cursor.fetchone()
689
+ total, avg_score, high, medium, low = result
690
+
691
+ # Check if analytics_summary table exists
692
+ cursor.execute("SELECT name FROM sqlite_master WHERE type='table' AND name='analytics_summary'")
693
+ if cursor.fetchone():
694
+ cursor.execute('''
695
+ UPDATE analytics_summary
696
+ SET total_analyses = ?, avg_score = ?, high_matches = ?,
697
+ medium_matches = ?, low_matches = ?, last_updated = datetime('now')
698
+ WHERE id = 1
699
+ ''', (total, round(avg_score, 1), high, medium, low))
700
+
701
+ conn.commit()
702
+ logger.debug(f"Analytics updated: {total} total analyses")
703
+
704
+ except Exception as e:
705
+ logger.error(f"Error updating analytics summary internally: {e}")
706
+
707
+ def get_recent_analyses(limit: int = 10) -> List[Dict[str, Any]]:
708
+ """Enhanced recent analyses with better formatting"""
709
+ try:
710
+ results = get_analysis_history(limit)
711
+
712
+ return [
713
+ {
714
+ "id": result.id,
715
+ "resume": result.resume_filename,
716
+ "job_description": result.jd_filename,
717
+ "score": result.final_score,
718
+ "verdict": result.verdict,
719
+ "date": result.timestamp.strftime("%Y-%m-%d %H:%M") if hasattr(result.timestamp, 'strftime') else str(result.timestamp),
720
+ "matched_skills": result.matched_skills,
721
+ "missing_skills": result.missing_skills,
722
+ "hard_match_score": result.hard_match_score,
723
+ "semantic_score": result.semantic_score
724
+ }
725
+ for result in results
726
+ ]
727
+
728
+ except Exception as e:
729
+ logger.error(f"Error getting recent analyses: {e}")
730
+ return []
731
+
732
+ def backup_database(backup_path: Optional[str] = None) -> bool:
733
+ """Create database backup"""
734
+ if not config.backup_enabled:
735
+ return True
736
+
737
+ try:
738
+ backup_path = backup_path or f"{config.db_path}.backup.{datetime.now().strftime('%Y%m%d_%H%M%S')}"
739
+
740
+ with get_db_connection() as source:
741
+ backup = sqlite3.connect(backup_path)
742
+ source.backup(backup)
743
+ backup.close()
744
+
745
+ logger.info(f"Database backed up to: {backup_path}")
746
+ return True
747
+
748
+ except Exception as e:
749
+ logger.error(f"Database backup failed: {e}")
750
+ return False
751
+
752
+ def get_database_stats() -> Dict[str, Any]:
753
+ """Get comprehensive database statistics"""
754
+ try:
755
+ with get_db_connection() as conn:
756
+ cursor = conn.cursor()
757
+
758
+ # Get table sizes
759
+ cursor.execute("SELECT COUNT(*) FROM analysis_results")
760
+ analysis_count = cursor.fetchone()[0]
761
+
762
+ # Get database file size
763
+ db_size = Path(config.db_path).stat().st_size if Path(config.db_path).exists() else 0
764
+
765
+ # Get date range if created_at exists
766
+ cursor.execute("PRAGMA table_info(analysis_results)")
767
+ existing_columns = {info[1] for info in cursor.fetchall()}
768
+
769
+ date_range = (None, None)
770
+ if 'created_at' in existing_columns:
771
+ cursor.execute("SELECT MIN(created_at), MAX(created_at) FROM analysis_results")
772
+ date_range = cursor.fetchone()
773
+
774
+ return {
775
+ "database_path": config.db_path,
776
+ "database_size_bytes": db_size,
777
+ "database_size_mb": round(db_size / (1024 * 1024), 2),
778
+ "analysis_results_count": analysis_count,
779
+ "earliest_record": date_range[0],
780
+ "latest_record": date_range[1],
781
+ "wal_enabled": config.enable_wal,
782
+ "backup_enabled": config.backup_enabled
783
+ }
784
+
785
+ except Exception as e:
786
+ logger.error(f"Error getting database stats: {e}")
787
+ return {"error": str(e)}
788
+
789
+ def repair_database():
790
+ """Enhanced database repair with integrity checking"""
791
+ try:
792
+ with get_db_connection() as conn:
793
+ cursor = conn.cursor()
794
+
795
+ logger.info("Starting database repair and optimization...")
796
+
797
+ # Check integrity
798
+ cursor.execute('PRAGMA integrity_check')
799
+ integrity_result = cursor.fetchall()
800
+
801
+ if len(integrity_result) == 1 and integrity_result[0][0] == 'ok':
802
+ logger.info("✅ Database integrity check passed")
803
+ else:
804
+ logger.warning(f"⚠️ Database integrity issues found: {integrity_result}")
805
+ return False
806
+
807
+ # Vacuum database
808
+ logger.info("Vacuuming database...")
809
+ cursor.execute('VACUUM')
810
+
811
+ # Analyze for query optimization
812
+ logger.info("Analyzing database for optimization...")
813
+ cursor.execute('ANALYZE')
814
+
815
+ # Update statistics
816
+ cursor.execute('PRAGMA optimize')
817
+
818
+ logger.info("✅ Database repair and optimization completed")
819
+ return True
820
+
821
+ except Exception as e:
822
+ logger.error(f"❌ Database repair failed: {e}")
823
+ return False
824
+
825
+ def test_database() -> bool:
826
+ """Comprehensive database testing suite"""
827
+ logger.info("🧪 Starting comprehensive database tests...")
828
+
829
+ try:
830
+ # Test 1: Initialization
831
+ init_database()
832
+ logger.info("✅ Database initialization test passed")
833
+
834
+ # Test 2: Save operations
835
+ test_data = {
836
+ 'enhanced_analysis': {
837
+ 'relevance_scoring': {
838
+ 'overall_score': 85.5,
839
+ 'fit_verdict': 'High Suitability',
840
+ 'skill_match_score': 90.0,
841
+ 'experience_match_score': 80.5,
842
+ 'matched_must_have': ['Python', 'JavaScript', 'React'],
843
+ 'missing_must_have': ['Node.js', 'Docker']
844
+ }
845
+ },
846
+ 'processing_info': {'processing_time': 2.5, 'enhanced_features': True}
847
+ }
848
+
849
+ success = save_analysis_result(test_data, "test_resume.pdf", "test_job.pdf")
850
+ if not success:
851
+ raise Exception("Save test failed")
852
+ logger.info("✅ Save operation test passed")
853
+
854
+ # Test 3: Retrieval operations
855
+ history = get_analysis_history(10)
856
+ logger.info(f"✅ History retrieval test passed ({len(history)} records)")
857
+
858
+ # Test 4: Analytics
859
+ analytics = get_analytics_summary()
860
+ logger.info("✅ Analytics test passed")
861
+
862
+ logger.info("🎉 All database tests completed successfully!")
863
+ return True
864
+
865
+ except Exception as e:
866
+ logger.error(f"❌ Database tests failed: {e}")
867
+ return False
868
+
869
+ # Production initialization with better error handling
870
+ def initialize_production_db():
871
+ """Initialize database for production environment"""
872
+ try:
873
+ logger.info("Initializing production database...")
874
+
875
+ # Create database with proper setup
876
+ init_database()
877
+
878
+ # Create backup if enabled
879
+ if config.backup_enabled:
880
+ backup_database()
881
+
882
+ # Run integrity check
883
+ repair_database()
884
+
885
+ # Log statistics
886
+ stats = get_database_stats()
887
+ logger.info(f"Database ready - Size: {stats.get('database_size_mb', 0)}MB, Records: {stats.get('analysis_results_count', 0)}")
888
+
889
+ return True
890
+
891
+ except Exception as e:
892
+ logger.error(f"Production database initialization failed: {e}")
893
+ return False
894
+
895
+ # Auto-initialize for production
896
+ if config.db_path and not os.getenv('DISABLE_AUTO_INIT', '').lower() == 'true':
897
+ try:
898
+ initialize_production_db()
899
+ logger.info("🚀 Production database module loaded and initialized")
900
+ except Exception as e:
901
+ logger.error(f"⚠️ Database initialization warning: {e}")
902
+
903
+ if __name__ == "__main__":
904
+ test_database()
demo_prep.md ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Hackathon Demo - Automated Resume Relevance Check System
2
+
3
+ ## 30-Second Elevator Pitch
4
+ "I built an AI-powered resume screening system that goes beyond simple keyword matching. It uses semantic embeddings, fuzzy matching, and NLP to provide intelligent analysis and actionable recommendations."
5
+
6
+ ## Key Demo Points (2 minutes)
7
+
8
+ ### 1. Problem Statement
9
+ - Current ATS systems miss qualified candidates
10
+ - Only basic keyword matching
11
+ - No actionable feedback for improvement
12
+
13
+ ### 2. Our Solution - Advanced AI Stack
14
+ - **Semantic Matching**: Understanding context, not just keywords
15
+ - **Fuzzy Matching**: Catches variations (JS vs JavaScript)
16
+ - **NLP Entity Extraction**: Extracts experience, education, skills
17
+ - **LLM Analysis**: Provides human-like insights
18
+ - **Comprehensive Scoring**: Multi-factor weighted algorithm
19
+
20
+ ### 3. Live Demo Flow
21
+ 1. Upload sample resume (show file upload)
22
+ 2. Upload job description
23
+ 3. Click analyze (show progress bar)
24
+ 4. Results breakdown:
25
+ - Final Score: 78/100
26
+ - Hard Match: 65% (TF-IDF + keywords)
27
+ - Semantic Match: 8/10 (AI understanding)
28
+ - Missing Skills: Docker, Kubernetes
29
+ - AI Recommendations: Specific next steps
30
+
31
+ ### 4. Business Value
32
+ - **For Companies**: Better candidate screening, reduce false negatives
33
+ - **For Students**: Clear improvement roadmap, skill gap analysis
34
+ - **For Placement Teams**: Data-driven decisions, automated screening
35
+
36
+ ### 5. Technical Highlights
37
+ - Modern tech stack (FastAPI, Streamlit, AI/ML)
38
+ - Scalable architecture (API-first design)
39
+ - Real-time analysis with progress tracking
40
+ - Exportable reports
main.py ADDED
@@ -0,0 +1,639 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # main.py - COMPLETE WITH LANGGRAPH + LANGSMITH
2
+ import os
3
+ from dotenv import load_dotenv
4
+
5
+ import time
6
+ # Load environment variables
7
+ load_dotenv()
8
+
9
+ # --- Configuration for OpenRouter ---
10
+ LLM_MODEL = "x-ai/grok-4-fast:free" # Updated model name
11
+
12
+ # Set environment variables for the OpenAI client to use OpenRouter
13
+ os.environ["OPENAI_BASE_URL"] = "https://openrouter.ai/api/v1"
14
+ os.environ["OPENAI_API_KEY"] = os.getenv("OPENROUTER_API_KEY", "")
15
+
16
+ # Import all modules - ENHANCED WITH NEW COMPONENTS
17
+ from parsers.pdf_parser import extract_text_pymupdf
18
+ from parsers.docx_parser import extract_text_docx
19
+ from parsers.cleaner import clean_text
20
+ from parsers.section_splitter import split_sections
21
+ from parsers.skill_extractor import extract_skills
22
+ from parsers.jd_parser import parse_jd
23
+ from llm_analysis.llm_analyzer import LLMResumeAnalyzer, test_llm_connection
24
+
25
+ # ENHANCED COMPONENTS
26
+ try:
27
+ from matchers.final_scorer import EnhancedResumeScorer
28
+ ENHANCED_SCORING = True
29
+ print("✅ Enhanced scoring components loaded")
30
+ except ImportError:
31
+ print("⚠️ Enhanced components not found, using basic scoring")
32
+ ENHANCED_SCORING = False
33
+
34
+ # LANGGRAPH & LANGSMITH COMPONENTS
35
+ try:
36
+ from llm_analysis.langgraph_pipeline import ResumeAnalysisPipeline
37
+ from llm_analysis.langsmith_logger import logger, trace_llm_analysis
38
+ ADVANCED_PIPELINE = True
39
+ print("✅ LangGraph + LangSmith components loaded")
40
+ except ImportError:
41
+ print("⚠️ LangGraph/LangSmith not found - install with: pip install langgraph langsmith")
42
+ ADVANCED_PIPELINE = False
43
+
44
+ def load_file(file_path):
45
+ """Load text from various file formats"""
46
+ if file_path.endswith(".pdf"):
47
+ return extract_text_pymupdf(file_path)
48
+ elif file_path.endswith(".docx"):
49
+ return extract_text_docx(file_path)
50
+ elif file_path.endswith(".txt"):
51
+ with open(file_path, 'r', encoding='utf-8') as f:
52
+ return f.read()
53
+ else:
54
+ raise ValueError("Unsupported file format")
55
+
56
+ def calculate_basic_scores(resume_skills, jd_skills):
57
+ """Calculate basic matching scores (fallback)"""
58
+ if not jd_skills:
59
+ return {"score": 0, "matched_skills": [], "missing_skills": [], "matched_count": 0, "total_jd_skills": 0}
60
+
61
+ matched_skills = list(set(resume_skills) & set(jd_skills))
62
+ missing_skills = list(set(jd_skills) - set(resume_skills))
63
+
64
+ coverage_score = len(matched_skills) / len(jd_skills) * 100
65
+
66
+ return {
67
+ "score": round(coverage_score, 2),
68
+ "matched_skills": matched_skills,
69
+ "missing_skills": missing_skills,
70
+ "matched_count": len(matched_skills),
71
+ "total_jd_skills": len(jd_skills)
72
+ }
73
+
74
+ @trace_llm_analysis if ADVANCED_PIPELINE else lambda x: x # LangSmith tracing decorator
75
+ def complete_ai_analysis(resume_file, jd_file):
76
+ """Complete AI-powered resume analysis with LangGraph + LangSmith"""
77
+
78
+ print("🚀 STARTING ENHANCED AI-POWERED RESUME ANALYSIS")
79
+ if ADVANCED_PIPELINE:
80
+ print(" 🔗 LangGraph: Structured pipeline")
81
+ print(" 🔍 LangSmith: Observability & logging")
82
+ print("=" * 65)
83
+
84
+ # Start LangSmith trace
85
+ trace_id = None
86
+ if ADVANCED_PIPELINE:
87
+ trace_id = logger.start_trace("complete_resume_analysis", {
88
+ "resume_file": resume_file,
89
+ "jd_file": jd_file
90
+ })
91
+
92
+ # Test LLM connection first
93
+ if not test_llm_connection():
94
+ print("⚠️ LLM connection failed, continuing with mock analysis...")
95
+
96
+ try:
97
+ # Initialize components
98
+ print("\n🔧 INITIALIZING ENHANCED COMPONENTS...")
99
+ llm_analyzer = LLMResumeAnalyzer(model=LLM_MODEL)
100
+
101
+ # LangGraph pipeline
102
+ if ADVANCED_PIPELINE:
103
+ pipeline = ResumeAnalysisPipeline(model=LLM_MODEL)
104
+ print("✅ LangGraph pipeline initialized")
105
+
106
+ if ENHANCED_SCORING:
107
+ enhanced_scorer = EnhancedResumeScorer()
108
+ print("✅ Enhanced scorer with semantic matching, fuzzy matching, and NLP entities")
109
+ else:
110
+ enhanced_scorer = None
111
+ print("⚠️ Using basic scoring (install enhanced components for full tech stack)")
112
+
113
+ # Step 1: Load and parse files
114
+ print("\n📄 LOADING FILES...")
115
+ resume_raw = load_file(resume_file)
116
+ jd_raw = load_file(jd_file)
117
+ print(f"✅ Resume loaded: {len(resume_raw)} chars")
118
+ print(f"✅ JD loaded: {len(jd_raw)} chars")
119
+
120
+ # Step 2: Process resume
121
+ print("\n🔍 PROCESSING RESUME...")
122
+ resume_clean = clean_text(resume_raw)
123
+ resume_sections = split_sections(resume_clean)
124
+ resume_skills = extract_skills(" ".join(resume_sections.values()))
125
+ print(f"✅ Resume sections: {list(resume_sections.keys())}")
126
+ print(f"✅ Resume skills found: {len(resume_skills)}")
127
+
128
+ # Step 3: Process JD
129
+ print("\n🔍 PROCESSING JOB DESCRIPTION...")
130
+ jd_data = parse_jd(jd_raw)
131
+ jd_skills = jd_data["skills"]
132
+ print(f"✅ JD role: {jd_data['role']}")
133
+ print(f"✅ JD skills found: {len(jd_skills)}")
134
+
135
+ # Step 4: ENHANCED COMPREHENSIVE SCORING
136
+ if ENHANCED_SCORING:
137
+ print("\n🧮 RUNNING COMPREHENSIVE ANALYSIS...")
138
+ print(" 🔍 Hard Match: TF-IDF + keyword matching")
139
+ print(" 🧠 Semantic Match: Embeddings + cosine similarity")
140
+ print(" 🔄 Fuzzy Match: Skill variations + rapidfuzz")
141
+ print(" 📊 Entity Analysis: spaCy NLP + experience extraction")
142
+
143
+ comprehensive_result = enhanced_scorer.calculate_comprehensive_score(
144
+ {"raw_text": resume_clean, "skills": resume_skills},
145
+ {"raw_text": jd_raw, "skills": jd_skills}
146
+ )
147
+
148
+ basic_scores = {
149
+ "score": comprehensive_result["breakdown"]["hard_match"]["score"],
150
+ "matched_skills": comprehensive_result["breakdown"]["hard_match"]["matched_skills"],
151
+ "missing_skills": comprehensive_result["breakdown"]["hard_match"]["missing_skills"],
152
+ "matched_count": comprehensive_result["breakdown"]["hard_match"]["matched_count"],
153
+ "total_jd_skills": comprehensive_result["breakdown"]["hard_match"]["total_jd_skills"]
154
+ }
155
+
156
+ else:
157
+ # Fallback to basic scoring
158
+ print("\n⚙️ CALCULATING BASIC SCORES...")
159
+ basic_scores = calculate_basic_scores(resume_skills, jd_skills)
160
+ comprehensive_result = None
161
+ print(f"✅ Keyword match: {basic_scores['score']:.1f}%")
162
+ print(f"✅ Matched skills: {basic_scores['matched_count']}/{basic_scores['total_jd_skills']}")
163
+
164
+ # Step 5: LangGraph Structured Pipeline (if available)
165
+ if ADVANCED_PIPELINE:
166
+ print("\n🔗 RUNNING LANGGRAPH STRUCTURED PIPELINE...")
167
+ pipeline_result = pipeline.run_structured_analysis(resume_clean, jd_raw, basic_scores)
168
+
169
+ if pipeline_result.get("pipeline_status") == "completed":
170
+ llm_analysis = pipeline_result["llm_analysis"]
171
+ improvement_roadmap = pipeline_result["improvement_roadmap"]
172
+ print("✅ LangGraph pipeline completed successfully")
173
+ else:
174
+ print("⚠️ LangGraph pipeline failed, using fallback analysis")
175
+ llm_analysis = llm_analyzer.analyze_resume_vs_jd(resume_clean, jd_raw, basic_scores)
176
+ improvement_roadmap = llm_analyzer.generate_improvement_roadmap(llm_analysis)
177
+ else:
178
+ # Standard LLM Analysis
179
+ print("\n🧠 RUNNING LLM ANALYSIS...")
180
+ llm_analysis = llm_analyzer.analyze_resume_vs_jd(resume_clean, jd_raw, basic_scores)
181
+
182
+ print("\n🗺️ GENERATING IMPROVEMENT ROADMAP...")
183
+ improvement_roadmap = llm_analyzer.generate_improvement_roadmap(llm_analysis)
184
+
185
+ # Step 6: Display enhanced results
186
+ if ENHANCED_SCORING:
187
+ display_enhanced_results(comprehensive_result, llm_analysis, improvement_roadmap)
188
+ else:
189
+ display_structured_results(basic_scores, llm_analysis, improvement_roadmap, {})
190
+
191
+ # Log success metrics (LangSmith)
192
+ if ADVANCED_PIPELINE and trace_id:
193
+ logger.log_metrics({
194
+ "analysis_success": True,
195
+ "resume_length": len(resume_raw),
196
+ "jd_length": len(jd_raw),
197
+ "skills_found": len(resume_skills),
198
+ "pipeline_status": pipeline_result.get("pipeline_status", "fallback") if ADVANCED_PIPELINE else "standard",
199
+ "enhanced_scoring": ENHANCED_SCORING
200
+ })
201
+
202
+ logger.end_trace(trace_id, {
203
+ "pipeline_status": pipeline_result.get("pipeline_status", "fallback") if ADVANCED_PIPELINE else "standard",
204
+ "final_score": llm_analysis.get("overall_fit_score", 0)
205
+ }, "success")
206
+
207
+ except Exception as e:
208
+ print(f"❌ Analysis failed: {e}")
209
+
210
+ # Log error (LangSmith)
211
+ if ADVANCED_PIPELINE and trace_id:
212
+ logger.end_trace(trace_id, {}, "error", str(e))
213
+ logger.log_metrics({
214
+ "analysis_success": False,
215
+ "error": str(e)
216
+ })
217
+
218
+ import traceback
219
+ traceback.print_exc()
220
+
221
+ def display_enhanced_results(comprehensive_result, llm_analysis, roadmap):
222
+ """Display enhanced results with full tech stack analysis"""
223
+
224
+ print(f"\n{'='*75}")
225
+ print("🎯 Automated Resume Relevance Check Report (Enhanced)")
226
+ if ADVANCED_PIPELINE:
227
+ print(" 🔗 Powered by LangGraph + LangSmith")
228
+ print("=" * 75)
229
+
230
+ # Get breakdown
231
+ breakdown = comprehensive_result["breakdown"]
232
+ hard_match = breakdown["hard_match"]
233
+ semantic_match = breakdown["semantic_match"]
234
+ fuzzy_match = breakdown["fuzzy_match"]
235
+ entity_analysis = breakdown["entity_analysis"]
236
+
237
+ # RELEVANCE ANALYSIS - Enhanced 3 Steps
238
+ print(f"\n📋 RELEVANCE ANALYSIS (Enhanced with Full Tech Stack)")
239
+ print("-" * 60)
240
+
241
+ # Step 1: Enhanced Hard Match
242
+ print(f"\n🔍 STEP 1: ENHANCED HARD MATCH")
243
+ print(f" 📊 TF-IDF Similarity: {hard_match.get('tfidf_similarity', 0):.1f}%")
244
+ print(f" 🎯 Basic Coverage: {hard_match['basic_coverage']:.1f}%")
245
+ print(f" ⚖️ Combined Hard Score: {hard_match['score']:.1f}%")
246
+ print(f" ✅ Exact Matches: {hard_match['matched_count']}/{hard_match['total_jd_skills']} skills")
247
+ print(f" 🔄 Fuzzy Matches: {fuzzy_match['fuzzy_score']} additional skills")
248
+
249
+ # Display matched skills
250
+ if hard_match['matched_skills']:
251
+ print(f" 📝 Matched Skills: {', '.join(hard_match['matched_skills'][:8])}")
252
+ if len(hard_match['matched_skills']) > 8:
253
+ print(f" ... and {len(hard_match['matched_skills']) - 8} more")
254
+
255
+ # Display fuzzy matches
256
+ if fuzzy_match.get('match_details'):
257
+ print(f" 🔄 Fuzzy Matches Found:")
258
+ for match in fuzzy_match['match_details'][:3]:
259
+ print(f" • {match['jd_skill']} ↔ {match['resume_skill']} ({match['confidence']}%)")
260
+
261
+ # Step 2: Semantic Match with Embeddings
262
+ print(f"\n🧠 STEP 2: SEMANTIC MATCH (Embeddings + Cosine Similarity)")
263
+ print(f" 🤖 LLM Experience Score: {llm_analysis.get('overall_fit_score', 0)}/10")
264
+ print(f" 📊 Embedding Similarity: {semantic_match.get('semantic_score', 0):.1f}%")
265
+ print(f" 🔍 Context Understanding: {llm_analysis.get('experience_alignment', 'N/A')[:100]}...")
266
+
267
+ # Entity Analysis Results
268
+ print(f"\n📊 ENTITY ANALYSIS (spaCy NLP):")
269
+ if entity_analysis.get('experience_years', 0) > 0:
270
+ print(f" 💼 Experience Detected: {entity_analysis['experience_years']} years")
271
+ if entity_analysis.get('education', {}).get('degrees'):
272
+ print(f" 🎓 Education: {', '.join(entity_analysis['education']['degrees'])}")
273
+
274
+ # Step 3: Enhanced Scoring & Verdict
275
+ final_score = comprehensive_result["final_score"]
276
+ print(f"\n⚖️ STEP 3: ENHANCED SCORING & VERDICT")
277
+ print(f" 📐 Weighted Formula: Hard(40%) + Semantic(45%) + Fuzzy(10%) + Experience(3%) + Education(2%)")
278
+ print(f" 🎯 Component Scores:")
279
+ print(f" • Hard Match: {hard_match['score']:.1f}%")
280
+ print(f" • Semantic: {semantic_match.get('semantic_score', 0):.1f}%")
281
+ print(f" • Fuzzy Bonus: +{fuzzy_match['fuzzy_score'] * 3:.1f} points")
282
+ if entity_analysis.get('experience_years', 0) > 0:
283
+ print(f" • Experience Bonus: +{min(entity_analysis['experience_years'] * 2, 10):.1f} points")
284
+ print(f" 🏆 FINAL SCORE: {final_score}/100")
285
+
286
+ # OUTPUT GENERATION
287
+ print(f"\n📊 OUTPUT GENERATION")
288
+ print("-" * 50)
289
+
290
+ # Relevance Score
291
+ print(f"\n🎯 RELEVANCE SCORE: {final_score}/100")
292
+
293
+ # Enhanced Verdict
294
+ verdict = comprehensive_result["verdict"]
295
+ print(f"\n🏷️ VERDICT: {verdict}")
296
+
297
+ # Missing Skills Analysis
298
+ missing_skills = hard_match['missing_skills']
299
+ print(f"\n❌ MISSING SKILLS/REQUIREMENTS:")
300
+ for i, skill in enumerate(missing_skills[:8], 1):
301
+ print(f" {i}. {skill}")
302
+
303
+ # Critical Gaps from LLM
304
+ if llm_analysis.get('critical_gaps'):
305
+ print(f"\n⚠️ CRITICAL GAPS (LLM Analysis):")
306
+ for i, gap in enumerate(llm_analysis['critical_gaps'][:3], 1):
307
+ print(f" {i}. {gap}")
308
+
309
+ # Enhanced Recommendations
310
+ print(f"\n💡 ENHANCED SUGGESTIONS:")
311
+ recommendations = comprehensive_result.get("recommendations", [])
312
+
313
+ if roadmap and roadmap.get('immediate_actions'):
314
+ print(f"\n 📋 IMMEDIATE ACTIONS:")
315
+ for i, action in enumerate(roadmap['immediate_actions'][:3], 1):
316
+ print(f" {i}. {action}")
317
+
318
+ if roadmap and roadmap.get('priority_skills'):
319
+ print(f"\n 🎯 PRIORITY SKILLS TO LEARN:")
320
+ for i, skill in enumerate(roadmap['priority_skills'][:5], 1):
321
+ print(f" {i}. {skill}")
322
+
323
+ # Tech Stack Recommendations
324
+ if recommendations:
325
+ print(f"\n 🔧 TECH STACK RECOMMENDATIONS:")
326
+ for i, rec in enumerate(recommendations[:3], 1):
327
+ print(f" {i}. {rec}")
328
+
329
+ # Final LLM Verdict
330
+ print(f"\n📋 FINAL RECOMMENDATION:")
331
+ final_verdict = llm_analysis.get('final_verdict', 'Enhanced analysis completed successfully')
332
+ if len(final_verdict) > 200:
333
+ final_verdict = final_verdict[:200] + "..."
334
+ print(f" {final_verdict}")
335
+
336
+ # LangSmith Session Summary (if available)
337
+ if ADVANCED_PIPELINE:
338
+ print(f"\n🔍 LANGSMITH OBSERVABILITY:")
339
+ try:
340
+ session_summary = logger.get_session_summary()
341
+ print(f" 📊 Total Traces: {session_summary.get('total_traces', 0)}")
342
+ print(f" 📈 Total Metrics: {session_summary.get('total_metrics', 0)}")
343
+ print(f" 📁 Session ID: {session_summary.get('session_id', 'N/A')[:8]}...")
344
+ except:
345
+ print(f" 📊 Session data available in logs/ directory")
346
+
347
+ print(f"\n{'='*75}")
348
+
349
+ def display_structured_results(basic_scores, llm_analysis, roadmap, enhanced_skills):
350
+ """Fallback display for basic scoring (original function)"""
351
+
352
+ print(f"\n{'='*70}")
353
+ print("🎯 Automated Resume Relevance Check Report")
354
+ if ADVANCED_PIPELINE:
355
+ print(" 🔗 LangGraph + LangSmith Integration Active")
356
+ print("=" * 70)
357
+
358
+ # RELEVANCE ANALYSIS - 3 Steps
359
+ print(f"\n📋 RELEVANCE ANALYSIS")
360
+ print("-" * 50)
361
+
362
+ # Step 1: Hard Match
363
+ print(f"\n🔍 STEP 1: HARD MATCH (Keyword & Skill Check)")
364
+ print(f" • Exact Matches: {basic_scores['matched_count']}/{basic_scores['total_jd_skills']} skills")
365
+ print(f" • Coverage Score: {basic_scores['score']:.1f}%")
366
+ print(f" • Matched Skills: {', '.join(basic_scores['matched_skills'][:8])}")
367
+ if len(basic_scores['matched_skills']) > 8:
368
+ print(f" ... and {len(basic_scores['matched_skills']) - 8} more")
369
+
370
+ # Step 2: Semantic Match
371
+ experience_fit = llm_analysis.get('overall_fit_score', 0)
372
+ print(f"\n🧠 STEP 2: SEMANTIC MATCH (LLM Analysis)")
373
+ print(f" • Experience Alignment Score: {experience_fit}/10")
374
+ print(f" • Context Understanding: {llm_analysis.get('experience_alignment', 'N/A')[:100]}...")
375
+
376
+ # Step 3: Scoring & Verdict
377
+ hard_match_score = basic_scores['score']
378
+ semantic_score = experience_fit * 10 # Convert to percentage
379
+ final_score = (hard_match_score * 0.4) + (semantic_score * 0.6) # Weighted formula
380
+
381
+ print(f"\n⚖️ STEP 3: SCORING & VERDICT (Weighted Formula)")
382
+ print(f" • Formula: (Hard Match × 40%) + (Semantic Match × 60%)")
383
+ print(f" • Calculation: ({hard_match_score:.1f}% × 0.4) + ({semantic_score:.1f}% × 0.6)")
384
+ print(f" • Final Score: {final_score:.1f}/100")
385
+
386
+ # OUTPUT GENERATION
387
+ print(f"\n📊 OUTPUT GENERATION")
388
+ print("-" * 50)
389
+
390
+ # Relevance Score
391
+ print(f"\n🎯 RELEVANCE SCORE: {final_score:.0f}/100")
392
+
393
+ # Verdict
394
+ if final_score >= 80:
395
+ verdict = "🟢 HIGH SUITABILITY"
396
+ verdict_desc = "Strong candidate - Recommend for interview"
397
+ elif final_score >= 60:
398
+ verdict = "🟡 MEDIUM SUITABILITY"
399
+ verdict_desc = "Good potential - Consider with training"
400
+ else:
401
+ verdict = "🔴 LOW SUITABILITY"
402
+ verdict_desc = "Significant gaps - Major upskilling needed"
403
+
404
+ print(f"\n🏷️ VERDICT: {verdict}")
405
+ print(f" • Assessment: {verdict_desc}")
406
+
407
+ # Missing Skills/Projects/Certifications
408
+ print(f"\n❌ MISSING SKILLS/REQUIREMENTS:")
409
+ missing_items = basic_scores['missing_skills'][:8] # Top 8 missing
410
+ for i, item in enumerate(missing_items, 1):
411
+ print(f" {i}. {item}")
412
+
413
+ if llm_analysis.get('critical_gaps'):
414
+ print(f"\n⚠️ CRITICAL GAPS IDENTIFIED:")
415
+ for i, gap in enumerate(llm_analysis['critical_gaps'][:3], 1):
416
+ print(f" {i}. {gap}")
417
+
418
+ # Suggestions for Student Improvement
419
+ print(f"\n💡 SUGGESTIONS FOR STUDENT IMPROVEMENT:")
420
+
421
+ # Immediate actions
422
+ if roadmap and roadmap.get('immediate_actions'):
423
+ print(f"\n 📋 IMMEDIATE ACTIONS:")
424
+ for i, action in enumerate(roadmap['immediate_actions'][:3], 1):
425
+ print(f" {i}. {action}")
426
+
427
+ # Skills to learn
428
+ if roadmap and roadmap.get('priority_skills'):
429
+ print(f"\n 🎯 PRIORITY SKILLS TO LEARN:")
430
+ for i, skill in enumerate(roadmap['priority_skills'][:5], 1):
431
+ print(f" {i}. {skill}")
432
+
433
+ # Quick wins
434
+ if roadmap and roadmap.get('quick_wins'):
435
+ print(f"\n 🚀 QUICK WINS:")
436
+ for i, win in enumerate(roadmap['quick_wins'][:3], 1):
437
+ print(f" {i}. {win}")
438
+
439
+ # Final recommendation
440
+ print(f"\n📋 FINAL RECOMMENDATION:")
441
+ final_verdict = llm_analysis.get('final_verdict', 'Analysis completed successfully')
442
+ if len(final_verdict) > 200:
443
+ final_verdict = final_verdict[:200] + "..."
444
+ print(f" {final_verdict}")
445
+
446
+ print(f"\n{'='*70}")
447
+
448
+ @trace_llm_analysis if ADVANCED_PIPELINE else lambda x: x
449
+ def complete_ai_analysis_api(resume_file, jd_file):
450
+ """API version with LangGraph + LangSmith integration"""
451
+ start_time = time.time()
452
+
453
+ trace_id = None
454
+ if ADVANCED_PIPELINE:
455
+ trace_id = logger.start_trace("api_resume_analysis", {
456
+ "resume_file": resume_file,
457
+ "jd_file": jd_file,
458
+ "api_call": True
459
+ })
460
+
461
+ try:
462
+ llm_analyzer = LLMResumeAnalyzer(model=LLM_MODEL)
463
+
464
+ # Initialize LangGraph pipeline if available
465
+ if ADVANCED_PIPELINE:
466
+ pipeline = ResumeAnalysisPipeline(model=LLM_MODEL)
467
+
468
+ # Load and process files
469
+ resume_raw = load_file(resume_file)
470
+ jd_raw = load_file(jd_file)
471
+
472
+ resume_clean = clean_text(resume_raw)
473
+ resume_sections = split_sections(resume_clean)
474
+ resume_skills = extract_skills(" ".join(resume_sections.values()))
475
+
476
+ jd_data = parse_jd(jd_raw)
477
+ jd_skills = jd_data["skills"]
478
+
479
+ # Enhanced scoring if available
480
+ if ENHANCED_SCORING:
481
+ enhanced_scorer = EnhancedResumeScorer()
482
+ comprehensive_result = enhanced_scorer.calculate_comprehensive_score(
483
+ {"raw_text": resume_clean, "skills": resume_skills},
484
+ {"raw_text": jd_raw, "skills": jd_skills}
485
+ )
486
+
487
+ final_score = comprehensive_result["final_score"]
488
+ basic_scores = {
489
+ "score": comprehensive_result["breakdown"]["hard_match"]["score"],
490
+ "matched_skills": comprehensive_result["breakdown"]["hard_match"]["matched_skills"],
491
+ "missing_skills": comprehensive_result["breakdown"]["hard_match"]["missing_skills"],
492
+ "matched_count": comprehensive_result["breakdown"]["hard_match"]["matched_count"],
493
+ "total_jd_skills": comprehensive_result["breakdown"]["hard_match"]["total_jd_skills"]
494
+ }
495
+ else:
496
+ basic_scores = calculate_basic_scores(resume_skills, jd_skills)
497
+ hard_match_score = basic_scores['score']
498
+ semantic_score = 50
499
+ final_score = (hard_match_score * 0.4) + (semantic_score * 0.6)
500
+
501
+ # Run LangGraph pipeline if available
502
+ if ADVANCED_PIPELINE:
503
+ pipeline_result = pipeline.run_structured_analysis(resume_clean, jd_raw, basic_scores)
504
+
505
+ if pipeline_result.get("pipeline_status") == "completed":
506
+ llm_analysis = pipeline_result["llm_analysis"]
507
+ improvement_roadmap = pipeline_result["improvement_roadmap"]
508
+ pipeline_used = True
509
+ else:
510
+ llm_analysis = llm_analyzer.analyze_resume_vs_jd(resume_clean, jd_raw, basic_scores)
511
+ improvement_roadmap = llm_analyzer.generate_improvement_roadmap(llm_analysis)
512
+ pipeline_used = False
513
+ else:
514
+ llm_analysis = llm_analyzer.analyze_resume_vs_jd(resume_clean, jd_raw, basic_scores)
515
+ improvement_roadmap = llm_analyzer.generate_improvement_roadmap(llm_analysis)
516
+ pipeline_used = False
517
+
518
+ # Determine verdict
519
+ if final_score >= 80:
520
+ verdict = "High Suitability"
521
+ verdict_description = "Strong candidate - Recommend for interview"
522
+ elif final_score >= 60:
523
+ verdict = "Medium Suitability"
524
+ verdict_description = "Good potential - Consider with training"
525
+ else:
526
+ verdict = "Low Suitability"
527
+ verdict_description = "Significant gaps - Major upskilling needed"
528
+
529
+ # Finalize processing time
530
+ end_time = time.time()
531
+ processing_time = round(end_time - start_time, 2)
532
+
533
+ result = {
534
+ "success": True,
535
+ "enhanced_analysis": ENHANCED_SCORING,
536
+ "langgraph_pipeline": pipeline_used,
537
+ "langsmith_logging": ADVANCED_PIPELINE,
538
+ "relevance_analysis": {
539
+ "step_1_hard_match": {
540
+ "exact_matches": f"{basic_scores.get('matched_count', 0)}/{basic_scores.get('total_jd_skills', 0)}",
541
+ "coverage_score": basic_scores['score'],
542
+ "matched_skills": basic_scores['matched_skills'],
543
+ "tfidf_included": ENHANCED_SCORING,
544
+ "fuzzy_matches": [] if not ENHANCED_SCORING else comprehensive_result["breakdown"]["fuzzy_match"]["fuzzy_matched_skills"]
545
+ },
546
+ "step_2_semantic_match": {
547
+ "experience_alignment_score": llm_analysis.get('overall_fit_score', 0),
548
+ "context_understanding": llm_analysis.get('experience_alignment', ''),
549
+ "embedding_analysis": "Enhanced embeddings" if ENHANCED_SCORING else "LLM-powered analysis"
550
+ },
551
+ "step_3_scoring_verdict": {
552
+ "final_score": round(final_score, 1),
553
+ "enhanced_components": ENHANCED_SCORING
554
+ }
555
+ },
556
+ "output_generation": {
557
+ "relevance_score": f"{final_score:.0f}/100",
558
+ "verdict": verdict,
559
+ "verdict_description": verdict_description,
560
+ "missing_skills": basic_scores['missing_skills'],
561
+ "critical_gaps": llm_analysis.get('critical_gaps', []),
562
+ "improvement_suggestions": {
563
+ "immediate_actions": improvement_roadmap.get('immediate_actions', [])[:3],
564
+ "priority_skills": improvement_roadmap.get('priority_skills', [])[:5],
565
+ "quick_wins": improvement_roadmap.get('quick_wins', [])[:3]
566
+ },
567
+ "final_recommendation": llm_analysis.get('final_verdict', ''),
568
+ "tech_stack_used": {
569
+ "semantic_embeddings": ENHANCED_SCORING,
570
+ "fuzzy_matching": ENHANCED_SCORING,
571
+ "spacy_nlp": ENHANCED_SCORING,
572
+ "tfidf_scoring": ENHANCED_SCORING,
573
+ "faiss_vector_store": ENHANCED_SCORING,
574
+ "langgraph_pipeline": pipeline_used,
575
+ "langsmith_logging": ADVANCED_PIPELINE
576
+ }
577
+ },
578
+ "processing_info": {
579
+ "processing_time": processing_time
580
+ }
581
+ }
582
+
583
+ # Log success
584
+ if ADVANCED_PIPELINE and trace_id:
585
+ logger.end_trace(trace_id, {
586
+ "final_score": final_score,
587
+ "pipeline_used": pipeline_used
588
+ }, "success")
589
+
590
+ logger.log_metrics({
591
+ "api_success": True,
592
+ "final_score": final_score,
593
+ "pipeline_used": pipeline_used
594
+ })
595
+
596
+ return result
597
+
598
+ except Exception as e:
599
+ if ADVANCED_PIPELINE and trace_id:
600
+ logger.end_trace(trace_id, {}, "error", str(e))
601
+ return {"success": False, "error": str(e)}
602
+
603
+ if __name__ == "__main__":
604
+ # Check prerequisites
605
+ print("🔧 Checking prerequisites...")
606
+
607
+ # Check .env file
608
+ if not os.path.exists('.env'):
609
+ print("❌ .env file missing! Create it with your OPENROUTER_API_KEY")
610
+ exit(1)
611
+
612
+ # Check API key
613
+ if not os.getenv('OPENROUTER_API_KEY'):
614
+ print("❌ OPENROUTER_API_KEY not found in .env file!")
615
+ print("💡 Add this to your .env file: OPENROUTER_API_KEY=your-key-here")
616
+ exit(1)
617
+
618
+ # Check files exist
619
+ resume_file = "input/sample_resume.pdf"
620
+ jd_file = "input/sample_jd.pdf"
621
+
622
+ if not os.path.exists(resume_file):
623
+ print(f"❌ Resume file not found: {resume_file}")
624
+ exit(1)
625
+
626
+ if not os.path.exists(jd_file):
627
+ print(f"❌ JD file not found: {jd_file}")
628
+ exit(1)
629
+
630
+ print("✅ All prerequisites checked!")
631
+
632
+ # Show final tech stack status
633
+ print(f"\n🔧 TECH STACK STATUS:")
634
+ print(f" • Enhanced Scoring: {'✅ Active' if ENHANCED_SCORING else '⚠️ Basic'}")
635
+ print(f" • LangGraph Pipeline: {'✅ Active' if ADVANCED_PIPELINE else '⚠️ Not installed'}")
636
+ print(f" • LangSmith Logging: {'✅ Active' if ADVANCED_PIPELINE else '⚠️ Not installed'}")
637
+
638
+ # Run the complete enhanced analysis
639
+ complete_ai_analysis(resume_file, jd_file)
placement_dashboard.db ADDED
Binary file (36.9 kB). View file
 
requirements.txt ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ fastapi>=0.104.1
2
+ uvicorn[standard]>=0.24.0
3
+ streamlit>=1.28.0
4
+ requests>=2.31.0
5
+ pandas>=2.0.0
6
+ plotly>=5.15.0
7
+ python-dateutil>=2.8.2
8
+ python-multipart>=0.0.6
9
+ pydantic>=2.5.0
10
+ sqlalchemy>=2.0.0
11
+ numpy>=1.24.0
12
+ scikit-learn>=1.3.0
13
+ sentence-transformers>=2.2.2
14
+ python-docx>=0.8.11
15
+ PyPDF2>=3.0.1
16
+ reportlab>=4.0.0
17
+ fuzzywuzzy>=0.18.0
18
+ python-levenshtein>=0.20.0
resume_analysis.db ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1f08599bebf0a52ae980e1efae0dd6356be105f17e7274929eaffa3389cd42a4
3
+ size 122880
simple_results.db ADDED
Binary file (36.9 kB). View file
 
start.sh ADDED
File without changes
streamlit_app.py ADDED
@@ -0,0 +1,1103 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import streamlit as st
3
+ import requests
4
+ import json
5
+ import time
6
+ from datetime import datetime
7
+ import pandas as pd
8
+ import io
9
+
10
+ # HuggingFace Spaces Configuration
11
+ BACKEND_URL = os.getenv("BACKEND_URL", "http://localhost:8000")
12
+ SPACE_ID = os.getenv("SPACE_ID", None)
13
+ IS_HUGGINGFACE = SPACE_ID is not None
14
+
15
+ # Optional visualization imports
16
+ try:
17
+ import plotly.express as px
18
+ import plotly.graph_objects as go
19
+ PLOTLY_AVAILABLE = True
20
+ except ImportError:
21
+ PLOTLY_AVAILABLE = False
22
+
23
+ # Helper functions (defined at the top)
24
+ def create_csv_export(export_data):
25
+ """Create CSV export"""
26
+ analysis = export_data["analysis"]
27
+
28
+ csv_lines = [
29
+ "Resume Analysis Results",
30
+ "",
31
+ f"Resume,{export_data['files']['resume']}",
32
+ f"Job Description,{export_data['files']['jd']}",
33
+ f"Date,{export_data['timestamp']}",
34
+ "",
35
+ "SCORES"
36
+ ]
37
+
38
+ if "enhanced_analysis" in analysis:
39
+ scoring = analysis["enhanced_analysis"]["relevance_scoring"]
40
+ csv_lines.extend([
41
+ f"Overall Score,{scoring['overall_score']}/100",
42
+ f"Skill Match,{scoring['skill_match_score']:.1f}%",
43
+ f"Experience Match,{scoring['experience_match_score']:.1f}%",
44
+ f"Verdict,{scoring['fit_verdict']}",
45
+ f"Confidence,{scoring['confidence']:.1f}%"
46
+ ])
47
+
48
+ # Add matched skills
49
+ csv_lines.extend(["", "MATCHED SKILLS"])
50
+ for skill in scoring.get('matched_must_have', []):
51
+ csv_lines.append(f"✓,{skill}")
52
+
53
+ # Add missing skills
54
+ csv_lines.extend(["", "MISSING SKILLS"])
55
+ for skill in scoring.get('missing_must_have', []):
56
+ csv_lines.append(f"✗,{skill}")
57
+
58
+ elif "relevance_analysis" in analysis:
59
+ relevance = analysis["relevance_analysis"]
60
+ csv_lines.extend([
61
+ f"Final Score,{relevance['step_3_scoring_verdict']['final_score']}/100",
62
+ f"Hard Match,{relevance['step_1_hard_match']['coverage_score']:.1f}%",
63
+ f"Semantic Score,{relevance['step_2_semantic_match']['experience_alignment_score']}/10",
64
+ f"Verdict,{analysis['output_generation']['verdict']}"
65
+ ])
66
+
67
+ # Add matched skills
68
+ csv_lines.extend(["", "MATCHED SKILLS"])
69
+ for skill in relevance['step_1_hard_match'].get('matched_skills', []):
70
+ csv_lines.append(f"✓,{skill}")
71
+
72
+ return "\n".join(csv_lines)
73
+
74
+ def create_text_report(export_data):
75
+ """Create text report"""
76
+ analysis = export_data["analysis"]
77
+ timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
78
+
79
+ report = f"""
80
+ RESUME ANALYSIS REPORT
81
+ =====================
82
+
83
+ Generated: {timestamp}
84
+ Resume: {export_data['files']['resume']}
85
+ Job Description: {export_data['files']['jd']}
86
+
87
+ ANALYSIS RESULTS
88
+ ===============
89
+
90
+ """
91
+
92
+ if "enhanced_analysis" in analysis:
93
+ scoring = analysis["enhanced_analysis"]["relevance_scoring"]
94
+ job_parsing = analysis["enhanced_analysis"]["job_parsing"]
95
+
96
+ report += f"""JOB DETAILS:
97
+ Role: {job_parsing.get('role_title', 'Not specified')}
98
+ Experience Required: {job_parsing.get('experience_required', 'Not specified')}
99
+
100
+ SCORES:
101
+ Overall Score: {scoring['overall_score']}/100
102
+ Skill Match: {scoring['skill_match_score']:.1f}%
103
+ Experience Match: {scoring['experience_match_score']:.1f}%
104
+ Verdict: {scoring['fit_verdict']}
105
+ Confidence: {scoring['confidence']:.1f}%
106
+
107
+ MATCHED SKILLS:
108
+ """
109
+ for skill in scoring.get('matched_must_have', []):
110
+ report += f"✓ {skill}\n"
111
+
112
+ report += "\nMISSING SKILLS:\n"
113
+ for skill in scoring.get('missing_must_have', []):
114
+ report += f"✗ {skill}\n"
115
+
116
+ if scoring.get('improvement_suggestions'):
117
+ report += "\nRECOMMENDATIONS:\n"
118
+ for i, suggestion in enumerate(scoring['improvement_suggestions'], 1):
119
+ report += f"{i}. {suggestion}\n"
120
+
121
+ if scoring.get('quick_wins'):
122
+ report += "\nQUICK WINS:\n"
123
+ for i, win in enumerate(scoring['quick_wins'], 1):
124
+ report += f"{i}. {win}\n"
125
+
126
+ elif "relevance_analysis" in analysis:
127
+ relevance = analysis["relevance_analysis"]
128
+ output = analysis["output_generation"]
129
+
130
+ report += f"""SCORES:
131
+ Final Score: {relevance['step_3_scoring_verdict']['final_score']}/100
132
+ Hard Match: {relevance['step_1_hard_match']['coverage_score']:.1f}%
133
+ Semantic Score: {relevance['step_2_semantic_match']['experience_alignment_score']}/10
134
+ Exact Matches: {relevance['step_1_hard_match']['exact_matches']}
135
+ Verdict: {output['verdict']}
136
+
137
+ MATCHED SKILLS:
138
+ """
139
+ for skill in relevance['step_1_hard_match'].get('matched_skills', []):
140
+ report += f"✓ {skill}\n"
141
+
142
+ missing_skills = output.get('missing_skills', [])
143
+ if missing_skills:
144
+ report += "\nMISSING SKILLS:\n"
145
+ for skill in missing_skills[:10]:
146
+ report += f"✗ {skill}\n"
147
+
148
+ report += f"\n---\nGenerated by AI Resume Analyzer\n{timestamp}"
149
+ return report
150
+
151
+ def wait_for_backend(max_wait=60):
152
+ """Wait for backend to be ready"""
153
+ start_time = time.time()
154
+ while time.time() - start_time < max_wait:
155
+ try:
156
+ response = requests.get(f"{BACKEND_URL}/health", timeout=5)
157
+ if response.status_code == 200:
158
+ return True
159
+ except:
160
+ pass
161
+ time.sleep(2)
162
+ return False
163
+
164
+ def check_backend_status():
165
+ """Check if backend is available and get system info with retry logic"""
166
+ max_retries = 3
167
+ for attempt in range(max_retries):
168
+ try:
169
+ response = requests.get(f"{BACKEND_URL}/health", timeout=10)
170
+ if response.status_code == 200:
171
+ health_data = response.json()
172
+ return {
173
+ "available": True,
174
+ "components": health_data.get("components", {}),
175
+ "version": health_data.get("version", "Unknown"),
176
+ "attempt": attempt + 1
177
+ }
178
+ except requests.exceptions.ConnectionError:
179
+ if attempt < max_retries - 1:
180
+ time.sleep(3) # Wait longer between retries
181
+ continue
182
+ return {"available": False, "error": "Backend starting up..." if IS_HUGGINGFACE else "Connection refused - Backend not running", "attempt": attempt + 1}
183
+ except requests.exceptions.Timeout:
184
+ return {"available": False, "error": "Request timeout - Backend starting" if IS_HUGGINGFACE else "Request timeout", "attempt": attempt + 1}
185
+ except Exception as e:
186
+ return {"available": False, "error": str(e), "attempt": attempt + 1}
187
+
188
+ return {"available": False, "error": "Backend not responsive"}
189
+
190
+ def safe_api_call(endpoint, method="GET", **kwargs):
191
+ """Make a safe API call with proper URL handling"""
192
+ max_retries = 2
193
+ for attempt in range(max_retries):
194
+ try:
195
+ # Construct proper URL
196
+ if endpoint.startswith("http"):
197
+ url = endpoint
198
+ else:
199
+ # Ensure endpoint starts with /
200
+ if not endpoint.startswith("/"):
201
+ endpoint = "/" + endpoint
202
+ url = f"{BACKEND_URL}{endpoint}"
203
+
204
+ if method.upper() == "GET":
205
+ response = requests.get(url, timeout=30, **kwargs)
206
+ elif method.upper() == "POST":
207
+ response = requests.post(url, timeout=120, **kwargs)
208
+ elif method.upper() == "DELETE":
209
+ response = requests.delete(url, timeout=30, **kwargs)
210
+ else:
211
+ raise ValueError(f"Unsupported method: {method}")
212
+
213
+ response.raise_for_status()
214
+
215
+ # Handle empty responses for DELETE requests
216
+ if method.upper() == "DELETE" and not response.content:
217
+ return {"success": True, "data": {"message": "Deleted successfully"}}
218
+
219
+ return {"success": True, "data": response.json(), "status_code": response.status_code}
220
+
221
+ except requests.exceptions.ConnectionError:
222
+ if attempt < max_retries - 1:
223
+ time.sleep(2)
224
+ continue
225
+ return {"success": False, "error": "Cannot connect to backend", "error_type": "connection"}
226
+ except requests.exceptions.Timeout:
227
+ if attempt < max_retries - 1:
228
+ time.sleep(1)
229
+ continue
230
+ return {"success": False, "error": "Request timed out", "error_type": "timeout"}
231
+ except requests.exceptions.HTTPError as e:
232
+ return {"success": False, "error": f"HTTP {e.response.status_code}", "error_type": "http"}
233
+ except json.JSONDecodeError:
234
+ return {"success": False, "error": "Invalid response format", "error_type": "json"}
235
+ except Exception as e:
236
+ return {"success": False, "error": str(e), "error_type": "unknown"}
237
+
238
+ # Page config
239
+ st.set_page_config(
240
+ page_title="🤗 AI Resume Analyzer" if IS_HUGGINGFACE else "🎯 AI Resume Analyzer",
241
+ page_icon="🎯",
242
+ layout="wide",
243
+ initial_sidebar_state="expanded"
244
+ )
245
+
246
+ # Enhanced CSS styling (keeping your original theme + HuggingFace additions)
247
+ st.markdown("""
248
+ <style>
249
+ @import url('https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700&display=swap');
250
+
251
+ :root {
252
+ --font-family: 'Inter', sans-serif;
253
+ --primary-color: #3B82F6;
254
+ --accent-color: #60A5FA;
255
+ --success-color: #10B981;
256
+ --warning-color: #F59E0B;
257
+ --error-color: #EF4444;
258
+ --background-color: #F9FAFB;
259
+ --card-bg-color: #FFFFFF;
260
+ --text-color: #1F2937;
261
+ --subtle-text-color: #6B7280;
262
+ --border-color: #E5E7EB;
263
+ --hf-orange: #FF6B35;
264
+ --hf-blue: #4285F4;
265
+ }
266
+
267
+ /* General Styles */
268
+ body, .stApp {
269
+ font-family: var(--font-family);
270
+ background-color: var(--background-color);
271
+ color: var(--text-color);
272
+ }
273
+ #MainMenu, footer, header { visibility: hidden; }
274
+
275
+ /* HuggingFace Header */
276
+ .hf-header {
277
+ background: linear-gradient(135deg, var(--hf-orange) 0%, var(--hf-blue) 100%);
278
+ color: white;
279
+ padding: 2rem;
280
+ border-radius: 16px;
281
+ margin: 1rem 0;
282
+ text-align: center;
283
+ box-shadow: 0 8px 32px rgba(255, 107, 53, 0.3);
284
+ position: relative;
285
+ }
286
+
287
+ .hf-header::before {
288
+ content: '🤗';
289
+ position: absolute;
290
+ top: 20px;
291
+ right: 30px;
292
+ font-size: 3rem;
293
+ opacity: 0.3;
294
+ }
295
+
296
+ .hf-header h1 {
297
+ margin: 0 0 0.5rem 0;
298
+ font-weight: 700;
299
+ font-size: 2.5rem;
300
+ }
301
+
302
+ /* Startup Banner */
303
+ .startup-banner {
304
+ background: linear-gradient(135deg, #FEF3C7 0%, #FDE68A 100%);
305
+ color: #92400E;
306
+ padding: 1.5rem;
307
+ border-radius: 12px;
308
+ margin: 1rem 0;
309
+ text-align: center;
310
+ border: 2px solid var(--hf-orange);
311
+ animation: pulse 2s infinite;
312
+ }
313
+
314
+ @keyframes pulse {
315
+ 0% { opacity: 1; }
316
+ 50% { opacity: 0.8; }
317
+ 100% { opacity: 1; }
318
+ }
319
+
320
+ /* Main Header (for non-HF) */
321
+ .main-header {
322
+ background-color: var(--card-bg-color);
323
+ padding: 2rem;
324
+ border-radius: 12px;
325
+ margin: 1rem 0;
326
+ text-align: center;
327
+ border: 1px solid var(--border-color);
328
+ box-shadow: 0 1px 3px rgba(0, 0, 0, 0.1);
329
+ }
330
+ .main-header h1 {
331
+ color: var(--primary-color);
332
+ font-weight: 700;
333
+ letter-spacing: -1px;
334
+ margin-bottom: 0.5rem;
335
+ }
336
+ .main-header p {
337
+ color: var(--subtle-text-color);
338
+ font-size: 1.1rem;
339
+ margin: 0;
340
+ }
341
+
342
+ /* Status indicators */
343
+ .status-indicator {
344
+ display: inline-flex;
345
+ align-items: center;
346
+ padding: 0.5rem 1rem;
347
+ border-radius: 20px;
348
+ font-size: 0.875rem;
349
+ font-weight: 500;
350
+ margin: 0.25rem;
351
+ }
352
+ .status-online {
353
+ background-color: #D1FAE5;
354
+ color: #065F46;
355
+ border: 1px solid #A7F3D0;
356
+ }
357
+ .status-offline {
358
+ background-color: #FEE2E2;
359
+ color: #991B1B;
360
+ border: 1px solid #FECACA;
361
+ }
362
+ .status-warning {
363
+ background-color: #FEF3C7;
364
+ color: #92400E;
365
+ border: 1px solid #FCD34D;
366
+ }
367
+ .status-starting {
368
+ background-color: #FEF3C7;
369
+ color: #92400E;
370
+ border: 1px solid #FCD34D;
371
+ animation: pulse 2s infinite;
372
+ }
373
+
374
+ /* File Uploader Customization */
375
+ [data-testid="stFileUploader"] > div {
376
+ background-color: var(--card-bg-color);
377
+ padding: 2rem;
378
+ border-radius: 12px;
379
+ border: 2px dashed var(--border-color);
380
+ transition: all 0.3s ease;
381
+ }
382
+ [data-testid="stFileUploader"] > div:hover {
383
+ border-color: var(--primary-color);
384
+ background-color: #F9FAFB;
385
+ }
386
+ [data-testid="stFileUploader"] label {
387
+ font-weight: 600;
388
+ color: var(--primary-color);
389
+ }
390
+
391
+ /* Results & Cards */
392
+ .results-container, .feature-card, .download-section {
393
+ background-color: var(--card-bg-color);
394
+ padding: 1.5rem;
395
+ border-radius: 12px;
396
+ border: 1px solid var(--border-color);
397
+ margin: 1rem 0;
398
+ box-shadow: 0 1px 3px rgba(0, 0, 0, 0.1);
399
+ }
400
+
401
+ [data-testid="metric-container"] {
402
+ background-color: var(--card-bg-color);
403
+ border: 1px solid var(--border-color);
404
+ padding: 1rem;
405
+ border-radius: 12px;
406
+ box-shadow: 0 1px 3px rgba(0, 0, 0, 0.1);
407
+ transition: transform 0.2s ease;
408
+ }
409
+ [data-testid="metric-container"]:hover {
410
+ transform: translateY(-2px);
411
+ }
412
+
413
+ /* Score Cards */
414
+ .score-card {
415
+ background: linear-gradient(135deg, var(--primary-color), var(--accent-color));
416
+ color: white;
417
+ padding: 1.5rem;
418
+ border-radius: 12px;
419
+ text-align: center;
420
+ margin: 0.5rem 0;
421
+ }
422
+ .score-number { font-size: 2rem; font-weight: 700; margin-bottom: 0.5rem; }
423
+ .score-label { font-size: 0.9rem; opacity: 0.9; }
424
+
425
+ /* Skill Tags */
426
+ .skill-tag {
427
+ display: inline-block;
428
+ padding: 0.3rem 0.8rem;
429
+ border-radius: 16px;
430
+ font-size: 0.85rem;
431
+ font-weight: 500;
432
+ margin: 0.25rem;
433
+ border: 1px solid transparent;
434
+ transition: transform 0.2s ease;
435
+ }
436
+ .skill-tag:hover {
437
+ transform: scale(1.05);
438
+ }
439
+ .skill-tag.matched {
440
+ background-color: #D1FAE5;
441
+ color: #065F46;
442
+ border-color: #A7F3D0;
443
+ }
444
+ .skill-tag.missing {
445
+ background-color: #FEE2E2;
446
+ color: #991B1B;
447
+ border-color: #FECACA;
448
+ }
449
+ .skill-tag.bonus {
450
+ background-color: #DBEAFE;
451
+ color: #1E40AF;
452
+ border-color: #BFDBFE;
453
+ }
454
+
455
+ /* Buttons */
456
+ .stButton > button {
457
+ background-color: var(--primary-color);
458
+ color: white;
459
+ border: none;
460
+ border-radius: 8px;
461
+ font-weight: 600;
462
+ transition: all 0.2s ease;
463
+ }
464
+ .stButton > button:hover {
465
+ background-color: var(--accent-color);
466
+ transform: translateY(-1px);
467
+ box-shadow: 0 4px 8px rgba(59, 130, 246, 0.3);
468
+ }
469
+ .stDownloadButton > button {
470
+ background-color: var(--success-color);
471
+ color: white;
472
+ border: none;
473
+ border-radius: 8px;
474
+ font-weight: 600;
475
+ transition: all 0.2s ease;
476
+ }
477
+ .stDownloadButton > button:hover {
478
+ transform: translateY(-1px);
479
+ box-shadow: 0 4px 8px rgba(16, 185, 129, 0.3);
480
+ }
481
+
482
+ /* Progress bar */
483
+ .stProgress > div > div > div > div {
484
+ background-image: linear-gradient(90deg, var(--primary-color), var(--accent-color));
485
+ }
486
+
487
+ /* Error/Warning styling */
488
+ .stError {
489
+ background-color: #FEE2E2;
490
+ color: #991B1B;
491
+ border-left: 4px solid var(--error-color);
492
+ border-radius: 8px;
493
+ }
494
+ .stWarning {
495
+ background-color: #FEF3C7;
496
+ color: #92400E;
497
+ border-left: 4px solid var(--warning-color);
498
+ border-radius: 8px;
499
+ }
500
+ .stSuccess {
501
+ background-color: #D1FAE5;
502
+ color: #065F46;
503
+ border-left: 4px solid var(--success-color);
504
+ border-radius: 8px;
505
+ }
506
+ .stInfo {
507
+ background-color: #DBEAFE;
508
+ color: #1E40AF;
509
+ border-left: 4px solid var(--primary-color);
510
+ border-radius: 8px;
511
+ }
512
+
513
+ /* History items */
514
+ .history-item {
515
+ background-color: var(--card-bg-color);
516
+ border-left: 3px solid var(--primary-color);
517
+ padding: 0.75rem;
518
+ margin-bottom: 0.5rem;
519
+ border-radius: 0 8px 8px 0;
520
+ transition: transform 0.2s ease;
521
+ }
522
+ .history-item:hover {
523
+ transform: translateX(2px);
524
+ }
525
+ .history-item.high-score {
526
+ border-left-color: var(--success-color);
527
+ }
528
+ .history-item.medium-score {
529
+ border-left-color: var(--warning-color);
530
+ }
531
+ .history-item.low-score {
532
+ border-left-color: var(--error-color);
533
+ }
534
+
535
+ /* Dashboard header */
536
+ .quick-nav {
537
+ background-color: var(--card-bg-color);
538
+ padding: 1rem;
539
+ border-radius: 8px;
540
+ margin-bottom: 1rem;
541
+ border: 1px solid var(--border-color);
542
+ text-align: center;
543
+ }
544
+ .quick-nav a {
545
+ color: var(--primary-color);
546
+ text-decoration: none;
547
+ margin: 0 1rem;
548
+ font-weight: 500;
549
+ }
550
+ .quick-nav a:hover {
551
+ color: var(--accent-color);
552
+ text-decoration: underline;
553
+ }
554
+
555
+ @media (prefers-color-scheme: dark) {
556
+ :root {
557
+ --background-color: #111827;
558
+ --card-bg-color: #1F2937;
559
+ --text-color: #F3F4F6;
560
+ --subtle-text-color: #9CA3AF;
561
+ --border-color: #374151;
562
+ }
563
+ }
564
+ </style>
565
+ """, unsafe_allow_html=True)
566
+
567
+ # Initialize session state
568
+ if 'results' not in st.session_state:
569
+ st.session_state.results = []
570
+ if 'backend_ready' not in st.session_state:
571
+ st.session_state.backend_ready = False
572
+ if 'startup_complete' not in st.session_state:
573
+ st.session_state.startup_complete = False
574
+
575
+ # Dynamic Header based on environment
576
+ if IS_HUGGINGFACE:
577
+ st.markdown("""
578
+ <div class="hf-header">
579
+ <h1>🤗 AI Resume Analyzer</h1>
580
+ <p><strong>Advanced AI-Powered Resume Analysis System</strong></p>
581
+ <p>Full-Stack Deployment on HuggingFace Spaces</p>
582
+ </div>
583
+ """, unsafe_allow_html=True)
584
+ else:
585
+ # Dashboard Header (using your existing theme colors)
586
+ st.markdown(f"""
587
+ <div class="quick-nav">
588
+ <strong>🎯 AUTOMATED RESUME RELEVANCE CHECK SYSTEM DASHBOARD</strong> |
589
+ <a href="{BACKEND_URL}/dashboard" target="_blank">📊 Backend</a> |
590
+ <a href="{BACKEND_URL}/health" target="_blank">🔍 Health</a> |
591
+ <a href="{BACKEND_URL}/docs" target="_blank">📋 API Docs</a>
592
+ </div>
593
+ """, unsafe_allow_html=True)
594
+
595
+ # Header (your existing design)
596
+ st.markdown("""
597
+ <div class="main-header">
598
+ <h1>🎯 AUTOMATED RESUME RELEVANCE CHECK SYSTEM</h1>
599
+ <p>Upload resumes and job descriptions for intelligent AI-powered candidate analysis</p>
600
+ </div>
601
+ """, unsafe_allow_html=True)
602
+
603
+ # Sidebar with improved status checking
604
+ with st.sidebar:
605
+ if IS_HUGGINGFACE:
606
+ st.markdown("### 🤗 HuggingFace Deployment")
607
+ st.success("✅ Running on HuggingFace Spaces")
608
+
609
+ st.markdown("### 🚀 System Features")
610
+ features = [
611
+ ("🎯", "Semantic Matching", "AI-powered similarity analysis"),
612
+ ("🔄", "Fuzzy Matching", "Intelligent skill detection"),
613
+ ("📊", "TF-IDF Scoring", "Statistical analysis"),
614
+ ("🤖", "LLM Analysis", "GPT insights"),
615
+ ("📝", "NLP Processing", "Entity extraction"),
616
+ ("⚡", "Real-time", "Instant results")
617
+ ]
618
+
619
+ for icon, title, desc in features:
620
+ st.markdown(f"""
621
+ <div class="feature-card" style="margin-bottom: 0.5rem;">
622
+ <div style="font-size: 1.5rem; float: left; margin-right: 1rem;">{icon}</div>
623
+ <div style="font-weight: 600; color: var(--primary-color);">{title}</div>
624
+ <div style="font-size: 0.85rem; color: var(--subtle-text-color);">{desc}</div>
625
+ </div>
626
+ """, unsafe_allow_html=True)
627
+
628
+ st.markdown("---")
629
+ st.markdown("### 🔧 System Status")
630
+
631
+ # Check backend status with loading indicator
632
+ with st.spinner("Checking system status..."):
633
+ backend_status = check_backend_status()
634
+
635
+ if backend_status["available"]:
636
+ st.session_state.backend_ready = True
637
+ st.session_state.startup_complete = True
638
+
639
+ st.markdown('<span class="status-indicator status-online">✅ Backend Ready</span>', unsafe_allow_html=True)
640
+
641
+ components = backend_status.get("components", {})
642
+
643
+ # Database status
644
+ db_status = components.get("database", "unavailable")
645
+ if db_status == "active":
646
+ st.markdown('<span class="status-indicator status-online">💾 Database Active</span>', unsafe_allow_html=True)
647
+ else:
648
+ st.markdown('<span class="status-indicator status-warning">💾 Database Limited</span>', unsafe_allow_html=True)
649
+
650
+ # Enhanced features
651
+ if components.get("enhanced_features") == "active":
652
+ st.markdown('<span class="status-indicator status-online">🧠 Enhanced AI</span>', unsafe_allow_html=True)
653
+ else:
654
+ st.markdown('<span class="status-indicator status-warning">🧠 Basic Mode</span>', unsafe_allow_html=True)
655
+
656
+ # Downloads
657
+ if components.get("download_features") == "active":
658
+ st.markdown('<span class="status-indicator status-online">📥 Downloads Ready</span>', unsafe_allow_html=True)
659
+
660
+ # Interactive History
661
+ if components.get("interactive_history") == "active":
662
+ st.markdown('<span class="status-indicator status-online">🗂️ Interactive History</span>', unsafe_allow_html=True)
663
+
664
+ # Version info
665
+ version = backend_status.get("version", "Unknown")
666
+ st.markdown(f"<small>Version: {version}</small>", unsafe_allow_html=True)
667
+
668
+ else:
669
+ st.markdown('<span class="status-indicator status-starting">⏳ System Starting</span>', unsafe_allow_html=True)
670
+
671
+ error_msg = backend_status.get("error", "Initializing...")
672
+ attempt = backend_status.get("attempt", 1)
673
+
674
+ if IS_HUGGINGFACE:
675
+ st.info(f"""
676
+ 🚀 **HuggingFace Startup in Progress**
677
+
678
+ Status: {error_msg}
679
+ Attempt: {attempt}/3
680
+
681
+ ⏱️ Please wait 30-60 seconds for full system initialization.
682
+ """)
683
+ else:
684
+ st.error(f"Error: {error_msg}")
685
+ st.info("💡 Start backend: `python app.py`")
686
+
687
+ # Auto-refresh button
688
+ if st.button("🔄 Check Status", use_container_width=True):
689
+ st.rerun()
690
+
691
+ st.markdown("---")
692
+ st.markdown("### 🔗 Quick Links")
693
+
694
+ if backend_status["available"]:
695
+ if st.button("🎯 Dashboard", use_container_width=True):
696
+ st.markdown(f'[🎯 Open Dashboard]({BACKEND_URL}/dashboard)', unsafe_allow_html=True)
697
+ st.success("Dashboard link above ↑")
698
+
699
+ if st.button("📋 API Docs", use_container_width=True):
700
+ st.markdown(f'[📋 Open API Documentation]({BACKEND_URL}/docs)', unsafe_allow_html=True)
701
+ st.success("API docs link above ↑")
702
+ else:
703
+ st.info("Links available when backend is running")
704
+
705
+ # Startup Banner for HuggingFace
706
+ if IS_HUGGINGFACE and not st.session_state.startup_complete:
707
+ st.markdown("""
708
+ <div class="startup-banner">
709
+ <strong>🚀 AI Resume Analyzer Starting Up</strong><br>
710
+ Full-stack system initializing on HuggingFace Spaces...<br>
711
+ <small>FastAPI Backend + Streamlit Frontend + Database</small><br>
712
+ <strong>Please wait 30-60 seconds</strong>
713
+ </div>
714
+ """, unsafe_allow_html=True)
715
+
716
+ # Main Application (only show if backend is ready or not on HuggingFace)
717
+ if st.session_state.backend_ready or not IS_HUGGINGFACE:
718
+ # Main content (your existing design)
719
+ st.markdown("### 📤 Upload Documents")
720
+ upload_col1, upload_col2 = st.columns(2)
721
+
722
+ with upload_col1:
723
+ resume_files = st.file_uploader(
724
+ "📄 **Upload Resumes**",
725
+ help="Upload one or more resumes (PDF, DOCX, TXT)",
726
+ type=['pdf', 'docx', 'txt'],
727
+ key="resume_uploader",
728
+ accept_multiple_files=True
729
+ )
730
+ if resume_files:
731
+ for f in resume_files:
732
+ st.success(f"📄 {f.name} ({len(f.getvalue())} bytes)")
733
+
734
+ with upload_col2:
735
+ jd_files = st.file_uploader(
736
+ "📋 **Upload Job Descriptions**",
737
+ help="Upload one or more job descriptions (PDF, DOCX, TXT)",
738
+ type=['pdf', 'docx', 'txt'],
739
+ key="jd_uploader",
740
+ accept_multiple_files=True
741
+ )
742
+ if jd_files:
743
+ for f in jd_files:
744
+ st.success(f"📋 {f.name} ({len(f.getvalue())} bytes)")
745
+
746
+ # Analysis button
747
+ if st.button("🚀 Analyze Candidate Fit", type="primary", use_container_width=True):
748
+ if not backend_status["available"]:
749
+ if IS_HUGGINGFACE:
750
+ st.error("❌ Backend is still starting up. Please wait and try again.")
751
+ else:
752
+ st.error("❌ Backend is not available. Please start the backend first.")
753
+ elif not resume_files or not jd_files:
754
+ st.warning("⚠️ Please upload at least one resume and one job description.")
755
+ else:
756
+ st.session_state.results = []
757
+ total_analyses = len(resume_files) * len(jd_files)
758
+
759
+ with st.container():
760
+ st.markdown("### 🤖 Processing Analysis")
761
+ progress_bar = st.progress(0)
762
+ status_text = st.empty()
763
+
764
+ count = 0
765
+ errors = []
766
+
767
+ for resume_file in resume_files:
768
+ for jd_file in jd_files:
769
+ count += 1
770
+ status_text.info(f"🧠 Analyzing {resume_file.name} vs {jd_file.name} ({count}/{total_analyses})...")
771
+
772
+ # Make API call with proper URL handling
773
+ files = {'resume': resume_file, 'jd': jd_file}
774
+ api_result = safe_api_call("/analyze", method="POST", files=files)
775
+
776
+ if api_result["success"]:
777
+ result = api_result["data"]
778
+ result['ui_info'] = {
779
+ 'resume_filename': resume_file.name,
780
+ 'jd_filename': jd_file.name
781
+ }
782
+ st.session_state.results.append(result)
783
+ else:
784
+ error_msg = f"Error analyzing {resume_file.name}: {api_result['error']}"
785
+ errors.append(error_msg)
786
+ st.error(error_msg)
787
+
788
+ progress_bar.progress(count / total_analyses)
789
+
790
+ # Clear progress indicators
791
+ progress_bar.empty()
792
+ status_text.empty()
793
+
794
+ # Show summary
795
+ if st.session_state.results:
796
+ st.success(f"✅ Completed {len(st.session_state.results)} successful analyses!")
797
+
798
+ if errors:
799
+ st.error(f"❌ {len(errors)} analyses failed. Check backend logs for details.")
800
+
801
+ # Display results (your existing design continues here)
802
+ if st.session_state.results:
803
+ st.markdown("---")
804
+ st.markdown("### 📊 Batch Analysis Results")
805
+
806
+ for i, result in enumerate(st.session_state.results):
807
+ ui_info = result.get('ui_info', {})
808
+ resume_name = ui_info.get('resume_filename', f'Resume {i+1}')
809
+ jd_name = ui_info.get('jd_filename', f'Job {i+1}')
810
+
811
+ # Determine overall score for color coding
812
+ overall_score = 0
813
+ if result.get("success"):
814
+ if 'enhanced_analysis' in result:
815
+ overall_score = result['enhanced_analysis']['relevance_scoring']['overall_score']
816
+ elif 'relevance_analysis' in result:
817
+ overall_score = result['relevance_analysis']['step_3_scoring_verdict']['final_score']
818
+
819
+ # Color coding for expander
820
+ score_emoji = "🟢" if overall_score >= 80 else "🟡" if overall_score >= 60 else "🔴"
821
+ expander_title = f"{score_emoji} **{resume_name}** vs **{jd_name}** - Score: {overall_score}/100"
822
+
823
+ with st.expander(expander_title, expanded=(i == 0)): # First result expanded by default
824
+ if result.get("success"):
825
+ # Processing info
826
+ processing_info = result.get('processing_info', {})
827
+ processing_time = processing_info.get('processing_time', 0)
828
+ enhanced_mode = processing_info.get('enhanced_features', False)
829
+ database_saved = processing_info.get('database_saved', False)
830
+
831
+ # Show mode and status
832
+ col_info1, col_info2, col_info3 = st.columns(3)
833
+ with col_info1:
834
+ mode_color = "🚀" if enhanced_mode else "⚠️"
835
+ mode_text = "Enhanced" if enhanced_mode else "Standard"
836
+ if IS_HUGGINGFACE:
837
+ st.info(f"🤗 HuggingFace: {mode_text}")
838
+ else:
839
+ st.info(f"{mode_color} Mode: {mode_text}")
840
+ with col_info2:
841
+ st.info(f"⏱️ Time: {processing_time:.1f}s")
842
+ with col_info3:
843
+ db_status = "💾 Saved" if database_saved else "⚠️ Not Saved"
844
+ st.info(db_status)
845
+
846
+ if 'enhanced_analysis' in result:
847
+ # Enhanced analysis results
848
+ relevance = result['enhanced_analysis']['relevance_scoring']
849
+ job_parsing = result['enhanced_analysis']['job_parsing']
850
+
851
+ # Job info
852
+ st.markdown("#### 💼 Job Analysis")
853
+ job_col1, job_col2 = st.columns(2)
854
+ with job_col1:
855
+ st.markdown(f"**Role:** {job_parsing.get('role_title', 'Not specified')}")
856
+ st.markdown(f"**Experience:** {job_parsing.get('experience_required', 'Not specified')}")
857
+ with job_col2:
858
+ st.markdown(f"**Must-have Skills:** {len(job_parsing.get('must_have_skills', []))}")
859
+ st.markdown(f"**Good-to-have Skills:** {len(job_parsing.get('good_to_have_skills', []))}")
860
+
861
+ # Score metrics
862
+ score_cols = st.columns(4)
863
+ score_cols[0].metric("🏆 Overall Score", f"{relevance['overall_score']}/100")
864
+ score_cols[1].metric("🎯 Skill Match", f"{relevance['skill_match_score']:.1f}%")
865
+ score_cols[2].metric("💼 Experience Match", f"{relevance['experience_match_score']:.1f}%")
866
+ score_cols[3].metric("🧠 Confidence", f"{relevance['confidence']:.1f}%")
867
+
868
+ # Verdict
869
+ verdict = relevance['fit_verdict']
870
+ verdict_color = "#10B981" if "High" in verdict else "#F59E0B" if "Medium" in verdict else "#EF4444"
871
+ st.markdown(f"""
872
+ <div style="background: white; padding: 1rem; border-radius: 8px; border-left: 4px solid {verdict_color}; margin: 1rem 0;">
873
+ <h4 style="color: {verdict_color}; margin: 0;">{verdict}</h4>
874
+ <p style="color: #6B7280; margin: 0.5rem 0 0 0;">Confidence: {relevance['confidence']:.1f}%</p>
875
+ </div>
876
+ """, unsafe_allow_html=True)
877
+
878
+ # Tabs for detailed analysis
879
+ tab1, tab2, tab3 = st.tabs(["🎯 Skills Analysis", "💡 AI Recommendations", "📥 Download Report"])
880
+
881
+ with tab1:
882
+ skill_col1, skill_col2 = st.columns(2)
883
+
884
+ with skill_col1:
885
+ st.markdown("##### ✅ Matched Must-Have Skills")
886
+ matched_skills = relevance.get('matched_must_have', [])
887
+ if matched_skills:
888
+ skills_html = ''.join(f'<span class="skill-tag matched">{s}</span>' for s in matched_skills)
889
+ st.markdown(skills_html, unsafe_allow_html=True)
890
+ else:
891
+ st.info("No must-have skills matched")
892
+
893
+ with skill_col2:
894
+ st.markdown("##### ❌ Missing Must-Have Skills")
895
+ missing_skills = relevance.get('missing_must_have', [])
896
+ if missing_skills:
897
+ skills_html = ''.join(f'<span class="skill-tag missing">{s}</span>' for s in missing_skills)
898
+ st.markdown(skills_html, unsafe_allow_html=True)
899
+ else:
900
+ st.success("All required skills present!")
901
+
902
+ # Bonus skills
903
+ bonus_skills = relevance.get('matched_good_to_have', [])
904
+ if bonus_skills:
905
+ st.markdown("##### ⭐ Bonus Skills (Good to Have)")
906
+ bonus_html = ''.join(f'<span class="skill-tag bonus">{s}</span>' for s in bonus_skills)
907
+ st.markdown(bonus_html, unsafe_allow_html=True)
908
+
909
+ with tab2:
910
+ rec_col1, rec_col2 = st.columns(2)
911
+
912
+ with rec_col1:
913
+ st.markdown("##### 📈 Improvement Suggestions")
914
+ suggestions = relevance.get('improvement_suggestions', [])
915
+ if suggestions:
916
+ for i, suggestion in enumerate(suggestions, 1):
917
+ st.markdown(f"**{i}.** {suggestion}")
918
+ else:
919
+ st.info("No specific improvements suggested")
920
+
921
+ with rec_col2:
922
+ st.markdown("##### ⚡ Quick Wins")
923
+ quick_wins = relevance.get('quick_wins', [])
924
+ if quick_wins:
925
+ for i, win in enumerate(quick_wins, 1):
926
+ st.markdown(f"**{i}.** {win}")
927
+ else:
928
+ st.info("No quick wins identified")
929
+
930
+ with tab3:
931
+ export_data = {
932
+ "timestamp": datetime.now().isoformat(),
933
+ "files": {"resume": resume_name, "jd": jd_name},
934
+ "analysis": result
935
+ }
936
+
937
+ d_col1, d_col2, d_col3 = st.columns(3)
938
+ key_base = f"{resume_name}_{jd_name}_{i}".replace(" ", "_").replace(".", "_")
939
+
940
+ with d_col1:
941
+ st.download_button(
942
+ "📄 JSON Report",
943
+ json.dumps(export_data, indent=2),
944
+ f"analysis_{key_base}.json",
945
+ "application/json",
946
+ use_container_width=True,
947
+ key=f"json_{key_base}"
948
+ )
949
+
950
+ with d_col2:
951
+ st.download_button(
952
+ "📊 CSV Summary",
953
+ create_csv_export(export_data),
954
+ f"analysis_{key_base}.csv",
955
+ "text/csv",
956
+ use_container_width=True,
957
+ key=f"csv_{key_base}"
958
+ )
959
+
960
+ with d_col3:
961
+ st.download_button(
962
+ "📝 Text Report",
963
+ create_text_report(export_data),
964
+ f"report_{key_base}.txt",
965
+ "text/plain",
966
+ use_container_width=True,
967
+ key=f"txt_{key_base}"
968
+ )
969
+
970
+ else:
971
+ # Standard analysis results
972
+ st.warning("⚠️ Running in Standard Mode - Enhanced features disabled")
973
+
974
+ if 'relevance_analysis' in result:
975
+ relevance = result['relevance_analysis']
976
+ output = result['output_generation']
977
+
978
+ # Score metrics
979
+ score_cols = st.columns(4)
980
+ score_cols[0].metric("🏆 Final Score", f"{relevance['step_3_scoring_verdict']['final_score']}/100")
981
+ score_cols[1].metric("🎯 Hard Match", f"{relevance['step_1_hard_match']['coverage_score']:.1f}%")
982
+ score_cols[2].metric("🧠 Semantic Score", f"{relevance['step_2_semantic_match']['experience_alignment_score']}/10")
983
+ score_cols[3].metric("✅ Matches", f"{relevance['step_1_hard_match']['exact_matches']}")
984
+
985
+ # Verdict
986
+ verdict = output['verdict']
987
+ st.success(f"**Verdict:** {verdict}")
988
+
989
+ # Skills
990
+ skill_col1, skill_col2 = st.columns(2)
991
+
992
+ with skill_col1:
993
+ st.markdown("##### ✅ Matched Skills")
994
+ matched_skills = relevance['step_1_hard_match'].get('matched_skills', [])
995
+ if matched_skills:
996
+ skills_html = ''.join(f'<span class="skill-tag matched">{s}</span>' for s in matched_skills)
997
+ st.markdown(skills_html, unsafe_allow_html=True)
998
+ else:
999
+ st.info("No skills matched")
1000
+
1001
+ with skill_col2:
1002
+ st.markdown("##### ❌ Missing Skills")
1003
+ missing_skills = output.get('missing_skills', [])
1004
+ if missing_skills:
1005
+ skills_html = ''.join(f'<span class="skill-tag missing">{s}</span>' for s in missing_skills[:10])
1006
+ st.markdown(skills_html, unsafe_allow_html=True)
1007
+ else:
1008
+ st.success("No missing skills identified")
1009
+
1010
+ else:
1011
+ st.error(f"❌ Analysis failed: {result.get('error', 'Unknown error')}")
1012
+
1013
+ # Analytics section
1014
+ if st.session_state.results or backend_status["available"]:
1015
+ st.markdown("---")
1016
+ st.markdown("### 📈 Analytics Overview")
1017
+
1018
+ if backend_status["available"]:
1019
+ analytics_result = safe_api_call("/analytics")
1020
+
1021
+ if analytics_result["success"]:
1022
+ analytics = analytics_result["data"]
1023
+
1024
+ # Metrics
1025
+ anal_col1, anal_col2 = st.columns(2)
1026
+ with anal_col1:
1027
+ st.metric("Total Analyses", analytics.get('total_analyses', 0))
1028
+ st.metric("Average Score", f"{analytics.get('avg_score', 0):.1f}/100")
1029
+ with anal_col2:
1030
+ st.metric("High-Fit Rate", f"{analytics.get('success_rate', 0):.1f}%")
1031
+ st.metric("High Matches", analytics.get('high_matches', 0))
1032
+
1033
+ # Simple chart if there's data and plotly is available
1034
+ if PLOTLY_AVAILABLE and analytics.get('total_analyses', 0) > 0:
1035
+ chart_data = pd.DataFrame({
1036
+ 'Category': ['High Match', 'Medium Match', 'Low Match'],
1037
+ 'Count': [
1038
+ analytics.get('high_matches', 0),
1039
+ analytics.get('medium_matches', 0),
1040
+ analytics.get('low_matches', 0)
1041
+ ]
1042
+ })
1043
+
1044
+ if chart_data['Count'].sum() > 0:
1045
+ fig = px.pie(
1046
+ chart_data,
1047
+ values='Count',
1048
+ names='Category',
1049
+ color_discrete_sequence=['#10B981', '#F59E0B', '#EF4444']
1050
+ )
1051
+ fig.update_layout(height=250, margin=dict(t=20, b=0, l=0, r=0))
1052
+ st.plotly_chart(fig, use_container_width=True)
1053
+ else:
1054
+ st.warning(f"Analytics unavailable: {analytics_result['error']}")
1055
+ else:
1056
+ st.info("Backend required for analytics")
1057
+
1058
+ else:
1059
+ # System not ready - show waiting interface for HuggingFace
1060
+ st.info("""
1061
+ 🚀 **System Initialization in Progress**
1062
+
1063
+ The AI Resume Analyzer is starting up on HuggingFace Spaces.
1064
+
1065
+ **What's happening:**
1066
+ - ⚡ FastAPI backend is initializing
1067
+ - 💾 Database system is starting
1068
+ - 🧠 AI components are loading
1069
+ - 🎨 Interface is preparing
1070
+
1071
+ **Please wait 30-60 seconds and the system will be ready!**
1072
+ """)
1073
+
1074
+ # Auto-refresh every 10 seconds
1075
+ time.sleep(10)
1076
+ st.rerun()
1077
+
1078
+ # Footer (updated for HuggingFace)
1079
+ st.markdown("---")
1080
+ if IS_HUGGINGFACE:
1081
+ st.markdown("""
1082
+ <div style="text-align: center; padding: 2rem; background: linear-gradient(135deg, #f8fafc 0%, #f1f5f9 100%);
1083
+ border-radius: 12px; margin: 1rem 0;">
1084
+ <div style="font-size: 1.5rem; font-weight: 700; color: #FF6B35; margin-bottom: 1rem;">
1085
+ 🤗 AI Resume Analyzer
1086
+ </div>
1087
+ <div style="font-size: 1rem; color: #6B7280; margin-bottom: 1rem;">
1088
+ <strong>Full-Stack AI System</strong> | Deployed on HuggingFace Spaces
1089
+ </div>
1090
+ <div style="font-size: 0.9rem; color: #9CA3AF;">
1091
+ FastAPI Backend + Streamlit Frontend + SQLite Database<br>
1092
+ Advanced Resume Analysis with Interactive History Management
1093
+ </div>
1094
+ </div>
1095
+ """, unsafe_allow_html=True)
1096
+ else:
1097
+ st.markdown("""
1098
+ <div style="text-align: center; padding: 1rem; color: var(--subtle-text-color);">
1099
+ <strong>🏆 AI Resume Analyzer</strong> |
1100
+ Built with Python, FastAPI & Streamlit |
1101
+ Enhanced with Interactive History Management
1102
+ </div>
1103
+ """, unsafe_allow_html=True)
technical_overview.md ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Technical Architecture
2
+
3
+ ## Core Components
4
+ 1. **Resume/JD Parser**: PyMuPDF, python-docx, spaCy
5
+ 2. **Semantic Engine**: sentence-transformers, FAISS, cosine similarity
6
+ 3. **Fuzzy Matcher**: RapidFuzz for skill variations
7
+ 4. **LLM Integration**: OpenRouter + Grok for intelligent analysis
8
+ 5. **Scoring Engine**: TF-IDF, weighted algorithms
9
+ 6. **Web Interface**: FastAPI backend, Streamlit frontend
10
+
11
+ ## Data Flow
12
+ 1. File Upload → Text Extraction
13
+ 2. NLP Processing → Entity Extraction
14
+ 3. Multi-Stage Analysis:
15
+ - Hard Match (TF-IDF + Keywords)
16
+ - Semantic Match (Embeddings + Cosine)
17
+ - Fuzzy Match (Skill Variations)
18
+ - LLM Analysis (Context Understanding)
19
+ 4. Weighted Scoring → Final Verdict
20
+ 5. Recommendations Generation → Export Report
21
+
22
+ ## Scalability Features
23
+ - RESTful API design
24
+ - Async processing
25
+ - Vector database integration
26
+ - Modular architecture
27
+ - Cloud deployment ready