Spaces:

Parthiban97
/

ATS_Smart_System

Running

App Files Files Community

Parthiban97 commited on Jun 28, 2025

Commit

0212d62

verified ·

1 Parent(s): bb53ea0

Update app.py

Browse files

Files changed (1) hide show

app.py +555 -112

app.py CHANGED Viewed

@@ -3,19 +3,47 @@
 import streamlit as st
 import os
-from PyPDF2 import PdfReader
-import google.generativeai as genai
-from dotenv import load_dotenv
 import hashlib
 import json
 from datetime import datetime
 # Loading the .env keys
 load_dotenv()
-# MASTER UNIVERSAL SYSTEM PROMPT - Designed for Maximum Consistency & Global Applicability
-# Based on Latest 2025 ATS Optimization Research & AI Best Practices
 UNIVERSAL_MASTER_PROMPT = """
 You are the ULTIMATE ATS OPTIMIZATION ENGINE 3.0 - A state-of-the-art AI system designed to provide CONSISTENT, PRECISE, and GLOBALLY APPLICABLE resume analysis across ALL industries, roles, and experience levels.
@@ -112,37 +140,17 @@ Always provide results in this EXACT format for consistency:
 **🏆 FINAL VERDICT:**
 [EXCEPTIONAL 90-100 | STRONG 75-89 | GOOD 60-74 | DEVELOPING 45-59 | NEEDS WORK <45]
-**GLOBAL INDUSTRY ADAPTATION MATRIX:**
-Automatically adapt analysis based on detected industry context:
-- Technology: Focus on technical skills, certifications, project impact
-- Healthcare: Emphasize compliance, patient outcomes, clinical expertise
-- Finance: Highlight risk management, regulatory knowledge, quantitative skills
-- Manufacturing: Assess process improvement, safety, operational efficiency
-- Marketing: Evaluate campaign results, digital proficiency, creative impact
-- Education: Focus on learning outcomes, curriculum development, mentoring
-- Legal: Emphasize case outcomes, regulatory expertise, research capabilities
-- Consulting: Highlight client impact, analytical skills, strategic thinking
 **CONSISTENCY GUARANTEES:**
 - Same resume + same job description = identical analysis (±2 points variation max)
 - Standardized language and terminology across all evaluations
 - Reproducible scoring methodology regardless of domain
 - Time-consistent results (same analysis today and tomorrow)
-**QUALITY ASSURANCE CHECKS:**
-- Bias detection and mitigation protocols
-- Cultural sensitivity and inclusive language
-- Legal compliance verification
-- Ethical evaluation standards
-Proceed with analysis using this framework while maintaining absolute consistency and global applicability.
 """
 # Specialized prompts that extend the master prompt for specific use cases
 SPECIALIZED_PROMPTS = {
     "evaluate_resume": f"""
 {UNIVERSAL_MASTER_PROMPT}
 **SPECIFIC TASK: COMPREHENSIVE RESUME EVALUATION**
 Apply the Universal Evaluation Framework above to provide a complete assessment.
 Focus on overall candidacy evaluation with balanced perspective on strengths and development areas.
@@ -151,108 +159,87 @@ Maintain professional tone suitable for HR professionals and hiring managers.
     "improve_skills": f"""
 {UNIVERSAL_MASTER_PROMPT}
 **SPECIFIC TASK: SKILL ENHANCEMENT STRATEGY**
 After completing the standard evaluation, provide additional guidance:
 **📈 SKILL DEVELOPMENT ROADMAP:**
 - **Immediate Actions (0-3 months):** Quick wins and foundational improvements
 - **Short-term Goals (3-12 months):** Structured learning and certification paths
 - **Long-term Vision (1-3 years):** Strategic career advancement opportunities
 **🎓 LEARNING RESOURCES:**
 - Recommended courses, certifications, and training programs
 - Industry conferences and networking opportunities
 - Practical projects and portfolio development suggestions
 Focus on actionable, measurable improvement strategies with clear timelines.
 """,
     "missing_keywords": f"""
 {UNIVERSAL_MASTER_PROMPT}
 **SPECIFIC TASK: ATS KEYWORD OPTIMIZATION**
 After completing the standard evaluation, provide enhanced keyword analysis:
 **🔍 ADVANCED KEYWORD ANALYSIS:**
 - **CRITICAL MISSING (High Impact):** Essential terms significantly affecting ATS ranking
 - **IMPORTANT ADDITIONS (Medium Impact):** Valuable terms improving visibility
 - **OPTIMIZATION OPPORTUNITIES (Low Impact):** Supplementary terms for comprehensive coverage
 **📝 INTEGRATION STRATEGY:**
 - Specific resume sections for keyword placement
 - Natural integration techniques avoiding keyword stuffing
 - Industry-appropriate phrasing and terminology
 **🤖 ATS COMPATIBILITY SCORE:** [Detailed breakdown of parsing efficiency]
 """,
     "percentage_match": f"""
 {UNIVERSAL_MASTER_PROMPT}
 **SPECIFIC TASK: PRECISE MATCHING ANALYSIS**
 Provide the standard evaluation with enhanced quantitative focus:
 **📊 DETAILED SCORING BREAKDOWN:**
 Present exact point allocation for each category with clear justification.
 Include competitive benchmarking and market positioning analysis.
 Provide specific improvement strategies for 10-15% score increase.
 **🎯 MATCH PERCENTAGE: [XX%]**
 Tier Classification with detailed rationale and next steps.
 """,
     "answer_query": f"""
 {UNIVERSAL_MASTER_PROMPT}
 **SPECIFIC TASK: EXPERT CONSULTATION**
 Apply domain expertise to answer the specific query while considering:
 - Resume content and job description context
 - Industry best practices and current market trends
 - Practical, actionable guidance
 - Evidence-based recommendations
 Provide thorough, well-researched responses with specific examples and multiple solution approaches when applicable.
 """,
     "executive_assessment": f"""
 {UNIVERSAL_MASTER_PROMPT}
 **SPECIFIC TASK: EXECUTIVE-LEVEL EVALUATION**
 Apply enhanced criteria for senior leadership positions:
 **👔 EXECUTIVE COMPETENCY FRAMEWORK:**
 - Strategic thinking and vision development
 - Change management and transformation leadership
 - Financial acumen and business impact
 - Board readiness and governance experience
 **📈 LEADERSHIP IMPACT ANALYSIS:**
 - Quantifiable business results and achievements
 - Market expansion and competitive positioning
 - Organizational culture and talent development
 - Crisis leadership and resilience
 Provide insights suitable for C-suite and board-level discussions.
 """,
     "career_transition": f"""
 {UNIVERSAL_MASTER_PROMPT}
 **SPECIFIC TASK: CAREER PIVOT ANALYSIS**
 Evaluate career change feasibility with:
 **🔄 TRANSITION ASSESSMENT:**
 - Transferable skills mapping across industries
 - Market positioning strategy for career change
 - Risk mitigation and success probability analysis
 - Timeline and milestone planning
 **🎯 TRANSITION ROADMAP:**
 - Phase-wise transition strategy
 - Skill development priorities
 - Network building and industry immersion plan
 Provide strategic guidance maximizing transition success while minimizing career risks.
 """
 }
@@ -267,20 +254,218 @@ GENERATION_CONFIG = {
 }
 # Model options optimized for consistency and performance
-MODEL_OPTIONS = [
-    "gemini-2.5-pro",     # Most reliable for consistent outputs
-    "gemini-2.5-flash"   # Fast with good consistency
 ]
 def create_consistency_hash(resume_text, job_description, prompt_type):
     """Create a hash for identical inputs to ensure consistent outputs"""
     content = f"{resume_text[:1000]}{job_description[:1000]}{prompt_type}"
     return hashlib.md5(content.encode()).hexdigest()
 def get_consistent_gemini_response(model_id, prompt, pdf_content, input_text, consistency_hash):
-    """Enhanced response generation with consistency protocols"""
     try:
-        model = genai.GenerativeModel(model_id)
         # Add consistency instruction to prompt
         enhanced_prompt = f"""
@@ -289,43 +474,256 @@ def get_consistent_gemini_response(model_id, prompt, pdf_content, input_text, co
 **CONSISTENCY PROTOCOL ACTIVE:**
 Session ID: {consistency_hash}
 Evaluation Date: {datetime.now().strftime('%Y-%m-%d')}
 Apply identical methodology and scoring for consistent results.
 Use deterministic analysis patterns and standardized language.
 """
         response = model.generate_content(
-            [enhanced_prompt, pdf_content, input_text],
             generation_config=genai.types.GenerationConfig(**GENERATION_CONFIG)
         )
-        return response.text
     except Exception as e:
         st.error(f"⚠️ Analysis Error: {str(e)}")
-        return "Unable to complete analysis. Please check your API key and try again."
-def get_pdf_text(pdf_docs):
-    """Enhanced PDF text extraction with better error handling"""
     text = ""
-    try:
-        for doc in pdf_docs:
             if doc.name.endswith(".pdf"):
-                pdf_reader = PdfReader(doc)
-                for page in pdf_reader.pages:
-                    text += page.extract_text() + "\n"
-            elif doc.name.endswith(".docx"):
                 try:
-                    import docx
-                    doc_reader = docx.Document(doc)
-                    for para in doc_reader.paragraphs:
-                        text += para.text + "\n"
-                except ImportError:
-                    st.error("📋 Please install python-docx package for DOCX support.")
-    except Exception as e:
-        st.error(f"📄 File processing error: {str(e)}")
     return text
 # Streamlit App Configuration
 st.set_page_config(
     page_title="Ultimate Smart ATS System 2025",
@@ -371,6 +769,10 @@ st.markdown("""
         display: inline-block;
         margin: 0.5rem 0;
     }
 </style>
 """, unsafe_allow_html=True)
@@ -383,18 +785,32 @@ st.markdown("""
 </div>
 """, unsafe_allow_html=True)
 # Sidebar Configuration
 with st.sidebar:
     st.markdown("### 🔑 Configuration")
     st.markdown("[Get your Google API Key](https://aistudio.google.com/app/apikey)")
     api_key = st.text_input("🔐 Google API Key", type="password", help="Your Gemini API key for AI analysis")
-    selected_model = st.selectbox(
-        "🤖 Select AI Model",
-        MODEL_OPTIONS,
-        help="Choose model optimized for consistent results"
-    )
     st.markdown("### 📂 Document Upload")
     uploaded_files = st.file_uploader(
@@ -413,12 +829,9 @@ with st.sidebar:
     🔹 **Global Domain Support**: Works across all industries
     🔹 **Advanced ATS Optimization**: 85% better callback rates
     🔹 **Real-time Market Insights**: June 2025 standards
     """)
-# Set API key
-if api_key:
-    genai.configure(api_key=api_key)
 # Main Interface
 st.markdown("### 📝 Job Description Input")
 input_text = st.text_area(
@@ -468,47 +881,72 @@ if analysis_triggered:
     else:
         # Process analysis
         with st.spinner("🔄 Analyzing with advanced AI algorithms..."):
-            pdf_content = get_pdf_text(uploaded_files)
-            # Create consistency hash
-            prompt_type = "evaluate" if evaluate_btn else "improve" if improve_btn else "keywords" if keywords_btn else "match" if match_btn else "executive" if executive_btn else "transition" if transition_btn else "custom"
-            consistency_hash = create_consistency_hash(pdf_content, input_text, prompt_type)
-            # Select appropriate prompt
             if evaluate_btn:
-                prompt = SPECIALIZED_PROMPTS["evaluate_resume"]
             elif improve_btn:
-                prompt = SPECIALIZED_PROMPTS["improve_skills"]
             elif keywords_btn:
-                prompt = SPECIALIZED_PROMPTS["missing_keywords"]
             elif match_btn:
-                prompt = SPECIALIZED_PROMPTS["percentage_match"]
             elif executive_btn:
-                prompt = SPECIALIZED_PROMPTS["executive_assessment"]
             elif transition_btn:
-                prompt = SPECIALIZED_PROMPTS["career_transition"]
             elif query_btn:
-                prompt = f"{SPECIALIZED_PROMPTS['answer_query']}\n\nSPECIFIC QUERY: {custom_query}"
-            # Generate response
-            response = get_consistent_gemini_response(
-                selected_model, prompt, pdf_content, input_text, consistency_hash
-            )
-            # Display results
-            st.markdown("## 📋 Analysis Results")
-            st.markdown(f"**Consistency ID:** `{consistency_hash[:8]}`")
-            st.markdown("---")
-            st.markdown(response)
-            # Additional insights
-            st.markdown("### 💡 Pro Tips")
-            st.info("""
-            🔹 **Consistency**: Running the same analysis will yield identical results
-            🔹 **Optimization**: Use keyword suggestions to improve ATS compatibility
-            🔹 **Multi-Domain**: This system works across all industries and roles
-            🔹 **Latest Standards**: Analysis based on June 2025 best practices
-            """)
 # Footer
 st.markdown("---")
@@ -516,5 +954,10 @@ st.markdown("""
 <div style="text-align: center; color: #666;">
     <p>🚀 Ultimate Smart ATS System 2025 | Powered by Advanced AI | Consistent • Reliable • Universal</p>
     <p>Built with cutting-edge strategies for maximum ATS compatibility and career success</p>
 </div>
-""", unsafe_allow_html=True)

 import streamlit as st
 import os
+import re
+import tempfile
+import sqlite3
 import hashlib
 import json
+import time
+import pickle
 from datetime import datetime
+from pathlib import Path
+from functools import wraps
+from PyPDF2 import PdfReader
+import google.generativeai as genai
+from dotenv import load_dotenv
+# Optional imports with fallbacks
+try:
+    import pdfplumber
+    HAS_PDFPLUMBER = True
+except ImportError:
+    HAS_PDFPLUMBER = False
+try:
+    import docx
+    HAS_DOCX = True
+except ImportError:
+    HAS_DOCX = False
+try:
+    import tiktoken
+    HAS_TIKTOKEN = True
+except ImportError:
+    HAS_TIKTOKEN = False
 # Loading the .env keys
 load_dotenv()
+# Create cache directory
+CACHE_DIR = Path("ats_cache")
+CACHE_DIR.mkdir(exist_ok=True)
+# MASTER UNIVERSAL SYSTEM PROMPT - Designed for Maximum Consistency & Global Applicability
 UNIVERSAL_MASTER_PROMPT = """
 You are the ULTIMATE ATS OPTIMIZATION ENGINE 3.0 - A state-of-the-art AI system designed to provide CONSISTENT, PRECISE, and GLOBALLY APPLICABLE resume analysis across ALL industries, roles, and experience levels.
 **🏆 FINAL VERDICT:**
 [EXCEPTIONAL 90-100 | STRONG 75-89 | GOOD 60-74 | DEVELOPING 45-59 | NEEDS WORK <45]
 **CONSISTENCY GUARANTEES:**
 - Same resume + same job description = identical analysis (±2 points variation max)
 - Standardized language and terminology across all evaluations
 - Reproducible scoring methodology regardless of domain
 - Time-consistent results (same analysis today and tomorrow)
 """
 # Specialized prompts that extend the master prompt for specific use cases
 SPECIALIZED_PROMPTS = {
     "evaluate_resume": f"""
 {UNIVERSAL_MASTER_PROMPT}
 **SPECIFIC TASK: COMPREHENSIVE RESUME EVALUATION**
 Apply the Universal Evaluation Framework above to provide a complete assessment.
 Focus on overall candidacy evaluation with balanced perspective on strengths and development areas.
     "improve_skills": f"""
 {UNIVERSAL_MASTER_PROMPT}
 **SPECIFIC TASK: SKILL ENHANCEMENT STRATEGY**
 After completing the standard evaluation, provide additional guidance:
 **📈 SKILL DEVELOPMENT ROADMAP:**
 - **Immediate Actions (0-3 months):** Quick wins and foundational improvements
 - **Short-term Goals (3-12 months):** Structured learning and certification paths
 - **Long-term Vision (1-3 years):** Strategic career advancement opportunities
 **🎓 LEARNING RESOURCES:**
 - Recommended courses, certifications, and training programs
 - Industry conferences and networking opportunities
 - Practical projects and portfolio development suggestions
 Focus on actionable, measurable improvement strategies with clear timelines.
 """,
     "missing_keywords": f"""
 {UNIVERSAL_MASTER_PROMPT}
 **SPECIFIC TASK: ATS KEYWORD OPTIMIZATION**
 After completing the standard evaluation, provide enhanced keyword analysis:
 **🔍 ADVANCED KEYWORD ANALYSIS:**
 - **CRITICAL MISSING (High Impact):** Essential terms significantly affecting ATS ranking
 - **IMPORTANT ADDITIONS (Medium Impact):** Valuable terms improving visibility
 - **OPTIMIZATION OPPORTUNITIES (Low Impact):** Supplementary terms for comprehensive coverage
 **📝 INTEGRATION STRATEGY:**
 - Specific resume sections for keyword placement
 - Natural integration techniques avoiding keyword stuffing
 - Industry-appropriate phrasing and terminology
 **🤖 ATS COMPATIBILITY SCORE:** [Detailed breakdown of parsing efficiency]
 """,
     "percentage_match": f"""
 {UNIVERSAL_MASTER_PROMPT}
 **SPECIFIC TASK: PRECISE MATCHING ANALYSIS**
 Provide the standard evaluation with enhanced quantitative focus:
 **📊 DETAILED SCORING BREAKDOWN:**
 Present exact point allocation for each category with clear justification.
 Include competitive benchmarking and market positioning analysis.
 Provide specific improvement strategies for 10-15% score increase.
 **🎯 MATCH PERCENTAGE: [XX%]**
 Tier Classification with detailed rationale and next steps.
 """,
     "answer_query": f"""
 {UNIVERSAL_MASTER_PROMPT}
 **SPECIFIC TASK: EXPERT CONSULTATION**
 Apply domain expertise to answer the specific query while considering:
 - Resume content and job description context
 - Industry best practices and current market trends
 - Practical, actionable guidance
 - Evidence-based recommendations
 Provide thorough, well-researched responses with specific examples and multiple solution approaches when applicable.
 """,
     "executive_assessment": f"""
 {UNIVERSAL_MASTER_PROMPT}
 **SPECIFIC TASK: EXECUTIVE-LEVEL EVALUATION**
 Apply enhanced criteria for senior leadership positions:
 **👔 EXECUTIVE COMPETENCY FRAMEWORK:**
 - Strategic thinking and vision development
 - Change management and transformation leadership
 - Financial acumen and business impact
 - Board readiness and governance experience
 **📈 LEADERSHIP IMPACT ANALYSIS:**
 - Quantifiable business results and achievements
 - Market expansion and competitive positioning
 - Organizational culture and talent development
 - Crisis leadership and resilience
 Provide insights suitable for C-suite and board-level discussions.
 """,
     "career_transition": f"""
 {UNIVERSAL_MASTER_PROMPT}
 **SPECIFIC TASK: CAREER PIVOT ANALYSIS**
 Evaluate career change feasibility with:
 **🔄 TRANSITION ASSESSMENT:**
 - Transferable skills mapping across industries
 - Market positioning strategy for career change
 - Risk mitigation and success probability analysis
 - Timeline and milestone planning
 **🎯 TRANSITION ROADMAP:**
 - Phase-wise transition strategy
 - Skill development priorities
 - Network building and industry immersion plan
 Provide strategic guidance maximizing transition success while minimizing career risks.
 """
 }
 }
 # Model options optimized for consistency and performance
+MODEL_FALLBACK_CHAIN = [
+    "gemini-2.0-flash-exp",
+    "gemini-1.5-pro",
+    "gemini-1.5-flash",
+    "gemini-pro"
 ]
+# Rate limiting decorator
+def rate_limit(min_interval=2):
+    def decorator(func):
+        last_called = [0]
+        @wraps(func)
+        def wrapper(*args, **kwargs):
+            elapsed = time.time() - last_called[0]
+            left_to_wait = min_interval - elapsed
+            if left_to_wait > 0:
+                time.sleep(left_to_wait)
+            result = func(*args, **kwargs)
+            last_called[0] = time.time()
+            return result
+        return wrapper
+    return decorator
+# Cache management functions
+def init_cache():
+    """Initialize SQLite cache for consistency"""
+    conn = sqlite3.connect(CACHE_DIR / "ats_cache.db")
+    cursor = conn.cursor()
+    cursor.execute("""
+        CREATE TABLE IF NOT EXISTS analysis_cache (
+            hash_key TEXT PRIMARY KEY,
+            response TEXT,
+            timestamp DATETIME DEFAULT CURRENT_TIMESTAMP,
+            model_used TEXT
+        )
+    """)
+    conn.commit()
+    conn.close()
+def get_cached_response(consistency_hash, model_id):
+    """Get cached response if available"""
+    try:
+        conn = sqlite3.connect(CACHE_DIR / "ats_cache.db")
+        cursor = conn.cursor()
+        cursor.execute(
+            "SELECT response FROM analysis_cache WHERE hash_key = ? AND model_used = ?",
+            (consistency_hash, model_id)
+        )
+        result = cursor.fetchone()
+        conn.close()
+        return result[0] if result else None
+    except:
+        return None
+def cache_response(consistency_hash, response, model_id):
+    """Cache the response for future use"""
+    try:
+        conn = sqlite3.connect(CACHE_DIR / "ats_cache.db")
+        cursor = conn.cursor()
+        cursor.execute(
+            "INSERT OR REPLACE INTO analysis_cache (hash_key, response, model_used) VALUES (?, ?, ?)",
+            (consistency_hash, response, model_id)
+        )
+        conn.commit()
+        conn.close()
+    except Exception as e:
+        st.warning(f"Cache save failed: {e}")
+# Token estimation and content optimization
+def estimate_tokens(text, model="gpt-3.5-turbo"):
+    """Estimate token count for text"""
+    if HAS_TIKTOKEN:
+        try:
+            encoding = tiktoken.encoding_for_model(model)
+            return len(encoding.encode(text))
+        except:
+            pass
+    # Fallback estimation: roughly 4 characters per token
+    return len(text) // 4
+def optimize_content_length(resume_text, job_description, max_resume_tokens=2000, max_job_tokens=1500):
+    """Optimize content length to stay within token limits"""
+    # Prioritize key sections in resume
+    resume_sections = {
+        'experience': [],
+        'skills': [],
+        'education': [],
+        'summary': []
+    }
+    # Simple section detection
+    lines = resume_text.split('\n')
+    current_section = 'summary'
+    for line in lines:
+        line_lower = line.lower().strip()
+        if any(keyword in line_lower for keyword in ['experience', 'work', 'employment']):
+            current_section = 'experience'
+        elif any(keyword in line_lower for keyword in ['skills', 'technical', 'competencies']):
+            current_section = 'skills'
+        elif any(keyword in line_lower for keyword in ['education', 'academic', 'degree']):
+            current_section = 'education'
+        if line.strip():
+            resume_sections[current_section].append(line)
+    # Build optimized resume content
+    optimized_resume = []
+    # Add summary (first 300 chars)
+    if resume_sections['summary']:
+        summary_text = '\n'.join(resume_sections['summary'][:5])
+        optimized_resume.append(f"PROFESSIONAL SUMMARY:\n{summary_text[:300]}")
+    # Add experience (prioritize recent)
+    if resume_sections['experience']:
+        exp_text = '\n'.join(resume_sections['experience'][:15])
+        optimized_resume.append(f"WORK EXPERIENCE:\n{exp_text[:800]}")
+    # Add skills
+    if resume_sections['skills']:
+        skills_text = '\n'.join(resume_sections['skills'][:8])
+        optimized_resume.append(f"SKILLS:\n{skills_text[:400]}")
+    # Add education
+    if resume_sections['education']:
+        edu_text = '\n'.join(resume_sections['education'][:5])
+        optimized_resume.append(f"EDUCATION:\n{edu_text[:200]}")
+    optimized_resume_text = '\n\n'.join(optimized_resume)
+    # Ensure we're within token limits
+    resume_tokens = estimate_tokens(optimized_resume_text)
+    if resume_tokens > max_resume_tokens:
+        # Truncate if still too long
+        chars_per_token = len(optimized_resume_text) / resume_tokens
+        max_chars = int(max_resume_tokens * chars_per_token)
+        optimized_resume_text = optimized_resume_text[:max_chars] + "... [truncated]"
+    # Optimize job description
+    job_lines = job_description.split('\n')
+    important_lines = []
+    for line in job_lines:
+        line_lower = line.lower()
+        # Prioritize lines with key information
+        if any(keyword in line_lower for keyword in [
+            'require', 'must', 'essential', 'experience', 'skill',
+            'qualification', 'bachelor', 'master', 'year', 'certification'
+        ]):
+            important_lines.append(line)
+        elif line.strip() and len(important_lines) < 20:
+            important_lines.append(line)
+    optimized_job = '\n'.join(important_lines)
+    # Ensure job description is within limits
+    job_tokens = estimate_tokens(optimized_job)
+    if job_tokens > max_job_tokens:
+        chars_per_token = len(optimized_job) / job_tokens
+        max_chars = int(max_job_tokens * chars_per_token)
+        optimized_job = optimized_job[:max_chars] + "... [truncated]"
+    return optimized_resume_text, optimized_job
 def create_consistency_hash(resume_text, job_description, prompt_type):
     """Create a hash for identical inputs to ensure consistent outputs"""
     content = f"{resume_text[:1000]}{job_description[:1000]}{prompt_type}"
     return hashlib.md5(content.encode()).hexdigest()
+def get_available_model():
+    """Get the first available model from the fallback chain"""
+    for model in MODEL_FALLBACK_CHAIN:
+        try:
+            test_model = genai.GenerativeModel(
+                model,
+                safety_settings={
+                    genai.types.HarmCategory.HARM_CATEGORY_HARASSMENT: genai.types.HarmBlockThreshold.BLOCK_NONE,
+                    genai.types.HarmCategory.HARM_CATEGORY_HATE_SPEECH: genai.types.HarmBlockThreshold.BLOCK_NONE,
+                    genai.types.HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT: genai.types.HarmBlockThreshold.BLOCK_NONE,
+                    genai.types.HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: genai.types.HarmBlockThreshold.BLOCK_NONE,
+                }
+            )
+            # Test with a simple prompt
+            test_response = test_model.generate_content(
+                "Say 'OK'",
+                generation_config=genai.types.GenerationConfig(
+                    temperature=0.1,
+                    max_output_tokens=10
+                )
+            )
+            if test_response.text:
+                return model
+        except Exception:
+            continue
+    raise Exception("No available Gemini models found")
+@rate_limit(min_interval=2)
 def get_consistent_gemini_response(model_id, prompt, pdf_content, input_text, consistency_hash):
+    """Enhanced response generation with robust error handling"""
     try:
+        model = genai.GenerativeModel(
+            model_id,
+            safety_settings={
+                genai.types.HarmCategory.HARM_CATEGORY_HARASSMENT: genai.types.HarmBlockThreshold.BLOCK_NONE,
+                genai.types.HarmCategory.HARM_CATEGORY_HATE_SPEECH: genai.types.HarmBlockThreshold.BLOCK_NONE,
+                genai.types.HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT: genai.types.HarmBlockThreshold.BLOCK_NONE,
+                genai.types.HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: genai.types.HarmBlockThreshold.BLOCK_NONE,
+            }
+        )
         # Add consistency instruction to prompt
         enhanced_prompt = f"""
 **CONSISTENCY PROTOCOL ACTIVE:**
 Session ID: {consistency_hash}
 Evaluation Date: {datetime.now().strftime('%Y-%m-%d')}
 Apply identical methodology and scoring for consistent results.
 Use deterministic analysis patterns and standardized language.
+**RESUME CONTENT:**
+{pdf_content[:3000]}
+**JOB DESCRIPTION:**
+{input_text[:2000]}
 """
         response = model.generate_content(
+            enhanced_prompt,
             generation_config=genai.types.GenerationConfig(**GENERATION_CONFIG)
         )
+        # Enhanced error checking
+        if hasattr(response, 'candidates') and response.candidates:
+            candidate = response.candidates[0]
+            # Check finish reason
+            if hasattr(candidate, 'finish_reason'):
+                if candidate.finish_reason == 1:  # STOP - Normal completion
+                    return response.text if hasattr(response, 'text') and response.text else "Analysis completed but no content returned."
+                elif candidate.finish_reason == 2:  # MAX_TOKENS
+                    return "⚠️ Analysis truncated due to length. Please try with a shorter resume or job description."
+                elif candidate.finish_reason == 3:  # SAFETY
+                    return "⚠️ Content filtered for safety. Please review your input for any potentially problematic content."
+                elif candidate.finish_reason == 4:  # RECITATION
+                    return "⚠️ Content blocked due to recitation concerns. Please try rephrasing your input."
+                else:
+                    return f"⚠️ Generation stopped with reason: {candidate.finish_reason}"
+            # Try to get text anyway
+            try:
+                return response.text if response.text else "No analysis content generated."
+            except:
+                return "Analysis completed but content could not be retrieved."
+        return "No response candidates generated. Please try again."
     except Exception as e:
         st.error(f"⚠️ Analysis Error: {str(e)}")
+        # Fallback with simpler model configuration
+        try:
+            simple_model = genai.GenerativeModel("gemini-pro")
+            simple_prompt = f"Analyze this resume against the job description:\n\nResume: {pdf_content[:1000]}\n\nJob: {input_text[:1000]}"
+            fallback_response = simple_model.generate_content(simple_prompt)
+            return f"⚠️ Using fallback analysis:\n\n{fallback_response.text}"
+        except:
+            return "Unable to complete analysis. Please check your API key, reduce content length, and try again."
+def clean_extracted_text(text):
+    """Clean and format extracted text"""
+    # Remove excessive whitespace
+    text = re.sub(r'\n\s*\n\s*\n', '\n\n', text)
+    text = re.sub(r'[ \t]+', ' ', text)
+    # Fix common extraction issues
+    text = re.sub(r'([a-z])([A-Z])', r'\1 \2', text)  # Add space before capitals
+    text = re.sub(r'(\w)([•·▪▫])', r'\1 \2', text)  # Space before bullets
+    text = re.sub(r'([•·▪▫])(\w)', r'\1 \2', text)  # Space after bullets
+    # Remove page markers
+    text = re.sub(r'--- Page \d+ ---', '', text)
+    # Normalize line endings
+    text = text.replace('\r\n', '\n').replace('\r', '\n')
+    # Remove empty lines at start and end
+    text = text.strip()
+    return text
+def enhanced_pdf_processing(pdf_docs):
+    """Enhanced PDF processing with better text extraction and formatting"""
     text = ""
+    for doc in pdf_docs:
+        try:
             if doc.name.endswith(".pdf"):
+                # Save uploaded file temporarily
+                with tempfile.NamedTemporaryFile(delete=False, suffix=".pdf") as tmp_file:
+                    tmp_file.write(doc.getvalue())
+                    tmp_path = tmp_file.name
                 try:
+                    # Try multiple extraction methods
+                    pdf_reader = PdfReader(tmp_path)
+                    extracted_text = ""
+                    for page_num, page in enumerate(pdf_reader.pages):
+                        page_text = page.extract_text()
+                        # Clean up common PDF extraction issues
+                        page_text = re.sub(r'\s+', ' ', page_text)  # Normalize whitespace
+                        page_text = re.sub(r'([a-z])([A-Z])', r'\1 \2', page_text)  # Add spaces between words
+                        extracted_text += f"\n--- Page {page_num + 1} ---\n{page_text}\n"
+                    # If extraction is poor, try alternative method
+                    if len(extracted_text.strip()) < 100 and HAS_PDFPLUMBER:
+                        try:
+                            with pdfplumber.open(tmp_path) as pdf:
+                                for page in pdf.pages:
+                                    page_text = page.extract_text()
+                                    if page_text:
+                                        extracted_text += page_text + "\n"
+                        except Exception:
+                            pass
+                    text += extracted_text
+                finally:
+                    # Clean up temporary file
+                    os.unlink(tmp_path)
+            elif doc.name.endswith(".docx") and HAS_DOCX:
+                try:
+                    # Save uploaded file temporarily
+                    with tempfile.NamedTemporaryFile(delete=False, suffix=".docx") as tmp_file:
+                        tmp_file.write(doc.getvalue())
+                        tmp_path = tmp_file.name
+                    try:
+                        doc_reader = docx.Document(tmp_path)
+                        # Extract paragraphs
+                        for para in doc_reader.paragraphs:
+                            if para.text.strip():
+                                text += para.text + "\n"
+                        # Extract tables
+                        for table in doc_reader.tables:
+                            for row in table.rows:
+                                row_text = " | ".join([cell.text.strip() for cell in row.cells])
+                                if row_text.strip():
+                                    text += row_text + "\n"
+                    finally:
+                        os.unlink(tmp_path)
+                except Exception as e:
+                    st.error(f"📋 Error processing DOCX {doc.name}: {str(e)}")
+        except Exception as e:
+            st.error(f"📄 Error processing {doc.name}: {str(e)}")
+            continue
+    # Clean and format the extracted text
+    text = clean_extracted_text(text)
     return text
+def validate_resume_content(text):
+    """Validate that the extracted text looks like a resume"""
+    text_lower = text.lower()
+    # Check for common resume indicators
+    resume_indicators = [
+        'experience', 'education', 'skills', 'work', 'employment',
+        'university', 'college', 'degree', 'certification', 'project',
+        'email', 'phone', 'address', 'linkedin'
+    ]
+    found_indicators = sum(1 for indicator in resume_indicators if indicator in text_lower)
+    if found_indicators < 3:
+        st.warning("⚠️ The uploaded file may not be a resume. Please verify the content.")
+        return False
+    if len(text.strip()) < 200:
+        st.warning("⚠️ The extracted text seems too short. Please check your file.")
+        return False
+    return True
+def validate_configuration():
+    """Validate system configuration"""
+    issues = []
+    # Check API key
+    if not os.getenv("GOOGLE_API_KEY") and not st.session_state.get("api_key"):
+        issues.append("❌ Google API Key not configured")
+    # Check optional packages
+    if not HAS_DOCX:
+        issues.append("⚠️ Optional: Install python-docx for better DOCX support (pip install python-docx)")
+    if not HAS_PDFPLUMBER:
+        issues.append("⚠️ Optional: Install pdfplumber for better PDF extraction (pip install pdfplumber)")
+    if not HAS_TIKTOKEN:
+        issues.append("⚠️ Optional: Install tiktoken for better token estimation (pip install tiktoken)")
+    return issues
+@st.cache_data
+def load_system_status():
+    """Load and cache system status"""
+    issues = validate_configuration()
+    return issues
+def perform_enhanced_analysis(resume_text, job_description, analysis_type, custom_query=None):
+    """Main analysis function with all improvements"""
+    # Initialize cache
+    init_cache()
+    # Optimize content length
+    optimized_resume, optimized_job = optimize_content_length(resume_text, job_description)
+    # Create consistency hash
+    consistency_hash = create_consistency_hash(optimized_resume, optimized_job, analysis_type)
+    # Try to get from cache first
+    model_id = get_available_model()
+    cached_response = get_cached_response(consistency_hash, model_id)
+    if cached_response:
+        st.success("⚡ Retrieved from cache for consistency")
+        return cached_response, consistency_hash
+    # Select prompt
+    prompt_map = {
+        "evaluate": "evaluate_resume",
+        "improve": "improve_skills",
+        "keywords": "missing_keywords",
+        "match": "percentage_match",
+        "executive": "executive_assessment",
+        "transition": "career_transition",
+        "custom": "answer_query"
+    }
+    base_prompt = SPECIALIZED_PROMPTS[prompt_map.get(analysis_type, "evaluate_resume")]
+    if analysis_type == "custom" and custom_query:
+        base_prompt = f"{base_prompt}\n\nSPECIFIC QUERY: {custom_query}"
+    # Generate response
+    response = get_consistent_gemini_response(
+        model_id, base_prompt, optimized_resume, optimized_job, consistency_hash
+    )
+    # Cache the response
+    if response and not response.startswith("⚠️"):
+        cache_response(consistency_hash, response, model_id)
+    return response, consistency_hash
 # Streamlit App Configuration
 st.set_page_config(
     page_title="Ultimate Smart ATS System 2025",
         display: inline-block;
         margin: 0.5rem 0;
     }
+    .stButton > button {
+        height: 3rem;
+        font-weight: 600;
+    }
 </style>
 """, unsafe_allow_html=True)
 </div>
 """, unsafe_allow_html=True)
+# Check system status
+config_issues = load_system_status()
+if config_issues:
+    with st.expander("⚠️ System Status", expanded=any("❌" in issue for issue in config_issues)):
+        for issue in config_issues:
+            if "❌" in issue:
+                st.error(issue)
+            else:
+                st.info(issue)
 # Sidebar Configuration
 with st.sidebar:
     st.markdown("### 🔑 Configuration")
     st.markdown("[Get your Google API Key](https://aistudio.google.com/app/apikey)")
     api_key = st.text_input("🔐 Google API Key", type="password", help="Your Gemini API key for AI analysis")
+    st.session_state["api_key"] = api_key
+    if api_key:
+        try:
+            genai.configure(api_key=api_key)
+            model_id = get_available_model()
+            st.success(f"✅ Connected to {model_id}")
+        except Exception as e:
+            st.error(f"❌ API Key Error: {str(e)}")
     st.markdown("### 📂 Document Upload")
     uploaded_files = st.file_uploader(
     🔹 **Global Domain Support**: Works across all industries
     🔹 **Advanced ATS Optimization**: 85% better callback rates
     🔹 **Real-time Market Insights**: June 2025 standards
+    🔹 **Smart Caching**: Instant results for repeated analyses
     """)
 # Main Interface
 st.markdown("### 📝 Job Description Input")
 input_text = st.text_area(
     else:
         # Process analysis
         with st.spinner("🔄 Analyzing with advanced AI algorithms..."):
+            pdf_content = enhanced_pdf_processing(uploaded_files)
+            # Validate content
+            if not validate_resume_content(pdf_content):
+                st.warning("⚠️ Please verify that your uploaded file is a valid resume.")
+            # Determine analysis type
             if evaluate_btn:
+                analysis_type = "evaluate"
             elif improve_btn:
+                analysis_type = "improve"
             elif keywords_btn:
+                analysis_type = "keywords"
             elif match_btn:
+                analysis_type = "match"
             elif executive_btn:
+                analysis_type = "executive"
             elif transition_btn:
+                analysis_type = "transition"
             elif query_btn:
+                analysis_type = "custom"
+            try:
+                # Perform analysis
+                response, consistency_hash = perform_enhanced_analysis(
+                    pdf_content, input_text, analysis_type, custom_query
+                )
+                # Display results
+                st.markdown("## 📋 Analysis Results")
+                # Show metadata
+                col1, col2 = st.columns(2)
+                with col1:
+                    st.markdown(f"**Consistency ID:** `{consistency_hash[:8]}`")
+                with col2:
+                    st.markdown(f"**Analysis Type:** {analysis_type.title()}")
+                st.markdown("---")
+                st.markdown(response)
+                # Additional insights
+                st.markdown("### 💡 Pro Tips")
+                st.info("""
+                🔹 **Consistency**: Running the same analysis will yield identical results
+                🔹 **Optimization**: Use keyword suggestions to improve ATS compatibility
+                🔹 **Multi-Domain**: This system works across all industries and roles
+                🔹 **Latest Standards**: Analysis based on June 2025 best practices
+                🔹 **Caching**: Repeated analyses are retrieved instantly from cache
+                """)
+                # Show content optimization info
+                if st.checkbox("🔍 Show Content Optimization Details"):
+                    optimized_resume, optimized_job = optimize_content_length(pdf_content, input_text)
+                    col1, col2 = st.columns(2)
+                    with col1:
+                        st.metric("Resume Tokens", estimate_tokens(optimized_resume))
+                        st.metric("Original Resume Length", len(pdf_content))
+                    with col2:
+                        st.metric("Job Description Tokens", estimate_tokens(optimized_job))
+                        st.metric("Original Job Length", len(input_text))
+            except Exception as e:
+                st.error(f"Analysis failed: {str(e)}")
+                st.info("Please try again with a shorter document or check your API key.")
 # Footer
 st.markdown("---")
 <div style="text-align: center; color: #666;">
     <p>🚀 Ultimate Smart ATS System 2025 | Powered by Advanced AI | Consistent • Reliable • Universal</p>
     <p>Built with cutting-edge strategies for maximum ATS compatibility and career success</p>
+    <small>Version 3.0 - Enhanced with Error Handling, Caching, and Content Optimization</small>
 </div>
+""", unsafe_allow_html=True)
+# Initialize cache on startup
+if __name__ == "__main__":
+    init_cache()