Spaces:
Runtime error
Runtime error
| #!/usr/bin/env python3 | |
| """ | |
| Diagnostic Script - Understanding Why Empty/Minimal Resumes Get High Scores | |
| """ | |
| from app import ATSCompatibilityAnalyzer | |
| analyzer = ATSCompatibilityAnalyzer() | |
| # Test cases to diagnose | |
| test_cases = [ | |
| ("Empty Resume", "", "Software Engineer with Python experience"), | |
| ("Just Name", "John Doe", "Looking for a Data Analyst with SQL and Python"), | |
| ("Just 'Hi'", "Hi", "Senior Software Engineer"), | |
| ("Random Gibberish", "asdfghjkl qwertyuiop zxcvbnm", "Machine Learning Engineer"), | |
| ("Chef for ML Role", "Chef John - 10 years cooking - French cuisine, pastry", "Machine Learning Engineer with PhD and PyTorch"), | |
| ] | |
| print("=" * 80) | |
| print("DIAGNOSIS: WHY ARE EMPTY/MINIMAL RESUMES GETTING HIGH SCORES?") | |
| print("=" * 80) | |
| for name, resume, jd in test_cases: | |
| print(f"\n{'='*60}") | |
| print(f"TEST: {name}") | |
| print(f"Resume: '{resume[:50]}...' ({len(resume)} chars)" if len(resume) > 50 else f"Resume: '{resume}' ({len(resume)} chars)") | |
| print(f"JD: '{jd[:50]}...'" if len(jd) > 50 else f"JD: '{jd}'") | |
| print("-" * 60) | |
| result = analyzer.analyze(resume, jd) | |
| print(f"\nπ TOTAL SCORE: {result['total_score']}% <-- THIS IS THE PROBLEM!") | |
| print("\nπ BREAKDOWN (with weights):") | |
| weights = analyzer.weights | |
| for metric, score in result['breakdown'].items(): | |
| weight = weights.get(metric, 0) | |
| weighted = score * weight | |
| print(f" {metric:20} = {score:5.1f}% Γ {weight:.2f} = {weighted:5.1f}") | |
| print(f"\n {'='*40}") | |
| print(f" WEIGHTED TOTAL: {result['total_score']}%") | |
| print("\n\n" + "=" * 80) | |
| print("π ROOT CAUSE ANALYSIS") | |
| print("=" * 80) | |
| print(""" | |
| The scoring functions have ARTIFICIALLY HIGH BASELINES: | |
| 1. _format_score: baseline = 80 (even empty resume gets 80) | |
| 2. _section_score: baseline = 80 (even no sections gets 80) | |
| 3. _action_verb_score: baseline = 75 (0 verbs = 75%) | |
| 4. _quantification: baseline = 68 (0 numbers = 68%) | |
| 5. _tfidf_score: 60 + (raw * 0.45) (0% match = 60%) | |
| 6. _skills_match: baseline = 75 (0 matches = 75%) | |
| 7. _semantic_match: returns 75-85 default with no match | |
| This design was meant to prevent "harsh" scoring but it's BROKEN: | |
| - Empty resumes should score 0-10%, not 70%+ | |
| - Completely irrelevant resumes should score <30%, not 80%+ | |
| RECOMMENDED FIX: | |
| - Remove artificial baselines | |
| - Score from 0, not from 60-80 | |
| - Apply minimum thresholds for valid input | |
| """) | |
| print("\n" + "=" * 80) | |
| print("WEIGHTS BEING USED:") | |
| print("=" * 80) | |
| for metric, weight in analyzer.weights.items(): | |
| print(f" {metric:20} = {weight:.2f}") | |
| print(f"\nTotal weights sum: {sum(analyzer.weights.values()):.2f}") | |