A newer version of the Gradio SDK is available: 6.13.0
metadata
title: Resume Verification System
emoji: π»
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
Resume Verification System π
An advanced AI-powered CV analysis tool that identifies claims, verifies evidence, detects red flags, and generates targeted interview questions. Built with Gradio and Google Gemini Flash 2.5 for deployment on Hugging Face Spaces.
π― Key Features
Core Capabilities
- Smart Claim Extraction: Automatically identifies and categorizes all factual claims from resumes
- Multi-Tier Evidence Validation: Verifies claims through link checking, repository forensics, and cross-section triangulation
- Advanced Red Flag Detection: Identifies role-achievement mismatches, timeline inconsistencies, and implausible metrics
- SOTA Verification: Validates research claims against known state-of-the-art benchmarks
- Interview Question Generation: Creates targeted questions based on unverified claims and red flags
- Comprehensive Reporting: Exports detailed analysis in PDF, HTML, CSV, JSON, and interview checklist formats
Advanced Features
- Dual-Score Model: Credibility Score + Consistency Score with weighted final assessment
- Seniority-Aware Analysis: Adaptive thresholds based on candidate level (Intern/Junior/Mid/Senior/Lead)
- Repository Forensics: Deep analysis of GitHub/GitLab repositories including commit history and authorship
- Artifact Credibility Tiers: Weighted evidence scoring (DOI/arXiv > Corporate Blog > Personal Blog)
- Buzzword Detection: Identifies and penalizes vague claims and excessive buzzwords
- Timeline Validation: Detects employment gaps, overlapping positions, and technology anachronisms
- Bias Mitigation: Strips protected attributes and ensures fair assessment
π Quick Start
Prerequisites
Python 3.8+
Google Gemini API Key (get from https://makersuite.google.com/app/apikey)
Installation
- Clone the repository:
git clone https://github.com/yourusername/resume_verifier.git
cd resume_verifier
- Install dependencies:
pip install -r requirements.txt
- Run the application:
python app.py
- Open browser to
http://localhost:7860
π Usage Guide
Step 1: Initialize Session
- Enter your Gemini API key in the Setup tab
- Click "Initialize Session"
- Wait for confirmation message
Step 2: Upload Resume
- Go to the Analysis tab
- Upload a resume (PDF, DOCX, or TXT)
- Select seniority level
- Choose analysis strictness (Low/Medium/High)
- Enable deep analysis for thorough verification
Step 3: Run Analysis
- Click "Analyze Resume"
- Wait for progress completion (typically 30-60 seconds)
- Review the summary in the Analysis tab
Step 4: Review Results
- Results Dashboard: View credibility scores and evidence heatmap
- Interview Prep: Review red flags and generated interview questions
- Export: Download comprehensive reports in various formats
ποΈ System Architecture
resume_verifier/
βββ app.py # Main Gradio application
βββ requirements.txt # Python dependencies
βββ README.md # Documentation
βββ config/
β βββ __init__.py
β βββ prompts.py # Gemini prompt templates
βββ modules/
β βββ __init__.py
β βββ cv_parser.py # Document parsing and text extraction
β βββ claim_extractor.py # Claim identification via Gemini
β βββ evidence_validator.py # Evidence scoring and validation
β βββ red_flag_detector.py # Red flag and inconsistency detection
β βββ sota_checker.py # Research claim verification
βββ utils/
β βββ __init__.py
β βββ gemini_client.py # Gemini API wrapper with caching
βββ visualization/
β βββ __init__.py
β βββ evidence_heatmap.py # Interactive visualizations
β βββ report_generator.py # Multi-format report generation
π§ Configuration
Seniority Levels
- Intern: Lenient thresholds, expects limited evidence
- Junior: Basic verification, 1-3 years experience
- Mid: Standard verification, 3-5 years experience
- Senior: Strict verification, 5+ years experience
- Lead: Highest scrutiny for leadership claims
Strictness Levels
- Low: 0.7x severity multiplier, fewer red flags
- Medium: 1.0x standard detection (recommended)
- High: 1.3x severity multiplier, aggressive flagging
Evidence Tiers
| Tier | Weight | Examples |
|---|---|---|
| DOI/ArXiv | 1.0 | doi.org, arxiv.org, ACM, IEEE |
| GitHub Active | 0.9 | GitHub/GitLab repositories with activity |
| Corporate Blog | 0.8 | Company engineering blogs |
| Portfolio | 0.7 | Personal portfolio sites |
| Personal Blog | 0.6 | Medium, dev.to, personal blogs |
π Scoring System
Final Score Calculation
Final Score = (Credibility Γ 0.6) + (Consistency Γ 0.4)
Risk Assessment
- Low Risk (75-100): Strong evidence, minimal concerns
- Medium Risk (50-74): Some unverified claims, standard verification needed
- High Risk (25-49): Multiple red flags, detailed verification required
- Critical Risk (0-24): Major concerns, consider rejection
Red Flag Severity
- Critical (-30 points): Major inconsistencies or fabrications
- High (-20 points): Significant credibility issues
- Medium (-10 points): Moderate concerns requiring clarification
- Low (-5 points): Minor issues or vagueness
π Red Flag Categories
1. Role-Achievement Mismatch
- Leadership claims in junior roles
- Senior achievements without corresponding titles
- Complex projects with impossibly short tenures
2. Timeline Issues
- Overlapping full-time positions
- Employment gaps > 3 months
- Technologies used before public release
3. Metric Implausibility
- Improvements > 200% in short timeframes
- User numbers exceeding company scale
- SOTA claims beyond published benchmarks
4. Vagueness Indicators
- High buzzword density (>20%)
- No quantifiable metrics
- Generic descriptions without specifics
5. Over-claiming Patterns
15 "expert" level skills
- All projects claimed as "successful"
- Sole credit for team achievements
π SOTA Benchmarks (2025)
Computer Vision
- ImageNet Accuracy: 92.8%
- COCO mAP: 65.5%
- CIFAR-10: 99.5%
NLP
- SQUAD F1: 97.8%
- GLUE Average: 94.2%
- WMT BLEU: 43.1
Speech
- LibriSpeech WER (clean): 0.7%
π‘οΈ Security & Privacy
API Key Security
- Session-scoped storage only
- No persistent storage
- In-memory TTL management
PII Protection
- Automatic redaction of phone/email/address
- Protected attribute removal
- RBAC for multi-user deployments
Rate Limiting
- 60 requests/minute to Gemini API
- Exponential backoff on failures
- Response caching to minimize API calls
π Deployment
Hugging Face Spaces
- Create new Space
- Select Gradio SDK
- Upload repository files
- Set environment variables:
GEMINI_API_KEY=your_key_here - Deploy and share URL
Docker Deployment
FROM python:3.9-slim
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
CMD ["python", "app.py"]
π API Usage
Programmatic Analysis
from modules.cv_parser import CVParser
from modules.claim_extractor import ClaimExtractor
from utils.gemini_client import GeminiClient
# Initialize
client = GeminiClient(api_key="your_key")
parser = CVParser()
extractor = ClaimExtractor(client)
# Analyze
parsed = parser.parse("resume.pdf")
claims = extractor.extract_claims(parsed, seniority_level="mid")
π§ͺ Testing
Run Tests
pytest tests/
Built with β€οΈ for better hiring decisions