Spaces:
Sleeping
Sleeping
A newer version of the Streamlit SDK is available: 1.57.0
metadata
sdk: streamlit
π― AI Resume ATS Analyzer
A production-ready, AI-powered Resume Screening & ATS System built entirely on free and open-source tools.
π Features
| Feature | Technology Used |
|---|---|
| PDF/DOCX Resume Parsing | PyMuPDF + python-docx |
| Named Entity Recognition | spaCy en_core_web_sm |
| Skills Extraction | Custom keyword taxonomy |
| Section Detection | Keyword + NLP heuristics |
| Resume Base Scoring | Custom rubric (100 pts) |
| Job Description Matching | Sentence-BERT (all-MiniLM-L6-v2) |
| ATS Score Calculation | Weighted formula |
| AI Resume Rewriting | FLAN-T5-base (Instruction-tuned) |
| Suggestions Engine | Rule-based + score-driven |
π Project Structure
resume_ats/
βββ app.py # Main Streamlit application
βββ requirements.txt # All dependencies
βββ README.md # This file
βββ utils/
βββ __init__.py
βββ parser.py # PDF/DOCX text extraction
βββ nlp_utils.py # NER, section detection, skills, suggestions
βββ scorer.py # Resume base score + ATS score
βββ similarity.py # Sentence-BERT job matching
βββ generator.py # FLAN-T5 resume rewriting
βοΈ ATS Scoring Formula
Resume Base Score (0β100):
Skills richness : up to 20 pts
Experience : up to 30 pts
Projects : up to 20 pts
Education : up to 10 pts
Resume length : up to 10 pts
Diversity : up to 10 pts
ATS Score = (0.6 Γ Resume Score) + (0.4 Γ Job Match %)
β capped at 100%
Classification:
ATS β₯ 70 β Good β
45 β€ ATS < 70 β Average β οΈ
ATS < 45 β Poor β
π₯οΈ Run Locally
Prerequisites
- Python 3.10 or 3.11
- pip
Installation
# Clone / download the project
cd resume_ats
# Install dependencies (takes ~3β5 minutes first time)
pip install -r requirements.txt
# Download spaCy language model
python -m spacy download en_core_web_sm
# Launch the app
streamlit run app.py
Open your browser at http://localhost:8501
βοΈ Deploy on Hugging Face Spaces (Step-by-Step)
Step 1: Create a Hugging Face Account
Go to https://huggingface.co/join and sign up for a free account.
### Step 2: Create a New Space
1. Go to https://huggingface.co/new-space
2. Fill in:
- Space name: ai-resume-ats-analyzer (or any name)
- License: MIT
- SDK: Select Streamlit β Important!
- Hardware: CPU Basic (Free) is sufficient
3. Click Create Space
### Step 3: Upload Your Files
Option A β Using the Web UI:
1. Click Files tab in your Space
2. Click Add file β Upload files
3. Upload all files maintaining this structure:
app.py requirements.txt utils/__init__.py utils/parser.py utils/nlp_utils.py utils/scorer.py utils/similarity.py utils/generator.py
4. Commit the changes
Option B β Using Git:
bash # Install git-lfs (for large model files if needed) git lfs install # Clone your Space git clone https://huggingface.co/spaces/YOUR_USERNAME/ai-resume-ats-analyzer # Copy project files into the cloned directory cp -r resume_ats/* ai-resume-ats-analyzer/ # Push to Hugging Face cd ai-resume-ats-analyzer git add . git commit -m "Initial deployment" git push
### Step 4: Configure for Hugging Face Spaces
Create a file called packages.txt in the root with:
# No system packages needed β all Python
> Note: The spaCy model (en_core_web_sm) is downloaded automatically
> by nlp_utils.py on first run via subprocess. No manual step needed.
### Step 5: Monitor the Build
1. Go to your Space URL
2. Click the Logs tab to watch the build progress
3. First build takes ~5β10 minutes (installing packages + downloading models)
4. Once the status shows Running β
, your app is live!
### Step 6: Access Your App
Your app will be available at:
https://huggingface.co/spaces/YOUR_USERNAME/ai-resume-ats-analyzer
π§ Runtime Considerations
First Run (Cold Start)
The first time the app runs, it will download:
- en_core_web_sm β 12 MB (spaCy model)
- all-MiniLM-L6-v2 β ~80 MB (Sentence-BERT)
- google/flan-t5-base β ~250 MB (text generation)
Total: ~342 MB β downloaded once, then cached.
On Hugging Face Spaces (CPU Basic):
- Model loading: ~30β60 seconds on first analysis
- Subsequent analyses: ~5β15 seconds (models cached in memory)
- AI rewrite generation: ~30β60 seconds on CPU
### Memory Usage
- CPU Basic (16GB RAM) on HF Spaces is sufficient
- All models run on CPU β no GPU required
### Caching
Hugging Face caches models to `/.cache/huggingface/`. Models persist between restarts on persistent Spaces.
all-MiniLM-L6-v2 β ~80 MB (Sentence-BERT)
- google/flan-t5-base β ~250 MB (text generation)
Total: ~342 MB β downloaded once, then cached.
On Hugging Face Spaces (CPU Basic):
- Model loading: ~30β60 seconds on first analysis
- Subsequent analyses: ~5β15 seconds (models cached in memory)
- AI rewrite generation: ~30β60 seconds on CPU
### Memory Usage
- CPU Basic (16GB RAM) on HF Spaces is sufficient
- All models run on CPU β no GPU required
### Caching
Hugging Face caches models to `π Important Notes
- Image-based PDFs (scanned documents) will show very little extracted text. Use text-based PDFs.
- FLAN-T5 rewriting is limited to ~400 words of input due to model context window. For longer resumes, only the first section is rewritten.
- Privacy: No data is sent to any external server. All processing happens locally/on your HF Space.
- Sentence-BERT similarity uses semantic understanding β it catches synonym matches that keyword overlap would miss.
π οΈ Customization
Add more skills
Edit TECHNICAL_SKILLS and SOFT_SKILLS sets in utils/nlp_utils.py.
### Adjust scoring weights
Modify the compute_base_score() function in utils/scorer.py.
### Change the AI model
In utils/generator.py, replace "google/flan-t5-base" with a larger model like "google/flan-t5-large" for better quality (requires more RAM).
π License
MIT β free for personal and commercial use.