chirag1121's picture
Update README.md
8913fce verified
---
sdk: streamlit
---
# 🎯 AI Resume ATS Analyzer
A production-ready, AI-powered Resume Screening & ATS System built entirely on free and open-source tools.
---
## πŸš€ Features
| Feature | Technology Used |
|---|---|
| PDF/DOCX Resume Parsing | PyMuPDF + python-docx |
| Named Entity Recognition | spaCy en_core_web_sm |
| Skills Extraction | Custom keyword taxonomy |
| Section Detection | Keyword + NLP heuristics |
| Resume Base Scoring | Custom rubric (100 pts) |
| Job Description Matching | Sentence-BERT (all-MiniLM-L6-v2) |
| ATS Score Calculation | Weighted formula |
| AI Resume Rewriting | FLAN-T5-base (Instruction-tuned) |
| Suggestions Engine | Rule-based + score-driven |
---
## πŸ“ Project Structure
```
resume_ats/
β”œβ”€β”€ app.py # Main Streamlit application
β”œβ”€β”€ requirements.txt # All dependencies
β”œβ”€β”€ README.md # This file
└── utils/
β”œβ”€β”€ __init__.py
β”œβ”€β”€ parser.py # PDF/DOCX text extraction
β”œβ”€β”€ nlp_utils.py # NER, section detection, skills, suggestions
β”œβ”€β”€ scorer.py # Resume base score + ATS score
β”œβ”€β”€ similarity.py # Sentence-BERT job matching
└── generator.py # FLAN-T5 resume rewriting
```
---
## βš™οΈ ATS Scoring Formula
```
Resume Base Score (0–100):
Skills richness : up to 20 pts
Experience : up to 30 pts
Projects : up to 20 pts
Education : up to 10 pts
Resume length : up to 10 pts
Diversity : up to 10 pts
ATS Score = (0.6 Γ— Resume Score) + (0.4 Γ— Job Match %)
β€” capped at 100%
Classification:
ATS β‰₯ 70 β†’ Good βœ…
45 ≀ ATS < 70 β†’ Average ⚠️
ATS < 45 β†’ Poor ❌
```
---
## πŸ–₯️ Run Locally
### Prerequisites
- Python 3.10 or 3.11
- pip
### Installation
```bash
# Clone / download the project
cd resume_ats
# Install dependencies (takes ~3–5 minutes first time)
pip install -r requirements.txt
# Download spaCy language model
python -m spacy download en_core_web_sm
# Launch the app
streamlit run app.py
```
Open your browser at `http://localhost:8501`
---
## ☁️ Deploy on Hugging Face Spaces (Step-by-Step)
### Step 1: Create a Hugging Face Account
Go to [https://huggingface.co/join](https://huggingface.co/join) and sign up for a free account.
### Step 2: Create a New Space
1. Go to [https://huggingface.co/new-space](https://huggingface.co/new-space)
2. Fill in:
- **Space name:** `ai-resume-ats-analyzer` (or any name)
- **License:** MIT
- **SDK:** Select **Streamlit** ← Important!
- **Hardware:** CPU Basic (Free) is sufficient
3. Click **Create Space**
### Step 3: Upload Your Files
**Option A β€” Using the Web UI:**
1. Click **Files** tab in your Space
2. Click **Add file β†’ Upload files**
3. Upload all files maintaining this structure:
```
app.py
requirements.txt
utils/__init__.py
utils/parser.py
utils/nlp_utils.py
utils/scorer.py
utils/similarity.py
utils/generator.py
```
4. Commit the changes
**Option B β€” Using Git:**
```bash
# Install git-lfs (for large model files if needed)
git lfs install
# Clone your Space
git clone https://huggingface.co/spaces/YOUR_USERNAME/ai-resume-ats-analyzer
# Copy project files into the cloned directory
cp -r resume_ats/* ai-resume-ats-analyzer/
# Push to Hugging Face
cd ai-resume-ats-analyzer
git add .
git commit -m "Initial deployment"
git push
```
### Step 4: Configure for Hugging Face Spaces
Create a file called `packages.txt` in the root with:
```
# No system packages needed β€” all Python
```
> **Note:** The spaCy model (`en_core_web_sm`) is downloaded automatically
> by `nlp_utils.py` on first run via `subprocess`. No manual step needed.
### Step 5: Monitor the Build
1. Go to your Space URL
2. Click the **Logs** tab to watch the build progress
3. First build takes ~5–10 minutes (installing packages + downloading models)
4. Once the status shows **Running βœ…**, your app is live!
### Step 6: Access Your App
Your app will be available at:
```
https://huggingface.co/spaces/YOUR_USERNAME/ai-resume-ats-analyzer
```
---
## πŸ”§ Runtime Considerations
### First Run (Cold Start)
The first time the app runs, it will download:
- `en_core_web_sm` β€” ~12 MB (spaCy model)
- `all-MiniLM-L6-v2` β€” ~80 MB (Sentence-BERT)
- `google/flan-t5-base` β€” ~250 MB (text generation)
**Total: ~342 MB** β€” downloaded once, then cached.
On Hugging Face Spaces (CPU Basic):
- Model loading: ~30–60 seconds on first analysis
- Subsequent analyses: ~5–15 seconds (models cached in memory)
- AI rewrite generation: ~30–60 seconds on CPU
### Memory Usage
- CPU Basic (16GB RAM) on HF Spaces is sufficient
- All models run on CPU β€” no GPU required
### Caching
Hugging Face caches models to `~/.cache/huggingface/`. Models persist between restarts on persistent Spaces.
---
## πŸ“ Important Notes
1. **Image-based PDFs** (scanned documents) will show very little extracted text. Use text-based PDFs.
2. **FLAN-T5 rewriting** is limited to ~400 words of input due to model context window. For longer resumes, only the first section is rewritten.
3. **Privacy:** No data is sent to any external server. All processing happens locally/on your HF Space.
4. **Sentence-BERT similarity** uses semantic understanding β€” it catches synonym matches that keyword overlap would miss.
---
## πŸ› οΈ Customization
### Add more skills
Edit `TECHNICAL_SKILLS` and `SOFT_SKILLS` sets in `utils/nlp_utils.py`.
### Adjust scoring weights
Modify the `compute_base_score()` function in `utils/scorer.py`.
### Change the AI model
In `utils/generator.py`, replace `"google/flan-t5-base"` with a larger model like `"google/flan-t5-large"` for better quality (requires more RAM).
---
## πŸ“„ License
MIT β€” free for personal and commercial use.