--- sdk: streamlit --- # 🎯 AI Resume ATS Analyzer A production-ready, AI-powered Resume Screening & ATS System built entirely on free and open-source tools. --- ## 🚀 Features | Feature | Technology Used | |---|---| | PDF/DOCX Resume Parsing | PyMuPDF + python-docx | | Named Entity Recognition | spaCy en_core_web_sm | | Skills Extraction | Custom keyword taxonomy | | Section Detection | Keyword + NLP heuristics | | Resume Base Scoring | Custom rubric (100 pts) | | Job Description Matching | Sentence-BERT (all-MiniLM-L6-v2) | | ATS Score Calculation | Weighted formula | | AI Resume Rewriting | FLAN-T5-base (Instruction-tuned) | | Suggestions Engine | Rule-based + score-driven | --- ## 📁 Project Structure ``` resume_ats/ ├── app.py # Main Streamlit application ├── requirements.txt # All dependencies ├── README.md # This file └── utils/ ├── __init__.py ├── parser.py # PDF/DOCX text extraction ├── nlp_utils.py # NER, section detection, skills, suggestions ├── scorer.py # Resume base score + ATS score ├── similarity.py # Sentence-BERT job matching └── generator.py # FLAN-T5 resume rewriting ``` --- ## ⚙️ ATS Scoring Formula ``` Resume Base Score (0–100): Skills richness : up to 20 pts Experience : up to 30 pts Projects : up to 20 pts Education : up to 10 pts Resume length : up to 10 pts Diversity : up to 10 pts ATS Score = (0.6 × Resume Score) + (0.4 × Job Match %) — capped at 100% Classification: ATS ≥ 70 → Good ✅ 45 ≤ ATS < 70 → Average ⚠️ ATS < 45 → Poor ❌ ``` --- ## 🖥️ Run Locally ### Prerequisites - Python 3.10 or 3.11 - pip ### Installation ```bash # Clone / download the project cd resume_ats # Install dependencies (takes ~3–5 minutes first time) pip install -r requirements.txt # Download spaCy language model python -m spacy download en_core_web_sm # Launch the app streamlit run app.py ``` Open your browser at `http://localhost:8501` --- ## ☁️ Deploy on Hugging Face Spaces (Step-by-Step) ### Step 1: Create a Hugging Face Account Go to [https://huggingface.co/join](https://huggingface.co/join) and sign up for a free account. ### Step 2: Create a New Space 1. Go to [https://huggingface.co/new-space](https://huggingface.co/new-space) 2. Fill in: - **Space name:** `ai-resume-ats-analyzer` (or any name) - **License:** MIT - **SDK:** Select **Streamlit** ← Important! - **Hardware:** CPU Basic (Free) is sufficient 3. Click **Create Space** ### Step 3: Upload Your Files **Option A — Using the Web UI:** 1. Click **Files** tab in your Space 2. Click **Add file → Upload files** 3. Upload all files maintaining this structure: ``` app.py requirements.txt utils/__init__.py utils/parser.py utils/nlp_utils.py utils/scorer.py utils/similarity.py utils/generator.py ``` 4. Commit the changes **Option B — Using Git:** ```bash # Install git-lfs (for large model files if needed) git lfs install # Clone your Space git clone https://huggingface.co/spaces/YOUR_USERNAME/ai-resume-ats-analyzer # Copy project files into the cloned directory cp -r resume_ats/* ai-resume-ats-analyzer/ # Push to Hugging Face cd ai-resume-ats-analyzer git add . git commit -m "Initial deployment" git push ``` ### Step 4: Configure for Hugging Face Spaces Create a file called `packages.txt` in the root with: ``` # No system packages needed — all Python ``` > **Note:** The spaCy model (`en_core_web_sm`) is downloaded automatically > by `nlp_utils.py` on first run via `subprocess`. No manual step needed. ### Step 5: Monitor the Build 1. Go to your Space URL 2. Click the **Logs** tab to watch the build progress 3. First build takes ~5–10 minutes (installing packages + downloading models) 4. Once the status shows **Running ✅**, your app is live! ### Step 6: Access Your App Your app will be available at: ``` https://huggingface.co/spaces/YOUR_USERNAME/ai-resume-ats-analyzer ``` --- ## 🔧 Runtime Considerations ### First Run (Cold Start) The first time the app runs, it will download: - `en_core_web_sm` — ~12 MB (spaCy model) - `all-MiniLM-L6-v2` — ~80 MB (Sentence-BERT) - `google/flan-t5-base` — ~250 MB (text generation) **Total: ~342 MB** — downloaded once, then cached. On Hugging Face Spaces (CPU Basic): - Model loading: ~30–60 seconds on first analysis - Subsequent analyses: ~5–15 seconds (models cached in memory) - AI rewrite generation: ~30–60 seconds on CPU ### Memory Usage - CPU Basic (16GB RAM) on HF Spaces is sufficient - All models run on CPU — no GPU required ### Caching Hugging Face caches models to `~/.cache/huggingface/`. Models persist between restarts on persistent Spaces. --- ## 📝 Important Notes 1. **Image-based PDFs** (scanned documents) will show very little extracted text. Use text-based PDFs. 2. **FLAN-T5 rewriting** is limited to ~400 words of input due to model context window. For longer resumes, only the first section is rewritten. 3. **Privacy:** No data is sent to any external server. All processing happens locally/on your HF Space. 4. **Sentence-BERT similarity** uses semantic understanding — it catches synonym matches that keyword overlap would miss. --- ## 🛠️ Customization ### Add more skills Edit `TECHNICAL_SKILLS` and `SOFT_SKILLS` sets in `utils/nlp_utils.py`. ### Adjust scoring weights Modify the `compute_base_score()` function in `utils/scorer.py`. ### Change the AI model In `utils/generator.py`, replace `"google/flan-t5-base"` with a larger model like `"google/flan-t5-large"` for better quality (requires more RAM). --- ## 📄 License MIT — free for personal and commercial use.