coderound-bkl / README.md
cloud450's picture
Upload 16 files
ba18317 verified
---
title: AI Recruitment Agent
emoji:
colorFrom: indigo
colorTo: green
sdk: gradio
sdk_version: "4.44.0"
app_file: app.py
pinned: false
---
# ⚡ AI Recruitment Agent
A production-grade hybrid candidate matching pipeline using **Groq LLM**, **Pinecone vector DB**, and a **Gradio** UI.
## Architecture
```
CSV Input → Stage 1: Normalize (Groq)
→ Stage 2: Embed + Match (Pinecone + SentenceTransformers) → Top 20
→ Stage 3: Deterministic Rerank (Groq) → Top 10
→ Stage 4: LLM Deep Review (Groq) → Top 5
→ Stage 5: Final Synthesis (Groq) → Shortlist
```
## Setup (Local)
### 1. Install dependencies
```bash
pip install -r requirements.txt
```
### 2. Configure environment
```bash
cp .env.example .env
# Edit .env and fill in your API keys
```
### 3. Create Pinecone index
In your Pinecone console:
- Create an index named `recruitment-index` (or whatever you set in `PINECONE_INDEX`)
- Dimension: **384** for `all-MiniLM-L6-v2`, **1024** for `BAAI/bge-m3`
- Metric: **cosine**
### 4. Run
```bash
python app.py
```
Open http://localhost:7860
## Setup (Hugging Face Spaces)
Do **not** commit a `.env` file. Instead, go to your Space → **Settings → Repository Secrets** and add:
| Secret | Example value |
|--------|--------------|
| `GROQ_API_KEYS` | `gsk_xxx,gsk_yyy` |
| `GROQ_MODEL` | `llama3-70b-8192` |
| `PINECONE_API_KEY` | `pcsk_xxx` |
| `PINECONE_INDEX` | `recruitment-index` |
| `EMBEDDING_MODEL` | `all-MiniLM-L6-v2` |
| `STAGE2_TOP_K` | `20` |
## CSV Format
| Column | Variants accepted |
|--------|----------|
| `name` | `full_name`, `candidate_name` |
| `email` | `email_address` |
| `skills` | `parsed_skills`, `technical_skills` |
| `experience` | `parsed_work_experience`, `years_of_experience` |
| `education` | `parsed_metadata_education` |
| `resume_text` | `parsed_summary`, `summary` |
## Pipeline Stages
| Stage | Method | Input | Output |
|-------|--------|-------|--------|
| 1. Normalize | Groq LLM | All candidates | Structured features |
| 2. Embed & Match | Pinecone + SentenceTransformers | All candidates | Top 20 by similarity |
| 3. Rerank | Groq LLM (deterministic scoring) | Top 20 | Top 10 with scores |
| 4. Deep Review | Groq LLM | Top 5 | Verdicts + signals |
| 5. Final Synthesis | Groq LLM | Top 5 reviews | Final ranked shortlist |