File size: 3,444 Bytes
69ff9c0
 
 
 
 
 
 
6f112ca
69ff9c0
 
6f112ca
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
---
title: ResumeDataExtractor
emoji: πŸ¦€
colorFrom: indigo
colorTo: pink
sdk: docker
pinned: false
app_port: 7860
---

# 🎯 AI Smart Resume Screen & Extractor

**ResumeDataExtractor** is an intelligent Application Tracking System (ATS) tool powered by **Google Gemini AI**. It parses PDF resumes, extracts structured data, and compares candidates against specific job descriptions to provide a match score, reasoning, and skill gap analysis.

πŸ”— **Live Demo:** [Hugging Face Space](https://huggingface.co/spaces/LovnishVerma/ResumeDataExtractor)

---

## πŸš€ Key Features

* **πŸ“„ PDF Parsing**: extract raw text from PDF resumes reliably.
* **πŸ€– AI Analysis**: Uses Google's **Gemini 1.5 Pro/Flash** to interpret candidate data.
* **πŸ“Š Smart Scoring**: Compare a Resume against a Job Description (JD) to get a 0-100% match score.
* **🧩 Skill Gap Analysis**: Automatically identifies **Matching Skills** and **Missing Skills** based on the JD.
* **⚑ Hybrid Architecture**: Runs a **FastAPI** backend for logic/API processing and a **Streamlit** frontend for the UI in a single container.

---

## πŸ› οΈ How It Works

This project uses a microservices-in-a-box approach:
1.  **Backend (`main.py`)**: A **FastAPI** server running on port `8000`. It handles file uploads, text extraction, and communicates with the Google Gemini API.
2.  **Frontend (`app.py`)**: A **Streamlit** dashboard running on port `7860`. It accepts user input and sends requests to the local backend.
3.  **AI Engine**: The system dynamically selects the best available Gemini model (e.g., `gemini-1.5-flash` or `gemini-pro`) to process the text.

---

## πŸ’» Local Setup & Installation

### Prerequisites
* Python 3.11+
* A Google Gemini API Key ([Get it here](https://aistudio.google.com/app/apikey))

### 1. Clone the Repository
```bash
git clone [https://huggingface.co/spaces/LovnishVerma/ResumeDataExtractor](https://huggingface.co/spaces/LovnishVerma/ResumeDataExtractor)
cd ResumeDataExtractor

```

### 2. Install Dependencies

```bash
pip install -r requirements.txt

```

### 3. Set Environment Variables

Create a `.env` file in the root directory:

```env
GEMINI_API_KEY=your_actual_api_key_here

```

### 4. Run the Application

You can run the startup script (Linux/Mac/WSL) which launches both services:

```bash
chmod +x start.sh
./start.sh

```

**Or run them manually in separate terminals:**

*Terminal 1 (Backend):*

```bash
uvicorn main:app --host 0.0.0.0 --port 8000

```

*Terminal 2 (Frontend):*

```bash
streamlit run app.py --server.port 7860

```

Access the UI at: `http://localhost:7860`

---

## 🐳 Docker Deployment

This project is configured to run effortlessly in Docker (standard for Hugging Face Spaces).

```bash
# Build the image
docker build -t resume-extractor .

# Run the container (Pass your API Key)
docker run -p 7860:7860 -e GEMINI_API_KEY="your_key" resume-extractor

```

---

## πŸ“‚ Project Structure

* **`app.py`**: Streamlit frontend interface.
* **`main.py`**: FastAPI backend server.
* **`parser_logic.py`**: Core logic for PDF extraction and interaction with Google Gemini.
* **`start.sh`**: Entry point script to run both servers simultaneously.
* **`Dockerfile`**: Container configuration.

---

## πŸ›‘οΈ License & Disclaimer

This project uses Google Generative AI. Ensure you comply with their usage policies. Resume data is processed in memory and not permanently stored on the server.