File size: 8,804 Bytes
f61f6a3
 
 
 
 
 
 
 
 
 
6106435
72d1c14
 
 
 
 
 
 
 
10090b1
 
72d1c14
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10090b1
 
72d1c14
 
 
 
 
 
 
 
 
 
 
 
 
10090b1
 
72d1c14
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10090b1
 
72d1c14
10090b1
 
 
72d1c14
10090b1
72d1c14
 
 
 
 
 
 
 
10090b1
72d1c14
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6106435
 
72d1c14
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10090b1
 
72d1c14
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6106435
f61f6a3
72d1c14
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
---
title: TalentPulse AI Candidate Matching
emoji: 
colorFrom: purple
colorTo: indigo
sdk: docker
pinned: false
app_port: 7860
---


# TalentPulse: AI-Powered Candidate Matching System


## Overview

TalentPulse is a production-grade, full-stack AI system for matching job descriptions against large candidate pools. It replaces manual resume screening with semantic retrieval, neural reranking, structured gap analysis, and LLM-generated explanations.

The platform is built for recruiters and hiring teams who need fast, explainable, and configurable candidate matching. It supports session-based candidate batches, dynamic scoring weights, trajectory analysis, and reusable matching workflows for A/B testing and precision hiring.

## Key Features

### Session-based Candidate Management
Group candidates into named sessions for isolated workflows and repeatable matching experiments.

### Two-stage AI Matching Pipeline
- **Stage 1: Retrieval** — Fast vector search in Qdrant with structured scoring for skills, experience, and other signals.
- **Stage 2: Reranking** — Cross-encoder reranking of the shortlist, fused with Reciprocal Rank Fusion.

### Live Weight Sliders
Adjust matching priorities in real time and rerank results in memory without running new model inference.

### Structured Gap Analysis
Detect missing skills, experience gaps, and mismatches to generate grounded candidate explanations.

### LLM-generated Explanations
Use Groq-powered LLM responses based on the precomputed gap analysis.

### Trajectory Scoring
Estimate career growth velocity from work history and reward strong advancement patterns.

### JD Quality Feedback
Evaluate job descriptions for clarity, breadth, and missing signals.

## Tech Stack

| Layer | Technology |
|------|------------|
| Frontend | Next.js 16, React, Tailwind CSS v4 |
| Backend | FastAPI, Uvicorn |
| Database | Neon Postgres, Asyncpg, SQLAlchemy, Alembic |
| Vector Search | Qdrant Cloud |
| Async Jobs | Celery |
| Cache | Redis Cloud |
| Embeddings | BAAI/bge-small-en-v1.5 via SentenceTransformers |
| Reranking | BAAI/bge-reranker-v2-m3 via FlagEmbedding |
| LLM Provider | Groq (llama-3.3-70b-versatile) |
| Deployment | Docker, Nginx, Supervisord, HuggingFace Spaces |

## Architecture Overview

```mermaid
graph TD
    UI[Next.js Frontend] -->|REST API| Proxy[Nginx Reverse Proxy]
    Proxy --> API[FastAPI Backend]

    API -->|Async Tasks| Queue[Redis / Celery Queue]
    Queue --> Worker[Celery Workers]

    API -->|Read / Write| DB[(Neon Postgres)]
    Worker -->|Persist Metadata| DB

    API -->|Vector Search| VectorDB[(Qdrant Cloud)]
    Worker -->|Store Embeddings| VectorDB

    API -->|In-Memory Rerank| LocalAI[Local Reranker Model]
    API -->|LLM Explanations| LLM[Groq API]
    Worker -->|LLM Jobs| LLM
````

## Project Structure

```text
/
├── backend/
│   ├── alembic/
│   ├── src/
│   │   ├── matching/
│   │   ├── ml/
│   │   ├── models/
│   │   ├── routers/
│   │   ├── schemas/
│   │   └── workers/
│   ├── main.py
│   └── requirements.txt
├── frontend/
│   ├── public/
│   ├── src/
│   │   ├── app/
│   │   └── lib/
│   ├── next.config.ts
│   └── globals.css
├── docker-compose.yml
├── Dockerfile
├── supervisord.conf
└── nginx.conf
```

## Core Modules & Responsibilities

### Backend

* **backend/src/ml**
  Handles model loading, text embedding, and feature extraction.

* **backend/src/matching**
  Implements retrieval, reranking, weighted scoring, and explanation logic.

* **backend/src/workers**
  Runs background jobs such as candidate ingestion and explanation generation.

* **backend/src/routers**
  Exposes API endpoints for sessions, JDs, candidates, matching, and health checks.

### Frontend

* **frontend/src/app**
  Contains user-facing routes such as sessions, JD details, and pipeline orchestration.

* **frontend/src/lib**
  Centralized API client wrappers.

## Application Flows

### Candidate Upload & Ingestion Flow

```mermaid
sequenceDiagram
    actor User
    participant UI as Next.js UI
    participant API as FastAPI Router
    participant Queue as Redis / Celery Queue
    participant Worker as Celery Worker
    participant Store as Postgres + Qdrant

    User->>UI: Upload candidate CSV/JSON
    UI->>API: POST /api/candidates/upload
    API->>Queue: Dispatch ingest_candidates_batch
    API-->>UI: Return task ID
    UI->>API: Poll /api/candidates/status/{task_id}
    Worker->>Queue: Fetch task
    Worker->>Worker: Parse candidate data
    Worker->>Worker: Compute embeddings and growth velocity
    Worker->>Store: Save metadata and vector points
    Worker-->>Queue: Mark task complete
    API-->>UI: Return success status
```

### Matching & Reranking Flow

```mermaid
sequenceDiagram
    actor User
    participant UI as Next.js UI
    participant API as FastAPI Router
    participant Qdrant as Vector DB
    participant Reranker as Local Reranker
    participant Cache as Redis Cache

    User->>UI: Open JD and click Match
    UI->>API: POST /api/match/{jd_id}
    API->>Qdrant: Retrieve top candidates
    Qdrant-->>API: Return top-K vectors
    API->>Reranker: Cross-encoder reranking
    Reranker-->>API: Return adjusted scores
    API->>API: Apply rank fusion and weights
    API->>Cache: Store result
    API-->>UI: Return ranked candidates

    User->>UI: Adjust weight sliders
    UI->>API: POST /api/match/{jd_id}/rerank
    API->>API: Recompute ranking in memory
    API-->>UI: Return updated ordering
```

### Explain & Refine Flow

```mermaid
sequenceDiagram
    actor User
    participant UI as Next.js UI
    participant API as FastAPI Router
    participant DB as Postgres
    participant LLM as Groq API

    User->>UI: Open candidate match details
    UI->>API: POST /api/match/{jd_id}/candidates/{candidate_id}/explain
    API->>DB: Load match data and gap analysis
    API->>LLM: Generate grounded explanation
    LLM-->>API: Return explanation text
    API-->>UI: Show explanation to user
```

## API Documentation

| Method | Path                                                 | Purpose                    |
| ------ | ---------------------------------------------------- | -------------------------- |
| POST   | /api/sessions                                        | Create a candidate session |
| GET    | /api/sessions                                        | List sessions              |
| POST   | /api/jds                                             | Create a job description   |
| GET    | /api/jds                                             | List job descriptions      |
| POST   | /api/candidates/upload?session_id=                   | Upload candidate files     |
| GET    | /api/candidates/status/{task_id}                     | Check task progress        |
| POST   | /api/match/{jd_id}?session_id=                       | Run full matching pipeline |
| POST   | /api/match/{jd_id}/rerank                            | Rerank in memory           |
| POST   | /api/match/{jd_id}/candidates/{candidate_id}/explain | Generate explanation       |
| GET    | /health                                              | Health check               |

## Database Models

* **Session** — Candidate batch container
* **JobDescription** — Stores JD text and parsed requirements
* **Candidate** — Stores profile, skills, work history, embeddings
* **MatchResult** — Stores scores, gaps, explanations, weights

## Authentication & Security

* No formal authentication yet
* CORS allows all origins
* Minimal admin utility route exists

## State Management

* React Hooks (`useState`, `useEffect`, `useCallback`)
* Local storage for persistence
* Redis for backend caching

## Caching & Performance

* Cached match results by `jd_id + session_id`
* Models pre-downloaded into Docker image
* SQLAlchemy cache tuned for Neon pooling

## Setup & Installation

### Run Locally

```bash
docker-compose up --build
```

### Database Migration

```bash
cd backend
alembic upgrade head
```

## Environment Variables

```env
DATABASE_URL=
QDRANT_URL=
QDRANT_API_KEY=
REDIS_URL=
GROQ_API_KEY=
GROQ_MODEL=
EMBEDDING_MODEL=
RERANKER_MODEL=
NEXT_PUBLIC_API_URL=
```

## Deployment

* Multi-stage Docker build
* Runs FastAPI + Next.js + Celery + Nginx
* Optimized for HuggingFace Spaces
* Exposes port `7860`

## Improvement Recommendations

* Add JWT auth + RBAC
* Replace polling with WebSockets / SSE
* Add object storage
* Add automated tests
* Add observability & metrics

## Quick Summary

TalentPulse combines semantic search, reranking, and LLM reasoning to help recruiters identify the best candidates faster, with explainable AI-powered hiring workflows.