Spaces:

ketannnn
/

coderound

Sleeping

File size: 8,804 Bytes

---
title: TalentPulse AI Candidate Matching
emoji: ⚡
colorFrom: purple
colorTo: indigo
sdk: docker
pinned: false
app_port: 7860
---


# TalentPulse: AI-Powered Candidate Matching System


## Overview

TalentPulse is a production-grade, full-stack AI system for matching job descriptions against large candidate pools. It replaces manual resume screening with semantic retrieval, neural reranking, structured gap analysis, and LLM-generated explanations.

The platform is built for recruiters and hiring teams who need fast, explainable, and configurable candidate matching. It supports session-based candidate batches, dynamic scoring weights, trajectory analysis, and reusable matching workflows for A/B testing and precision hiring.

## Key Features

### Session-based Candidate Management
Group candidates into named sessions for isolated workflows and repeatable matching experiments.

### Two-stage AI Matching Pipeline
- **Stage 1: Retrieval** — Fast vector search in Qdrant with structured scoring for skills, experience, and other signals.
- **Stage 2: Reranking** — Cross-encoder reranking of the shortlist, fused with Reciprocal Rank Fusion.

### Live Weight Sliders
Adjust matching priorities in real time and rerank results in memory without running new model inference.

### Structured Gap Analysis
Detect missing skills, experience gaps, and mismatches to generate grounded candidate explanations.

### LLM-generated Explanations
Use Groq-powered LLM responses based on the precomputed gap analysis.

### Trajectory Scoring
Estimate career growth velocity from work history and reward strong advancement patterns.

### JD Quality Feedback
Evaluate job descriptions for clarity, breadth, and missing signals.

## Tech Stack

| Layer | Technology |
|------|------------|
| Frontend | Next.js 16, React, Tailwind CSS v4 |
| Backend | FastAPI, Uvicorn |
| Database | Neon Postgres, Asyncpg, SQLAlchemy, Alembic |
| Vector Search | Qdrant Cloud |
| Async Jobs | Celery |
| Cache | Redis Cloud |
| Embeddings | BAAI/bge-small-en-v1.5 via SentenceTransformers |
| Reranking | BAAI/bge-reranker-v2-m3 via FlagEmbedding |
| LLM Provider | Groq (llama-3.3-70b-versatile) |
| Deployment | Docker, Nginx, Supervisord, HuggingFace Spaces |

## Architecture Overview

```mermaid
graph TD
    UI[Next.js Frontend] -->|REST API| Proxy[Nginx Reverse Proxy]
    Proxy --> API[FastAPI Backend]

    API -->|Async Tasks| Queue[Redis / Celery Queue]
    Queue --> Worker[Celery Workers]

    API -->|Read / Write| DB[(Neon Postgres)]
    Worker -->|Persist Metadata| DB

    API -->|Vector Search| VectorDB[(Qdrant Cloud)]
    Worker -->|Store Embeddings| VectorDB

    API -->|In-Memory Rerank| LocalAI[Local Reranker Model]
    API -->|LLM Explanations| LLM[Groq API]
    Worker -->|LLM Jobs| LLM
````

## Project Structure

```text
/
├── backend/
│   ├── alembic/
│   ├── src/
│   │   ├── matching/
│   │   ├── ml/
│   │   ├── models/
│   │   ├── routers/
│   │   ├── schemas/
│   │   └── workers/
│   ├── main.py
│   └── requirements.txt
├── frontend/
│   ├── public/
│   ├── src/
│   │   ├── app/
│   │   └── lib/
│   ├── next.config.ts
│   └── globals.css
├── docker-compose.yml
├── Dockerfile
├── supervisord.conf
└── nginx.conf
```

## Core Modules & Responsibilities

### Backend

* **backend/src/ml**
  Handles model loading, text embedding, and feature extraction.

* **backend/src/matching**
  Implements retrieval, reranking, weighted scoring, and explanation logic.

* **backend/src/workers**
  Runs background jobs such as candidate ingestion and explanation generation.

* **backend/src/routers**
  Exposes API endpoints for sessions, JDs, candidates, matching, and health checks.

### Frontend

* **frontend/src/app**
  Contains user-facing routes such as sessions, JD details, and pipeline orchestration.

* **frontend/src/lib**
  Centralized API client wrappers.

## Application Flows

### Candidate Upload & Ingestion Flow

```mermaid
sequenceDiagram
    actor User
    participant UI as Next.js UI
    participant API as FastAPI Router
    participant Queue as Redis / Celery Queue
    participant Worker as Celery Worker
    participant Store as Postgres + Qdrant

    User->>UI: Upload candidate CSV/JSON
    UI->>API: POST /api/candidates/upload
    API->>Queue: Dispatch ingest_candidates_batch
    API-->>UI: Return task ID
    UI->>API: Poll /api/candidates/status/{task_id}
    Worker->>Queue: Fetch task
    Worker->>Worker: Parse candidate data
    Worker->>Worker: Compute embeddings and growth velocity
    Worker->>Store: Save metadata and vector points
    Worker-->>Queue: Mark task complete
    API-->>UI: Return success status
```

### Matching & Reranking Flow

```mermaid
sequenceDiagram
    actor User
    participant UI as Next.js UI
    participant API as FastAPI Router
    participant Qdrant as Vector DB
    participant Reranker as Local Reranker
    participant Cache as Redis Cache

    User->>UI: Open JD and click Match
    UI->>API: POST /api/match/{jd_id}
    API->>Qdrant: Retrieve top candidates
    Qdrant-->>API: Return top-K vectors
    API->>Reranker: Cross-encoder reranking
    Reranker-->>API: Return adjusted scores
    API->>API: Apply rank fusion and weights
    API->>Cache: Store result
    API-->>UI: Return ranked candidates

    User->>UI: Adjust weight sliders
    UI->>API: POST /api/match/{jd_id}/rerank
    API->>API: Recompute ranking in memory
    API-->>UI: Return updated ordering
```

### Explain & Refine Flow

```mermaid
sequenceDiagram
    actor User
    participant UI as Next.js UI
    participant API as FastAPI Router
    participant DB as Postgres
    participant LLM as Groq API

    User->>UI: Open candidate match details
    UI->>API: POST /api/match/{jd_id}/candidates/{candidate_id}/explain
    API->>DB: Load match data and gap analysis
    API->>LLM: Generate grounded explanation
    LLM-->>API: Return explanation text
    API-->>UI: Show explanation to user
```

## API Documentation

| Method | Path                                                 | Purpose                    |
| ------ | ---------------------------------------------------- | -------------------------- |
| POST   | /api/sessions                                        | Create a candidate session |
| GET    | /api/sessions                                        | List sessions              |
| POST   | /api/jds                                             | Create a job description   |
| GET    | /api/jds                                             | List job descriptions      |
| POST   | /api/candidates/upload?session_id=                   | Upload candidate files     |
| GET    | /api/candidates/status/{task_id}                     | Check task progress        |
| POST   | /api/match/{jd_id}?session_id=                       | Run full matching pipeline |
| POST   | /api/match/{jd_id}/rerank                            | Rerank in memory           |
| POST   | /api/match/{jd_id}/candidates/{candidate_id}/explain | Generate explanation       |
| GET    | /health                                              | Health check               |

## Database Models

* **Session** — Candidate batch container
* **JobDescription** — Stores JD text and parsed requirements
* **Candidate** — Stores profile, skills, work history, embeddings
* **MatchResult** — Stores scores, gaps, explanations, weights

## Authentication & Security

* No formal authentication yet
* CORS allows all origins
* Minimal admin utility route exists

## State Management

* React Hooks (`useState`, `useEffect`, `useCallback`)
* Local storage for persistence
* Redis for backend caching

## Caching & Performance

* Cached match results by `jd_id + session_id`
* Models pre-downloaded into Docker image
* SQLAlchemy cache tuned for Neon pooling

## Setup & Installation

### Run Locally

```bash
docker-compose up --build
```

### Database Migration

```bash
cd backend
alembic upgrade head
```

## Environment Variables

```env
DATABASE_URL=
QDRANT_URL=
QDRANT_API_KEY=
REDIS_URL=
GROQ_API_KEY=
GROQ_MODEL=
EMBEDDING_MODEL=
RERANKER_MODEL=
NEXT_PUBLIC_API_URL=
```

## Deployment

* Multi-stage Docker build
* Runs FastAPI + Next.js + Celery + Nginx
* Optimized for HuggingFace Spaces
* Exposes port `7860`

## Improvement Recommendations

* Add JWT auth + RBAC
* Replace polling with WebSockets / SSE
* Add object storage
* Add automated tests
* Add observability & metrics

## Quick Summary

TalentPulse combines semantic search, reranking, and LLM reasoning to help recruiters identify the best candidates faster, with explainable AI-powered hiring workflows.