Spaces:

ketannnn
/

coderound

Sleeping

App Files Files Community

coderound / README.md

ketannnn

feat: implement multi-stage candidate ingestion and matching pipeline with UI tracking and backend schema support

72d1c14 26 days ago

preview code

raw

history blame contribute delete

8.8 kB

metadata

title: TalentPulse AI Candidate Matching
emoji: ⚡
colorFrom: purple
colorTo: indigo
sdk: docker
pinned: false
app_port: 7860

TalentPulse: AI-Powered Candidate Matching System

Overview

TalentPulse is a production-grade, full-stack AI system for matching job descriptions against large candidate pools. It replaces manual resume screening with semantic retrieval, neural reranking, structured gap analysis, and LLM-generated explanations.

The platform is built for recruiters and hiring teams who need fast, explainable, and configurable candidate matching. It supports session-based candidate batches, dynamic scoring weights, trajectory analysis, and reusable matching workflows for A/B testing and precision hiring.

Key Features

Session-based Candidate Management

Group candidates into named sessions for isolated workflows and repeatable matching experiments.

Two-stage AI Matching Pipeline

Stage 1: Retrieval — Fast vector search in Qdrant with structured scoring for skills, experience, and other signals.
Stage 2: Reranking — Cross-encoder reranking of the shortlist, fused with Reciprocal Rank Fusion.

Live Weight Sliders

Adjust matching priorities in real time and rerank results in memory without running new model inference.

Structured Gap Analysis

Detect missing skills, experience gaps, and mismatches to generate grounded candidate explanations.

LLM-generated Explanations

Use Groq-powered LLM responses based on the precomputed gap analysis.

Trajectory Scoring

Estimate career growth velocity from work history and reward strong advancement patterns.

JD Quality Feedback

Evaluate job descriptions for clarity, breadth, and missing signals.

Tech Stack

Layer	Technology
Frontend	Next.js 16, React, Tailwind CSS v4
Backend	FastAPI, Uvicorn
Database	Neon Postgres, Asyncpg, SQLAlchemy, Alembic
Vector Search	Qdrant Cloud
Async Jobs	Celery
Cache	Redis Cloud
Embeddings	BAAI/bge-small-en-v1.5 via SentenceTransformers
Reranking	BAAI/bge-reranker-v2-m3 via FlagEmbedding
LLM Provider	Groq (llama-3.3-70b-versatile)
Deployment	Docker, Nginx, Supervisord, HuggingFace Spaces

Architecture Overview

graph TD
    UI[Next.js Frontend] -->|REST API| Proxy[Nginx Reverse Proxy]
    Proxy --> API[FastAPI Backend]

    API -->|Async Tasks| Queue[Redis / Celery Queue]
    Queue --> Worker[Celery Workers]

    API -->|Read / Write| DB[(Neon Postgres)]
    Worker -->|Persist Metadata| DB

    API -->|Vector Search| VectorDB[(Qdrant Cloud)]
    Worker -->|Store Embeddings| VectorDB

    API -->|In-Memory Rerank| LocalAI[Local Reranker Model]
    API -->|LLM Explanations| LLM[Groq API]
    Worker -->|LLM Jobs| LLM

Project Structure

/
├── backend/
│   ├── alembic/
│   ├── src/
│   │   ├── matching/
│   │   ├── ml/
│   │   ├── models/
│   │   ├── routers/
│   │   ├── schemas/
│   │   └── workers/
│   ├── main.py
│   └── requirements.txt
├── frontend/
│   ├── public/
│   ├── src/
│   │   ├── app/
│   │   └── lib/
│   ├── next.config.ts
│   └── globals.css
├── docker-compose.yml
├── Dockerfile
├── supervisord.conf
└── nginx.conf

Core Modules & Responsibilities

Backend

backend/src/ml Handles model loading, text embedding, and feature extraction.
backend/src/matching Implements retrieval, reranking, weighted scoring, and explanation logic.
backend/src/workers Runs background jobs such as candidate ingestion and explanation generation.
backend/src/routers Exposes API endpoints for sessions, JDs, candidates, matching, and health checks.

Frontend

frontend/src/app Contains user-facing routes such as sessions, JD details, and pipeline orchestration.
frontend/src/lib Centralized API client wrappers.

Application Flows

Candidate Upload & Ingestion Flow

sequenceDiagram
    actor User
    participant UI as Next.js UI
    participant API as FastAPI Router
    participant Queue as Redis / Celery Queue
    participant Worker as Celery Worker
    participant Store as Postgres + Qdrant

    User->>UI: Upload candidate CSV/JSON
    UI->>API: POST /api/candidates/upload
    API->>Queue: Dispatch ingest_candidates_batch
    API-->>UI: Return task ID
    UI->>API: Poll /api/candidates/status/{task_id}
    Worker->>Queue: Fetch task
    Worker->>Worker: Parse candidate data
    Worker->>Worker: Compute embeddings and growth velocity
    Worker->>Store: Save metadata and vector points
    Worker-->>Queue: Mark task complete
    API-->>UI: Return success status

Matching & Reranking Flow

sequenceDiagram
    actor User
    participant UI as Next.js UI
    participant API as FastAPI Router
    participant Qdrant as Vector DB
    participant Reranker as Local Reranker
    participant Cache as Redis Cache

    User->>UI: Open JD and click Match
    UI->>API: POST /api/match/{jd_id}
    API->>Qdrant: Retrieve top candidates
    Qdrant-->>API: Return top-K vectors
    API->>Reranker: Cross-encoder reranking
    Reranker-->>API: Return adjusted scores
    API->>API: Apply rank fusion and weights
    API->>Cache: Store result
    API-->>UI: Return ranked candidates

    User->>UI: Adjust weight sliders
    UI->>API: POST /api/match/{jd_id}/rerank
    API->>API: Recompute ranking in memory
    API-->>UI: Return updated ordering

Explain & Refine Flow

sequenceDiagram
    actor User
    participant UI as Next.js UI
    participant API as FastAPI Router
    participant DB as Postgres
    participant LLM as Groq API

    User->>UI: Open candidate match details
    UI->>API: POST /api/match/{jd_id}/candidates/{candidate_id}/explain
    API->>DB: Load match data and gap analysis
    API->>LLM: Generate grounded explanation
    LLM-->>API: Return explanation text
    API-->>UI: Show explanation to user

API Documentation

Method	Path	Purpose
POST	/api/sessions	Create a candidate session
GET	/api/sessions	List sessions
POST	/api/jds	Create a job description
GET	/api/jds	List job descriptions
POST	/api/candidates/upload?session_id=	Upload candidate files
GET	/api/candidates/status/{task_id}	Check task progress
POST	/api/match/{jd_id}?session_id=	Run full matching pipeline
POST	/api/match/{jd_id}/rerank	Rerank in memory
POST	/api/match/{jd_id}/candidates/{candidate_id}/explain	Generate explanation
GET	/health	Health check

Database Models

Session — Candidate batch container
JobDescription — Stores JD text and parsed requirements
Candidate — Stores profile, skills, work history, embeddings
MatchResult — Stores scores, gaps, explanations, weights

Authentication & Security

No formal authentication yet
CORS allows all origins
Minimal admin utility route exists

State Management

React Hooks (useState, useEffect, useCallback)
Local storage for persistence
Redis for backend caching

Caching & Performance

Cached match results by jd_id + session_id
Models pre-downloaded into Docker image
SQLAlchemy cache tuned for Neon pooling

Setup & Installation

Run Locally

docker-compose up --build

Database Migration

cd backend
alembic upgrade head

Environment Variables

DATABASE_URL=
QDRANT_URL=
QDRANT_API_KEY=
REDIS_URL=
GROQ_API_KEY=
GROQ_MODEL=
EMBEDDING_MODEL=
RERANKER_MODEL=
NEXT_PUBLIC_API_URL=

Deployment

Multi-stage Docker build
Runs FastAPI + Next.js + Celery + Nginx
Optimized for HuggingFace Spaces
Exposes port 7860

Improvement Recommendations

Add JWT auth + RBAC
Replace polling with WebSockets / SSE
Add object storage
Add automated tests
Add observability & metrics

Quick Summary

TalentPulse combines semantic search, reranking, and LLM reasoning to help recruiters identify the best candidates faster, with explainable AI-powered hiring workflows.