fix: enforce relative api routing in production
Browse files- README.md +3 -0
- frontend/src/lib/api.ts +3 -1
- implementation_plan.md +0 -48
README.md
CHANGED
|
@@ -8,6 +8,9 @@ pinned: false
|
|
| 8 |
app_port: 7860
|
| 9 |
---
|
| 10 |
|
|
|
|
|
|
|
|
|
|
| 11 |
# TalentPulse — AI Candidate Matching System
|
| 12 |
|
| 13 |
A production-grade two-stage AI pipeline for matching job descriptions against large candidate pools.
|
|
|
|
| 8 |
app_port: 7860
|
| 9 |
---
|
| 10 |
|
| 11 |
+
|
| 12 |
+
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
| 13 |
+
|
| 14 |
# TalentPulse — AI Candidate Matching System
|
| 15 |
|
| 16 |
A production-grade two-stage AI pipeline for matching job descriptions against large candidate pools.
|
frontend/src/lib/api.ts
CHANGED
|
@@ -1,4 +1,6 @@
|
|
| 1 |
-
const API_BASE = process.env.NEXT_PUBLIC_API_URL
|
|
|
|
|
|
|
| 2 |
|
| 3 |
async function request<T>(path: string, options?: RequestInit): Promise<T> {
|
| 4 |
const url = path.startsWith("http") ? path : `${API_BASE}${path}`;
|
|
|
|
| 1 |
+
const API_BASE = typeof process.env.NEXT_PUBLIC_API_URL === "string"
|
| 2 |
+
? process.env.NEXT_PUBLIC_API_URL
|
| 3 |
+
: "http://localhost:8000";
|
| 4 |
|
| 5 |
async function request<T>(path: string, options?: RequestInit): Promise<T> {
|
| 6 |
const url = path.startsWith("http") ? path : `${API_BASE}${path}`;
|
implementation_plan.md
DELETED
|
@@ -1,48 +0,0 @@
|
|
| 1 |
-
# Architectural Evolution: Linear Monolithic Pipeline & Frontend Inference
|
| 2 |
-
|
| 3 |
-
Thank you for the strict scale constraints. 100,000 candidates is a massive load for a localized environment, but our architecture can handle it easily with the strategies outlined below. Let's break down your questions about execution times first, and then the structural redesign!
|
| 4 |
-
|
| 5 |
-
## ⏱ Scale & Time Estimations (100,000 Candidates)
|
| 6 |
-
1. **NeonDB Free Tier (Postgres Storage):** 100,000 candidates with basic text descriptions will consume roughly **40MB to 60MB** of space. Neon's free tier provides 500MB, therefore you will have exactly zero problems running this! After the test, clicking the "Delete Session" button will cleanly wipe the 50MB so you never run out of space.
|
| 7 |
-
2. **Database Postgres Insertion Time:** Because your backend is using SQLAlchemy asynchronous insertion through Celery chunks, it will take approximately **2 to 3 minutes** to map 100k rows to NeonDB.
|
| 8 |
-
3. **Vector Database Embedding Time (BAAI Model):** **This is your bottleneck.** Because you are running a Deep Learning model on a CPU, vectorizing 100k profiles usually runs at ~30 documents/second/core. On a standard 8-core laptop, this will take approximately **40 to 50 minutes** to finish background embedding.
|
| 9 |
-
- **How to minimize this?** The optimal minimization strategy is deploying the application to a machine with an NVIDIA GPU (T4 or L4). Embedding times would drop from 45 minutes to **~2 minutes**. Alternatively, I can add a flag to maximize Celery Concurrency up to your CPU's exact core limit to squeeze every drop of local performance simultaneously.
|
| 10 |
-
4. **Matching Time (Top 250):** The Qdrant neural search combined with the cross-encoder Reranker will operate extremely fast, taking around **8 to 15 seconds** total to sift through the 100k space and yield the top 250.
|
| 11 |
-
|
| 12 |
-
Here is exactly how I will redesign the system to meet your "One-Way Pipeline" constraints:
|
| 13 |
-
|
| 14 |
-
## Proposed Changes
|
| 15 |
-
|
| 16 |
-
### 1. The Monolithic Pipeline (Frontend Orchestrator)
|
| 17 |
-
Currently, you have to click around to create sessions, upload data, create a JD, and click "Run Match". **I will destroy this fragmentation.**
|
| 18 |
-
#### [NEW] `frontend/src/app/pipeline/page.tsx`
|
| 19 |
-
- I will create a single "Run Pipeline" page mapping exactly to your vision.
|
| 20 |
-
- **Form:** You enter Job Title, Job Description text, and upload the 100k CSV.
|
| 21 |
-
- **Action:** Clicking "Start" initiates an intelligent, localized script:
|
| 22 |
-
1. Creates the JD API asynchronously.
|
| 23 |
-
2. Creates a Session automatically.
|
| 24 |
-
3. Uploads the CSV batch to Celery.
|
| 25 |
-
4. Automatically polls the embedding status.
|
| 26 |
-
5. The moment embeddings finish, it automatically executes the Match function.
|
| 27 |
-
6. Automatically redirects you to the definitive Top 250 visualization page, with the background Top 60 LLM thread running invisibly!
|
| 28 |
-
|
| 29 |
-
### 2. Zero-Latency Frontend Weight Logic
|
| 30 |
-
You correctly identified that network trips during weight-slider tweaks are unnecessary since we already pipe the data to the browser!
|
| 31 |
-
#### [MODIFY] `frontend/src/app/jds/[id]/page.tsx`
|
| 32 |
-
- I will rip out the `POST /api/rerank` network call that runs whenever you touch a slider.
|
| 33 |
-
- I will write Javascript math logic that instantly calculates your normalized weight distribution directly inside the browser using the raw `component_scores`, sorting the DOM elements exactly into their new ranks at **0ms latency**.
|
| 34 |
-
- To satisfy your requirement *"after sometime if user visit that so user can have whatever beta I have set for that weight"*, the frontend will quietly ping a new tiny `PATCH` endpoint simply to save the slider coordinates in Postgres behind the scenes, never waiting for it!
|
| 35 |
-
|
| 36 |
-
### 3. Dedicated Backend Configurations
|
| 37 |
-
#### [NEW] `backend/src/routers/jds.py` (Update)
|
| 38 |
-
- I will add a tiny `PATCH /api/jds/{jd_id}/weights` route so the frontend can silently save the state of your sliders.
|
| 39 |
-
#### [MODIFY] `backend/src/workers/celery_app.py`
|
| 40 |
-
- I will explicitly configure Celery concurrency settings to forcefully max-out your laptop's multi-threading to guarantee the 100k embedding finishes as fast as theoretically possible on local hardware.
|
| 41 |
-
|
| 42 |
-
## User Review Required
|
| 43 |
-
|
| 44 |
-
> [!WARNING]
|
| 45 |
-
> By shifting to the Monolithic Pipeline model, the current dashboard will just point to the "Run Universal Pipeline" flow instead of manual modular steps. Are you completely comfortable moving exactly to this one-way system?
|
| 46 |
-
|
| 47 |
-
> [!TIP]
|
| 48 |
-
> If you approve, I will strip away the fragmented pages and implement the unified flow with maximum multi-threading parameters!
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|