Spaces:
Running
Background batch jobs (launcher)
Large launcher queues can be processed server-side in chunks: SerpAPI for Google results, then fetch + score (OpenRouter AI or heuristic). State lives in Upstash Redis; Upstash QStash wakes your deployment (e.g. Hugging Face Space) with HTTPS callbacks so work continues after you close the tab.
Troubleshooting: “Job not found” right after starting a batch
The Upstash Redis client parses JSON on GET by default. This app stores each job as a JSON string with SET and expects a string back on GET. If deserialization is left on, GET returns an object, the loader treats the job as missing, and the status page shows Job not found even though Redis has the key. The code sets automaticDeserialization: false on the Redis client (lib/batch-jobs/redis.ts). After upgrading, restart the deployment so a new process picks up the client config.
Required services
- SerpAPI —
SERPAPI_API_KEY. - Upstash Redis — REST URL and token:
UPSTASH_REDIS_REST_URLUPSTASH_REDIS_REST_TOKEN
- Upstash QStash — publish token and signing keys (for verifying callbacks):
QSTASH_TOKEN— used to enqueue the next chunk after each run.QSTASH_CURRENT_SIGNING_KEY— used to verify incoming requests to the chunk endpoint.QSTASH_NEXT_SIGNING_KEY— optional; use during key rotation.
Public URL (critical for QStash)
QStash must call a stable HTTPS URL that reaches your app. Set:
BATCH_PUBLIC_APP_URL— origin only, no trailing slash, e.g.https://your-space.hf.spaceor your Vercel URL.
Hugging Face Spaces (important)
Do not use the gallery link as the public URL:
- Wrong:
https://huggingface.co/spaces/zimejin/Job-Scorer(Space page on hf.co — this is not your container’s API host). - Right:
https://<subdomain>.hf.space— the direct app URL shown when you open the Space (App tab / embedded app). It usually looks likehttps://zimejin-job-scorer.hf.space(check your Space’s Settings → Details for the exact*.hf.spacevalue).
Use that https://....hf.space value (no path) for BATCH_PUBLIC_APP_URL, and use the same URL in the browser when you run batch jobs so /api/... calls hit your app.
The app publishes chunk work to:
{BATCH_PUBLIC_APP_URL}/api/internal/batch-jobs/chunk
Signing verification uses this same base URL. If verification fails, confirm the URL matches exactly what QStash calls (scheme, host, path).
Optional: OpenRouter (AI scoring)
OPENROUTER_API_KEY
If unset, scoring falls back to heuristics.
Optional tuning (environment)
| Variable | Purpose |
|---|---|
BATCH_SERP_DELAY_MS |
Delay between SerpAPI calls within a chunk (default 1200). |
BATCH_SCORE_DELAY_MS |
Delay between listing fetch/score steps (default 1500). |
BATCH_QUERIES_PER_CHUNK |
Max Serp queries processed per QStash delivery (default 5). |
BATCH_SCORES_PER_CHUNK |
Max listings scored per delivery (default 3). |
BATCH_MAX_STORED_HITS |
Cap on stored passing hits in Redis (default 500). |
BATCH_MIN_ROLE_FIT |
Default min role score (overridable in API body). |
BATCH_MIN_GLOBAL_REMOTE |
Default min remote score. |
BATCH_ALLOWED_VERDICTS |
Comma list: Strong Match, Possible, Skip. |
Manual resume (debug / recovery)
If QStash delivery fails, you can advance the job one chunk with:
POST /api/launcher/batch-jobs/{id}/resume
Header:
Authorization: Bearer {BATCH_RESUME_SECRET}
Set BATCH_RESUME_SECRET in the deployment environment. This runs the same logic as the QStash callback without signature verification.
API
POST /api/launcher/batch-jobs— body:{ queries: [{ q, tbs? }], options }. Returns{ id, status }and enqueues the first chunk.GET /api/launcher/batch-jobs/{id}— progress + top hits (sorted, capped bytopK).POST /api/internal/batch-jobs/chunk— QStash-only (or manual resume via the resume route above); body{ jobId }.
UI
Job search launcher → Background batch job: precision checkbox, caps, and Start background batch opens /job-launcher/batch/{id} for polling.
Recent jobs and JSON export (browser)
- Recent batch jobs on the launcher is stored in
localStorageonly (this browser / profile). Clearing site data removes the list; it is not synced to a server. - On the batch status page, Download JSON saves an archival snapshot of the current job payload (plus
exportedAtmetadata). Server-side jobs still expire from Redis after about 7 days as before—export if you need a long-term copy.