Spaces:
Sleeping
Local Setup β GapGuide Backend
Prerequisites: Windows 10/11, PostgreSQL 17 installed, Python 3.13 (already set up at backend/venv/), Node 20+ for the frontend.
1. Install pgvector extension on PostgreSQL 17
Module 8 Layer 4 uses pgvector for SBERT cosine similarity. CREATE EXTENSION vector fails until the extension files are installed into the PostgreSQL server directory. Two options:
Option A β Precompiled binary (recommended, ~2 minutes)
- Go to https://github.com/andreiramani/pgvector_pgsql_windows/releases
- Download the release tagged
0.8.2_17.6(built against PostgreSQL 17.6; works on 17.3+ β older 17.0β17.2 hit a linker bug). - Extract. You should see
vector.control,vector--*.sql, andvector.dll. - Copy files:
vector.controlandvector--*.sqlβC:\Program Files\PostgreSQL\17\share\extension\vector.dllβC:\Program Files\PostgreSQL\17\lib\- Administrator rights required (Program Files is protected).
- Verify in psql:
Output should includepsql -U postgres -d gapguide -c "CREATE EXTENSION IF NOT EXISTS vector; SELECT '[1,2,3]'::vector;"[1,2,3].
Option B β Build from source (~30 min, no trusted binary)
- Install Visual Studio with "Desktop development with C++".
- Open "x64 Native Tools Command Prompt for VS" as administrator.
- Run:
set "PGROOT=C:\Program Files\PostgreSQL\17" cd %TEMP% git clone --branch v0.8.2 https://github.com/pgvector/pgvector.git cd pgvector nmake /F Makefile.win nmake /F Makefile.win install - Verify as in Option A step 5.
2. Python dependencies
cd backend
.\venv\Scripts\Activate.ps1
pip install -r requirements.txt
python -m spacy download en_core_web_sm
First install is ~5 minutes (torch CPU wheel is the heaviest at ~200 MB).
The NER layers will use en_core_web_sm (12 MB) by default. If you want
slightly better noun-phrase quality on paraphrased CVs, upgrade to
en_core_web_lg (~560 MB) β the layers auto-detect and prefer _lg:
python -m spacy download en_core_web_lg
HuggingFace models download lazily on first parse into %USERPROFILE%\.cache\huggingface (~1 GB total across Nucha BERT, JobBERT, SBERT). Subsequent parses are instant.
3. Database setup
# create the DB (if fresh)
createdb -U postgres gapguide
# migrate (requires pgvector from step 1)
python manage.py migrate
# seed the catalog + 5 demo users
python manage.py seed_initial_skills
python manage.py seed_initial_roles
python manage.py seed_initial_resources
python manage.py seed_demo_users
# build SBERT embeddings for the Skill catalog (Module 8 Layer 4)
python scripts/build_skill_embeddings.py
Demo user login: demo.partial@gapguide.test / DemoPass123!.
4. Run locally
# terminal 1 β backend
cd backend
.\venv\Scripts\Activate.ps1
python manage.py runserver
# terminal 2 β frontend
cd frontend
npm install
npm run dev
Open http://localhost:8080.
5. Running tests
cd backend
# CI baseline (no ML, seconds)
$env:GAPGUIDE_PARSE_LAYERS = "lexical"
pytest -q
# ML smoke test (downloads models on first run β ~15 min first time)
Remove-Item Env:\GAPGUIDE_PARSE_LAYERS
$env:GAPGUIDE_ML_SMOKE = "1"
pytest apps/accounts/tests/test_resume_parser_integration.py -q
Troubleshooting
extension "vector" is not availableβ pgvector not installed on your Postgres. Redo step 1. Thevector.controlfile must exist in PostgreSQL'sshare\extension\directory.ModuleNotFoundError: pgvectorβpip install -r requirements.txthasn't run (or ran in a different venv). Activate the venv first.spaCy model not foundβ runpython -m spacy download en_core_web_lg.- Slow first resume parse β normal. The NER chain downloads ~1 GB of models on first call. Subsequent parses use the cache.
GAPGUIDE_PARSE_LAYERS=lexicalβ lets you run the backend without any ML deps loaded. Useful during dev when you don't want the first-call model downloads.