Automated-CV-Parser / README.md
Zeqhx's picture
Deploy CV parser dashboard with dataset 2 model
c59578d verified
---
title: Automated CV Parser
emoji: 🧩
colorFrom: indigo
colorTo: blue
sdk: docker
app_port: 7860
pinned: false
short_description: Resume NER β€” extracts Job Titles, Skills & Education
---
# CV Parser Dashboard
A Streamlit app on top of the WQF7007 resume-NER model. Deployed as a Hugging
Face Space; the model is loaded from the Hub so it can be updated without a
redeploy.
- **πŸ”Ž Live Parser** β€” paste/upload one CV and watch it get tokenized and
classified: sub-word token chips coloured by predicted label, the original text
with highlighted entities, and a structured summary.
- **πŸ“Š Analytics** β€” batch-upload CVs (PDF / DOCX / TXT) for a skills word cloud
and top-entity charts across the set.
- **πŸ” Manage Model** β€” password-gated; teammates upload a new exported model and
it's pushed to the Hub repo the app reads from.
## Models β€” how swapping works
The app loads its model **from a Hugging Face Hub repo** (`PRIMARY_MODEL_ID` in
`config.py`, default `Zeqhx/cv-parser-ner`). A Space's own disk is wiped on
restart, so the Hub repo is the durable store. To update the live model:
- **Easiest:** open **Manage Model**, enter the page password, upload a `.zip` of
your exported model folder (the `exported_models/…` folder the training notebooks
produce). It's validated against the 7-tag scheme and pushed to the Hub repo.
- **Or:** push files straight to the Hub model repo (web UI / CLI).
- Then click **πŸ”„ Reload model** in the sidebar to pick up the new weights.
The sidebar picker also has a **Custom HF model ID** box to load any repo live.
## Secrets (Space β†’ Settings β†’ Variables and secrets)
| Name | Purpose |
|------|---------|
| `HF_TOKEN` | A Hugging Face **write** token, so Manage Model can push to the Hub. |
| `MANAGE_PASSWORD` | Password protecting the Manage Model page (page is disabled if unset). |
| `DASHBOARD_MODEL_ID` | *(optional)* Override the model repo the app loads. |
## Run locally
```bash
pip install -r requirements.txt
streamlit run app.py
```
Local exported models in `../exported_models/…` are auto-offered in the picker
when present (see `config.MODEL_REGISTRY`).
## Layout
```
app.py # landing page + model-status banner
config.py # model registry/resolution, labels, colours, sample CV
pages/
1_Live_Parser.py
2_Analytics.py
3_Manage_Model.py # password-gated model uploader -> Hub
lib/
model.py # load (local/Hub/fallback) + sliding-window inference
extract.py # PDF / DOCX / TXT -> text (adaptive spacing fix)
ui.py # shared sidebar model picker
viz.py # entity HTML, token chips, word cloud, bar charts
```