Automated-CV-Parser / README.md
Zeqhx's picture
Deploy CV parser dashboard with dataset 2 model
c59578d verified
metadata
title: Automated CV Parser
emoji: 🧩
colorFrom: indigo
colorTo: blue
sdk: docker
app_port: 7860
pinned: false
short_description: Resume NER β€” extracts Job Titles, Skills & Education

CV Parser Dashboard

A Streamlit app on top of the WQF7007 resume-NER model. Deployed as a Hugging Face Space; the model is loaded from the Hub so it can be updated without a redeploy.

  • πŸ”Ž Live Parser β€” paste/upload one CV and watch it get tokenized and classified: sub-word token chips coloured by predicted label, the original text with highlighted entities, and a structured summary.
  • πŸ“Š Analytics β€” batch-upload CVs (PDF / DOCX / TXT) for a skills word cloud and top-entity charts across the set.
  • πŸ” Manage Model β€” password-gated; teammates upload a new exported model and it's pushed to the Hub repo the app reads from.

Models β€” how swapping works

The app loads its model from a Hugging Face Hub repo (PRIMARY_MODEL_ID in config.py, default Zeqhx/cv-parser-ner). A Space's own disk is wiped on restart, so the Hub repo is the durable store. To update the live model:

  • Easiest: open Manage Model, enter the page password, upload a .zip of your exported model folder (the exported_models/… folder the training notebooks produce). It's validated against the 7-tag scheme and pushed to the Hub repo.
  • Or: push files straight to the Hub model repo (web UI / CLI).
  • Then click πŸ”„ Reload model in the sidebar to pick up the new weights.

The sidebar picker also has a Custom HF model ID box to load any repo live.

Secrets (Space β†’ Settings β†’ Variables and secrets)

Name Purpose
HF_TOKEN A Hugging Face write token, so Manage Model can push to the Hub.
MANAGE_PASSWORD Password protecting the Manage Model page (page is disabled if unset).
DASHBOARD_MODEL_ID (optional) Override the model repo the app loads.

Run locally

pip install -r requirements.txt
streamlit run app.py

Local exported models in ../exported_models/… are auto-offered in the picker when present (see config.MODEL_REGISTRY).

Layout

app.py                  # landing page + model-status banner
config.py               # model registry/resolution, labels, colours, sample CV
pages/
  1_Live_Parser.py
  2_Analytics.py
  3_Manage_Model.py     # password-gated model uploader -> Hub
lib/
  model.py              # load (local/Hub/fallback) + sliding-window inference
  extract.py            # PDF / DOCX / TXT -> text (adaptive spacing fix)
  ui.py                 # shared sidebar model picker
  viz.py                # entity HTML, token chips, word cloud, bar charts