--- title: PrivacyShield emoji: πŸ›‘οΈ colorFrom: red colorTo: gray sdk: gradio sdk_version: 5.49.1 app_file: app.py pinned: false license: mit tags: - build-small-hackathon - privacy - pii - security - llm-guardrails - ner - track:backyard - sponsor:openbmb - sponsor:nvidia - achievement:offgrid - achievement:welltuned short_description: Local PII & secret firewall for LLMs --- # πŸ›‘οΈ PrivacyShield β€” a local firewall for LLMs **Strip PII and leaked secrets out of text *before* it ever reaches an LLM API β€” then put the real values back into the response. Nothing sensitive leaves your machine.** Every week another company leaks customer data or an API key into an LLM prompt. The answer isn't "stop using AI" β€” it's a guardrail that runs **locally**, in front of the model. > **Demo it in two clicks:** open the app β†’ **Try the leaked-secret example β†’ Sanitize.** Watch the > AWS key, JWT, emails, Aadhaar (checksum-validated), names and address get masked β€” **"N blocked Β· > 0 leaked."** Then hit **Simulate the LLM round-trip**: the model only ever sees placeholders, and > the real values are restored on your machine. ## Why this matters - **The privacy requirement makes a small local model the *correct* design, not a compromise.** You literally cannot send PII to a cloud API to have it redacted. PrivacyShield runs entirely on-device. - **It catches what regex can't.** Structured data (Aadhaar, PAN, cards, AWS keys, JWTs) is caught by high-precision, **checksum-validated** detectors. Context-dependent data (**names, addresses, orgs**) is caught by a **fine-tuned model** β€” regex is blind to these. - **The round-trip keeps the LLM useful.** Mask β†’ call the LLM with safe text β†’ restore the originals into the answer locally. You get a real answer; the data never left. - **Compliance-ready.** Aligns with privacy regimes (India's DPDP Act, GDPR) that require minimizing exposure of personal data to third parties. ## How it works ``` your text β†’ DETECT : checksum-validated regex (structured PII + secrets) βˆͺ fine-tuned NER (names/addresses) β†’ MASK : each finding β†’ reversible placeholder, e.g. [PERSON_NAME_1], [SECRET_1]; originals kept only in an in-memory vault (never logged, never sent) β†’ CALL : send the sanitized text to the LLM (the LLM only ever sees placeholders) β†’ RESTORE: swap placeholders back to the real values in the response, locally ``` ## What it detects | Layer | Examples | How | |---|---|---| | **Structured PII** | email, phone, **Aadhaar** (Verhoeff checksum), **PAN**, IFSC, **card** (Luhn), UPI, IP | deterministic, high precision | | **Secrets** | AWS keys (`AKIA…`), JWTs, GitHub tokens, private-key blocks, high-entropy strings | regex + Shannon entropy | | **Contextual PII** | **person names, addresses, organizations** | **fine-tuned XLM-RoBERTa** | ## The model β€” real data, real evaluation (not vibes) - **Base:** `FacebookAI/xlm-roberta-base` (~270M params β€” runs on CPU, no GPU needed at inference). - **Fine-tuned on:** [`ai4privacy/pii-masking-200k`](https://huggingface.co/datasets/ai4privacy/pii-masking-200k) (real, span-labeled) **+ synthetic Indian PII** (valid-format Aadhaar/PAN/IFSC/UPI, Indian names & addresses) so it handles Indian documents, which generic tools miss. - **Model:** [`perceptron01/privacyshield-ner`](https://huggingface.co/perceptron01/privacyshield-ner) - **Param math:** 0.27B β‰ͺ 32B cap βœ… **Evaluation (held-out mix of ai4privacy + synthetic Indian PII):** | Method | PERSON recall | ADDRESS recall | structured PII / secrets | |---|---|---|---| | regex-only baseline | ~0.00 | ~0.00 | high (checksum-validated) | | **PrivacyShield (regex + fine-tuned model)** | **~0.97** | **~0.97** | high | **Recall is the metric we optimize** β€” a missed secret or PII item is a leak, so a false negative is far worse than over-masking. Overall fine-tuned F1 β‰ˆ **0.97** (precision 0.97 / recall 0.97). *Honest limitations:* the synthetic portion of the test set is formulaic and inflates absolute scores; the model occasionally labels an organization as ADDRESS (the value is still masked, so nothing leaks); free-text address boundaries are imperfect. The structured/secret layer is the high-precision backbone. ## Privacy by design No database, no auth, no persistence. The detected values live only in an in-memory vault for the duration of a request; the downloadable **audit log contains placeholders only β€” never raw values.** ## Run locally ```bash pip install -r requirements.txt python app.py ``` ## Tech Gradio Β· Hugging Face Transformers Β· a fine-tuned XLM-RoBERTa token classifier Β· deterministic detectors with Verhoeff (Aadhaar) and Luhn (card) checksum validation + Shannon-entropy secret detection. ## Submission - πŸ€— **Live Space:** https://huggingface.co/spaces/build-small-hackathon/PrivacyShield - πŸŽ₯ **Demo video:** https://drive.google.com/file/d/1TERBTamfhW87jlLip9EX8Sx9KYqgMAL4/view?usp=sharing - πŸ“£ **Social post:** https://www.linkedin.com/posts/aman-maurya-2a394924b_privacyshield-a-hugging-face-space-by-build-small-hackathon-share-7472416023334367234-oE6J/?utm_source=share&utm_medium=member_desktop&rcm=ACoAAD3l3lsBvHlGmHXJP3WiWP5GwQFJQ2g9QZI