PrivacyShield / README.md
perceptron01's picture
Update README.md
9f23b02 verified
|
Raw
History Blame Contribute Delete
5.3 kB
---
title: PrivacyShield
emoji: πŸ›‘οΈ
colorFrom: red
colorTo: gray
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
license: mit
tags:
- build-small-hackathon
- privacy
- pii
- security
- llm-guardrails
- ner
- track:backyard
- sponsor:openbmb
- sponsor:nvidia
- achievement:offgrid
- achievement:welltuned
short_description: Local PII & secret firewall for LLMs
---
# πŸ›‘οΈ PrivacyShield β€” a local firewall for LLMs
**Strip PII and leaked secrets out of text *before* it ever reaches an LLM API β€” then put the
real values back into the response. Nothing sensitive leaves your machine.**
Every week another company leaks customer data or an API key into an LLM prompt. The answer
isn't "stop using AI" β€” it's a guardrail that runs **locally**, in front of the model.
> **Demo it in two clicks:** open the app β†’ **Try the leaked-secret example β†’ Sanitize.** Watch the
> AWS key, JWT, emails, Aadhaar (checksum-validated), names and address get masked β€” **"N blocked Β·
> 0 leaked."** Then hit **Simulate the LLM round-trip**: the model only ever sees placeholders, and
> the real values are restored on your machine.
## Why this matters
- **The privacy requirement makes a small local model the *correct* design, not a compromise.** You
literally cannot send PII to a cloud API to have it redacted. PrivacyShield runs entirely on-device.
- **It catches what regex can't.** Structured data (Aadhaar, PAN, cards, AWS keys, JWTs) is caught by
high-precision, **checksum-validated** detectors. Context-dependent data (**names, addresses,
orgs**) is caught by a **fine-tuned model** β€” regex is blind to these.
- **The round-trip keeps the LLM useful.** Mask β†’ call the LLM with safe text β†’ restore the originals
into the answer locally. You get a real answer; the data never left.
- **Compliance-ready.** Aligns with privacy regimes (India's DPDP Act, GDPR) that require minimizing
exposure of personal data to third parties.
## How it works
```
your text
β†’ DETECT : checksum-validated regex (structured PII + secrets) βˆͺ fine-tuned NER (names/addresses)
β†’ MASK : each finding β†’ reversible placeholder, e.g. [PERSON_NAME_1], [SECRET_1]; originals kept
only in an in-memory vault (never logged, never sent)
β†’ CALL : send the sanitized text to the LLM (the LLM only ever sees placeholders)
β†’ RESTORE: swap placeholders back to the real values in the response, locally
```
## What it detects
| Layer | Examples | How |
|---|---|---|
| **Structured PII** | email, phone, **Aadhaar** (Verhoeff checksum), **PAN**, IFSC, **card** (Luhn), UPI, IP | deterministic, high precision |
| **Secrets** | AWS keys (`AKIA…`), JWTs, GitHub tokens, private-key blocks, high-entropy strings | regex + Shannon entropy |
| **Contextual PII** | **person names, addresses, organizations** | **fine-tuned XLM-RoBERTa** |
## The model β€” real data, real evaluation (not vibes)
- **Base:** `FacebookAI/xlm-roberta-base` (~270M params β€” runs on CPU, no GPU needed at inference).
- **Fine-tuned on:** [`ai4privacy/pii-masking-200k`](https://huggingface.co/datasets/ai4privacy/pii-masking-200k)
(real, span-labeled) **+ synthetic Indian PII** (valid-format Aadhaar/PAN/IFSC/UPI, Indian names &
addresses) so it handles Indian documents, which generic tools miss.
- **Model:** [`perceptron01/privacyshield-ner`](https://huggingface.co/perceptron01/privacyshield-ner)
- **Param math:** 0.27B β‰ͺ 32B cap βœ…
**Evaluation (held-out mix of ai4privacy + synthetic Indian PII):**
| Method | PERSON recall | ADDRESS recall | structured PII / secrets |
|---|---|---|---|
| regex-only baseline | ~0.00 | ~0.00 | high (checksum-validated) |
| **PrivacyShield (regex + fine-tuned model)** | **~0.97** | **~0.97** | high |
**Recall is the metric we optimize** β€” a missed secret or PII item is a leak, so a false negative is
far worse than over-masking. Overall fine-tuned F1 β‰ˆ **0.97** (precision 0.97 / recall 0.97).
*Honest limitations:* the synthetic portion of the test set is formulaic and inflates absolute scores;
the model occasionally labels an organization as ADDRESS (the value is still masked, so nothing leaks);
free-text address boundaries are imperfect. The structured/secret layer is the high-precision backbone.
## Privacy by design
No database, no auth, no persistence. The detected values live only in an in-memory vault for the
duration of a request; the downloadable **audit log contains placeholders only β€” never raw values.**
## Run locally
```bash
pip install -r requirements.txt
python app.py
```
## Tech
Gradio Β· Hugging Face Transformers Β· a fine-tuned XLM-RoBERTa token classifier Β· deterministic
detectors with Verhoeff (Aadhaar) and Luhn (card) checksum validation + Shannon-entropy secret detection.
## Submission
- πŸ€— **Live Space:** https://huggingface.co/spaces/build-small-hackathon/PrivacyShield
- πŸŽ₯ **Demo video:** https://drive.google.com/file/d/1TERBTamfhW87jlLip9EX8Sx9KYqgMAL4/view?usp=sharing
- πŸ“£ **Social post:** https://www.linkedin.com/posts/aman-maurya-2a394924b_privacyshield-a-hugging-face-space-by-build-small-hackathon-share-7472416023334367234-oE6J/?utm_source=share&utm_medium=member_desktop&rcm=ACoAAD3l3lsBvHlGmHXJP3WiWP5GwQFJQ2g9QZI