Spaces:
Configuration error
Configuration error
| # DomainShield | |
| **Privacy protection for LLM pipelines** | |
| DomainShield is a research project focused on preventing sensitive data leakage when using external large language model APIs. | |
| ## Overview | |
| The system acts as a middleware firewall: | |
| - Masks sensitive information before sending data to external LLMs | |
| - Handles both general PII and domain-specific sensitive entities | |
| - Reconstructs the original content after receiving the response | |
| ## Key Focus | |
| - PII masking (names, emails, identifiers) | |
| - Domain-specific entity protection (internal terms, codes, private vocabularies) | |
| - Multilingual robustness under noisy conditions | |
| - Comparison of adaptation methods (prompting, RAG, fine-tuning, NER) | |
| ## Approach | |
| We evaluate multiple strategies for detecting and masking sensitive data: | |
| - Prompt-based methods | |
| - Retrieval-augmented approaches (RAG) | |
| - Supervised fine-tuning (LoRA) | |
| - Token classification (NER) | |
| - Hybrid and ensemble methods | |
| ## Status | |
| Active research project. Models, benchmarks, and demos coming soon. |