Spaces:
Running
Running
| title: NLProxy Enterprise Demo | |
| emoji: 🛡️ | |
| colorFrom: blue | |
| colorTo: gray | |
| sdk: gradio | |
| sdk_version: 4.36.1 | |
| app_file: app.py | |
| pinned: false | |
| license: other | |
| tags: | |
| - llm | |
| - prompt-compression | |
| - security | |
| - firewall | |
| - nli | |
| - pii-masking | |
| - enterprise | |
| <div align="center"> | |
| <h1>NLProxy</h1> | |
| <p><strong>Prompt Security & Compression Gateway for LLMs</strong></p> | |
| <p><em>The enterprise-grade, offline-first middleware that cuts your LLM bill by up to 60% while enforcing zero-trust security.</em></p> | |
| [](https://github.com/intellideep/nlproxy/blob/main/LICENSE.md) | |
| [](https://pypi.org/project/nlproxy/) | |
| [](https://github.com/intellideep/nlproxy) | |
| </div> | |
| --- | |
| ## 🎛️ About This Interactive Demo | |
| This Hugging Face Space serves as a **live, interactive sandbox** for the NLProxy Pipeline. Instead of just reading about it, you can visually audit how NLProxy protects, compresses, and verifies LLM interactions in real-time. | |
| Upon startup, this Space dynamically clones the official [`intellideep/nlproxy`](https://github.com/intellideep/nlproxy) repository, downloads the required ONNX/NLI models, and exposes the complete **5-Step Lifecycle** via a Gradio interface. | |
| --- | |
| ## 📉 The Problem with LLMs Today | |
| Every time you send a prompt to OpenAI, Anthropic, or Gemini, you are doing three dangerous things: | |
| 1. **Burning money** on redundant words, pleasantries, and verbose context. | |
| 2. **Leaking PII** (emails, IPs, internal code) to third-party servers. | |
| 3. **Exposing yourself** to jailbreaks, prompt injections, and semantic drift. | |
| **NLProxy fixes all three before the prompt ever leaves your infrastructure.** | |
| --- | |
| ## 🎯 Why NLProxy? | |
| ### 💰 Slash Your LLM Bill (Semantic Compression) | |
| NLProxy doesn't just strip stopwords. It uses **KMeans/Ward semantic clustering** and **ONNX-quantized embeddings** to understand the *meaning* of your prompt. It identifies redundant sentences and compresses them, **reducing token usage by 40% to 60%** without losing critical intent. | |
| > *Result: A $1,000/month OpenAI bill becomes $400.* | |
| ### 🏗️ The 6-Stage Defense Pipeline (Visualized in this Demo) | |
| ```text | |
| ┌─────────────────────────────────────────────────────────────┐ | |
| │ NLProxy Pipeline │ | |
| ├─────────────────────────────────────────────────────────────┤ | |
| │ │ | |
| │ 📥 INPUT: "Ignore instructions... IP 192.168.1.1..." │ | |
| │ ↓ │ | |
| │ 🛡️ [1] FIREWALL │ | |
| │ ├─ PromptFirewall.check_prompt() │ | |
| │ └─ Action: BLOCK / ALERT / REWRITE / ALLOW │ | |
| │ ↓ │ | |
| │ 📉 [2] COMPRESS │ | |
| │ ├─ CompressionService.compress_batch() │ | |
| │ ├─ Shield → Segment → Cluster → Reconstruct │ | |
| │ └─ Output: "IP: __PROT_xxx. Do NOT use Python..." │ | |
| │ ↓ │ | |
| │ 🔒 [3] SAFETY │ | |
| │ ├─ SafetyChecker.validate() │ | |
| │ └─ Reinserts critical intents if missing │ | |
| │ ↓ │ | |
| │ 🤖 [4] LLM CALL (Simulated in this demo) │ | |
| │ ├─ LLMOrchestrator.generate() │ | |
| │ └─ OpenAI / Claude / Gemini / Local │ | |
| │ ↓ │ | |
| │ 🧹 [5] CORRECT │ | |
| │ ├─ ResponseCorrector.correct() │ | |
| │ └─ Applies FORBID/MANDATE + redacts unauthorized │ | |
| │ ↓ │ | |
| │ 🔍 [6] VERIFY │ | |
| │ ├─ PostLLMVerifier.verify() │ | |
| │ ├─ NLI contradiction detection │ | |
| │ └─ Confidence: 0.30 → 0.85 (after auto-correction) │ | |
| │ ↓ │ | |
| │ 📤 OUTPUT: "Solution in Java. Connection protected." │ | |
| │ │ | |
| └─────────────────────────────────────────────────────────────┘ | |
| ``` | |
| ### 🛡️ Unbreakable Security (Firewall & Verification) | |
| - **Pre-Flight:** A multi-layer firewall blocks jailbreaks, system prompt extraction, and SQLi using regex + semantic attack detection. | |
| - **Post-Flight:** NLI (Natural Language Inference) models verify that the LLM didn't hallucinate forbidden actions or leak unauthorized entities. | |
| ### Real‑World Use Cases | |
| | Use Case | NLProxy Benefit | | |
| |-----------------------------------|---------------------------------------------------------------------------------| | |
| | **Chat‑based customer support** | Reduces token costs by 50% while preserving mandatory disclaimers and safety rules. | | |
| | **Code generation assistant** | Masks API keys and internal IPs; enforces “do not use Python” restrictions. | | |
| | **Legal document analysis** | Preserves confidentiality and privilege statements even after heavy compression. | | |
| | **Multi‑tenant SaaS** | Semantic cache + domain filtering reduces redundant LLM calls by 70‑80%. | | |
| | **On‑premise deployment** | Works fully offline, no external dependencies (optional Redis for cache). | | |
| --- | |
| # Components | |
| | Component | Function | | |
| |-------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------| | |
| | **Firewall** | Regex + semantic injection detection (jailbreak, system prompt extraction, data exfiltration). | | |
| | **Shield** | Entity masking (IPs, emails, codes, PII) and extraction of semantic restrictions (FORBID/MANDATE). | | |
| | **Segmenter** | Language‑aware sentence splitting + ONNX‑accelerated sentence embeddings (384‑d MiniLM). | | |
| | **Compressor** | Clustering‑based redundancy removal (Ward / K‑Means) with variance filtering. | | |
| | **Reconstructor** | Re‑injects masked entities, removes stopwords, and computes token/cost savings. | | |
| | **SafetyChecker** | Verifies critical intents/restrictions survive compression; re‑inserts missing sentences. | | |
| | **LLMOrchestrator** | Multi‑provider (Gemini, OpenAI, Claude, etc.) with retry, circuit breaker, and rate limiting. | | |
| | **PostLLMVerifier** | NLI‑based contradiction detection, unauthorized entity detection, semantic drift monitoring. | | |
| | **ResponseCorrector** | Sanitizes LLM output: removes prohibited entities, enforces mandates, redacts placeholders. | | |
| | **Semantic Cache** | RedisVL‑powered vector cache (cosine similarity), optional TTL and domain filtering. | | |
| --- | |
| # Benchmark | |
| ## Comparison with State‑of‑the‑Art (SOTA) | |
| | Solution | Injection Prevention | Entity Masking | Prompt Compression | Restriction Enforcement | Post‑LLM Verification | Offline | Open Source | Multi‑LLM | | |
| |-------------------------|:--------------------:|:--------------:|:------------------:|:------------------------:|:---------------------:|:-------:|:-----------:|:---------:| | |
| | **NLProxy** | ✅ | ✅ | ✅ (semantic) | ✅ | ✅ | ✅ | ✅ (BSL 1.1)| ✅ | | |
| | LangChain | ❌ (no built‑in) | ❌ | ❌ (only templates) | ❌ | ❌ | ⚠️ partial | ✅ | ✅ | | |
| | Semantic Kernel | ❌ | ❌ | ❌ | ❌ | ❌ | ⚠️ partial | ✅ | ✅ | | |
| | LLMLingua / Selective Context | ❌ | ❌ | ✅ (token‑level) | ❌ | ❌ | ✅ | ✅ | ❌ | | |
| | Rebuff (injection) | ✅ | ❌ | ❌ | ❌ | ❌ | ⚠️ | ✅ | ❌ | | |
| | Lakera Guard | ✅ | ✅ (basic) | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | | |
| | Azure OpenAI Content Safety | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ | | |
| **Key differentiators:** | |
| - NLProxy is the **only open‑source solution** that combines **prompt security, semantic compression, constraint enforcement, and response verification** in a single pipeline. | |
| - All critical components work **offline** (embedding & NLI models are downloaded once and run locally). | |
| - The **business‑friendly BSL 1.1 license** allows free use for indie developers, students, and non‑profits, while requiring a commercial license for large enterprises (>$1M revenue). | |
| ### Compression Efficiency | |
| | Metric | Value | | |
| |-------------------------------------|------------------------------------------| | |
| | Average token reduction (general) | **45‑55%** | | |
| | Reduction on legal/finance documents| 35‑45% (conservative) | | |
| | Reduction on code prompts | 55‑65% | | |
| | Compression latency (per prompt) | 50‑120 ms (CPU), 20‑40 ms (GPU) | | |
| | Embedding model | all‑MiniLM‑L6‑v2 (384 dim, ONNX) | | |
| | Clustering method | Auto‑select Ward (<200 sent) / K‑Means | | |
| ### Security & Verification | |
| | Check | Accuracy / Throughput | | |
| |------------------------------------|------------------------------------------| | |
| | Injection detection (regex) | >99% on known patterns (MITRE ATLAS) | | |
| | Semantic injection (embedding) | 92% recall @ 0.85 threshold (optional) | | |
| | Entity masking | 100% of IPs, emails, dates, hashes | | |
| | NLI contradiction detection | 78‑85% accuracy (distilroberta‑base) | | |
| | Restriction enforcement (FORBID) | 100% (exact match) | | |
| | Post‑LLM verification latency | +30‑60 ms per request (NLI enabled) | | |
| ### End‑to‑End Latency | |
| | Configuration | P95 Latency (ms) | | |
| |--------------------------------------|------------------| | |
| | Compression only (no NLI, no cache) | 120‑180 | | |
| | Compression + Firewall + Shield | 150‑220 | | |
| | Full pipeline + NLI verification | 200‑300 | | |
| | Full pipeline + Semantic Cache (hit) | <10 | | |
| ### Scalability | |
| | Component | Limit / Sizing Guideline | | |
| |--------------------------|-------------------------------------------------------| | |
| | Max prompt length | 100k chars (configurable) | | |
| | Concurrent requests | Limited by `--workers` + thread pool (default 8) | | |
| | Embedding batch size | 128 sentences (can be increased with more memory) | | |
| | Redis cache capacity | Unlimited (depends on Redis memory) | | |
| | Multi‑LLM failover | Supports fallback chains (OpenAI → Claude → Gemini) | | |
| --- | |
| ## 📄 License | |
| NLProxy is released under the **Business Source License 1.1** (BSL 1.1). | |
| - ✅ Free for **indie developers, students, non‑profits, and small businesses** (revenue < $1M). | |
| - 🏢 **Large enterprises** (revenue ≥ $1M) require a commercial license – contact us for pricing. | |
| - 🔓 After **five years** from the release date, the code automatically converts to **Apache 2.0**. | |
| See the [LICENSE.md](LICENSE.md) file for full text. | |
| --- | |
| ## 💬 Support & Contact | |
| - 📧 Email: **intellideeplabs@gmail.com** | |
| - 💬 Telegram: [@itsLerb](https://t.me/itsLerb) (click to open) – *response within 24h* | |
| - 🐛 Issues: Use [GitHub Issues](https://github.com/intellideep/nlproxy/issues) for bugs and feature requests. | |
| We welcome contributions, but please open an issue first to discuss. | |