Spaces:
Running
Running
File size: 14,149 Bytes
c173619 2129c29 c173619 2129c29 c173619 2129c29 c173619 2129c29 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 | ---
title: NLProxy Enterprise Demo
emoji: π‘οΈ
colorFrom: blue
colorTo: gray
sdk: gradio
sdk_version: 4.36.1
app_file: app.py
pinned: false
license: other
tags:
- llm
- prompt-compression
- security
- firewall
- nli
- pii-masking
- enterprise
---
<div align="center">
<h1>NLProxy</h1>
<p><strong>Prompt Security & Compression Gateway for LLMs</strong></p>
<p><em>The enterprise-grade, offline-first middleware that cuts your LLM bill by up to 60% while enforcing zero-trust security.</em></p>
[](https://github.com/intellideep/nlproxy/blob/main/LICENSE.md)
[](https://pypi.org/project/nlproxy/)
[](https://github.com/intellideep/nlproxy)
</div>
---
## ποΈ About This Interactive Demo
This Hugging Face Space serves as a **live, interactive sandbox** for the NLProxy Pipeline. Instead of just reading about it, you can visually audit how NLProxy protects, compresses, and verifies LLM interactions in real-time.
Upon startup, this Space dynamically clones the official [`intellideep/nlproxy`](https://github.com/intellideep/nlproxy) repository, downloads the required ONNX/NLI models, and exposes the complete **5-Step Lifecycle** via a Gradio interface.
---
## π The Problem with LLMs Today
Every time you send a prompt to OpenAI, Anthropic, or Gemini, you are doing three dangerous things:
1. **Burning money** on redundant words, pleasantries, and verbose context.
2. **Leaking PII** (emails, IPs, internal code) to third-party servers.
3. **Exposing yourself** to jailbreaks, prompt injections, and semantic drift.
**NLProxy fixes all three before the prompt ever leaves your infrastructure.**
---
## π― Why NLProxy?
### π° Slash Your LLM Bill (Semantic Compression)
NLProxy doesn't just strip stopwords. It uses **KMeans/Ward semantic clustering** and **ONNX-quantized embeddings** to understand the *meaning* of your prompt. It identifies redundant sentences and compresses them, **reducing token usage by 40% to 60%** without losing critical intent.
> *Result: A $1,000/month OpenAI bill becomes $400.*
### ποΈ The 6-Stage Defense Pipeline (Visualized in this Demo)
```text
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β NLProxy Pipeline β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β π₯ INPUT: "Ignore instructions... IP 192.168.1.1..." β
β β β
β π‘οΈ [1] FIREWALL β
β ββ PromptFirewall.check_prompt() β
β ββ Action: BLOCK / ALERT / REWRITE / ALLOW β
β β β
β π [2] COMPRESS β
β ββ CompressionService.compress_batch() β
β ββ Shield β Segment β Cluster β Reconstruct β
β ββ Output: "IP: __PROT_xxx. Do NOT use Python..." β
β β β
β π [3] SAFETY β
β ββ SafetyChecker.validate() β
β ββ Reinserts critical intents if missing β
β β β
β π€ [4] LLM CALL (Simulated in this demo) β
β ββ LLMOrchestrator.generate() β
β ββ OpenAI / Claude / Gemini / Local β
β β β
β π§Ή [5] CORRECT β
β ββ ResponseCorrector.correct() β
β ββ Applies FORBID/MANDATE + redacts unauthorized β
β β β
β π [6] VERIFY β
β ββ PostLLMVerifier.verify() β
β ββ NLI contradiction detection β
β ββ Confidence: 0.30 β 0.85 (after auto-correction) β
β β β
β π€ OUTPUT: "Solution in Java. Connection protected." β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
### π‘οΈ Unbreakable Security (Firewall & Verification)
- **Pre-Flight:** A multi-layer firewall blocks jailbreaks, system prompt extraction, and SQLi using regex + semantic attack detection.
- **Post-Flight:** NLI (Natural Language Inference) models verify that the LLM didn't hallucinate forbidden actions or leak unauthorized entities.
### RealβWorld Use Cases
| Use Case | NLProxy Benefit |
|-----------------------------------|---------------------------------------------------------------------------------|
| **Chatβbased customer support** | Reduces token costs by 50% while preserving mandatory disclaimers and safety rules. |
| **Code generation assistant** | Masks API keys and internal IPs; enforces βdo not use Pythonβ restrictions. |
| **Legal document analysis** | Preserves confidentiality and privilege statements even after heavy compression. |
| **Multiβtenant SaaS** | Semantic cache + domain filtering reduces redundant LLM calls by 70β80%. |
| **Onβpremise deployment** | Works fully offline, no external dependencies (optional Redis for cache). |
---
# Components
| Component | Function |
|-------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **Firewall** | Regex + semantic injection detection (jailbreak, system prompt extraction, data exfiltration). |
| **Shield** | Entity masking (IPs, emails, codes, PII) and extraction of semantic restrictions (FORBID/MANDATE). |
| **Segmenter** | Languageβaware sentence splitting + ONNXβaccelerated sentence embeddings (384βd MiniLM). |
| **Compressor** | Clusteringβbased redundancy removal (Ward / KβMeans) with variance filtering. |
| **Reconstructor** | Reβinjects masked entities, removes stopwords, and computes token/cost savings. |
| **SafetyChecker** | Verifies critical intents/restrictions survive compression; reβinserts missing sentences. |
| **LLMOrchestrator** | Multiβprovider (Gemini, OpenAI, Claude, etc.) with retry, circuit breaker, and rate limiting. |
| **PostLLMVerifier** | NLIβbased contradiction detection, unauthorized entity detection, semantic drift monitoring. |
| **ResponseCorrector** | Sanitizes LLM output: removes prohibited entities, enforces mandates, redacts placeholders. |
| **Semantic Cache** | RedisVLβpowered vector cache (cosine similarity), optional TTL and domain filtering. |
---
# Benchmark
## Comparison with StateβofβtheβArt (SOTA)
| Solution | Injection Prevention | Entity Masking | Prompt Compression | Restriction Enforcement | PostβLLM Verification | Offline | Open Source | MultiβLLM |
|-------------------------|:--------------------:|:--------------:|:------------------:|:------------------------:|:---------------------:|:-------:|:-----------:|:---------:|
| **NLProxy** | β
| β
| β
(semantic) | β
| β
| β
| β
(BSL 1.1)| β
|
| LangChain | β (no builtβin) | β | β (only templates) | β | β | β οΈ partial | β
| β
|
| Semantic Kernel | β | β | β | β | β | β οΈ partial | β
| β
|
| LLMLingua / Selective Context | β | β | β
(tokenβlevel) | β | β | β
| β
| β |
| Rebuff (injection) | β
| β | β | β | β | β οΈ | β
| β |
| Lakera Guard | β
| β
(basic) | β | β | β | β | β | β |
| Azure OpenAI Content Safety | β
| β | β | β | β | β | β | β
|
**Key differentiators:**
- NLProxy is the **only openβsource solution** that combines **prompt security, semantic compression, constraint enforcement, and response verification** in a single pipeline.
- All critical components work **offline** (embedding & NLI models are downloaded once and run locally).
- The **businessβfriendly BSL 1.1 license** allows free use for indie developers, students, and nonβprofits, while requiring a commercial license for large enterprises (>$1M revenue).
### Compression Efficiency
| Metric | Value |
|-------------------------------------|------------------------------------------|
| Average token reduction (general) | **45β55%** |
| Reduction on legal/finance documents| 35β45% (conservative) |
| Reduction on code prompts | 55β65% |
| Compression latency (per prompt) | 50β120 ms (CPU), 20β40 ms (GPU) |
| Embedding model | allβMiniLMβL6βv2 (384 dim, ONNX) |
| Clustering method | Autoβselect Ward (<200 sent) / KβMeans |
### Security & Verification
| Check | Accuracy / Throughput |
|------------------------------------|------------------------------------------|
| Injection detection (regex) | >99% on known patterns (MITRE ATLAS) |
| Semantic injection (embedding) | 92% recall @ 0.85 threshold (optional) |
| Entity masking | 100% of IPs, emails, dates, hashes |
| NLI contradiction detection | 78β85% accuracy (distilrobertaβbase) |
| Restriction enforcement (FORBID) | 100% (exact match) |
| PostβLLM verification latency | +30β60 ms per request (NLI enabled) |
### EndβtoβEnd Latency
| Configuration | P95 Latency (ms) |
|--------------------------------------|------------------|
| Compression only (no NLI, no cache) | 120β180 |
| Compression + Firewall + Shield | 150β220 |
| Full pipeline + NLI verification | 200β300 |
| Full pipeline + Semantic Cache (hit) | <10 |
### Scalability
| Component | Limit / Sizing Guideline |
|--------------------------|-------------------------------------------------------|
| Max prompt length | 100k chars (configurable) |
| Concurrent requests | Limited by `--workers` + thread pool (default 8) |
| Embedding batch size | 128 sentences (can be increased with more memory) |
| Redis cache capacity | Unlimited (depends on Redis memory) |
| MultiβLLM failover | Supports fallback chains (OpenAI β Claude β Gemini) |
---
## π License
NLProxy is released under the **Business Source License 1.1** (BSL 1.1).
- β
Free for **indie developers, students, nonβprofits, and small businesses** (revenue < $1M).
- π’ **Large enterprises** (revenue β₯ $1M) require a commercial license β contact us for pricing.
- π After **five years** from the release date, the code automatically converts to **Apache 2.0**.
See the [LICENSE.md](LICENSE.md) file for full text.
---
## π¬ Support & Contact
- π§ Email: **intellideeplabs@gmail.com**
- π¬ Telegram: [@itsLerb](https://t.me/itsLerb) (click to open) β *response within 24h*
- π Issues: Use [GitHub Issues](https://github.com/intellideep/nlproxy/issues) for bugs and feature requests.
We welcome contributions, but please open an issue first to discuss.
|