Spaces:
Running
Running
File size: 5,249 Bytes
2129c29 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 | # NLProxy Core Module Reference
This document surveys the core library modules in `core/`.
## Purpose
Core modules implement the prompt compression pipeline, constraint enforcement, and verification logic used by NLProxy.
## Files and Responsibilities
### `core/compressor.py`
#### Purpose
Performs clustering-based sentence compression to reduce prompt size while preserving core meaning.
#### Primary Class
- `SemanticCompressor`
#### Key Concepts
- Uses sentence embeddings to identify semantic redundancy.
- Employs `KMeans` clustering with `n_init="auto"` and `max_iter=300`.
- Returns compressed sentence subsets and compression metrics.
#### Performance
- Complexity: O(n · k · d) for clustering, where n = sentence count, k = cluster count, d = embedding dimension.
- Best suited to sentence sizes under 100 to avoid quadratic cluster overhead.
#### Edge Cases
- Handles empty sentence lists as a no-op.
- Falls back to conservative compression when cluster quality is low.
### `core/corrector.py`
#### Purpose
Sanitizes and post-corrects LLM outputs to align with safety constraints extracted during prompt shielding.
#### Primary Class
- `ResponseCorrector`
#### Behavior
- Applies heuristic cleanup to generated text.
- Mends broken placeholders and ensures protected entity tokens are restored correctly.
#### Edge Cases
- Responds defensively when correction data is missing.
- Avoids overcorrection by preserving valid syntax.
### `core/model_manager.py`
#### Purpose
Coordinates local model verification, SHA256 checksum validation, and on-demand download triggers.
#### Primary Classes
- `ModelConfig`
- `ModelManager`
#### Features
- Thread-safe singleton initialization.
- `verify_zip_checksum(zip_path, expected_hash)` validates download integrity.
- `ensure_ready()` is async-safe and idempotent.
- Supports synchronous initialization via `sync_ensure_ready()`.
#### Edge Cases
- Raises `RuntimeError` if required models remain missing after download.
- Uses atomic file moves to avoid partial ZIP writes.
### `core/reconstructor.py`
#### Purpose
Reconstructs compressed prompts by reinserting protected entities, formatting, and token optimization artifacts.
#### Primary Classes
- `ReconstructionResult`
- `PromptReconstructor`
#### Behavior
- Rebuilds final prompt text from compressed sentences and placeholder maps.
- Produces token counts and compression metrics.
#### Complexity
- Reconstruction is O(n) in the number of sentences and placeholder replacements.
### `core/restriction.py`
#### Purpose
Represents extracted restrictions, blocklists, and prompt constraints during shielding.
#### Primary Classes
- `Restriction`
- `RestrictionGraph`
#### Behavior
- Encodes restriction rules as immutable dataclasses.
- Builds a directed graph of dependents for conflict detection.
#### Edge Cases
- Handles conflicting restrictions with explicit priority logic.
### `core/safety.py`
#### Purpose
Validates compressed prompts and generated responses against safety policies.
#### Primary Classes
- `SafetyReport`
- `SafetyChecker`
#### Features
- Enforces semantic drift thresholds.
- Optional perplexity-based checks.
- Supports multiple safety modes.
#### Performance
- Safety validation is lightweight compared to embedding inference.
- Perplexity checks are conditional and only activate when enabled.
### `core/segmenter.py`
#### Purpose
Segments text into sentences and generates dense embeddings.
#### Primary Classes
- `EmbeddingBackend`
- `SegmentationConfig`
- `SemanticSegmenter`
#### Features
- Supports ONNX and PyTorch backends.
- Uses local model artifacts from `nlproxy/models/`.
- Selects CPU INT8 ONNX models when `onnx_int8=True`.
- Exposes async inference via `segment_and_encode_async()`.
#### Performance
- Sentence segmentation is O(n) in prompt length.
- Embedding inference latency depends on backend and hardware.
- Recommended batch sizes: `32-128` for CPU production.
#### Edge Cases
- Fails fast when local model files are missing.
- Uses `NLPROXY_MODELS_DIR` environment variable for model path override.
### `core/shield.py`
#### Purpose
Protects sensitive content, extracts protected entities, and applies domain-specific shielding.
#### Primary Classes
- `DomainMode`
- `ProtectedBlock`
- `ProtectedEntity`
- `ShieldResult`
- `PromptShield`
#### Behavior
- Detects password-like tokens, email addresses, and other protected spans.
- Replaces protected text with deterministic placeholders.
- Produces `ShieldResult` with `placeholder_map` and `shielded_text`.
#### Edge Cases
- Ensures placeholder collision resistance using secrets and hashing.
- Maintains alignment for re-injection during reconstruction.
### `core/verifier.py`
#### Purpose
Performs post-LLM verification and hallucination detection.
#### Primary Classes
- `VerificationResult`
- `PostLLMVerifier`
#### Behavior
- Checks final response confidence scores.
- Reports violations and corrective action recommendations.
- Integrates with automatic correction loops.
#### Edge Cases
- Gracefully disables NLI verification if models are unavailable.
- Allows response acceptance only when `confidence_score` exceeds configured thresholds.
|