Nomearod Claude Opus 4.6 (1M context) commited on
Commit
77e1875
·
1 Parent(s): 148a231

docs: add decisions for monitor mode, SSE events, vanilla JS

Browse files

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Files changed (1) hide show
  1. DECISIONS.md +32 -0
DECISIONS.md CHANGED
@@ -321,3 +321,35 @@ The HF Spaces demo is public by design — the `curl` examples in the README wor
321
  The security pipeline protects *content* (injection detection, PII redaction, output validation), not *access*. This is a deliberate scope boundary: application-layer guardrails ensure the system behaves safely regardless of who calls it, rather than assuming trusted callers. Rate limiting (10 RPM per IP) provides basic abuse protection.
322
 
323
  A production deployment would add authentication (API keys or OAuth) at the infrastructure layer — reverse proxy, API gateway, or middleware. The security pipeline's `getattr(..., None)` pattern means auth can be layered on without modifying the existing security components.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
321
  The security pipeline protects *content* (injection detection, PII redaction, output validation), not *access*. This is a deliberate scope boundary: application-layer guardrails ensure the system behaves safely regardless of who calls it, rather than assuming trusted callers. Rate limiting (10 RPM per IP) provides basic abuse protection.
322
 
323
  A production deployment would add authentication (API keys or OAuth) at the infrastructure layer — reverse proxy, API gateway, or middleware. The security pipeline's `getattr(..., None)` pattern means auth can be layered on without modifying the existing security components.
324
+
325
+ ## Why monitor mode for output validation, not gating?
326
+
327
+ Output validation runs post-stream as a monitoring layer. The answer
328
+ streams to the client, then validation runs and emits its verdict. Gating
329
+ (buffer-then-validate) would add 4-5 seconds of dead air while the full
330
+ answer generates — unacceptable streaming UX for a documentation Q&A bot.
331
+ Trade-off: a hallucinated URL or PII fragment could reach the client
332
+ before validation catches it. For this use case (FastAPI docs, no real
333
+ PII in corpus), the risk is near-zero. The dashboard labels this
334
+ "monitored" (not "gated") to be explicit about the posture.
335
+
336
+ ## Why additive SSE stage events?
337
+
338
+ The enhanced `/ask/stream` adds `meta` and `stage` event types alongside
339
+ the existing `sources`, `chunk`, and `done` events. Existing consumers
340
+ that only handle the three legacy types are unaffected — they simply
341
+ ignore events with unknown types. This avoids versioning the endpoint
342
+ or breaking the non-streaming `/ask` contract. The `meta` event fires
343
+ first (before any stages) so the frontend can display provider/model
344
+ info immediately.
345
+
346
+ ## Why vanilla JS for the frontend, not Alpine or React?
347
+
348
+ The showcase dashboard has ~5 pieces of reactive state (pipeline stages,
349
+ retrieval results, security badges, stats, chat messages). The SSE
350
+ handler is inherently imperative: receive event, querySelector the
351
+ target node, update classList and textContent. Wrapping this in a
352
+ reactive framework adds a dependency, interview questions about
353
+ "why is there a framework for 5 state variables", and indirection
354
+ that fights the imperative SSE pattern. One `state` object + a few
355
+ `render()` functions handles it in ~150 lines.