File size: 1,096 Bytes
5e6809d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# Observability and Metrics

This module adds full observability for the NL2SQL Copilot pipeline.

## πŸ“Š Metrics exposed

| Metric | Type | Labels | Description |
|--------|------|---------|--------------|
| `stage_duration_ms` | histogram | `stage` | Duration per stage (detector, planner, generator, safety, executor, verifier) |
| `pipeline_runs_total` | counter | `status` | Pipeline runs by outcome (`ok`, `error`, `ambiguous`) |
| `safety_checks_total`, `safety_blocks_total` | counter | `reason` | Number of safety checks and blocked queries |
| `verifier_checks_total`, `verifier_failures_total` | counter | `reason` | Number of verification passes and failures |

---

## βš™οΈ Recording & Alerting Rules

Defined in `prometheus/rules.yml`:

- **`nl2sql:stage_p95_ms`** – 95th percentile latency per stage
- **`nl2sql:pipeline_success_ratio`** – 5-minute success ratio
- Alerts:
  - `PipelineLowSuccessRatio` (<90% for 10m)
  - `GeneratorLatencyHigh` (>1500 ms for 5m)
  - `SafetyBlocksSpike` (>0.5/min)

---

## πŸ§ͺ Local Testing

1. Start Prometheus
   ```bash
   make prom-up