Spaces:
Sleeping
Sleeping
File size: 12,420 Bytes
e327f0d | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 | # TRACKING_POLLING β Frontend Hata Izleme + Backend Log Korelasyonu
**Scope:** Tasarim dokumani. Bu dalgada KOD DEGISTIRILMEDI / SDK KURULMADI. Bu dosya, polling
ve multi-foto upload akislarinda kullanici-yonlu hata gozlemlenebilirligini
acmak icin **onerilen** sema, kod ornekleri ve event taxonomisidir.
**Sorun ifadesi.** Polling timeout / 5xx / network-blip senaryolarinda kullanici
"bir seyler bozuk" yazisi gorur, fakat:
- Frontend, backend log satirina baglanmaz (request_id frontend tarafinda yok)
- Hangi `inspection_id`'nin kac kez denendigi, hangi sebeple dustugu
client-side'da hicbir yerde toplanmaz (Sentry yok, GA4 yok)
- Backend `access_log` zaten `request_id`, `user_id`, `duration_ms` yayinliyor
(`services/backend/middleware.py` `AccessLogMiddleware`), ama frontend `console.error`
cikinca o request_id'yi ekrana ya da clipboard'a basmiyor β destek
surecinde "ekran goruntusu yollayin" yerine "su id'yi yollayin" demek
imkansiz.
Bu doc ucunu birden kapatmak icin yapilmasi gerekenleri tarif eder.
---
## 0. Mevcut Durumun Hizli Envanteri
| Tas | Durum |
|---|---|
| Backend `X-Request-ID` middleware | **VAR** β `services/backend/middleware.py:121` `RequestIDMiddleware`, inbound header'i kabul ediyor; yoksa `uuid4().hex` mintliyor. |
| CORS `expose_headers` `X-Request-ID` icerigi | **VAR** β `middleware.py:271` `expose_headers=["X-Request-ID", ...]`, dolayisiyla browser JS tarafindan `response.headers.get("x-request-id")` okunabilir. |
| Backend `access_log` JSON satiri | **VAR** β `event=http.access` icinde `request_id`, `user_id`, `status`, `duration_ms`, `path`. |
| Frontend axios response interceptor request_id okuma | **YOK** β `apps/web/lib/api.ts:91` interceptor sadece 401 refresh akisini handle ediyor, header okunmuyor. |
| Frontend polling instrumentation | **YOK** β `apps/web/lib/use-inspection-polling.ts` `attempts` sayiyor ama disari yayinlamiyor. |
| Multi-foto upload instrumentation | **YOK** β `inspections.create` `onUploadProgress` callback'i sadece UI progress bar icin kullaniliyor. |
| GA4 / Sentry / Posthog SDK | **YOK** (KVKK consent banner'i da yok). |
| `inspection_unassigned_ratio` backend cikti | **VAR** β `services/backend/output_formatter.py:411` `summary.unassigned_damage_count` ve genel `damages` count'u uzerinden orana cevirilebilir. Frontend henuz okumuyor. |
---
## 1. X-Request-ID Propagation (KOD ORNEGI β uygulanmadi, oneri)
### 1.1 Frontend: axios response interceptor
`apps/web/lib/api.ts` icindeki `buildClient()` icinde response interceptor'a
**asagidaki** ek konabilir. Yalnizca okuma + log. Side-effect yok.
```ts
// apps/web/lib/api.ts β buildClient() icinde,
// response interceptor'in BASARILI dalinda (su anki `(r) => r` yerine):
instance.interceptors.response.use(
(r) => {
const rid = r.headers?.['x-request-id'];
if (rid) {
// Successful response: log only at debug verbosity so the console
// is not polluted in production builds. The tag prefix lets support
// engineers grep the user-shared screenshots.
if (process.env.NODE_ENV !== 'production') {
console.debug(`[api] ${r.config.method?.toUpperCase()} ${r.config.url} rid=${rid}`);
}
// Stash on the response so call sites (polling, upload) can grab it
// for event payloads.
(r as { requestId?: string }).requestId = rid;
}
return r;
},
async (error: AxiosError) => {
// Mevcut 401 / refresh akisi degismez. Ek olarak:
const rid =
(error.response?.headers as Record<string, string> | undefined)?.['x-request-id'];
if (rid) {
// Hata durumu HER ZAMAN logla β destek bilet eslesmesi icin kritik.
console.error(
`[api] FAIL ${error.config?.method?.toUpperCase()} ${error.config?.url} ` +
`status=${error.response?.status ?? 'network'} rid=${rid}`,
);
// ApiErrorInfo'ya da dahil et ki UI clipboard'a kopyalayabilsin.
(error as { requestId?: string }).requestId = rid;
}
// ... mevcut refresh / unauthorized logigi aynen kalir.
},
);
```
### 1.2 `classifyApiError` cikisina `requestId` ekleme
`ApiErrorInfo` arayuzu su an `kind | status | detail | fieldErrors` iceriyor
(`apps/web/lib/api.ts:513`). Asagidaki tek satir genisleme yeterli:
```ts
export interface ApiErrorInfo {
// ... mevcut alanlar
/** Backend X-Request-ID, support diagnostics icin. */
requestId?: string;
}
// classifyApiError govdesi sonunda:
const rid = (err as { requestId?: string })?.requestId
?? (axios.isAxiosError(err)
? (err.response?.headers as Record<string,string> | undefined)?.['x-request-id']
: undefined);
return { ...base, requestId: rid };
```
### 1.3 UI'da gosterim ornegi (destek mesajina kopyalama)
Polling failure ya da upload failure error card'i:
```tsx
{info.requestId && (
<p className="text-xs text-zinc-500 mt-2">
Destek referansi: <code>{info.requestId}</code>
<button onClick={() => navigator.clipboard.writeText(info.requestId!)}>
kopyala
</button>
</p>
)}
```
Bu, **Sentry kurulduktan sonra** `Sentry.captureException(err, { tags: { request_id } })`
ile dogrudan backend access log'una linklenecek. Sentry yokken bile kullanici
"rid=ab12cd34..." dedi mi backend `journalctl | grep ab12cd34` ile o exact
istegi 1 saniyede bulur.
---
## 2. Event Semasi
### 2.1 Genel kurallar
- `event_name`: `snake_case`, fiil bitis (`_started`, `_completed`, `_failed`).
- `properties`: dust JSON-serializable (string / number / bool / null).
- **PII yasak**: e-mail, JWT, dosya icerigi, plaka. `inspection_id` UUID
oldugu icin guvenli; `user_id` GA4 user-id slot'una gider, properties'e degil.
- Zaman damgalari: `*_ms` (number, epoch yerine duration tercih).
- Hata sebebi `reason` enum'u: `timeout | network | unauthorized | forbidden | not_found | server | rate_limited | aborted | unknown`.
(`classifyApiError.kind` ile birebir eslesmeli β onceki rule mevcut.)
### 2.2 Polling event'leri (`use-inspection-polling.ts` icinde emit edilecek)
| event_name | trigger | properties |
|---|---|---|
| `polling_started` | `tick` ilk kez planlandiginda | `inspection_id` (str), `mode` (`async`\|`sync`), `interval_ms` (number), `max_duration_ms` (number) |
| `polling_attempt` | her `tick` cagrisinin BASLAMASINDA | `inspection_id`, `attempt_n` (1..N), `elapsed_ms`, `current_interval_ms` |
| `polling_failed` | `tick` icinde fatal/exhausted/timedOut/paused setlendiginde | `inspection_id`, `reason` (`timeout`\|`network`\|`unauthorized`\|`forbidden`\|`not_found`\|`server`\|`rate_limited`\|`exhausted`\|`paused`), `attempt_n`, `consecutive_errors`, `elapsed_ms`, `request_id` (varsa) |
| `polling_completed` | `status === 'completed'` doner | `inspection_id`, `attempt_n`, `duration_ms` (elapsed since polling_started), `final_status` (`completed`\|`failed`), `request_id` |
| `polling_retried` | kullanici `retry()` cagirir | `inspection_id`, `previous_attempts`, `was_timed_out` (bool), `was_paused` (bool) |
### 2.3 Multi-foto upload event'leri (`inspections.create` cagrisinin etrafinda)
| event_name | trigger | properties |
|---|---|---|
| `upload_started` | `inspections.create()` cagrildiginda | `file_count` (number), `total_size_mb` (number, 2 ondalik), `mode` (`sync`\|`async`) |
| `upload_progress` | `onUploadProgress` cagrildiginda β **kisilmis** (10/25/50/75/90/100 esikleri, throttle) | `percent` (0..100), `loaded_mb`, `total_mb` |
| `upload_completed` | `inspections.create()` 2xx donerse | `file_count`, `total_size_mb`, `mode`, `duration_ms`, `inspection_id`, `request_id` |
| `upload_failed` | `inspections.create()` reject ederse | `file_count`, `total_size_mb`, `mode`, `reason` (`classifyApiError.kind`), `partial_count` (multi-foto'da basarisiz olan dosya sayisi; tek POST oldugu icin su an her zaman `file_count` veya `0` β gelecekte chunked upload'a hazirlik), `duration_ms`, `request_id` |
### 2.4 Detection quality event'i
| event_name | trigger | properties |
|---|---|---|
| `inspection_quality_observed` | `polling_completed` 'in hemen ardindan, response gelinince | `inspection_id`, `total_damage_count` (number), `unassigned_damage_count` (number β backend zaten `summary.unassigned_damage_count` doruyor: `services/backend/output_formatter.py:411`), `unassigned_ratio` (float, 0..1), `total_part_count`, `processing_time_ms` (backend'den), `image_count` |
`unassigned_ratio = unassigned_damage_count / total_damage_count`. Ornek brief'te
verilen 10/16 = 0.62 anomalisi bu metrigin GA4'e dustugunde dashboard'da
threshold alarm setlemeyi mumkun kilar (model kalibrasyonu icin sinyal).
### 2.5 Hata baglamlarinda standart envelope
Tum `*_failed` event'leri **mutlaka** su alt seti tasimali (sirf bu tanimi
genisletmek "bu hata neden oldu" sorusuna 30 saniyede cevap olur):
```jsonc
{
"event_name": "polling_failed",
"properties": {
"inspection_id": "5f6e...",
"reason": "server",
"http_status": 502,
"request_id": "ab12cd34ef56...", // backend access_log ile JOIN anahtari
"attempt_n": 4,
"elapsed_ms": 9421,
"consecutive_errors": 3,
"user_agent_brand": "Chrome", // navigator.userAgentData.brands[0]
"connection_type": "4g", // navigator.connection?.effectiveType
"session_id": "...", // GA4 / posthog auto
"build_sha": "abc1234" // NEXT_PUBLIC_BUILD_SHA, varsa
}
}
```
### 2.6 KVKK consent gate'i
Hicbir analytics event'i, kullanici `cookie_consent.analytics === true`
isaretlemedikce **gonderilmemelidir**. Onerilen mimari:
1. `apps/web/lib/analytics.ts` (henuz yok) β `track(event, props)` fonksiyonu
`localStorage['cookie_consent']` JSON'unu okur; `analytics: true` degilse
buffer'a yazar (max 200 event), consent verildiginde flush eder, **asla**
network'e basmaz.
2. Banner kabulune kadar GA4 / Posthog SDK init **edilmez** (sadece stub).
3. Backend `request_id` console'a yazilmaya devam eder β bu PII degil, opex.
Bu dalgada SDK ya da banner kurulmadi; bu sema banner geldigi gun gun
calismaya hazir.
---
## 3. Top 5 Must-Have Event
Once-yapilacaklar siralamasi (etki / efor orani yuksek olandan dusuge):
| # | Event | Niye 1. siniftan |
|---|---|---|
| 1 | `polling_failed` | Tum kullanici sikayetinin **kaynagi**. `reason` + `request_id` ile destek biletlerinin %80'i tek satir log lookup'a duser. SDK gelmeden once **console.error**'a basmasi bile fayda saglar. |
| 2 | `upload_failed` | Multi-foto upload'da kullanici **hangi adimda** dustugunu bilmiyor (network mi, 413 mu, 422 mi). `reason=tooLarge` ile `reason=network` ayrimi UI mesajinin tonunu degistirir. |
| 3 | `inspection_quality_observed` | `unassigned_ratio` model kalibrasyonu icin **business** metrigi. Brief'teki 10/16 = 0.62 ornegi tam olarak burayi gosteriyor; bu olcum olmadan "model nezaman bozuldu" sorusu cevapsiz. |
| 4 | `polling_completed` | Basari yolu olcumu β p50/p95 `duration_ms` SLO panosunun ana metrigi. Hata orani sadece basari hizina karsi anlam kazanir. |
| 5 | `upload_started` | Funnel'in tepe noktasi. `started` olmadan `failed` rate yorumlanamaz (denominator). `file_count` distribution UI tasariminda (1-foto vs 20-foto akislari) gerceklerle dogrulanir. |
---
## 4. Backend ile Korelasyon Akisi (operatif kosesi)
Frontend `polling_failed` event'i emit ettiginde `request_id=ab12cd34` ile.
Backend tarafinda:
```bash
# /var/log/.../backend.access.log ya da journald
grep '"request_id":"ab12cd34"' backend.access.log | jq .
```
donus:
```jsonc
{
"event": "http.access",
"method": "GET",
"path": "/api/v1/inspect/5f6e...",
"status": 502,
"duration_ms": 12.4,
"user_id": "user-uuid",
"ip": "...",
"request_id": "ab12cd34..."
}
```
Bu JOIN su an mumkun **degil** cunku frontend `request_id`'yi okumuyor.
Section 1.1 patch'i uygulanir uygulanmaz, **kod degisikligi olmadan** bile
destek susrecindeki MTTR (mean-time-to-resolve) dakikalardan saniyelere iner.
---
## 5. Bu dalgada YAPILMADI (hatirlatma)
- `apps/web/lib/api.ts` interceptor patch'i **kodlanmadi** (sadece ornek).
- `apps/web/lib/use-inspection-polling.ts` `track(...)` cagrilari **eklenmedi**.
- `apps/web/lib/analytics.ts` consent-gated wrapper **olusturulmadi**.
- GA4 / Posthog / Sentry SDK kurulumu **yapilmadi**.
- KVKK cookie consent banner UI **eklenmedi**.
Bu dosya, sonraki dalgalarin `git grep TRACKING_POLLING.md` ile
implementasyon spec'ini tek noktada bulmasi icindir.
|