interacmanagernew / ICC_Technical_Flow.md
MichaelEdou
Initial commit — ICC Interac Manager full-stack app
149698e
# ICC Interac Manager — Complete Technical Flow
**Every technology, every AI model, every data transformation — fully detailed**
**Last updated:** February 2026
---
## 1. System Architecture — Full Technology Map
```
┌─────────────────────────────────────────────────────────────────────────┐
│ USER'S BROWSER │
│ │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ VITE 6 + REACT 19 SPA │ │
│ │ TypeScript 5 · Tailwind CSS 4 · shadcn/ui │ │
│ │ │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌────────┐ │ │
│ │ │ Login │ │Dashboard │ │ Scan │ │ Settings │ │Reports │ │ │
│ │ │ Page │ │ Page │ │ Page │ │ Page │ │ Page │ │ │
│ │ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ └───┬────┘ │ │
│ │ │ │ │ │ │ │ │
│ │ ┌────┴─────────────┴────────────┴─────────────┴────────────┴────┐│ │
│ │ │ State Management Layer ││ │
│ │ │ Zustand (global) · TanStack Query v5 (server/cache) ││ │
│ │ │ React Hook Form + Zod (forms) · Socket.io-client (realtime) ││ │
│ │ └───────────────────────────┬────────────────────────────────────┘│ │
│ └──────────────────────────────┼─────────────────────────────────────┘ │
│ │ HTTP + WebSocket │
│ ┌──────────────────┼──────────────────┐ │
│ │ Vite Dev Proxy (port 5173) │ │
│ │ /api/* → localhost:3001 │ │
│ │ /ws → ws://localhost:3001 │ │
│ └──────────────────┼──────────────────┘ │
└─────────────────────────────────┼───────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────┐
│ EXPRESS.JS 5 BACKEND (port 3001) │
│ Node.js 20 LTS · TypeScript 5 │
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ MIDDLEWARE LAYER │ │
│ │ JWT Auth · CORS · express-rate-limit · Helmet CSP · Pino logger │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Auth │ │ Scan │ │ Txns │ │ Receipts │ │ Settings │ │
│ │ Routes │ │ Routes │ │ Routes │ │ Routes │ │ Routes │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
│ │ │ │ │ │ │
│ ┌────┴─────────────┴────────────┴─────────────┴────────────┴────┐ │
│ │ SERVICE LAYER │ │
│ │ │ │
│ │ ┌─────────────┐ ┌──────────────┐ ┌───────────────────────┐│ │
│ │ │ Gmail │ │ Scan Engine │ │ AI Provider Pool ││ │
│ │ │ Service │ │ (pipeline) │ │ (auto-switcher) ││ │
│ │ │ │ │ │ │ ││ │
│ │ │ googleapis │ │ Fetch→Parse │ │ Groq ←→ Mistral ││ │
│ │ │ OR imapflow │ │ →Route→Save │ │ (9 free model slots) ││ │
│ │ └─────────────┘ └──────────────┘ └───────────────────────┘│ │
│ │ │ │
│ │ ┌─────────────┐ ┌──────────────┐ ┌───────────────────────┐│ │
│ │ │ Routing │ │ PDF Service │ │ Export Service ││ │
│ │ │ Service │ │ PDFKit / │ │ SheetJS (xlsx) ││ │
│ │ │ branch map │ │ react-pdf │ │ CSV (built-in) ││ │
│ │ └─────────────┘ └──────────────┘ └───────────────────────┘│ │
│ └────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ DATA LAYER │ │
│ │ Drizzle ORM · PostgreSQL 16 · Redis + BullMQ (job queue) │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ WEBSOCKET LAYER (Socket.io) │ │
│ │ scan:progress · scan:completed · transaction:new · ai:switcher │ │
│ └──────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────┘
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────────────┐
│ PostgreSQL 16│ │ Redis 7 │ │ External AI APIs │
│ │ │ │ │ │
│ users │ │ BullMQ jobs │ │ api.groq.com │
│ transactions │ │ scan queue │ │ api.mistral.ai │
│ branch_config│ │ rate limiter │ │ (free tier, $0) │
│ scan_logs │ │ │ │ │
│ ai_settings │ │ │ │ [Optional paid:] │
│ ai_switcher │ │ │ │ api.anthropic.com │
│ _logs │ │ │ │ api.openai.com │
└──────────────┘ └──────────────┘ └──────────────────────┘
```
---
## 2. Complete Scan Pipeline — Step by Step
This is the exact sequence of events when a user clicks **"Scanner maintenant"**.
### Phase 0: User Initiates Scan
```
USER ACTION TECHNOLOGY INVOLVED
─────────────────────────────────── ──────────────────────────────────
1. User clicks "Scanner aujourd'hui" React onClick handler
2. Frontend sends POST /api/scan/start Axios / fetch (TanStack Query mutation)
Body: { preset: "today" }
3. Backend receives request Express.js route handler
4. JWT middleware validates token jsonwebtoken (JWT verify)
5. Creates BullMQ job in Redis BullMQ + Redis
6. Returns { jobId: "scan_abc123" } Express response
7. Frontend opens WebSocket Socket.io-client
Listens for scan:progress events
```
### Phase 1: Email Discovery (Gmail API)
```
STEP ACTION TECHNOLOGY TIME
─────── ─────────────────────────────── ───────────────────── ──────
1.1 Resolve date range: resolveScanDates() <1ms
"today" → midnight→now (shared util)
1.2 Build Gmail search query: buildGmailQuery() <1ms
"from:notify@payments. (server util)
interac.ca after:2026/2/22
before:2026/2/24"
1.3 Call Gmail API: list messages googleapis npm 200ms
GET gmail/v1/users/me/messages (google-auth-library
?q={query}&maxResults=500 + googleapis)
OAuth token from DB: Drizzle ORM → PG
users.access_token (AES-256 crypto.decipher
encrypted) → decrypt
If token expired: google-auth-library
auto-refresh with OAuth2Client
users.refresh_token .refreshAccessToken()
1.4 Receive message ID list: Gmail API response —
["msg_001", "msg_002", ...]
(just IDs, not full emails)
1.5 Deduplication check: Drizzle ORM → PG 50ms
SELECT email_id FROM SQL query
transactions WHERE (parameterized,
email_id IN (...) indexed)
1.6 Filter: skip existing IDs JavaScript Set <1ms
Result: newIds[] (emails
to process)
1.7 WebSocket emit: Socket.io <1ms
scan:started {
jobId, totalEmails,
newEmails, skipped,
dateRange
}
PHASE 1 TOTAL: ~300ms for discovery
```
### Phase 2: Parallel Fetch (Gmail API — 10 concurrent)
```
STEP ACTION TECHNOLOGY TIME
─────── ─────────────────────────────── ───────────────────── ──────
2.1 Create concurrency limiter: p-limit npm <1ms
gmailLimit = pLimit(10) (10 concurrent)
2.2 For each newId, launch Promise + p-limit —
concurrent fetch:
┌──── gmailLimit(async () => {
│ 2.3 GET gmail/v1/users/me/ googleapis 100ms
│ messages/{id} (full MIME format) each
│ ?format=full
│ 2.4 Extract email body: Custom MIME parser <1ms
│ - Find text/html part (base64 decode)
│ - Base64 decode
│ - Strip HTML tags
│ - Extract plain text
│ 2.5 Extract email metadata: MIME header parse <1ms
│ - Date header
│ - From header
│ - Subject header
└──── return { emailId, body, metadata } })
With 10 concurrent: 50 emails fetched in ~500ms
200 emails in ~2 seconds
1000 emails in ~10 seconds
PHASE 2 TOTAL: ~100ms per email, 10 concurrent = ~10ms effective per email
```
### Phase 3: AI Parsing (Auto-Switcher — up to 15 concurrent on Groq)
This is where the AI models do the work. Every email body goes through the `AIProviderPool`.
```
STEP ACTION TECHNOLOGY TIME
─────── ─────────────────────────────── ───────────────────── ──────
3.1 AIProviderPool receives aiProviderPool.ts <1ms
email body text
3.2 Select best available slot: getNextAvailableSlot <1ms
Check priority order: ()
1. groq:gpt-oss-20b
2. groq:llama4-scout
3. groq:llama31-8b
...
7. mistral:small-3.2
...
3.3 Enforce rate delay: enforceRateDelay() 0-2000ms
Groq: 60s/30RPM = 2s delay (per slot)
Mistral: 60s/2RPM = 30s delay
3.4 Build AI request: <1ms
┌────────────────────────────────────────────────┐
│ { │
│ model: "openai/gpt-oss-20b", │
│ messages: [ │
│ { role: "system", │
│ content: EXTRACTION_SYSTEM_PROMPT }, │
│ { role: "user", │
│ content: "Extract transaction │
│ details:\n\n{EMAIL_BODY}" } │
│ ], │
│ temperature: 0.0, │
│ max_tokens: 500, │
│ response_format: { type: "json_object" } │
│ } │
└────────────────────────────────────────────────┘
3.5 Send to AI provider: fetch() or openai 150-800ms
npm package
┌─── IF GROQ: ───────────────────────────────────┐
│ POST https://api.groq.com/openai/v1/ │
│ chat/completions │
│ Headers: │
│ Authorization: Bearer gsk_xxxxx │
│ Content-Type: application/json │
│ │
│ Speed: 500-1000 tokens/sec │
│ Typical response: 150-300ms │
└──────────────────────────────────────────────────┘
┌─── IF MISTRAL: ────────────────────────────────┐
│ POST https://api.mistral.ai/v1/ │
│ chat/completions │
│ Headers: │
│ Authorization: Bearer xxxxx │
│ Content-Type: application/json │
│ │
│ Speed: 100-300 tokens/sec │
│ Typical response: 300-800ms │
└──────────────────────────────────────────────────┘
3.6 IF 429 RATE LIMIT: Auto-switcher <1ms
- Mark current slot as logic in pool
rate_limited
- Set cooldownUntil from
retry-after header
- WebSocket emit ai:switcher
- Jump to next priority slot
- RETRY from step 3.2
3.7 Receive AI response: JSON.parse() <1ms
{
"sender": "Jean Tremblay",
"amount": 150.00,
"currency": "CAD",
"reference": "CA1b2c3d4e5f",
"message": "Dîme mars 2025",
"recipient_email": "montreal.finances@iccameriques.org",
"date": "2026-02-23T14:30:00Z",
"status": "deposited"
}
3.8 Validate with Zod schema: zod npm <1ms
InteracTransactionSchema
.parse(parsed)
If validation fails:
- Log warning
- Retry with different model
- Or mark as needs_review
PHASE 3 TOTAL: ~200-800ms per email depending on provider
With Groq 15 concurrent: ~15ms effective per email
```
### Phase 4: Branch Routing (No AI — Pure Logic)
```
STEP ACTION TECHNOLOGY TIME
─────── ─────────────────────────────── ───────────────────── ──────
4.1 Look up recipient_email in BRANCH_MAPPING <1ms
branch mapping: (shared constant)
"montreal.finances JavaScript object
@iccameriques.org" lookup, O(1)
→ "ICC Montréal"
4.2 If no match found: Fallback logic <1ms
→ "Non classifié"
4.3 Attach branch to transaction: Object assign <1ms
transaction.branch =
"ICC Montréal"
PHASE 4 TOTAL: <1ms (no network, no AI, pure in-memory lookup)
```
### Phase 5: Database Save (Batch INSERT)
```
STEP ACTION TECHNOLOGY TIME
─────── ─────────────────────────────── ───────────────────── ──────
5.1 Buffer parsed transaction Array.push <1ms
into saveBuffer[]
5.2 When buffer reaches 25: Drizzle ORM 5-15ms
Batch INSERT INTO → PostgreSQL
transactions ( (parameterized)
email_id, user_id, date,
sender, amount, currency,
reference, message,
recipient_email, branch,
status, raw_email,
parsed_at, reviewed
) VALUES (...), (...), ...
★ Single INSERT for 25 rows
is 10-20x faster than 25
individual INSERTs
5.3 WebSocket emit per row: Socket.io <1ms
transaction:new { (per transaction)
transaction: {...}
}
Dashboard receives and
auto-adds to TanStack Table
5.4 Update scan_logs: Drizzle ORM → PG 2ms
UPDATE scan_logs SET
emails_parsed = emails_parsed + 25
WHERE id = {scanLogId}
PHASE 5 TOTAL: ~15ms per batch of 25 transactions
```
### Phase 6: Completion
```
STEP ACTION TECHNOLOGY TIME
─────── ─────────────────────────────── ───────────────────── ──────
6.1 Flush remaining buffer Drizzle → PG 5ms
(< 25 transactions)
6.2 Write final scan log: Drizzle → PG 2ms
UPDATE scan_logs SET
finished_at = NOW(),
emails_found = {n},
emails_parsed = {n},
errors = {n}
6.3 Write switcher logs: Drizzle → PG 2ms
INSERT INTO ai_switcher_logs
(batch of all switch events)
6.4 WebSocket emit: Socket.io <1ms
scan:completed {
jobId,
summary: { found, parsed,
skipped, errors },
dateRange,
duration: "45s"
}
6.5 Frontend shows toast: Sonner (toast lib) —
"47 nouveaux virements
importés en 45 secondes"
6.6 Dashboard auto-refreshes TanStack Query —
transaction list invalidateQueries()
(already live via WebSocket, (backup full refresh)
but also invalidates cache)
```
---
## 3. AI Model Connection Map
Exactly which AI model does what, and where in the code it connects.
```
┌──────────────────────────────────────────────────────────────────────────┐
│ │
│ WHAT AI DOES IN THIS PROJECT: │
│ │
│ AI has ONE job: Parse raw Interac email text → structured JSON │
│ │
│ AI does NOT do: │
│ ✗ Branch routing (pure lookup table, no AI) │
│ ✗ PDF generation (PDFKit / @react-pdf/renderer, no AI) │
│ ✗ Export CSV/Excel (SheetJS, no AI) │
│ ✗ Dashboard charts (Recharts, no AI) │
│ ✗ Authentication (Google OAuth, no AI) │
│ ✗ Email fetching (Gmail API / IMAP, no AI) │
│ ✗ Database queries (Drizzle ORM / SQL, no AI) │
│ ✗ Real-time updates (WebSocket, no AI) │
│ │
└──────────────────────────────────────────────────────────────────────────┘
```
### Where Each Model Connects
```
┌─────────────────────────────────────────────────────────────────────────┐
│ FILE: packages/server/src/services/aiProviderPool.ts │
│ CLASS: AIProviderPool │
│ METHOD: parse(emailBody: string) → InteracTransaction │
│ │
│ This is the ONLY place AI is called in the entire application. │
│ Everything below is managed by the auto-switcher inside this method. │
│ │
│ SLOT 1 ─── groq:gpt-oss-20b ──────────────────────────────────────────│
│ │ Provider: Groq │
│ │ Model: openai/gpt-oss-20b │
│ │ API: POST https://api.groq.com/openai/v1/chat/completions │
│ │ SDK: openai npm (baseURL override) │
│ │ File: packages/server/src/providers/groq.ts │
│ │ Auth: GROQ_API_KEY env var (free, no credit card) │
│ │ Speed: 1,000 tokens/sec · response in ~150ms │
│ │ Free: 30 RPM · 7,000 req/day · 500K tokens/day │
│ │ Input: ~800 tokens (system prompt + email body) │
│ │ Output: ~200 tokens (JSON transaction object) │
│ │ Quality: ★★★★☆ (good at structured extraction) │
│ │ French: ★★★★☆ │
│ │ │
│ SLOT 2 ─── groq:llama4-scout ─────────────────────────────────────────│
│ │ Provider: Groq │
│ │ Model: meta-llama/llama-4-scout-17b-16e-instruct │
│ │ API: POST https://api.groq.com/openai/v1/chat/completions │
│ │ SDK: openai npm (same Groq adapter) │
│ │ Speed: 594 tokens/sec │
│ │ Free: 30 RPM · 1,000 req/day · 500K tokens/day │
│ │ Quality: ★★★★☆ │
│ │ French: ★★★½☆ │
│ │ │
│ SLOT 3 ─── groq:llama31-8b ──────────────────────────────────────────│
│ │ Provider: Groq │
│ │ Model: llama-3.1-8b-instant │
│ │ API: POST https://api.groq.com/openai/v1/chat/completions │
│ │ Speed: 840 tokens/sec · response in ~100ms │
│ │ Free: 30 RPM · 14,400 req/day · 500K tokens/day │
│ │ Quality: ★★★☆☆ (simpler extraction, may miss edge cases) │
│ │ French: ★★★☆☆ │
│ │ │
│ SLOT 4 ─── groq:qwen3-32b ───────────────────────────────────────────│
│ │ Provider: Groq │
│ │ Model: qwen/qwen3-32b │
│ │ Speed: 662 tokens/sec │
│ │ Free: 30 RPM · 1,000 req/day · 500K tokens/day │
│ │ Quality: ★★★★☆ │
│ │ French: ★★★★☆ (strong multilingual) │
│ │ │
│ SLOT 5 ─── groq:llama4-maverick ──────────────────────────────────────│
│ │ Provider: Groq │
│ │ Model: meta-llama/llama-4-maverick-17b-128e-instruct │
│ │ Speed: 562 tokens/sec │
│ │ Free: 30 RPM · 1,000 req/day · 500K tokens/day │
│ │ Quality: ★★★★☆ │
│ │ French: ★★★★☆ │
│ │ │
│ SLOT 6 ─── groq:gpt-oss-120b ────────────────────────────────────────│
│ │ Provider: Groq │
│ │ Model: openai/gpt-oss-120b │
│ │ Speed: 500 tokens/sec │
│ │ Free: 30 RPM · 1,000 req/day · 100K tokens/day │
│ │ Quality: ★★★★★ (best reasoning on Groq) │
│ │ French: ★★★★☆ │
│ │ │
│ ────── PROVIDER BOUNDARY: Groq → Mistral ──────────────────────────── │
│ │
│ SLOT 7 ─── mistral:small-3.2 ────────────────────────────────────────│
│ │ Provider: Mistral AI │
│ │ Model: mistral-small-3.2-24b-instruct │
│ │ API: POST https://api.mistral.ai/v1/chat/completions │
│ │ SDK: fetch() (OpenAI-compatible) │
│ │ File: packages/server/src/providers/mistral.ts │
│ │ Auth: MISTRAL_API_KEY env var (free, no credit card) │
│ │ Speed: ~200 tokens/sec · response in ~400ms │
│ │ Free: 2 RPM · ~2,880 req/day · 1B tokens/month │
│ │ Quality: ★★★★☆ (excellent structured extraction) │
│ │ French: ★★★★★ (built by French company, native French) │
│ │ │
│ SLOT 8 ─── mistral:medium-3.1 ───────────────────────────────────────│
│ │ Provider: Mistral AI │
│ │ Model: mistral-medium-3.1 │
│ │ Speed: ~150 tokens/sec │
│ │ Free: 2 RPM · ~2,880 req/day · 1B tokens/month │
│ │ Quality: ★★★★★ (highest accuracy) │
│ │ French: ★★★★★ │
│ │ │
│ SLOT 9 ─── mistral:ministral-8b ─────────────────────────────────────│
│ │ Provider: Mistral AI │
│ │ Model: ministral-8b-2512 │
│ │ Speed: ~300 tokens/sec │
│ │ Free: 2 RPM · ~2,880 req/day · 1B tokens/month │
│ │ Quality: ★★★☆☆ │
│ │ French: ★★★★☆ │
│ │ │
└─────────────────────────────────────────────────────────────────────────┘
```
---
## 4. Technology → Purpose Map (Every Library)
### Frontend (packages/web)
| Technology | Version | What It Does in This Project | Touches AI? |
|-----------|---------|------------------------------|-------------|
| **Vite** | 6 | Dev server, HMR, production bundler | No |
| **React** | 19 | Component rendering, UI framework | No |
| **TypeScript** | 5 | Type safety across all code | No |
| **Tailwind CSS** | 4 | Utility-first styling for all components | No |
| **shadcn/ui** | latest | Pre-built components: Button, Dialog, DatePicker, Table, Toggle, Toast | No |
| **React Router** | v7 | Page routing: /login, /dashboard, /scan, /settings, /reports | No |
| **Zustand** | latest | Global state: auth user, scan status, provider pool status | No |
| **TanStack Query** | v5 | Server state: transactions list, scan history, settings (auto-cache, refetch) | No |
| **TanStack Table** | v8 | Dashboard transaction table: sort, filter, paginate, select, expand | No |
| **React Hook Form** | latest | Settings forms: AI config, branch config, scan date range | No |
| **Zod** | latest | Frontend validation: date ranges, settings input | No |
| **Socket.io-client** | latest | Real-time: scan:progress, transaction:new, ai:switcher events | No |
| **Recharts** | latest | Reports page: bar charts, line charts, pie charts | No |
| **@react-pdf/renderer** | latest | Client-side PDF receipt generation (single transaction) | No |
| **SheetJS** | latest | Client-side Excel/CSV export | No |
| **date-fns** | latest | Date formatting with fr-CA locale | No |
| **Lucide React** | latest | Icons throughout the UI | No |
| **Sonner** | latest | Toast notifications ("47 virements importés") | No |
| **react-i18next** | latest | French UI translations | No |
| **p-limit** | latest | Used in scan progress UI for throttled updates | No |
### Backend (packages/server)
| Technology | Version | What It Does in This Project | Touches AI? |
|-----------|---------|------------------------------|-------------|
| **Express.js** | 5 | HTTP API server, route handling | No |
| **TypeScript** | 5 | Type safety across all server code | No |
| **googleapis** | latest | Gmail API: fetch message IDs, fetch full MIME emails | No |
| **google-auth-library** | latest | OAuth 2.0: token exchange, refresh, consent URL | No |
| **imapflow** | latest | IMAP fallback: connect to Gmail via IMAP if API unavailable | No |
| **openai** (npm) | latest | **Groq adapter**: same SDK, different baseURL → api.groq.com | **YES** |
| **fetch** (built-in) | — | **Mistral adapter**: direct HTTP to api.mistral.ai | **YES** |
| **@anthropic-ai/sdk** | latest | **Claude adapter** (optional paid): api.anthropic.com | **YES** |
| **Drizzle ORM** | latest | Database queries: all CRUD, batch inserts, migrations | No |
| **PostgreSQL driver** (pg) | latest | Database connection pooling | No |
| **Redis** (ioredis) | latest | BullMQ backend, rate limit counters, session cache | No |
| **BullMQ** | latest | Background job queue: scan jobs run async, not blocking API | No |
| **Socket.io** | latest | WebSocket server: push scan progress + provider switches to UI | No |
| **jsonwebtoken** | latest | JWT: issue access tokens (15 min) + refresh tokens (7 day) | No |
| **PDFKit** | latest | Server-side batch PDF receipt generation | No |
| **Pino** | latest | Structured JSON logging | No |
| **Helmet** | latest | Security headers (CSP, HSTS, etc.) | No |
| **cors** | latest | CORS policy: restrict to known frontend origins | No |
| **express-rate-limit** | latest | API rate limiting (prevent abuse of scan endpoints) | No |
| **p-limit** | latest | Concurrency control: 10 Gmail fetches, 15 Groq parses | No |
| **Zod** | latest | Server-side validation: AI output JSON, API request bodies | No |
| **crypto** (built-in) | — | AES-256 encryption: OAuth tokens at rest in PostgreSQL | No |
### Shared (packages/shared)
| Technology | What It Does |
|-----------|-------------|
| **TypeScript types** | InteracTransaction, ScanDateRange, ProviderSlot, SwitcherEvent |
| **BRANCH_MAPPING** | 48 email→branch lookup pairs (no AI needed) |
| **resolveScanDates()** | Preset→date range helper |
| **Zod schemas** | InteracTransactionSchema for validating AI output |
---
## 5. AI Data Flow — Input → Output
### What Goes INTO the AI
```
╔══════════════════════════════════════════════════════════════════════╗
║ SYSTEM PROMPT (~350 tokens, same for every email, cached) ║
║ ║
║ "You are a financial data extraction assistant. Given the raw ║
║ text/HTML of an Interac e-Transfer notification email from ║
║ notify@payments.interac.ca, extract the following fields into ║
║ a JSON object: ║
║ ║
║ - sender: The name of the person who sent the money ║
║ - amount: The dollar amount (numeric, no $ sign) ║
║ - currency: Always "CAD" ║
║ - reference: The Interac reference number ║
║ - message: The personal message/memo (null if none) ║
║ - recipient_email: The email the transfer was sent TO ║
║ - date: The date/time in ISO 8601 format ║
║ - status: One of deposited, pending, expired, cancelled ║
║ ║
║ Return ONLY valid JSON, no markdown, no explanation." ║
╠══════════════════════════════════════════════════════════════════════╣
║ USER MESSAGE (~450 tokens, unique per email) ║
║ ║
║ "Extract transaction details: ║
║ ║
║ INTERAC e-Transfer ║
║ Bonjour, vous avez reçu un virement Interac de Jean Tremblay. ║
║ Montant : 150,00 $ (CAD) ║
║ Numéro de référence: CA1b2c3d4e5f ║
║ Message de l'expéditeur : Dîme mars 2025 ║
║ Ce virement a été automatiquement déposé dans le compte de: ║
║ montreal.finances@iccameriques.org ║
║ Date: 23 février 2026 14:30 ║
║ ..." ║
╚══════════════════════════════════════════════════════════════════════╝
Total input: ~800 tokens
```
### What Comes OUT of the AI
```
╔══════════════════════════════════════════════════════════════════════╗
║ AI RESPONSE (~200 tokens) ║
║ ║
║ { ║
║ "sender": "Jean Tremblay", ║
║ "amount": 150.00, ║
║ "currency": "CAD", ║
║ "reference": "CA1b2c3d4e5f", ║
║ "message": "Dîme mars 2025", ║
║ "recipient_email": "montreal.finances@iccameriques.org", ║
║ "date": "2026-02-23T14:30:00Z", ║
║ "status": "deposited" ║
║ } ║
╚══════════════════════════════════════════════════════════════════════╝
Total output: ~200 tokens
Total per email: ~1,000 tokens
```
### What Happens AFTER AI (No AI Involved)
```
AI Output (JSON)
Zod Validation ──── FAIL? → log warning, retry with next model, or flag as needs_review
✅ PASS
Branch Routing ──── BRANCH_MAPPING["montreal.finances@iccameriques.org"] → "ICC Montréal"
│ (pure JavaScript lookup, no AI)
PostgreSQL INSERT ── Drizzle ORM → INSERT INTO transactions (...)
WebSocket Push ──── Socket.io → transaction:new { ... }
React Dashboard ─── TanStack Table auto-adds row to table
User Sees New Row ── "Jean Tremblay | 150,00 $ | ICC Montréal | Déposé"
```
---
## 6. Speed Optimization — Why It's Fast
### Old Architecture (Sequential)
```
Email 1: [Fetch 100ms] → [AI Parse 300ms] → [Save 5ms] = 405ms
Email 2: [Fetch] → [Parse] → [Save]
Email 3: [Fetch] → ...
Total for 50 emails: 50 × 405ms = 20,250ms = ~20 seconds
```
### New Architecture (Parallel Pipeline)
```
Time → 0ms 100ms 200ms 300ms 400ms 500ms
┌──────┬──────┬──────┬──────┬──────┬──────┐
Fetch: │E1-E10│E11-20│E21-30│E31-40│E41-50│ │ (10 concurrent)
Parse: │ │P1-P15│P16-30│P31-45│P46-50│ │ (15 concurrent)
Save: │ │ │S1-S25│ │S26-50│ │ (batch 25)
└──────┴──────┴──────┴──────┴──────┴──────┘
Total for 50 emails: ~500ms for fetches + ~700ms for parses = ~1.2 seconds
(stages overlap, so total < sum)
```
### Concurrency Budget Per Free Tier
```
GROQ MISTRAL
RPM (requests/minute): 30 2
Safe concurrency: 15 concurrent calls 1 sequential call
Time per AI call: ~150ms (Groq is fast) ~400ms
Emails parsed per minute: ~30 (RPM-limited) ~2 (RPM-limited)
Emails parsed per hour: ~1,800 ~120
WITH AUTO-SWITCHER (all 6 Groq + 3 Mistral):
Effective RPM: 30 + 30 + 30 + 30 + 30 + 30 + 2 + 2 + 2 = 186 RPM*
(*until individual daily limits hit)
First hour throughput: ~180 emails/min = ~10,800 emails/hour
```
### Speed Benchmarks (Revised with Pipeline)
| Scan Type | Emails | Time (Pipeline + Auto-Switcher) |
|-----------|--------|-------------------------------|
| Today | 50 | **~12 seconds** |
| 7 Days | 200 | **~35 seconds** |
| Custom (1 month) | 1,000 | **~3 minutes** |
| Custom (6 months) | 5,000 | **~15 minutes** |
| Custom (1 year) | 10,000 | **~30 minutes** |
---
## 7. Non-AI Technology Flows
### PDF Receipt Generation (No AI)
```
User clicks "Générer reçu" on a transaction row
CLIENT-SIDE (single receipt):
@react-pdf/renderer builds PDF in browser
Template: ICC logo + transaction details + branch info + date
Downloads immediately as "recu_CA1b2c3d4e5f.pdf"
SERVER-SIDE (batch receipts):
POST /api/receipts/batch { transactionIds: [...] }
PDFKit generates PDF per transaction
Merges into single ZIP file
Returns ZIP for download
```
### CSV/Excel Export (No AI)
```
User clicks "Exporter" → selects CSV or Excel
CLIENT-SIDE:
SheetJS (xlsx npm) builds file from TanStack Table data
Columns: Date | Expéditeur | Montant | Référence | Succursale | Statut
fr-CA formatting: "1 500,00 $" not "$1,500.00"
Downloads immediately
```
### Reports & Charts (No AI)
```
User navigates to /reports
GET /api/transactions/stats?from=2026-01-01&to=2026-02-23
PostgreSQL aggregation queries:
- SUM(amount) GROUP BY month → bar chart (Recharts)
- COUNT(*) GROUP BY status → pie chart (Recharts)
- COUNT(*) GROUP BY branch → top branches table
- SUM(amount) over time → trend line chart (Recharts)
React renders charts with Recharts
No AI involved — pure SQL aggregation + charting library
```
### Real-Time Dashboard Updates (No AI)
```
Server saves new transaction to PostgreSQL
Socket.io server emits:
transaction:new { transaction: {...} }
Socket.io client receives event
Zustand store updates transaction list
TanStack Table re-renders with new row
(animated row insertion at top of table)
SummaryBar recalculates totals
(no page refresh, no API call needed)
```
---
## 8. Security Flow (No AI)
```
1. User clicks "Se connecter avec Google"
2. Frontend redirects to Google OAuth consent URL
(scopes: gmail.readonly, userinfo.email, userinfo.profile)
3. User approves → Google redirects to /api/auth/google/callback?code=xxx
4. Backend exchanges auth code for access_token + refresh_token
(google-auth-library)
5. Backend encrypts tokens with AES-256 (crypto module)
6. Backend stores encrypted tokens in PostgreSQL (users table)
7. Backend issues JWT: { userId, email, role, exp: 15min }
(jsonwebtoken)
8. Frontend stores JWT in httpOnly secure cookie
9. Every API request includes JWT in Authorization header
10. Backend middleware verifies JWT on every request
(jsonwebtoken.verify)
11. If JWT expired → frontend calls POST /api/auth/refresh
with refresh token → new JWT issued
```
---
*This document is a companion to the ICC Interac Manager Build Prompt and the FREE AI Models Guide. Together, the three documents provide everything needed to build the complete system.*