Spaces:

safetrack
/

edtech

Running

CognxSafeTrack Claude Sonnet 4.6 commited on 21 days ago

Commit

5f0a436

1 Parent(s): a9bacbe

fix(bot): eliminate double feedback, fix gemini model, fix multilingual prompts

1. Double feedback (critical):
Bridge was enqueuing both download-media AND handle-inbound for every audio/image
message. Both paths independently transcribed the audio and called
WhatsAppLogic.handleIncomingMessage → two generate-feedback jobs → user received
two Coach responses. Fix: remove handle-inbound and send-message-direct from bridge
for media. MediaHandler (download-media) already handles full pipeline.
MediaHandler now sends a localized spinner after user lookup (sendSpinner flag).

2. gemini-1.5-pro → 404:
gemini-1.5-pro is deprecated via Google API v1beta. Updated GeminiProvider to use
gemini-2.0-flash for both flash and complex (pro) requests.

3. Multilingual deep-dive invitation (EN/ES/PT):
action-feedback-standard.md only had FR and WO variants for the APPROFONDIR/SUITE
invitation at end of feedback. Added EN, ES, PT — AI was mixing languages for
English/Spanish/Portuguese users.

4. InboundHandler [object Object] bug:
aiService.transcribeAudio returns { text, confidence }, not a string. Template
literal was stringifying the object. Fixed to use result.text.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Files changed (7) hide show

apps/whatsapp-worker/src/handlers/InboundHandler.ts +4 -4
apps/whatsapp-worker/src/handlers/MediaHandler.ts +13 -0
apps/whatsapp-worker/src/handlers/types.ts +1 -0
apps/whatsapp-worker/src/index.ts +6 -21
docs/organisation-onboarding/configuration-reference-2026-05-12.md +202 -0
packages/ai-sdk/src/gemini-provider.ts +3 -4
packages/prompts/src/templates/action-feedback-standard.md +3 -0

apps/whatsapp-worker/src/handlers/InboundHandler.ts CHANGED Viewed

@@ -51,10 +51,10 @@ export class InboundHandler implements JobHandler {
                             baseURL: '' // The URL is absolute
                         });
                         const audioBuffer = audioRes.data;
-                        const transcribedText = await aiService.transcribeAudio(Buffer.from(audioBuffer), `msg_${mediaId}.ogg`);
-                        if (transcribedText) {
-                            text = `🎤 Transcription : ${transcribedText}`;
                         }
                     }
                 }

                             baseURL: '' // The URL is absolute
                         });
                         const audioBuffer = audioRes.data;
+                        const transcribeResult = await aiService.transcribeAudio(Buffer.from(audioBuffer), `msg_${mediaId}.ogg`);
+                        if (transcribeResult?.text) {
+                            text = transcribeResult.text;
                         }
                     }
                 }

apps/whatsapp-worker/src/handlers/MediaHandler.ts CHANGED Viewed

@@ -69,6 +69,19 @@ export class MediaHandler implements JobHandler {
                         organizationId: organizationId || user.organizationId
                     }
                 });
             }
             if (mimeType && mimeType.startsWith('audio/')) {

                         organizationId: organizationId || user.organizationId
                     }
                 });
+                // Send localized spinner (bridge no longer sends send-message-direct for media)
+                if (job.data.sendSpinner) {
+                    const isImage = mimeType?.startsWith('image/');
+                    const spinnerMsg = isImage ? {
+                        FR: "⏳ J'analyse ton image...", WOLOF: "⏳ Defar ak sa nataal...",
+                        EN: "⏳ Analysing your image...", ES: "⏳ Analizando tu imagen...", PT: "⏳ Analisando sua imagem..."
+                    }[user.language] ?? "⏳ Analysing..." : {
+                        FR: "⏳ J'analyse ton audio...", WOLOF: "⏳ Defar ak sa kàddu...",
+                        EN: "⏳ Analysing your audio...", ES: "⏳ Analizando tu audio...", PT: "⏳ Analisando seu áudio..."
+                    }[user.language] ?? "⏳ Analysing...";
+                    await sendTextMessage(phone, spinnerMsg, tenantConfig);
+                }
             }
             if (mimeType && mimeType.startsWith('audio/')) {

apps/whatsapp-worker/src/handlers/types.ts CHANGED Viewed

@@ -88,6 +88,7 @@ export interface JobData {
     adminId?: string;
     accessToken?: string;
     overrideAudioUrl?: string;
     // Interactive components
     buttons?: Array<{ id: string; title: string }>;

     adminId?: string;
     accessToken?: string;
     overrideAudioUrl?: string;
+    sendSpinner?: boolean;
     // Interactive components
     buttons?: Array<{ id: string; title: string }>;

apps/whatsapp-worker/src/index.ts CHANGED Viewed

@@ -135,33 +135,18 @@ server.post('/v1/internal/whatsapp/inbound', async (req: FastifyRequest, reply:
             const isImage = msg.mediaType === 'image';
             logger.info(`[BRIDGE] Enqueuing inbound media (${msg.mediaType}) from ${msg.phone}`);
-            // 1. Traditional Media Processing (for training/pedagogy)
             await whatsappQueue.add('download-media', {
                 mediaId: msg.mediaId,
                 mimeType: isImage ? 'image/jpeg' : 'audio/ogg',
                 phone: msg.phone,
                 organizationId,
-                caption: msg.caption
             }, { priority: 1, attempts: 3, backoff: { type: 'exponential', delay: 2000 } });
-            // 2. New CRM Inbox flow
-            await whatsappQueue.add('handle-inbound', {
-                phone: msg.phone,
-                text: msg.caption || '', // Use caption as text if available
-                audioUrl: !isImage ? msg.mediaId : undefined,
-                imageUrl: isImage ? msg.mediaId : undefined,
-                organizationId
-            }, {
-                attempts: 3,
-                backoff: { type: 'exponential', delay: 1000 }
-            });
-            await whatsappQueue.add('send-message-direct', {
-                phone: msg.phone,
-                text: isImage ? "⏳ J'analyse ton image..." : "⏳ J'analyse ton audio...",
-                organizationId
-            });
         }
     }

             const isImage = msg.mediaType === 'image';
             logger.info(`[BRIDGE] Enqueuing inbound media (${msg.mediaType}) from ${msg.phone}`);
+            // Single media job: MediaHandler handles transcription + DB logging + pedagogy.
+            // A second handle-inbound job was previously enqueued here, causing double
+            // transcription and double generate-feedback jobs for every audio message.
             await whatsappQueue.add('download-media', {
                 mediaId: msg.mediaId,
                 mimeType: isImage ? 'image/jpeg' : 'audio/ogg',
                 phone: msg.phone,
                 organizationId,
+                caption: msg.caption,
+                sendSpinner: true  // MediaHandler will send a localized spinner after user lookup
             }, { priority: 1, attempts: 3, backoff: { type: 'exponential', delay: 2000 } });
         }
     }

docs/organisation-onboarding/configuration-reference-2026-05-12.md ADDED Viewed

	@@ -0,0 +1,202 @@

+# Configuration de Référence — Bot XAMLÉ Fonctionnel
+**Date de validation** : 12 Mai 2026
+**Statut** : ✅ Testé en production (flux complet EN validé)
+---
+## Architecture de Déploiement
+```
+Meta WhatsApp Cloud API
+        │
+        ▼ webhook POST
+┌─────────────────────────────┐
+│  HuggingFace Space          │  safetrack/edtech
+│  CPU Basic (free)           │  port 7860 (PM2)
+│                             │
+│  ├─ apps/api (Fastify)      │  ← reçoit le webhook Meta
+│  │   └─ /v1/whatsapp/webhook│  → valide, répond 200ms
+│  │   └─ /v1/ai/tts          │  ← TTS audio
+│  │   └─ /v1/ai/transcribe   │  ← transcription audio
+│  │   └─ /v1/ai/store-audio  │  ← stockage R2
+│  │                          │
+│  └─ apps/whatsapp-worker    │  DISABLE_WORKER_CONSUMER=true
+│      └─ Bridge :8082        │  → forward vers Railway
+│         /v1/internal/       │
+│         whatsapp/inbound    │
+└─────────────────────────────┘
+        │ HTTP POST (ADMIN_API_KEY)
+        ▼
+┌─────────────────────────────┐
+│  Railway (US West)          │  whatsapp-worker service
+│  whatsapp-worker            │
+│                             │
+│  ├─ BullMQ Worker           │  ← consomme whatsapp-queue
+│  ├─ Bridge :8082            │  ← reçoit de HuggingFace
+│  └─ Redis (shared)          │  ← Upstash Redis
+└─────────────────────────────┘
+        │
+        ▼
+┌─────────────────────────────┐
+│  Neon PostgreSQL            │  ep-divine-boat-a8pfifri
+│  (Base de données prod)     │
+└─────────────────────────────┘
+        │
+        ▼
+┌─────────────────────────────┐
+│  Cloudflare R2              │  pub-e770286d...r2.dev
+│  (Stockage audio/images)    │
+└─────────────────────────────┘
+```
+---
+## Variables d'Environnement Critiques
+### HuggingFace (api + worker en bridge mode)
+| Variable | Rôle |
+|----------|------|
+| `DISABLE_WORKER_CONSUMER` | **`true`** — empêche HF de consommer la queue BullMQ (seul Railway consomme) |
+| `DATABASE_URL` | Neon PostgreSQL |
+| `REDIS_URL` | Upstash Redis (même instance que Railway) |
+| `ADMIN_API_KEY` | Auth interne HF → Railway bridge |
+| `RAILWAY_INTERNAL_URL` | URL du bridge Railway (`:8082`) |
+| `WHATSAPP_VERIFY_TOKEN` | Vérification webhook Meta |
+| `WHATSAPP_ACCESS_TOKEN` | Token Meta global (fallback si pas de token org) |
+| `WHATSAPP_PHONE_NUMBER_ID` | ID numéro Meta |
+| `OPENAI_API_KEY` | Whisper STT + TTS + GPT-4o |
+| `GOOGLE_AI_API_KEY` | Gemini (avec fallback OpenAI si quota épuisé) |
+| `ENCRYPTION_SECRET` | **Doit être identique à Railway** — clé de déchiffrement des tokens org |
+| `JWT_SECRET` | Auth admin dashboard |
+| `R2_*` | Cloudflare R2 (store-audio, images) |
+### Railway (worker — consommateur BullMQ)
+| Variable | Valeur critique | Rôle |
+|----------|----------------|------|
+| `ENCRYPTION_SECRET` | **Même valeur que HuggingFace** | Déchiffre `systemUserToken` en DB |
+| `DATABASE_URL` | Neon PostgreSQL (même instance) | |
+| `REDIS_URL` | Upstash Redis (même instance que HF) | |
+| `ADMIN_API_KEY` | Min 32 chars (même valeur que HF) | Auth bridge |
+| `API_URL` | URL interne HF (port 7860) | Appels TTS, transcription, store-audio |
+| `DISABLE_WORKER_CONSUMER` | **absent ou `false`** | Railway DOIT consommer la queue |
+| `WORKER_CONCURRENCY` | `5` | Jobs parallèles |
+> **⚠️ Règle ENCRYPTION_SECRET** : La valeur en DB est chiffrée avec la clé qui était active lors de l'exécution de `apps/api/scratch/encrypt_existing_secrets.ts`. Si cette clé change, toutes les tentatives de déchiffrement retournent le ciphertext `enc:...` → Authentication Error sur Meta. Ne jamais changer cette clé sans exécuter `apps/api/scratch/rotate_encryption_key.ts`.
+---
+## Flux Utilisateur Validé (12 Mai 2026)
+```
+Utilisateur → "INSCRIPTION"
+    → OnboardingHandler : Hard Reset (delete enrollments, activity=null)
+    → Liste des langues envoyée
+Utilisateur → "English 🇬🇧" (LANG_EN)
+    → OnboardingHandler : user.language = EN
+    → Liste des secteurs envoyée
+Utilisateur → "Tech / Digital" (SEC_TECH)
+    → OnboardingHandler : user.activity = "Tech / Digital"
+      [FIX 2026-05-12 : guard !user.activity au lieu de !activeEnrollment]
+    → send-message "Sector noted: Tech / Digital"
+    → enroll-user → EnrollHandler → enrollment créé (token décrypté via getCachedOrganization)
+    → send-content (Day 1)
+ContentHandler → Day 1
+    → sendLessonDay() → Gemini 429 → fallback OpenAI GPT-4o ✅
+    → Contenu généré + TTS audio envoyé
+    → Video + textes envoyés
+Utilisateur → [Message vocal]
+    → Bridge HF : download-media job enqueué
+    → MediaHandler : télécharge, stocke sur R2, transcrit (OpenAI Whisper)
+    → WhatsAppLogic.handleIncomingMessage (transcribed text)
+    → ExerciseHandler : génère generate-feedback
+    → FeedbackHandler : Gemini 429 → fallback OpenAI ✅
+    → Feedback Coach XAMLÉ envoyé (une seule fois)
+```
+---
+## Organisations en Base
+| ID | Nom | systemUserToken |
+|----|-----|----------------|
+| `default-org-id` | XAMLÉ Global | ✅ Chiffré avec ENCRYPTION_SECRET local |
+| `136f72d9-...` | test | vide |
+| `ba012b65-...` | testcrm | vide |
+L'organisation active est `default-org-id`. Le token Meta déchiffré commence par `EAAURe...` (199 chars).
+---
+## Modèles AI Utilisés
+| Tâche | Provider | Modèle | Fallback |
+|-------|----------|--------|---------|
+| Génération contenu leçon | Gemini (priorité 100) | `gemini-2.0-flash` | OpenAI `gpt-4o` |
+| Génération feedback | Gemini (priorité 100) | `gemini-2.0-flash` | OpenAI `gpt-4o` |
+| Transcription audio (STT) | OpenAI (seul provider) | `whisper-1` | — |
+| TTS audio | OpenAI (seul provider) | `tts-1` | — |
+> **Note Gemini** : Le quota free tier (`generativelanguage.googleapis.com/generate_content_free_tier_requests`) est régulièrement épuisé. Le fallback OpenAI est automatique et transparent. Ajouter une clé Gemini payante en `Organization.googleAiApiKey` pour s'affranchir du quota global.
+---
+## Chaîne de Handlers (whatsapp-logic.ts)
+Ordre d'exécution, premier `canHandle() = true` gagne :
+1. **AIAgentHandler** — si `organization.mode === 'AI_AGENT'`
+2. **OnboardingHandler** — INSCRIPTION, LANG_*, SEC_* (si `!user.activity`), free-text activité
+3. **CommandHandler** — SEED, MENU_HISTORIQUE, DAY{n}_{ACTION}, ADMIN_*
+4. **NavigationHandler** — MENU, SUITE, CONTINUER, DAY{n}, langue
+5. **ExerciseHandler** — si `activeEnrollment` existe (catch-all pour les réponses pédagogiques)
+---
+## Jobs BullMQ — Nommage
+| Job | Handler | Déclencheur |
+|-----|---------|------------|
+| `handle-inbound` | InboundHandler | Texte entrant |
+| `download-media` | MediaHandler | Audio/image entrant |
+| `enroll-user` | EnrollHandler | Sélection secteur |
+| `send-content` | ContentHandler | Après enrollment, après exercice validé |
+| `generate-feedback` | FeedbackHandler | Après réponse exercice |
+| `send-message` | MessageHandler | Toute sortie texte |
+| `send-interactive-list` | MessageHandler | Listes interactives |
+| `send-nudge` | NudgeHandler | Relances planifiées |
+| `send-broadcast` | BroadcastHandler | Campagnes |
+---
+## Limitations Connues (à surveiller)
+| Limitation | Impact | Mitigation |
+|-----------|--------|-----------|
+| Gemini free tier quota épuisé | Latence +2-3s (fallback OpenAI) | Ajouter clé payante dans org |
+| `SERPER_API_KEY` absent | Search mock → données marché non réelles | Ajouter clé Serper |
+| HuggingFace sleep après 48h inactivité | Premier message lent (~30s cold start) | Envoyer un ping quotidien |
+| `gemini-1.5-pro` déprécié via v1beta | Erreur 404 → fallback OpenAI | Migré vers gemini-2.0-flash |
+---
+## Commandes de Diagnostic
+```bash
+# Vérifier si ENCRYPTION_SECRET peut déchiffrer les secrets en DB
+ENCRYPTION_SECRET="..." DATABASE_URL="..." \
+  npx tsx apps/api/scratch/check_encryption.ts
+# Faire tourner le diagnostic d'encodage des orgs
+ENCRYPTION_SECRET="..." DATABASE_URL="..." \
+  npx tsx apps/api/scratch/audit_db.ts
+# Rotation de clé (si ENCRYPTION_SECRET a changé)
+OLD_ENCRYPTION_SECRET="..." NEW_ENCRYPTION_SECRET="..." DATABASE_URL="..." \
+  npx tsx apps/api/scratch/rotate_encryption_key.ts
+```

packages/ai-sdk/src/gemini-provider.ts CHANGED Viewed

@@ -11,10 +11,9 @@ export class GeminiProvider implements LLMProvider {
     constructor(apiKey: string) {
         logger.info('[GEMINI] Initializing SDK...');
         this.genAI = new GoogleGenerativeAI(apiKey);
-        // Standard model for normal requests
         this.flashModel = this.genAI.getGenerativeModel({ model: 'gemini-2.0-flash' });
-        // Pro model for long context & complex doc generation
-        this.proModel = this.genAI.getGenerativeModel({ model: 'gemini-1.5-pro' });
     }
     async generateStructuredData<T>(prompt: string, _schema: z.ZodSchema<T>, temperature?: number, imageUrl?: string): Promise<T> {
@@ -22,7 +21,7 @@ export class GeminiProvider implements LLMProvider {
         // Use Pro for complex docs (OnePager/PitchDeck) - detected by prompt length or keyword
         const isComplex = prompt.includes('PITCH_DECK') || prompt.includes('ONE_PAGER') || prompt.length > 2000;
         const model = isComplex ? this.proModel : this.flashModel;
-        const modelName = isComplex ? 'gemini-1.5-pro' : 'gemini-2.0-flash';
         logger.info(`[GEMINI] Generating structured data with ${modelName}... (Vision: ${!!imageUrl})`);

     constructor(apiKey: string) {
         logger.info('[GEMINI] Initializing SDK...');
         this.genAI = new GoogleGenerativeAI(apiKey);
         this.flashModel = this.genAI.getGenerativeModel({ model: 'gemini-2.0-flash' });
+        // gemini-1.5-pro is deprecated via v1beta; use gemini-2.0-flash for all requests
+        this.proModel = this.genAI.getGenerativeModel({ model: 'gemini-2.0-flash' });
     }
     async generateStructuredData<T>(prompt: string, _schema: z.ZodSchema<T>, temperature?: number, imageUrl?: string): Promise<T> {
         // Use Pro for complex docs (OnePager/PitchDeck) - detected by prompt length or keyword
         const isComplex = prompt.includes('PITCH_DECK') || prompt.includes('ONE_PAGER') || prompt.length > 2000;
         const model = isComplex ? this.proModel : this.flashModel;
+        const modelName = isComplex ? 'gemini-2.0-flash (complex)' : 'gemini-2.0-flash';
         logger.info(`[GEMINI] Generating structured data with ${modelName}... (Vision: ${!!imageUrl})`);

packages/prompts/src/templates/action-feedback-standard.md CHANGED Viewed

@@ -32,5 +32,8 @@ INVITATION DEEP-DIVE OBLIGATOIRE :
 Termine EXACTEMENT ton pilier 3 par cette phrase selon la langue :
 (FR): "Si tu veux affiner ce point avec une donnée de ton propre terrain, tape 1️⃣ APPROFONDIR, sinon tape 2️⃣ SUITE."
 (WO): "Su nga bëggee yokk leneen ci li nga xam, bindal 1️⃣ APPROFONDIR, wala nga bind 2️⃣ SUITE."
 {{buttonBypassBlock}}

 Termine EXACTEMENT ton pilier 3 par cette phrase selon la langue :
 (FR): "Si tu veux affiner ce point avec une donnée de ton propre terrain, tape 1️⃣ APPROFONDIR, sinon tape 2️⃣ SUITE."
 (WO): "Su nga bëggee yokk leneen ci li nga xam, bindal 1️⃣ APPROFONDIR, wala nga bind 2️⃣ SUITE."
+(EN): "To refine this point with data from your own field, type 1️⃣ DEEP DIVE, otherwise type 2️⃣ CONTINUE."
+(ES): "Para profundizar este punto con datos de tu propio campo, escribe 1️⃣ PROFUNDIZAR, de lo contrario escribe 2️⃣ CONTINUAR."
+(PT): "Para aprofundar este ponto com dados do seu próprio campo, escreva 1️⃣ APROFUNDAR, caso contrário escreva 2️⃣ CONTINUAR."
 {{buttonBypassBlock}}