File size: 28,263 Bytes
6a4a552
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
# Phase 2 Voice Implementation - AI Assistant Prompts

## Overview

This file contains 6 carefully crafted prompts to guide AI assistants (like Claude, ChatGPT, etc.) through implementing Phase 2 voice features step-by-step.

**How to Use:**
1. Copy each prompt one at a time
2. Paste into your AI assistant
3. Review the generated code
4. Test before moving to the next prompt
5. Track progress in `PHASE_2_CHECKLIST.md`

**Context Files to Attach:**
- `PHASE_2_VOICE_IMPLEMENTATION_PLAN.md` (master plan)
- `PHASE_2_ARCHITECTURE.md` (architecture)
- `app/agent/honeypot.py` (existing honeypot)
- `app/config.py` (existing config)

---

## πŸ“‹ PROMPT 1: ASR Module (Whisper Transcription)

**Estimated Time:** 2 hours  
**Dependencies:** `pip install openai-whisper torchaudio`  
**Output:** `app/voice/asr.py`

### Prompt

```
I'm implementing Phase 2 voice features for my ScamShield AI honeypot project. I need to create an ASR (Automatic Speech Recognition) module using Whisper.

CONTEXT:
- This is Phase 2, which wraps around an existing Phase 1 text honeypot
- Phase 1 must remain unchanged
- The ASR module will transcribe audio to text, which then feeds into Phase 1

REQUIREMENTS:

1. Create file: app/voice/asr.py

2. Implement ASREngine class with:
   - __init__(model_size: str = "base") - Initialize Whisper model
   - _load_model() - Load Whisper model (tiny/base/small/medium/large)
   - transcribe(audio_path: str, language: Optional[str] = None) -> Dict
     Returns: {"text": str, "language": str, "confidence": float}
   - _calculate_confidence(result: Dict) -> float - Calculate confidence from Whisper output

3. Features:
   - Support multiple Whisper model sizes (configurable)
   - Auto-detect language or accept language hint
   - GPU support if available (cuda), else CPU
   - Return transcription with confidence score
   - Handle errors gracefully (return empty text with 0.0 confidence)

4. Singleton pattern:
   - get_asr_engine() -> ASREngine (global instance)

5. Code quality:
   - Type hints for all functions
   - Docstrings (Google style)
   - Logging using app.utils.logger.get_logger(__name__)
   - Error handling with try/except

6. Configuration from settings:
   - settings.WHISPER_MODEL (default: "base")
   - Auto-detect device (cuda/cpu)

REFERENCE IMPLEMENTATION (from PHASE_2_VOICE_IMPLEMENTATION_PLAN.md):
[See Step 2.1 in the plan]

ACCEPTANCE CRITERIA:
- [ ] ASREngine class created
- [ ] Whisper model loads successfully
- [ ] transcribe() returns correct format
- [ ] Language detection works
- [ ] Confidence calculation implemented
- [ ] Singleton pattern works
- [ ] Error handling present
- [ ] Type hints and docstrings complete
- [ ] Logging added

Please generate the complete app/voice/asr.py file with production-ready code.
```

---

## πŸ“‹ PROMPT 2: TTS Module (Text-to-Speech)

**Estimated Time:** 2 hours  
**Dependencies:** `pip install gTTS`  
**Output:** `app/voice/tts.py`

### Prompt

```
I'm implementing Phase 2 voice features for my ScamShield AI honeypot. I need to create a TTS (Text-to-Speech) module using gTTS.

CONTEXT:
- This is Phase 2, which wraps around an existing Phase 1 text honeypot
- Phase 1 generates text replies, TTS converts them to speech
- The TTS module will convert AI text replies to audio files

REQUIREMENTS:

1. Create file: app/voice/tts.py

2. Implement TTSEngine class with:
   - __init__() - Initialize TTS engine
   - synthesize(text: str, language: str = "en", output_path: Optional[str] = None) -> str
     Returns: Path to generated audio file
   - Language mapping for Indic languages (en, hi, gu, ta, te, bn, mr)

3. Features:
   - Support multiple languages (English, Hindi, Gujarati, Tamil, Telugu, Bengali, Marathi)
   - Auto-generate output path if not provided (use tempfile)
   - Return path to generated .mp3 file
   - Handle errors gracefully (raise exception with clear message)

4. Singleton pattern:
   - get_tts_engine() -> TTSEngine (global instance)

5. Code quality:
   - Type hints for all functions
   - Docstrings (Google style)
   - Logging using app.utils.logger.get_logger(__name__)
   - Error handling with try/except

6. Configuration from settings:
   - settings.TTS_ENGINE (default: "gtts")

REFERENCE IMPLEMENTATION (from PHASE_2_VOICE_IMPLEMENTATION_PLAN.md):
[See Step 2.2 in the plan]

LANGUAGE MAPPING:
- "english" -> "en"
- "hindi" -> "hi"
- "gujarati" -> "gu"
- "tamil" -> "ta"
- "telugu" -> "te"
- "bengali" -> "bn"
- "marathi" -> "mr"

ACCEPTANCE CRITERIA:
- [ ] TTSEngine class created
- [ ] synthesize() generates audio files
- [ ] Language mapping works for Indic languages
- [ ] Temp file generation works
- [ ] Singleton pattern works
- [ ] Error handling present
- [ ] Type hints and docstrings complete
- [ ] Logging added

Please generate the complete app/voice/tts.py file with production-ready code.
```

---

## πŸ“‹ PROMPT 3: Voice API Endpoints

**Estimated Time:** 3 hours  
**Dependencies:** FastAPI (already installed)  
**Output:** `app/api/voice_endpoints.py`, `app/api/voice_schemas.py`

### Prompt

```
I'm implementing Phase 2 voice features for my ScamShield AI honeypot. I need to create voice API endpoints that integrate with the existing Phase 1 text honeypot.

CONTEXT:
- Phase 1 has /api/v1/honeypot/engage (text endpoint) - DO NOT MODIFY
- Phase 2 needs /api/v1/voice/engage (voice endpoint) - NEW
- Voice endpoint: Audio in β†’ ASR β†’ Phase 1 pipeline β†’ TTS β†’ Audio out
- Must reuse existing Phase 1 logic (detector, honeypot, extractor)

REQUIREMENTS:

1. Create file: app/api/voice_schemas.py

Implement Pydantic schemas:
- TranscriptionMetadata (text, language, confidence)
- VoiceFraudMetadata (is_synthetic, confidence, risk_level) - Optional
- VoiceEngageResponse (session_id, scam_detected, scam_confidence, scam_type, turn_count, ai_reply_text, ai_reply_audio_url, transcription, voice_fraud, extracted_intelligence, processing_time_ms)

2. Create file: app/api/voice_endpoints.py

Implement endpoints:

A. POST /api/v1/voice/engage
   - Accept: multipart/form-data (audio_file, session_id, language)
   - Flow:
     1. Save uploaded audio temporarily
     2. Transcribe with ASR (app.voice.asr.get_asr_engine())
     3. Process through Phase 1 (REUSE existing code):
        - app.models.detector.get_detector().detect()
        - app.agent.honeypot.HoneypotAgent().engage()
        - app.models.extractor.extract_intelligence()
     4. Convert reply to speech with TTS (app.voice.tts.get_tts_engine())
     5. Return VoiceEngageResponse with audio URL
   - Auth: x-api-key header (use existing verify_api_key)
   - Error handling: HTTPException with clear messages

B. GET /api/v1/voice/audio/{filename}
   - Serve generated audio files from temp directory
   - Return FileResponse with audio/mpeg media type
   - 404 if file not found

C. GET /api/v1/voice/health
   - Check ASR and TTS engine status
   - Return health info (model, device, engine type)

3. Router setup:
   - APIRouter with prefix="/api/v1/voice", tags=["voice"]
   - Export router for inclusion in main app

4. Code quality:
   - Type hints for all functions
   - Docstrings (Google style)
   - Logging using app.utils.logger.get_logger(__name__)
   - Error handling with try/except
   - Clean up temp files after processing

CRITICAL: DO NOT MODIFY PHASE 1 CODE
- Import and reuse: app.models.detector, app.agent.honeypot, app.models.extractor
- Import and reuse: app.database.redis_client (session state)
- Import and reuse: app.api.auth.verify_api_key

REFERENCE IMPLEMENTATION (from PHASE_2_VOICE_IMPLEMENTATION_PLAN.md):
[See Step 3.1 and 3.2 in the plan]

ACCEPTANCE CRITERIA:
- [ ] voice_schemas.py created with all schemas
- [ ] voice_endpoints.py created with all endpoints
- [ ] POST /voice/engage works end-to-end
- [ ] Audio upload handling works
- [ ] ASR integration works
- [ ] Phase 1 integration works (no modifications to Phase 1)
- [ ] TTS integration works
- [ ] GET /voice/audio/{filename} serves files
- [ ] GET /voice/health returns status
- [ ] Error handling present
- [ ] Type hints and docstrings complete
- [ ] Logging added
- [ ] Auth (x-api-key) works

Please generate both files (voice_schemas.py and voice_endpoints.py) with production-ready code.
```

---

## πŸ“‹ PROMPT 4: Voice UI (HTML + JavaScript + CSS)

**Estimated Time:** 4 hours  
**Dependencies:** None (vanilla JS)  
**Output:** `ui/voice.html`, `ui/voice.js`, `ui/voice.css`

### Prompt

```
I'm implementing Phase 2 voice features for my ScamShield AI honeypot. I need to create a voice UI that allows users to record audio, send it to the API, and hear AI voice replies.

CONTEXT:
- Phase 1 has ui/index.html (text chat) - DO NOT MODIFY
- Phase 2 needs ui/voice.html (voice chat) - NEW, SEPARATE
- Voice UI: Record β†’ Send to /api/v1/voice/engage β†’ Display transcription + Play AI audio

REQUIREMENTS:

1. Create file: ui/voice.html

Features:
- Header: "🎀 ScamShield AI - Voice Honeypot (Phase 2)"
- Recording controls:
  - Status indicator (Ready/Recording/Processing)
  - "Start Recording" button
  - "Stop Recording" button
  - "Upload Audio File" button
  - Session ID display (read-only)
- Conversation area:
  - Display user messages (transcription)
  - Display AI messages (text + audio player)
  - System messages (status updates)
- Metadata section:
  - Transcription (text, language, confidence)
  - Detection (scam_detected, confidence, type)
  - Voice fraud (optional, if enabled)
- Intelligence section:
  - Display extracted UPI, bank accounts, phone numbers, URLs

2. Create file: ui/voice.js

Features:
- startRecording(): Use MediaRecorder API to capture audio
- stopRecording(): Stop recording and send to API
- uploadAudio(): Allow file upload
- sendAudioToAPI(): POST to /api/v1/voice/engage with FormData
- handleAPIResponse(): Update UI with response
- addMessage(): Add user/ai/system messages
- updateMetadata(): Update transcription, detection, fraud info
- updateIntelligence(): Display extracted intelligence
- Audio playback: <audio controls> for AI replies

3. Create file: ui/voice.css

Features:
- Dark theme (consistent with Phase 1)
- Recording status indicator (colors: ready=white, recording=red, processing=yellow)
- Button styles (primary, secondary, tertiary)
- Message bubbles (user=right, ai=left, system=center)
- Metadata cards with labels and values
- Responsive design

4. Code quality:
- Vanilla JavaScript (no frameworks)
- Clean, readable code
- Error handling (microphone access, API errors)
- Console logging for debugging

API INTEGRATION:
- Endpoint: POST /api/v1/voice/engage
- Headers: x-api-key: "dev-key-12345"
- FormData: audio_file (blob), session_id (string), language (string)
- Response: VoiceEngageResponse (see voice_schemas.py)

REFERENCE IMPLEMENTATION (from PHASE_2_VOICE_IMPLEMENTATION_PLAN.md):
[See Step 4.1, 4.2, 4.3 in the plan]

ACCEPTANCE CRITERIA:
- [ ] voice.html created with all sections
- [ ] voice.js created with all functions
- [ ] voice.css created with all styles
- [ ] Recording works (MediaRecorder API)
- [ ] File upload works
- [ ] API integration works
- [ ] Transcription displays correctly
- [ ] AI audio plays correctly
- [ ] Metadata updates correctly
- [ ] Intelligence displays correctly
- [ ] Error handling present
- [ ] UI looks professional (dark theme)
- [ ] Responsive design works

Please generate all three files (voice.html, voice.js, voice.css) with production-ready code.
```

---

## πŸ“‹ PROMPT 5: Integration & Configuration

**Estimated Time:** 3 hours  
**Dependencies:** None  
**Output:** Updated `app/main.py`, `app/config.py`, `.env.example`

### Prompt

```
I'm implementing Phase 2 voice features for my ScamShield AI honeypot. I need to integrate the voice module into the main app without breaking Phase 1.

CONTEXT:
- Phase 1 is working perfectly - MUST NOT BREAK
- Phase 2 voice module is ready (ASR, TTS, endpoints, UI)
- Need to conditionally load Phase 2 only if PHASE_2_ENABLED=true
- If Phase 2 fails to load, Phase 1 should still work

REQUIREMENTS:

1. Update file: app/config.py

Add Phase 2 settings to Settings class:
- PHASE_2_ENABLED: bool = Field(default=False, description="Enable Phase 2 voice features")
- WHISPER_MODEL: str = Field(default="base", description="Whisper model size (tiny, base, small, medium, large)")
- TTS_ENGINE: str = Field(default="gtts", description="TTS engine (gtts, indic_tts)")
- VOICE_FRAUD_DETECTION: bool = Field(default=False, description="Enable voice fraud detection")
- AUDIO_SAMPLE_RATE: int = Field(default=16000, description="Audio sample rate in Hz")
- AUDIO_CHUNK_DURATION: int = Field(default=5, description="Audio chunk duration in seconds")

2. Update file: app/main.py

Add conditional Phase 2 router inclusion:
```python
# After existing router inclusions
if getattr(settings, "PHASE_2_ENABLED", False):
    try:
        from app.api.voice_endpoints import router as voice_router
        app.include_router(voice_router)
        logger.info("Phase 2 voice endpoints enabled")
    except ImportError as e:
        logger.warning(f"Phase 2 voice endpoints unavailable: {e}")
    except Exception as e:
        logger.error(f"Failed to load Phase 2: {e}")
```

3. Update file: .env.example

Add Phase 2 configuration section:
```bash
# ========================================
# PHASE 2: VOICE FEATURES (OPTIONAL)
# ========================================
# Enable Phase 2 voice features (default: false)
PHASE_2_ENABLED=false

# Whisper ASR Configuration
WHISPER_MODEL=base
# Options: tiny, base, small, medium, large
# Larger models = better accuracy but slower

# TTS Configuration
TTS_ENGINE=gtts
# Options: gtts (Google TTS - free)

# Voice Fraud Detection (Optional)
VOICE_FRAUD_DETECTION=false
# Set to true to enable synthetic voice detection

# Audio Settings
AUDIO_SAMPLE_RATE=16000
AUDIO_CHUNK_DURATION=5
```

4. Code quality:
- Minimal changes to existing code
- Graceful degradation (Phase 1 works if Phase 2 fails)
- Clear logging messages
- No breaking changes

CRITICAL REQUIREMENTS:
- DO NOT modify any Phase 1 code beyond adding the router
- Phase 2 must be opt-in (default: disabled)
- If Phase 2 fails to load, log warning but continue
- Phase 1 must work even if Phase 2 dependencies are missing

REFERENCE IMPLEMENTATION (from PHASE_2_VOICE_IMPLEMENTATION_PLAN.md):
[See Step 5.1 and 5.2 in the plan]

ACCEPTANCE CRITERIA:
- [ ] app/config.py updated with Phase 2 settings
- [ ] app/main.py updated with conditional router inclusion
- [ ] .env.example updated with Phase 2 config
- [ ] Phase 1 still works with PHASE_2_ENABLED=false
- [ ] Phase 2 loads with PHASE_2_ENABLED=true
- [ ] Graceful degradation if Phase 2 fails
- [ ] Logging messages clear
- [ ] No breaking changes to Phase 1

Please provide the exact changes needed for each file (show before/after or provide complete updated sections).
```

---

## πŸ“‹ PROMPT 6: Testing & Validation

**Estimated Time:** 3 hours  
**Dependencies:** pytest (already installed)  
**Output:** `tests/unit/test_voice_asr.py`, `tests/unit/test_voice_tts.py`, `tests/integration/test_voice_api.py`

### Prompt

```
I'm implementing Phase 2 voice features for my ScamShield AI honeypot. I need to create comprehensive tests to ensure everything works correctly.

CONTEXT:
- Phase 2 is implemented (ASR, TTS, endpoints, UI, integration)
- Need unit tests for ASR and TTS modules
- Need integration tests for voice API endpoints
- Need to verify Phase 1 is not affected

REQUIREMENTS:

1. Create file: tests/unit/test_voice_asr.py

Test ASREngine:
- test_asr_engine_initialization() - Verify model loads
- test_asr_transcribe_english() - Test English transcription
- test_asr_transcribe_hindi() - Test Hindi transcription (if sample available)
- test_asr_confidence_calculation() - Test confidence scoring
- test_asr_error_handling() - Test with invalid audio
- test_asr_singleton() - Verify singleton pattern

2. Create file: tests/unit/test_voice_tts.py

Test TTSEngine:
- test_tts_engine_initialization() - Verify engine initializes
- test_tts_synthesize_english() - Test English synthesis
- test_tts_synthesize_hindi() - Test Hindi synthesis
- test_tts_language_mapping() - Test language code mapping
- test_tts_temp_file_generation() - Test auto file path
- test_tts_error_handling() - Test with invalid input
- test_tts_singleton() - Verify singleton pattern

3. Create file: tests/integration/test_voice_api.py

Test Voice API:
- test_voice_engage_endpoint() - Test full voice flow
  - Upload sample audio
  - Verify transcription in response
  - Verify AI reply text in response
  - Verify audio URL in response
  - Verify metadata (scam_detected, confidence, etc.)
- test_voice_audio_download() - Test audio file serving
- test_voice_health_endpoint() - Test health check
- test_voice_auth_required() - Test x-api-key authentication
- test_voice_invalid_audio() - Test error handling
- test_phase_1_unaffected() - Verify Phase 1 endpoints still work

4. Test fixtures:
- Create sample audio files (tests/fixtures/audio/):
  - sample_scam_en.wav (English scam message)
  - sample_scam_hi.wav (Hindi scam message, if available)
  - invalid_audio.txt (non-audio file for error testing)

5. Code quality:
- Use pytest fixtures
- Mock external dependencies where appropriate
- Clear test names and docstrings
- Assertions with descriptive messages
- Test both success and failure cases

CRITICAL: Test Phase 1 Isolation
- Run all existing Phase 1 tests
- Verify they still pass
- Verify Phase 1 endpoints work with PHASE_2_ENABLED=false

REFERENCE IMPLEMENTATION (from PHASE_2_VOICE_IMPLEMENTATION_PLAN.md):
[See Testing Plan section]

ACCEPTANCE CRITERIA:
- [ ] test_voice_asr.py created with all tests
- [ ] test_voice_tts.py created with all tests
- [ ] test_voice_api.py created with all tests
- [ ] All ASR tests pass
- [ ] All TTS tests pass
- [ ] All voice API tests pass
- [ ] Phase 1 tests still pass
- [ ] Test fixtures created
- [ ] Code coverage >80%
- [ ] Clear test documentation

Please generate all three test files with production-ready test code. Include instructions for creating sample audio fixtures.
```

---

## 🎯 Implementation Workflow

### Step-by-Step Process

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  PROMPT 1: ASR Module                                       β”‚
β”‚  β”œβ”€ Generate app/voice/asr.py                               β”‚
β”‚  β”œβ”€ Test: python -c "from app.voice.asr import get_asr_engine; print('OK')"
β”‚  └─ βœ“ Checkpoint: ASR module works                          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚
                            β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  PROMPT 2: TTS Module                                       β”‚
β”‚  β”œβ”€ Generate app/voice/tts.py                               β”‚
β”‚  β”œβ”€ Test: python -c "from app.voice.tts import get_tts_engine; print('OK')"
β”‚  └─ βœ“ Checkpoint: TTS module works                          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚
                            β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  PROMPT 3: Voice API                                        β”‚
β”‚  β”œβ”€ Generate app/api/voice_schemas.py                       β”‚
β”‚  β”œβ”€ Generate app/api/voice_endpoints.py                     β”‚
β”‚  β”œβ”€ Test: Check imports work                                β”‚
β”‚  └─ βœ“ Checkpoint: API code ready (not integrated yet)       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚
                            β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  PROMPT 4: Voice UI                                         β”‚
β”‚  β”œβ”€ Generate ui/voice.html                                  β”‚
β”‚  β”œβ”€ Generate ui/voice.js                                    β”‚
β”‚  β”œβ”€ Generate ui/voice.css                                   β”‚
β”‚  β”œβ”€ Test: Open voice.html in browser                        β”‚
β”‚  └─ βœ“ Checkpoint: UI renders (API not connected yet)        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚
                            β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  PROMPT 5: Integration                                      β”‚
β”‚  β”œβ”€ Update app/config.py                                    β”‚
β”‚  β”œβ”€ Update app/main.py                                      β”‚
β”‚  β”œβ”€ Update .env.example                                     β”‚
β”‚  β”œβ”€ Set PHASE_2_ENABLED=true in .env                        β”‚
β”‚  β”œβ”€ Test: Start server, check logs                          β”‚
β”‚  └─ βœ“ Checkpoint: Phase 2 integrated, server starts         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚
                            β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  PROMPT 6: Testing                                          β”‚
β”‚  β”œβ”€ Generate tests/unit/test_voice_asr.py                   β”‚
β”‚  β”œβ”€ Generate tests/unit/test_voice_tts.py                   β”‚
β”‚  β”œβ”€ Generate tests/integration/test_voice_api.py            β”‚
β”‚  β”œβ”€ Run: pytest tests/unit/test_voice_*.py                  β”‚
β”‚  β”œβ”€ Run: pytest tests/integration/test_voice_api.py         β”‚
β”‚  β”œβ”€ Run: pytest tests/ (all tests, including Phase 1)       β”‚
β”‚  └─ βœ“ Checkpoint: All tests pass                            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚
                            β–Ό
                    βœ… PHASE 2 COMPLETE!
```

---

## πŸ“Š Progress Tracking

### Checklist

Use this to track your progress:

- [ ] **PROMPT 1 COMPLETE** - ASR Module
  - [ ] app/voice/asr.py created
  - [ ] ASR module imports successfully
  - [ ] Basic transcription test works

- [ ] **PROMPT 2 COMPLETE** - TTS Module
  - [ ] app/voice/tts.py created
  - [ ] TTS module imports successfully
  - [ ] Basic synthesis test works

- [ ] **PROMPT 3 COMPLETE** - Voice API
  - [ ] app/api/voice_schemas.py created
  - [ ] app/api/voice_endpoints.py created
  - [ ] Imports work (no integration yet)

- [ ] **PROMPT 4 COMPLETE** - Voice UI
  - [ ] ui/voice.html created
  - [ ] ui/voice.js created
  - [ ] ui/voice.css created
  - [ ] UI renders in browser

- [ ] **PROMPT 5 COMPLETE** - Integration
  - [ ] app/config.py updated
  - [ ] app/main.py updated
  - [ ] .env.example updated
  - [ ] Server starts with Phase 2 enabled
  - [ ] Voice endpoints accessible

- [ ] **PROMPT 6 COMPLETE** - Testing
  - [ ] Unit tests created
  - [ ] Integration tests created
  - [ ] All tests pass
  - [ ] Phase 1 tests still pass

---

## 🚨 Important Notes

### Before Starting

1. **Backup your code:**
   ```bash
   git add .
   git commit -m "Backup before Phase 2 implementation"
   ```

2. **Install dependencies:**
   ```bash
   pip install -r requirements-phase2.txt
   ```

3. **Read the plan:**
   - Review `PHASE_2_VOICE_IMPLEMENTATION_PLAN.md`
   - Understand the architecture in `PHASE_2_ARCHITECTURE.md`

### During Implementation

1. **Test after each prompt:**
   - Don't move to the next prompt until the current one works
   - Run basic tests to verify functionality
   - Check logs for errors

2. **Track progress:**
   - Update `PHASE_2_CHECKLIST.md` as you complete tasks
   - Mark prompts complete in this file

3. **Ask for help:**
   - If a prompt doesn't work, ask the AI to debug
   - Provide error messages and logs
   - Reference the implementation plan

### After Completion

1. **Full testing:**
   ```bash
   # Test Phase 2
   pytest tests/unit/test_voice_*.py
   pytest tests/integration/test_voice_api.py
   
   # Test Phase 1 (verify no breaking changes)
   pytest tests/
   ```

2. **Manual testing:**
   - Open `http://localhost:8000/ui/voice.html`
   - Record a voice message
   - Verify AI responds with voice

3. **Documentation:**
   - Update main README.md with Phase 2 info
   - Document any issues or deviations from plan

---

## πŸŽ“ Tips for Success

### Working with AI Assistants

1. **Provide context:**
   - Attach relevant files (config, existing code)
   - Mention you're following a specific plan
   - Reference the implementation plan sections

2. **Be specific:**
   - If code doesn't work, provide exact error messages
   - Ask for specific fixes, not rewrites
   - Request explanations for unclear parts

3. **Iterate:**
   - Review generated code before using it
   - Test incrementally
   - Ask for improvements if needed

### Common Issues

| Issue | Solution | Prompt to Use |
|-------|----------|---------------|
| Import errors | Check dependencies installed | "I'm getting ImportError: [error]. How do I fix this?" |
| Whisper slow | Use smaller model | "Change WHISPER_MODEL to 'tiny' in the code" |
| Audio not playing | Check file path | "Debug audio file serving in voice_endpoints.py" |
| Phase 1 broken | Revert changes | "Show me how to make Phase 2 truly optional" |

---

## πŸ“ž Support

### If You Get Stuck

1. **Check the plan:**
   - `PHASE_2_VOICE_IMPLEMENTATION_PLAN.md` has detailed explanations
   - `PHASE_2_ARCHITECTURE.md` shows how components fit together

2. **Check logs:**
   ```bash
   tail -f logs/app.log
   ```

3. **Ask the AI:**
   - "I'm stuck on [step]. Here's my error: [error]. How do I fix it?"
   - Provide context from the implementation plan

### Getting Help from AI

**Good prompt:**
```
I'm implementing PROMPT 3 (Voice API) from PHASE_2_IMPLEMENTATION_PROMPTS.md.

I'm getting this error:
[paste error]

Here's my current code:
[paste relevant code]

How do I fix this? Reference the implementation plan if needed.
```

**Bad prompt:**
```
It doesn't work. Fix it.
```

---

## βœ… Success Criteria

Phase 2 is complete when:

- [x] All 6 prompts executed successfully
- [ ] All generated code works
- [ ] Server starts with PHASE_2_ENABLED=true
- [ ] Voice UI accessible at /ui/voice.html
- [ ] Can record voice and get AI voice reply
- [ ] All tests pass (Phase 1 + Phase 2)
- [ ] No breaking changes to Phase 1
- [ ] Documentation updated

---

## πŸŽ‰ You're Ready!

**Next Steps:**

1. **Start with PROMPT 1** (ASR Module)
2. **Copy the prompt** to your AI assistant
3. **Review the generated code**
4. **Test it works**
5. **Move to PROMPT 2**

**Estimated Total Time:** 17-21 hours

**You've got this!** πŸš€

---

*Created: 2026-02-10*

*For: ScamShield AI - Phase 2 Voice Implementation*

*Start with: PROMPT 1 (ASR Module)*