rodunia commited on
Commit
58e2ca7
Β·
verified Β·
1 Parent(s): 0fec979

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +1532 -7
app.py CHANGED
@@ -1,13 +1,1538 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
 
2
- from gradio_test import Test
 
3
 
4
- import gradio as gr
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
 
6
- example = Test().example_inputs()
 
 
 
 
 
 
 
 
 
 
7
 
8
- with gr.Blocks() as demo:
9
- Test(value=example, interactive=True)
10
- Test(value=example, interactive=False)
 
11
 
 
 
 
 
12
 
13
- demo.launch()
 
 
1
+ import os
2
+ import gradio as gr
3
+ import json
4
+ from datetime import datetime
5
+ from typing import List, Dict, Tuple
6
+ from dotenv import load_dotenv
7
+ import shutil
8
+ import tempfile
9
+ import google.generativeai as genai
10
+ import traceback
11
+ import numpy as np
12
+ import scipy.io.wavfile as wavfile
13
 
14
+ # Load environment variables
15
+ load_dotenv()
16
 
17
+ # Import OpenAI for Whisper transcription
18
+ from openai import OpenAI
19
+
20
+ # Initialize OpenAI client
21
+ openai_client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
22
+
23
+ # Configure Gemini for analysis
24
+ gemini_api_key = os.getenv("GEMINI_API_KEY")
25
+ if gemini_api_key:
26
+ genai.configure(api_key=gemini_api_key)
27
+ # Try to use the best available Gemini model
28
+ try:
29
+ # List available models
30
+ available_models = genai.list_models()
31
+ print("πŸ“‹ Available Gemini models:")
32
+ gemini_models = []
33
+ for model in available_models:
34
+ if 'generateContent' in model.supported_generation_methods:
35
+ print(f" - {model.name}")
36
+ gemini_models.append(model.name)
37
+
38
+ # Priority order: Try the best models first
39
+ model_priority = [
40
+ 'models/gemini-1.5-pro-latest', # Latest 1.5 Pro
41
+ 'models/gemini-1.5-pro', # Stable 1.5 Pro
42
+ 'models/gemini-1.5-pro-002', # Specific version
43
+ 'models/gemini-1.5-flash', # Faster but still good
44
+ 'models/gemini-pro' # Original Pro
45
+ ]
46
+
47
+ gemini_model = None
48
+ for model_name in model_priority:
49
+ if model_name in gemini_models:
50
+ try:
51
+ gemini_model = genai.GenerativeModel(
52
+ model_name.replace('models/', ''),
53
+ generation_config={
54
+ 'temperature': 0.7, # Balance creativity and consistency
55
+ 'top_p': 0.95,
56
+ 'top_k': 40,
57
+ 'max_output_tokens': 8192, # Increased for detailed analysis
58
+ }
59
+ )
60
+ print(f"βœ… Using {model_name} - Best available model!")
61
+ break
62
+ except Exception as e:
63
+ print(f" Could not initialize {model_name}: {e}")
64
+
65
+ # Fallback if none of the preferred models work
66
+ if not gemini_model and gemini_models:
67
+ model_name = gemini_models[0].replace('models/', '')
68
+ gemini_model = genai.GenerativeModel(model_name)
69
+ print(f"βœ… Using {model_name}")
70
+
71
+ if not gemini_model:
72
+ print("❌ No suitable Gemini models found!")
73
+
74
+ except Exception as e:
75
+ print(f"⚠️ Error listing Gemini models: {e}")
76
+ # Try direct initialization with best model
77
+ try:
78
+ gemini_model = genai.GenerativeModel(
79
+ 'gemini-1.5-pro',
80
+ generation_config={
81
+ 'temperature': 0.7,
82
+ 'top_p': 0.95,
83
+ 'top_k': 40,
84
+ 'max_output_tokens': 8192,
85
+ }
86
+ )
87
+ print("βœ… Gemini 1.5 Pro initialized (direct)")
88
+ except:
89
+ try:
90
+ gemini_model = genai.GenerativeModel('gemini-pro')
91
+ print("βœ… Gemini Pro initialized (fallback)")
92
+ except:
93
+ print("❌ Could not initialize any Gemini model!")
94
+ gemini_model = None
95
+ else:
96
+ print("⚠️ No Gemini API key found!")
97
+ gemini_model = None
98
+
99
+
100
+ class InterviewCoPilot:
101
+ def __init__(self):
102
+ self.transcript_history = []
103
+ self.research_questions = []
104
+ self.interview_protocol = []
105
+ self.detected_codes = []
106
+ self.coverage_status = {
107
+ "rq_covered": [],
108
+ "protocol_covered": []
109
+ }
110
+ # Add file tracking
111
+ self.processed_files = []
112
+ self.current_file_info = {}
113
+ self.current_audio_path = None # Store the current audio path
114
+
115
+ # Enhanced framework support - Initialize all attributes
116
+ self.theoretical_framework = ""
117
+ self.predefined_codes = {} # {category: [codes]}
118
+ self.analysis_focus = []
119
+ self.is_continuation = False # Initialize here
120
+ self.segment_number = 1 # Initialize here
121
+
122
+ # Session memory for Phase 1
123
+ self.session_segments = [] # List of processed segments
124
+ self.session_name = f"Interview_{datetime.now().strftime('%Y%m%d_%H%M%S')}"
125
+ self.framework_loaded = False
126
+
127
+ # Create a persistent temp directory for this session
128
+ self.temp_dir = tempfile.mkdtemp(prefix="interview_copilot_")
129
+ print(f"πŸ“ Created temp directory: {self.temp_dir}")
130
+
131
+ # Multi-view analysis support
132
+ self.segment_analyses = {} # Store individual segment analyses
133
+
134
+ def __del__(self):
135
+ """Cleanup temp directory on exit"""
136
+ if hasattr(self, 'temp_dir') and os.path.exists(self.temp_dir):
137
+ try:
138
+ shutil.rmtree(self.temp_dir)
139
+ print(f"🧹 Cleaned up temp directory: {self.temp_dir}")
140
+ except:
141
+ pass
142
+
143
+ def setup_research_context(self, research_questions: str, interview_protocol: str,
144
+ theoretical_framework: str = "", predefined_codes: str = "",
145
+ analysis_focus: str = ""):
146
+ """Setup the research context before starting interviews"""
147
+ if not research_questions.strip():
148
+ return "❌ Please provide at least research questions"
149
+
150
+ # Parse research questions
151
+ self.research_questions = [q.strip() for q in research_questions.split('\n') if q.strip()]
152
+
153
+ # Parse interview protocol
154
+ self.interview_protocol = [q.strip() for q in interview_protocol.split('\n') if q.strip()]
155
+
156
+ # Store theoretical framework
157
+ self.theoretical_framework = theoretical_framework.strip()
158
+
159
+ # Parse predefined codes (format: "Category: code1, code2, code3")
160
+ self.predefined_codes = {}
161
+ if predefined_codes.strip():
162
+ for line in predefined_codes.split('\n'):
163
+ if ':' in line:
164
+ category, codes = line.split(':', 1)
165
+ self.predefined_codes[category.strip()] = [
166
+ code.strip() for code in codes.split(',') if code.strip()
167
+ ]
168
+
169
+ # Parse analysis focus areas
170
+ self.analysis_focus = [f.strip() for f in analysis_focus.split('\n') if f.strip()]
171
+
172
+ # Initialize coverage tracking
173
+ self.coverage_status = {
174
+ "rq_covered": [False] * len(self.research_questions),
175
+ "protocol_covered": [False] * len(self.interview_protocol)
176
+ }
177
+
178
+ # Build status message
179
+ status_parts = [
180
+ f"βœ… Setup complete!",
181
+ f"πŸ“‹ Research Questions: {len(self.research_questions)}",
182
+ f"πŸ“ Protocol Questions: {len(self.interview_protocol)}"
183
+ ]
184
+
185
+ if self.theoretical_framework:
186
+ status_parts.append(f"πŸ“š Theoretical Framework: Yes")
187
+
188
+ if self.predefined_codes:
189
+ total_codes = sum(len(codes) for codes in self.predefined_codes.values())
190
+ status_parts.append(f"🏷️ Predefined Codes: {total_codes} codes in {len(self.predefined_codes)} categories")
191
+
192
+ if self.analysis_focus:
193
+ status_parts.append(f"🎯 Analysis Focus Areas: {len(self.analysis_focus)}")
194
+
195
+ # Mark framework as loaded
196
+ self.framework_loaded = True
197
+
198
+ return "\n".join(status_parts)
199
+
200
+ def add_segment_to_session(self, file_name, duration, transcript_length):
201
+ """Add a processed segment to the current session"""
202
+ segment_info = {
203
+ "number": len(self.session_segments) + 1,
204
+ "file_name": file_name,
205
+ "duration": duration,
206
+ "transcript_length": transcript_length,
207
+ "timestamp": datetime.now().strftime("%H:%M:%S"),
208
+ "codes_found": len(self.detected_codes)
209
+ }
210
+ self.session_segments.append(segment_info)
211
+ return segment_info
212
+
213
+ def get_session_summary(self):
214
+ """Get a summary of the current session"""
215
+ if not self.session_segments:
216
+ return "No segments processed yet"
217
+
218
+ total_duration = sum(seg.get("duration", 0) for seg in self.session_segments)
219
+ total_transcript = sum(seg.get("transcript_length", 0) for seg in self.session_segments)
220
+
221
+ summary = f"""### πŸ“Š Current Session: {self.session_name}
222
+
223
+ **Segments Processed:** {len(self.session_segments)}
224
+ **Total Duration:** {total_duration:.1f} minutes
225
+ **Total Transcript:** {total_transcript:,} characters
226
+ **Unique Codes Found:** {len(set(self.detected_codes))}
227
+
228
+ **Processed Files:**
229
+ """
230
+ for seg in self.session_segments:
231
+ summary += f"\nβœ“ Segment {seg['number']} - {seg['file_name']} ({seg['timestamp']})"
232
+
233
+ return summary
234
+
235
+ def reset_session(self, keep_framework=True):
236
+ """Reset the session but optionally keep the framework"""
237
+ self.session_segments = []
238
+ self.transcript_history = []
239
+ self.detected_codes = []
240
+ self.processed_files = []
241
+ self.segment_number = 1
242
+ self.is_continuation = False
243
+ self.segment_analyses = {} # Reset segment analyses
244
+
245
+ if not keep_framework:
246
+ self.research_questions = []
247
+ self.interview_protocol = []
248
+ self.theoretical_framework = ""
249
+ self.predefined_codes = {}
250
+ self.analysis_focus = []
251
+ self.framework_loaded = False
252
+ self.coverage_status = {
253
+ "rq_covered": [],
254
+ "protocol_covered": []
255
+ }
256
+ else:
257
+ # Reset only coverage status
258
+ self.coverage_status = {
259
+ "rq_covered": [False] * len(self.research_questions),
260
+ "protocol_covered": [False] * len(self.interview_protocol)
261
+ }
262
+
263
+ self.session_name = f"Interview_{datetime.now().strftime('%Y%m%d_%H%M%S')}"
264
+ return "βœ… Session reset. " + ("Framework kept." if keep_framework else "Everything cleared.")
265
+
266
+ def save_uploaded_file(self, audio_path):
267
+ """Save uploaded file to our temp directory to ensure it persists"""
268
+ if not audio_path or not os.path.exists(audio_path):
269
+ return None
270
+
271
+ try:
272
+ # Copy file to our temp directory
273
+ file_name = os.path.basename(audio_path)
274
+ saved_path = os.path.join(self.temp_dir, file_name)
275
+
276
+ # If file already exists, add timestamp to make unique
277
+ if os.path.exists(saved_path):
278
+ name, ext = os.path.splitext(file_name)
279
+ timestamp = datetime.now().strftime("%H%M%S")
280
+ file_name = f"{name}_{timestamp}{ext}"
281
+ saved_path = os.path.join(self.temp_dir, file_name)
282
+
283
+ shutil.copy2(audio_path, saved_path)
284
+ print(f"πŸ’Ύ Saved file to: {saved_path}")
285
+ return saved_path
286
+
287
+ except Exception as e:
288
+ print(f"❌ Error saving file: {str(e)}")
289
+ return None
290
+
291
+ def check_audio_file(self, audio_path):
292
+ """Pre-check audio file before processing"""
293
+ if not audio_path:
294
+ return None, "No file selected", None
295
+
296
+ try:
297
+ # Save the file to our temp directory
298
+ saved_path = self.save_uploaded_file(audio_path)
299
+ if not saved_path:
300
+ return None, "❌ Error saving uploaded file", None
301
+
302
+ file_size = os.path.getsize(saved_path)
303
+ file_size_mb = file_size / (1024 * 1024)
304
+ file_name = os.path.basename(saved_path)
305
+
306
+ # Store file info
307
+ self.current_file_info = {
308
+ "name": file_name,
309
+ "size_mb": file_size_mb,
310
+ "path": saved_path,
311
+ "original_path": audio_path
312
+ }
313
+
314
+ # Debug info
315
+ print(f"πŸ“Š File check:")
316
+ print(f" - Original path: {audio_path}")
317
+ print(f" - Saved path: {saved_path}")
318
+ print(f" - Size: {file_size_mb:.2f} MB")
319
+ print(f" - Exists: {os.path.exists(saved_path)}")
320
+
321
+ # Check file size
322
+ if file_size_mb > 25:
323
+ status = f"""⚠️ **File too large for direct processing**
324
+ - File: {file_name}
325
+ - Size: {file_size_mb:.1f} MB
326
+ - Maximum: 25 MB
327
+
328
+ **Options:**
329
+ 1. Compress the file using the compression tool below
330
+ 2. Split into smaller segments
331
+ 3. Use a different recording with lower quality settings"""
332
+ return None, status, saved_path
333
+
334
+ # Good to go
335
+ status = f"""βœ… **File ready for processing**
336
+ - File: {file_name}
337
+ - Size: {file_size_mb:.1f} MB
338
+ - Status: Within limits
339
+ - Saved to: {os.path.basename(self.temp_dir)}/"""
340
+
341
+ return saved_path, status, saved_path
342
+
343
+ except Exception as e:
344
+ print(f"❌ Error in check_audio_file: {traceback.format_exc()}")
345
+ return None, f"❌ Error checking file: {str(e)}", None
346
+
347
+ def compress_audio(self, audio_path, quality="medium"):
348
+ """Compress audio file with different quality settings"""
349
+ # Handle different input types
350
+ actual_path = None
351
+
352
+ # If it's a tuple (sample_rate, audio_data), save it first
353
+ if isinstance(audio_path, tuple) and len(audio_path) == 2:
354
+ sample_rate, audio_data = audio_path
355
+ # Save to temporary file
356
+ temp_path = os.path.join(self.temp_dir, f"temp_audio_{datetime.now().strftime('%H%M%S')}.wav")
357
+ wavfile.write(temp_path, sample_rate, audio_data)
358
+ actual_path = temp_path
359
+ elif isinstance(audio_path, str):
360
+ actual_path = audio_path
361
+ else:
362
+ return None, "No valid audio file to compress"
363
+
364
+ if not actual_path or not os.path.exists(actual_path):
365
+ return None, "No file to compress or file not found"
366
+
367
+ try:
368
+ import subprocess
369
+
370
+ # Quality presets
371
+ quality_settings = {
372
+ "high": {"bitrate": "128k", "sample_rate": "44100"},
373
+ "medium": {"bitrate": "64k", "sample_rate": "22050"},
374
+ "low": {"bitrate": "32k", "sample_rate": "16000"}
375
+ }
376
+
377
+ settings = quality_settings.get(quality, quality_settings["medium"])
378
+
379
+ # Create output filename in our temp directory
380
+ input_name = os.path.basename(actual_path)
381
+ name, ext = os.path.splitext(input_name)
382
+ output_path = os.path.join(self.temp_dir, f"{name}_compressed{ext}")
383
+
384
+ # Compress
385
+ cmd = [
386
+ 'ffmpeg', '-i', actual_path,
387
+ '-b:a', settings["bitrate"],
388
+ '-ar', settings["sample_rate"],
389
+ '-ac', '1', # Mono
390
+ '-y', output_path
391
+ ]
392
+
393
+ result = subprocess.run(cmd, capture_output=True, text=True)
394
+
395
+ if result.returncode == 0:
396
+ # Check new size
397
+ new_size = os.path.getsize(output_path) / (1024 * 1024)
398
+ old_size = os.path.getsize(actual_path) / (1024 * 1024)
399
+
400
+ # Update file info
401
+ self.current_file_info["path"] = output_path
402
+ self.current_file_info["size_mb"] = new_size
403
+
404
+ return output_path, f"""βœ… **Compression successful!**
405
+ - Original size: {old_size:.1f} MB
406
+ - Compressed size: {new_size:.1f} MB
407
+ - Reduction: {((old_size - new_size) / old_size * 100):.0f}%
408
+ - Quality setting: {quality}
409
+ - Saved to: {os.path.basename(output_path)}"""
410
+ else:
411
+ return None, f"❌ Compression failed: {result.stderr}"
412
+
413
+ except subprocess.SubprocessError as e:
414
+ return None, f"❌ FFmpeg error: {str(e)}\n\nMake sure ffmpeg is installed."
415
+ except Exception as e:
416
+ return None, f"❌ Error: {str(e)}"
417
+
418
+ def transcribe_audio(self, audio_path: str, progress_callback=None) -> str:
419
+ """Transcribe audio using Whisper API with progress updates"""
420
+ if not audio_path:
421
+ return "Error: No audio file provided"
422
+
423
+ if not os.path.exists(audio_path):
424
+ return f"Error: Audio file not found at path: {audio_path}"
425
+
426
+ if not openai_client.api_key:
427
+ return "Error: OpenAI API key not found (needed for transcription)"
428
+
429
+ try:
430
+ file_size = os.path.getsize(audio_path)
431
+ file_size_mb = file_size / (1024 * 1024)
432
+ print(f"πŸ“Š Transcribing file: {audio_path}")
433
+ print(f"πŸ“Š File size: {file_size_mb:.2f} MB ({file_size} bytes)")
434
+
435
+ # Check if it's actually over 25MB (OpenAI's limit)
436
+ if file_size_mb > 25:
437
+ return f"Error: Audio file too large. File size: {file_size_mb:.1f} MB (limit: 25 MB)"
438
+
439
+ # Update progress if callback provided
440
+ if progress_callback:
441
+ progress_callback(f"🎡 Transcribing {file_size_mb:.1f} MB file with OpenAI Whisper...")
442
+
443
+ with open(audio_path, "rb") as audio_file:
444
+ print("πŸ“Š Sending to OpenAI Whisper API...")
445
+ # New OpenAI v1.x syntax
446
+ transcript = openai_client.audio.transcriptions.create(
447
+ model="whisper-1",
448
+ file=audio_file,
449
+ response_format="text"
450
+ )
451
+
452
+ # In the new API, the response is directly the text
453
+ text = transcript if isinstance(transcript, str) else str(transcript)
454
+
455
+ # Add file info to transcript
456
+ file_name = self.current_file_info.get("name", "unknown")
457
+ if file_name not in self.processed_files:
458
+ self.processed_files.append(file_name)
459
+
460
+ print(f"βœ… Transcription successful! Length: {len(text)} characters")
461
+ return text
462
+
463
+ except Exception as e:
464
+ error_msg = str(e)
465
+ print(f"❌ OpenAI API error: {error_msg}")
466
+
467
+ # Check for specific error types
468
+ if "Invalid file format" in error_msg:
469
+ return "Error: Invalid audio file format. Supported formats: mp3, mp4, mpeg, mpga, m4a, wav, webm"
470
+ elif "too large" in error_msg.lower():
471
+ return "Error: Audio file too large. Please use files under 25MB."
472
+ elif "Incorrect API key" in error_msg or "Authentication" in error_msg:
473
+ return "Error: Invalid OpenAI API key. Please check your .env file."
474
+ elif "Rate limit" in error_msg:
475
+ return "Error: OpenAI rate limit reached. Please wait a moment and try again."
476
+ else:
477
+ return f"Error: {error_msg}"
478
+
479
+ def analyze_transcript_with_gemini(self, text: str) -> Dict:
480
+ """Analyze transcript using Gemini with advanced prompt"""
481
+ # Use the enhanced version by default
482
+ return self.analyze_transcript_with_gemini_enhanced(text, segment_num=self.segment_number)
483
+
484
+ def analyze_transcript_with_gemini_enhanced(self, text: str, segment_num: int = None) -> Dict:
485
+ """Enhanced analysis that tracks individual segments and can combine them"""
486
+
487
+ if not text or len(text.strip()) < 10:
488
+ return {"error": "Text too short to analyze"}
489
+
490
+ if not self.research_questions:
491
+ return {"error": "Please set up research questions first"}
492
+
493
+ if not gemini_model:
494
+ return {"error": "Gemini API not configured"}
495
+
496
+ # Determine if this is a specific segment or combined analysis
497
+ is_combined = segment_num is None
498
+ current_segment = segment_num if segment_num else self.segment_number
499
+
500
+ # Build context section
501
+ context_parts = []
502
+
503
+ if is_combined:
504
+ context_parts.append("This is a COMBINED ANALYSIS of all segments.")
505
+ context_parts.append(f"Total segments: {len(self.session_segments)}")
506
+ else:
507
+ context_parts.append(f"This is Segment {current_segment} of the interview.")
508
+ if current_segment > 1:
509
+ context_parts.append("Previous segments have covered:")
510
+ covered_rqs = [f"RQ{i + 1}" for i, covered in enumerate(self.coverage_status["rq_covered"]) if covered]
511
+ if covered_rqs:
512
+ context_parts.append(f"- Research Questions: {', '.join(covered_rqs)}")
513
+
514
+ context_section = "\n".join(context_parts)
515
+
516
+ # Build framework section
517
+ framework_section = ""
518
+ if self.theoretical_framework:
519
+ framework_section += f"\nTHEORETICAL FRAMEWORK:\n{self.theoretical_framework}\n"
520
+
521
+ if self.predefined_codes:
522
+ framework_section += "\nPREDEFINED CODES:\n"
523
+ for category, codes in self.predefined_codes.items():
524
+ framework_section += f"- {category}: {', '.join(codes)}\n"
525
+
526
+ if self.analysis_focus:
527
+ framework_section += "\nANALYSIS FOCUS:\n"
528
+ framework_section += "\n".join([f"- {focus}" for focus in self.analysis_focus])
529
+
530
+ # Modified prompt for combined vs individual analysis
531
+ analysis_type = "COMBINED TRANSCRIPT" if is_combined else f"SEGMENT {current_segment}"
532
+
533
+ prompt = f"""You are a Qualitative Research Analysis Assistant.
534
+
535
+ {context_section}
536
+
537
+ {analysis_type}: "{text}"
538
+
539
+ RESEARCH FRAMEWORK:
540
+ - Research Questions:
541
+ {chr(10).join([f" RQ{i + 1}: {q}" for i, q in enumerate(self.research_questions)])}
542
+
543
+ - Interview Protocol:
544
+ {chr(10).join([f" Q{i + 1}: {q}" for i, q in enumerate(self.interview_protocol)])}
545
+
546
+ {framework_section}
547
+
548
+ ANALYSIS TASKS:
549
+ 1. Apply predefined codes where relevant
550
+ 2. Identify emergent codes not in the framework
551
+ 3. Track research question coverage
552
+ 4. Note theoretical alignments or challenges
553
+ 5. Consider the analysis focus areas
554
+ {"6. Identify patterns across segments" if is_combined else ""}
555
+ {"7. Note evolution of themes" if is_combined else ""}
556
+
557
+ PROVIDE YOUR ANALYSIS IN THIS EXACT JSON FORMAT:
558
+ {{
559
+ "segment_number": {current_segment if not is_combined else '"combined"'},
560
+ "analysis_type": "{"combined" if is_combined else "individual"}",
561
+ "alerts": [
562
+ {{"type": "supports", "code": "Code Name", "text": "βœ… Supports [Theory/Concept]: ..."}},
563
+ {{"type": "challenges", "text": "⚠️ Challenges [Framework]: ..."}},
564
+ {{"type": "missing", "text": "πŸ” Missing [Dimension]: ..."}},
565
+ {{"type": "emergent", "code": "New Code", "text": "✳️ Emergent theme: ..."}},
566
+ {{"type": "noteworthy", "text": "πŸ“Œ Noteworthy: ..."}}
567
+ ],
568
+ "rq_addressed": [1, 2],
569
+ "codes_applied": ["Code 1", "Code 2"],
570
+ "emergent_codes": ["New Theme 1"],
571
+ "coverage": {{
572
+ "protocol_covered": [1, 3, 5],
573
+ "completion_percent": 40,
574
+ "missing_topics": ["Topic A", "Topic B"]
575
+ }},
576
+ "follow_ups": [
577
+ "🧭 To explore [concept], ask: 'Question?'",
578
+ "🧭 RQ3 needs data on [topic]"
579
+ ],
580
+ "insights": [
581
+ "Key pattern or finding",
582
+ "Theoretical implication"
583
+ ],
584
+ "segment_summary": "Brief summary of {"all segments combined" if is_combined else "this segment's contribution"}"{', "cross_segment_patterns": ["Pattern 1", "Pattern 2"],' if is_combined else ""}{'"theme_evolution": "Description of how themes evolved across segments"' if is_combined else ""}
585
+ }}
586
+
587
+ Return ONLY the JSON."""
588
+
589
+ try:
590
+ print(f"πŸ€– Analyzing {analysis_type} with Gemini...")
591
+ response = gemini_model.generate_content(prompt)
592
+ content = response.text.strip()
593
+
594
+ # Parse JSON response
595
+ try:
596
+ start = content.find('{')
597
+ end = content.rfind('}') + 1
598
+ if start >= 0 and end > start:
599
+ json_str = content[start:end]
600
+ analysis = json.loads(json_str)
601
+ else:
602
+ analysis = json.loads(content)
603
+
604
+ except json.JSONDecodeError:
605
+ print(f"JSON parsing error. Raw response: {content[:200]}...")
606
+ # Return a default structure
607
+ analysis = {
608
+ "segment_number": current_segment if not is_combined else "combined",
609
+ "analysis_type": "combined" if is_combined else "individual",
610
+ "alerts": [],
611
+ "rq_addressed": [],
612
+ "codes_applied": [],
613
+ "emergent_codes": [],
614
+ "coverage": {
615
+ "protocol_covered": [],
616
+ "completion_percent": 0,
617
+ "missing_topics": []
618
+ },
619
+ "follow_ups": ["Please try again"],
620
+ "insights": ["Unable to parse response"],
621
+ "segment_summary": "Analysis failed"
622
+ }
623
+
624
+ # Store individual segment analysis
625
+ if not is_combined:
626
+ self.segment_analyses[current_segment] = analysis
627
+
628
+ # Update coverage tracking
629
+ for rq_num in analysis.get("rq_addressed", []):
630
+ if isinstance(rq_num, int) and 0 < rq_num <= len(self.research_questions):
631
+ self.coverage_status["rq_covered"][rq_num - 1] = True
632
+
633
+ for pq_num in analysis.get("coverage", {}).get("protocol_covered", []):
634
+ if isinstance(pq_num, int) and 0 < pq_num <= len(self.interview_protocol):
635
+ self.coverage_status["protocol_covered"][pq_num - 1] = True
636
+
637
+ # Add codes to master list
638
+ self.detected_codes.extend(analysis.get("codes_applied", []))
639
+ self.detected_codes.extend(analysis.get("emergent_codes", []))
640
+
641
+ return analysis
642
+
643
+ except Exception as e:
644
+ print(f"❌ Gemini error: {type(e).__name__}: {str(e)}")
645
+ return {"error": f"Analysis error: {str(e)}"}
646
+
647
+ def format_analysis_output(self, analysis: Dict, show_segment_info: bool = True) -> str:
648
+ """Format analysis output with segment information"""
649
+
650
+ if "error" in analysis:
651
+ return f"❌ {analysis['error']}"
652
+
653
+ # Determine analysis type
654
+ is_combined = analysis.get("analysis_type") == "combined"
655
+ segment_num = analysis.get("segment_number", "Unknown")
656
+
657
+ # Format alerts section
658
+ alerts_text = ""
659
+ if "alerts" in analysis:
660
+ alerts_text = "### πŸ“’ Analysis Alerts:\n"
661
+ for alert in analysis.get("alerts", []):
662
+ alerts_text += f"{alert.get('text', '')}\n"
663
+
664
+ # Format codes section
665
+ codes_section = ""
666
+ applied_codes = analysis.get("codes_applied", [])
667
+ emergent_codes = analysis.get("emergent_codes", [])
668
+
669
+ if applied_codes:
670
+ codes_section += f"**Applied Codes:** {', '.join(applied_codes)}\n"
671
+ if emergent_codes:
672
+ codes_section += f"**✳️ Emergent Codes:** {', '.join(emergent_codes)}\n"
673
+
674
+ # Build header based on type
675
+ if is_combined:
676
+ header = "### πŸ“Š Combined Analysis Results (All Segments)"
677
+ segment_info = f"**Total Segments Analyzed:** {len(self.session_segments)}\n"
678
+ else:
679
+ header = f"### πŸ“Š Analysis Results - Segment {segment_num}"
680
+ segment_info = f"**πŸ“ Segment {segment_num} Summary:** {analysis.get('segment_summary', 'Analysis of this segment')}\n"
681
+
682
+ # Get file name for current segment
683
+ file_info = ""
684
+ if not is_combined and segment_num != "Unknown" and isinstance(segment_num, int):
685
+ if segment_num <= len(self.session_segments):
686
+ file_info = f"**File:** {self.session_segments[segment_num - 1].get('file_name', 'unknown')}\n"
687
+
688
+ # Build main analysis text
689
+ analysis_text = f"""{header}
690
+
691
+ {segment_info if show_segment_info else ""}{file_info}**Research Questions Addressed:** {', '.join([f"RQ{n}" for n in analysis.get('rq_addressed', [])])}
692
+
693
+ {alerts_text}
694
+
695
+ **Codes/Themes:**
696
+ {codes_section}
697
+
698
+ **Protocol Coverage:** {', '.join([f"Q{n}" for n in analysis.get('coverage', {}).get('protocol_covered', [])])}
699
+ **Completion:** {analysis.get('coverage', {}).get('completion_percent', 0)}% of protocol addressed
700
+
701
+ **Key Insights:**
702
+ {chr(10).join(['β€’ ' + insight for insight in analysis.get('insights', [])])}"""
703
+
704
+ # Add combined-specific sections
705
+ if is_combined:
706
+ if "cross_segment_patterns" in analysis:
707
+ analysis_text += "\n\n**Cross-Segment Patterns:**\n"
708
+ analysis_text += chr(10).join(
709
+ ['β€’ ' + pattern for pattern in analysis.get('cross_segment_patterns', [])])
710
+
711
+ if "theme_evolution" in analysis:
712
+ analysis_text += f"\n\n**Theme Evolution:**\n{analysis.get('theme_evolution', '')}"
713
+
714
+ missing_topics = analysis.get('coverage', {}).get('missing_topics', [])
715
+ if missing_topics:
716
+ analysis_text += f"\n\n**Missing Topics:**\n{chr(10).join(['β€’ ' + topic for topic in missing_topics])}"
717
+
718
+ return analysis_text
719
+
720
+ def generate_multi_view_analysis(self):
721
+ """Generate both individual segment analyses and combined analysis"""
722
+
723
+ if not hasattr(self, 'segment_analyses') or not self.segment_analyses:
724
+ return "No segments analyzed yet", "", ""
725
+
726
+ # Format individual segment analyses
727
+ individual_analyses = "## πŸ“‘ Individual Segment Analyses\n\n"
728
+
729
+ for seg_num in sorted(self.segment_analyses.keys()):
730
+ analysis = self.segment_analyses[seg_num]
731
+ formatted = self.format_analysis_output(analysis, show_segment_info=True)
732
+ individual_analyses += f"{formatted}\n\n{'=' * 50}\n\n"
733
+
734
+ # Generate combined analysis if multiple segments
735
+ combined_analysis = ""
736
+ if len(self.segment_analyses) > 1:
737
+ # Combine all transcripts
738
+ all_transcripts = "\n\n".join(self.transcript_history)
739
+
740
+ # Run combined analysis
741
+ combined_result = self.analyze_transcript_with_gemini_enhanced(all_transcripts, segment_num=None)
742
+ combined_analysis = "## πŸ”— Combined Analysis (All Segments Together)\n\n"
743
+ combined_analysis += self.format_analysis_output(combined_result, show_segment_info=True)
744
+ else:
745
+ combined_analysis = "Combined analysis requires at least 2 segments"
746
+
747
+ # Generate comparison view
748
+ comparison_view = self.generate_comparison_view()
749
+
750
+ return individual_analyses, combined_analysis, comparison_view
751
+
752
+ def generate_comparison_view(self):
753
+ """Generate a comparison view of segments"""
754
+
755
+ if not hasattr(self, 'segment_analyses') or not self.segment_analyses:
756
+ return "No segments to compare"
757
+
758
+ comparison = "## πŸ“Š Segment Comparison\n\n"
759
+
760
+ # Create comparison table
761
+ comparison += "| Segment | RQs Addressed | Codes Applied | Emergent Codes | Completion % |\n"
762
+ comparison += "|---------|---------------|---------------|----------------|-------------|\n"
763
+
764
+ for seg_num in sorted(self.segment_analyses.keys()):
765
+ analysis = self.segment_analyses[seg_num]
766
+ rqs = ', '.join([f"RQ{n}" for n in analysis.get('rq_addressed', [])])
767
+ applied = len(analysis.get('codes_applied', []))
768
+ emergent = len(analysis.get('emergent_codes', []))
769
+ completion = analysis.get('coverage', {}).get('completion_percent', 0)
770
+
771
+ comparison += f"| {seg_num} | {rqs} | {applied} | {emergent} | {completion}% |\n"
772
+
773
+ # Add theme tracking
774
+ comparison += "\n### πŸ“ˆ Theme Frequency Across Segments\n\n"
775
+
776
+ # Track code frequency by segment
777
+ code_by_segment = {}
778
+ for seg_num, analysis in self.segment_analyses.items():
779
+ all_codes = analysis.get('codes_applied', []) + analysis.get('emergent_codes', [])
780
+ for code in all_codes:
781
+ if code not in code_by_segment:
782
+ code_by_segment[code] = {}
783
+ code_by_segment[code][seg_num] = code_by_segment[code].get(seg_num, 0) + 1
784
+
785
+ # Display theme tracking
786
+ for code, segments in sorted(code_by_segment.items()):
787
+ seg_info = ', '.join([f"Seg{s}: {count}x" for s, count in sorted(segments.items())])
788
+ comparison += f"- **{code}**: {seg_info}\n"
789
+
790
+ return comparison
791
+
792
+ def process_interview_segment(self, audio_path, progress_callback=None):
793
+ """Process an audio segment and return transcript and analysis"""
794
+ print(f"\n🎯 Starting process_interview_segment")
795
+ print(f" Audio path provided: {audio_path}")
796
+ print(f" Type of audio_path: {type(audio_path)}")
797
+
798
+ # Handle different types of audio input
799
+ actual_audio_path = None
800
+
801
+ # Case 1: audio_path is a tuple (sample_rate, audio_data) from recording
802
+ if isinstance(audio_path, tuple) and len(audio_path) == 2:
803
+ print(" Detected audio data tuple (recording)")
804
+ sample_rate, audio_data = audio_path
805
+ # Save the audio data to a temporary file
806
+ temp_path = os.path.join(self.temp_dir, f"recorded_{datetime.now().strftime('%H%M%S')}.wav")
807
+ wavfile.write(temp_path, sample_rate, audio_data)
808
+ actual_audio_path = temp_path
809
+ print(f" Saved recording to: {temp_path}")
810
+
811
+ # Case 2: audio_path is a string (file path)
812
+ elif isinstance(audio_path, str):
813
+ actual_audio_path = audio_path
814
+
815
+ # Case 3: audio_path is None, check if we have a saved file
816
+ elif audio_path is None and self.current_file_info:
817
+ actual_audio_path = self.current_file_info.get("path")
818
+ print(f" Using saved path: {actual_audio_path}")
819
+
820
+ # Validate we have a valid path
821
+ if not actual_audio_path or not os.path.exists(actual_audio_path):
822
+ return "", "❌ No audio file found. Please upload a file or record audio first.", "", "", "No file to process"
823
+
824
+ # Get file info
825
+ if isinstance(audio_path, tuple):
826
+ file_name = f"recorded_{datetime.now().strftime('%H%M%S')}.wav"
827
+ file_size = os.path.getsize(actual_audio_path) / (1024 * 1024)
828
+ # Update current file info for recording
829
+ self.current_file_info = {
830
+ "name": file_name,
831
+ "size_mb": file_size,
832
+ "path": actual_audio_path
833
+ }
834
+ else:
835
+ file_name = self.current_file_info.get("name", os.path.basename(actual_audio_path))
836
+ file_size = self.current_file_info.get("size_mb", os.path.getsize(actual_audio_path) / (1024 * 1024))
837
+
838
+ # Progress update
839
+ progress = f"""πŸ”„ Processing: {file_name} ({file_size:.1f} MB)
840
+
841
+ πŸ“Š Current Step: Transcribing audio with Whisper...
842
+ ⏱️ Estimated time: {int(file_size * 0.5)}-{int(file_size * 1)} minutes for transcription
843
+
844
+ πŸ’‘ Tip: Larger files take longer. A 10MB file typically takes 5-10 minutes."""
845
+
846
+ # Update progress callback if provided
847
+ if progress_callback:
848
+ progress_callback(progress)
849
+
850
+ # Transcribe with Whisper
851
+ print(f"🎡 Starting transcription of {file_size:.1f} MB file...")
852
+ start_time = datetime.now()
853
+ transcript = self.transcribe_audio(actual_audio_path, progress_callback)
854
+ transcription_time = (datetime.now() - start_time).total_seconds()
855
+ print(f"βœ… Transcription completed in {transcription_time:.1f} seconds")
856
+
857
+ if transcript.startswith("Error:"):
858
+ return transcript, "❌ Transcription failed", "", "", progress + "\n\n❌ Transcription failed"
859
+
860
+ # Add to history with file info
861
+ timestamp = datetime.now().strftime("%H:%M:%S")
862
+
863
+ # Safely check for continuation attributes
864
+ is_continuation = getattr(self, 'is_continuation', False)
865
+ segment_number = getattr(self, 'segment_number', 1)
866
+
867
+ segment_label = f"Segment {segment_number}" if is_continuation else "Segment 1"
868
+ self.transcript_history.append(f"[{timestamp}] [{file_name}] [{segment_label}] {transcript}")
869
+
870
+ # Check if research context is set up
871
+ if not self.research_questions:
872
+ full_transcript = "\n\n".join(self.transcript_history)
873
+ return full_transcript, "⚠️ Please set up research questions first", "", "", progress
874
+
875
+ # Update progress for analysis phase
876
+ progress = f"""βœ… Transcription complete! ({transcription_time:.1f} seconds)
877
+
878
+ πŸ“Š Current Step: Analyzing with Gemini 1.5 Pro...
879
+ πŸ” Analyzing {segment_label}
880
+ ⏱️ This usually takes 10-30 seconds..."""
881
+
882
+ if progress_callback:
883
+ progress_callback(progress)
884
+
885
+ # Analyze with Gemini
886
+ print(f"πŸ€– Starting Gemini analysis...")
887
+ analysis_start = datetime.now()
888
+ analysis = self.analyze_transcript_with_gemini(transcript)
889
+ analysis_time = (datetime.now() - analysis_start).total_seconds()
890
+ print(f"βœ… Analysis completed in {analysis_time:.1f} seconds")
891
+
892
+ # Format outputs
893
+ full_transcript = "\n\n".join(self.transcript_history)
894
+
895
+ if "error" not in analysis:
896
+ # Format analysis output
897
+ analysis_text = self.format_analysis_output(analysis)
898
+
899
+ follow_ups = "### πŸ’‘ Suggested Follow-ups:\n" + \
900
+ '\n'.join(analysis.get('follow_ups', []))
901
+
902
+ rq_coverage = sum(self.coverage_status["rq_covered"]) / len(
903
+ self.research_questions) * 100 if self.research_questions else 0
904
+ protocol_coverage = sum(self.coverage_status["protocol_covered"]) / len(
905
+ self.interview_protocol) * 100 if self.interview_protocol else 0
906
+
907
+ # Track unique codes
908
+ all_codes = list(set(self.detected_codes))
909
+ applied_unique = list(set(analysis.get("codes_applied", [])))
910
+ emergent_unique = list(set(analysis.get("emergent_codes", [])))
911
+
912
+ coverage = f"""### πŸ“ˆ Overall Progress:
913
+ - **Research Questions:** {rq_coverage:.0f}% ({sum(self.coverage_status["rq_covered"])}/{len(self.research_questions)})
914
+ - **Protocol Questions:** {protocol_coverage:.0f}% ({sum(self.coverage_status["protocol_covered"])}/{len(self.interview_protocol)})
915
+ - **Total Unique Codes:** {len(all_codes)}
916
+ - Framework Codes: {len(applied_unique)}
917
+ - Emergent Codes: {len(emergent_unique)}
918
+ - **Segments Processed:** {len(self.processed_files)}"""
919
+
920
+ progress = f"βœ… Completed: {file_name} ({segment_label})"
921
+ else:
922
+ analysis_text = f"❌ {analysis['error']}"
923
+ follow_ups = "Unable to generate follow-ups"
924
+ coverage = "Unable to calculate coverage"
925
+ progress = f"❌ Failed: {file_name}"
926
+
927
+ return full_transcript, analysis_text, follow_ups, coverage, progress
928
+
929
+
930
+ # Initialize
931
+ copilot = InterviewCoPilot()
932
+
933
+ # Create improved interface
934
+ with gr.Blocks(title="Research Interview Co-Pilot", theme=gr.themes.Soft(), css="""
935
+ .file-info { background-color: #f0f0f0; padding: 10px; border-radius: 5px; margin: 10px 0; }
936
+ .success { color: #28a745; }
937
+ .warning { color: #ffc107; }
938
+ .error { color: #dc3545; }
939
+ h1 { text-align: center; }
940
+ .contain { max-width: 1200px; margin: auto; }
941
+ """) as app:
942
+ gr.Markdown("""
943
+ # πŸŽ™οΈ Research Interview Co-Pilot - Enhanced with Multi-View Analysis
944
+
945
+ **Transcription:** OpenAI Whisper | **Analysis:** Google Gemini Pro
946
+
947
+ Now with individual segment analysis, combined analysis, and segment comparison!
948
+ """)
949
+
950
+ with gr.Tab("πŸ“‹ Setup"):
951
+ gr.Markdown("### Set up your research context")
952
+
953
+ with gr.Row():
954
+ with gr.Column():
955
+ rq_input = gr.Textbox(
956
+ label="Research Questions (one per line) *",
957
+ placeholder="What pedagogical strategies are evident in AI educators?\nHow do AI tools emphasize practical applications?\nWhat are the differences between various AI approaches?",
958
+ lines=6
959
+ )
960
+
961
+ protocol_input = gr.Textbox(
962
+ label="Interview Protocol Questions (one per line)",
963
+ placeholder="Tell me about your experience with AI\nHow do you use AI tools?\nWhat challenges have you faced?",
964
+ lines=6
965
+ )
966
+
967
+ with gr.Column():
968
+ framework_input = gr.Textbox(
969
+ label="Theoretical Framework (optional)",
970
+ placeholder="e.g., Technology Acceptance Model (TAM)\nGrounded Theory approach\nActivity Theory lens",
971
+ lines=3
972
+ )
973
+
974
+ codes_input = gr.Textbox(
975
+ label="Predefined Codes (optional - format: 'Category: code1, code2')",
976
+ placeholder="Pedagogical: Scaffolding, Direct Instruction, Guided Practice\nPractical: Application, Implementation, Real-world Use\nEthical: Privacy Concerns, Bias Awareness, Transparency",
977
+ lines=6
978
+ )
979
+
980
+ focus_input = gr.Textbox(
981
+ label="Analysis Focus Areas (optional - one per line)",
982
+ placeholder="Look for emotional responses\nPay attention to metaphors used\nNote any resistance or enthusiasm",
983
+ lines=3
984
+ )
985
+
986
+ # Segment continuation option
987
+ with gr.Row():
988
+ continue_interview = gr.Checkbox(
989
+ label="This is a continuation of a previous interview segment",
990
+ value=False
991
+ )
992
+ segment_info = gr.Textbox(
993
+ label="Segment Info",
994
+ value="Segment 1",
995
+ interactive=False
996
+ )
997
+
998
+ setup_btn = gr.Button("Setup Research Context", variant="primary", size="lg")
999
+ setup_output = gr.Textbox(label="Setup Status", interactive=False, lines=6)
1000
+
1001
+ # Save/Load framework buttons
1002
+ with gr.Row():
1003
+ save_framework_btn = gr.Button("πŸ’Ύ Save Framework", size="sm")
1004
+ load_framework_btn = gr.Button("πŸ“‚ Load Framework", size="sm")
1005
+ framework_file = gr.File(label="Framework File", visible=False, file_types=[".json"])
1006
+
1007
+
1008
+ def update_segment_info(is_continuation):
1009
+ if is_continuation:
1010
+ copilot.is_continuation = True
1011
+ copilot.segment_number += 1
1012
+ return f"Segment {copilot.segment_number} (Continuing from previous)"
1013
+ else:
1014
+ copilot.is_continuation = False
1015
+ copilot.segment_number = 1
1016
+ return "Segment 1"
1017
+
1018
+
1019
+ def save_framework(rq, protocol, framework, codes, focus):
1020
+ """Save current framework to JSON file"""
1021
+ framework_data = {
1022
+ "research_questions": rq,
1023
+ "interview_protocol": protocol,
1024
+ "theoretical_framework": framework,
1025
+ "predefined_codes": codes,
1026
+ "analysis_focus": focus,
1027
+ "saved_date": datetime.now().isoformat()
1028
+ }
1029
+
1030
+ filename = f"framework_{datetime.now().strftime('%Y%m%d_%H%M%S')}.json"
1031
+ filepath = os.path.join(copilot.temp_dir, filename)
1032
+
1033
+ with open(filepath, 'w') as f:
1034
+ json.dump(framework_data, f, indent=2)
1035
+
1036
+ return gr.update(visible=True, value=filepath)
1037
+
1038
+
1039
+ def load_framework(file):
1040
+ """Load framework from JSON file"""
1041
+ if not file:
1042
+ return "", "", "", "", "", "No file selected"
1043
+
1044
+ try:
1045
+ with open(file.name, 'r') as f:
1046
+ data = json.load(f)
1047
+
1048
+ return (
1049
+ data.get("research_questions", ""),
1050
+ data.get("interview_protocol", ""),
1051
+ data.get("theoretical_framework", ""),
1052
+ data.get("predefined_codes", ""),
1053
+ data.get("analysis_focus", ""),
1054
+ f"βœ… Loaded framework from {os.path.basename(file.name)}"
1055
+ )
1056
+ except Exception as e:
1057
+ return "", "", "", "", "", f"❌ Error loading file: {str(e)}"
1058
+
1059
+
1060
+ continue_interview.change(
1061
+ update_segment_info,
1062
+ inputs=[continue_interview],
1063
+ outputs=[segment_info]
1064
+ )
1065
+
1066
+ setup_btn.click(
1067
+ fn=copilot.setup_research_context,
1068
+ inputs=[rq_input, protocol_input, framework_input, codes_input, focus_input],
1069
+ outputs=setup_output
1070
+ )
1071
+
1072
+ save_framework_btn.click(
1073
+ save_framework,
1074
+ inputs=[rq_input, protocol_input, framework_input, codes_input, focus_input],
1075
+ outputs=[framework_file]
1076
+ )
1077
+
1078
+ framework_file.change(
1079
+ lambda x: gr.update(visible=False),
1080
+ inputs=[framework_file],
1081
+ outputs=[framework_file]
1082
+ )
1083
+
1084
+ load_framework_btn.click(
1085
+ lambda: gr.update(visible=True),
1086
+ outputs=[framework_file]
1087
+ ).then(
1088
+ load_framework,
1089
+ inputs=[framework_file],
1090
+ outputs=[rq_input, protocol_input, framework_input, codes_input, focus_input, setup_output]
1091
+ )
1092
+
1093
+ with gr.Tab("🎀 Interview Processing"):
1094
+ gr.Markdown("### Process interview audio with multi-view analysis")
1095
+
1096
+ # Session info at the top
1097
+ with gr.Row():
1098
+ session_info = gr.Markdown(copilot.get_session_summary())
1099
+
1100
+ with gr.Row():
1101
+ # Session control buttons
1102
+ new_file_btn = gr.Button("πŸ“ New File, Keep Setup", variant="secondary")
1103
+ reset_session_btn = gr.Button("πŸ”„ Reset Session", variant="secondary")
1104
+ reset_all_btn = gr.Button("πŸ—‘οΈ Reset Everything", variant="stop")
1105
+
1106
+ with gr.Row():
1107
+ with gr.Column(scale=1):
1108
+ # File upload with preview
1109
+ audio_input = gr.Audio(
1110
+ sources=["upload", "microphone"],
1111
+ type="filepath",
1112
+ label="πŸ“ Upload Audio File or 🎀 Record",
1113
+ interactive=True
1114
+ )
1115
+
1116
+ file_status = gr.Markdown("*Upload a file to see its status*")
1117
+
1118
+ # Compression tool
1119
+ with gr.Accordion("πŸ”§ Audio Compression Tool", open=False):
1120
+ gr.Markdown("Compress large audio files")
1121
+
1122
+ quality_select = gr.Radio(
1123
+ choices=["high", "medium", "low"],
1124
+ value="medium",
1125
+ label="Compression Quality"
1126
+ )
1127
+
1128
+ compress_btn = gr.Button("Compress Audio", variant="secondary")
1129
+ compress_output = gr.Markdown()
1130
+ compressed_audio = gr.Audio(
1131
+ label="Compressed Audio",
1132
+ visible=False
1133
+ )
1134
+
1135
+ process_btn = gr.Button("πŸ” Process & Analyze", variant="primary", size="lg")
1136
+
1137
+ # Add visual processing indicator
1138
+ processing_status = gr.Markdown(
1139
+ value="",
1140
+ visible=True
1141
+ )
1142
+
1143
+ # Add progress bar
1144
+ with gr.Row():
1145
+ progress_bar = gr.Progress()
1146
+ progress_status = gr.Textbox(
1147
+ label="Progress",
1148
+ interactive=False,
1149
+ lines=4,
1150
+ value="Ready to process audio..."
1151
+ )
1152
+
1153
+ # Add multi-view analysis button AFTER progress status
1154
+ generate_multiview_btn = gr.Button(
1155
+ "πŸ“Š Generate Multi-View Analysis",
1156
+ variant="secondary",
1157
+ size="lg",
1158
+ visible=True # Always visible for now
1159
+ )
1160
+
1161
+ with gr.Column(scale=2):
1162
+ # Results area with enhanced tabs
1163
+ with gr.Tabs():
1164
+ with gr.Tab("πŸ“ Transcript"):
1165
+ transcript_output = gr.Textbox(
1166
+ label="Full Transcript",
1167
+ lines=15,
1168
+ max_lines=25,
1169
+ interactive=False
1170
+ )
1171
+
1172
+ with gr.Tab("πŸ” Current Segment"):
1173
+ current_analysis_output = gr.Markdown(
1174
+ value="*Process a segment to see analysis*"
1175
+ )
1176
+
1177
+ with gr.Tab("πŸ“‘ All Segments"):
1178
+ all_segments_output = gr.Markdown(
1179
+ value="*Individual analyses will appear here*"
1180
+ )
1181
+
1182
+ with gr.Tab("πŸ”— Combined Analysis"):
1183
+ combined_analysis_output = gr.Markdown(
1184
+ value="*Combined analysis will appear here after 2+ segments*"
1185
+ )
1186
+
1187
+ with gr.Tab("πŸ“Š Comparison"):
1188
+ comparison_output = gr.Markdown(
1189
+ value="*Segment comparison will appear here*"
1190
+ )
1191
+
1192
+ with gr.Tab("πŸ’‘ Follow-ups"):
1193
+ followup_output = gr.Markdown()
1194
+
1195
+ with gr.Tab("πŸ“ˆ Coverage"):
1196
+ coverage_output = gr.Markdown()
1197
+
1198
+ # Hidden state to store file path
1199
+ audio_state = gr.State()
1200
+
1201
+
1202
+ # Session management functions
1203
+ def new_file_keep_setup():
1204
+ """Clear audio input but keep framework"""
1205
+ copilot.is_continuation = True
1206
+ copilot.segment_number = len(copilot.session_segments) + 1
1207
+ return (
1208
+ None, # Clear audio input
1209
+ "*Upload a new file to continue the interview*",
1210
+ f"Ready for Segment {copilot.segment_number}",
1211
+ copilot.get_session_summary()
1212
+ )
1213
+
1214
+
1215
+ def reset_session():
1216
+ """Reset session but keep framework"""
1217
+ result = copilot.reset_session(keep_framework=True)
1218
+ return (
1219
+ None, # Clear audio
1220
+ "*Session reset. Framework kept.*",
1221
+ "Ready to process audio...",
1222
+ copilot.get_session_summary(),
1223
+ "" # Clear transcript
1224
+ )
1225
+
1226
+
1227
+ def reset_everything():
1228
+ """Reset everything including framework"""
1229
+ result = copilot.reset_session(keep_framework=False)
1230
+ return (
1231
+ None, # Clear audio
1232
+ "*Everything reset. Please set up framework again.*",
1233
+ "Ready to process audio...",
1234
+ copilot.get_session_summary(),
1235
+ "", # Clear transcript
1236
+ "❌ Framework cleared. Please go to Setup tab."
1237
+ )
1238
+
1239
+
1240
+ # File status update - store the path in state
1241
+ audio_input.change(
1242
+ fn=copilot.check_audio_file,
1243
+ inputs=[audio_input],
1244
+ outputs=[audio_input, file_status, audio_state]
1245
+ )
1246
+
1247
+ # Compression - update state with compressed file
1248
+ compress_btn.click(
1249
+ fn=copilot.compress_audio,
1250
+ inputs=[audio_state, quality_select],
1251
+ outputs=[compressed_audio, compress_output]
1252
+ ).then(
1253
+ fn=lambda x, msg: (gr.update(visible=True), x) if x else (gr.update(visible=False), None),
1254
+ inputs=[compressed_audio, compress_output],
1255
+ outputs=[compressed_audio, audio_state]
1256
+ )
1257
+
1258
+
1259
+ # Modified process function to handle multi-view
1260
+ def process_and_update_session_multiview(audio_path, progress=gr.Progress()):
1261
+ """Process audio and update session info with multi-view support"""
1262
+
1263
+ # Create a progress callback function
1264
+ def update_progress(message):
1265
+ progress(0.5, desc=message)
1266
+ return message
1267
+
1268
+ # Initialize progress
1269
+ progress(0, desc="Starting audio processing...")
1270
+
1271
+ # First, process the current segment with progress callback
1272
+ results = copilot.process_interview_segment(audio_path, progress_callback=update_progress)
1273
+
1274
+ # Update progress to complete
1275
+ progress(1.0, desc="Processing complete!")
1276
+
1277
+ # Add to session if successful
1278
+ if results[4].startswith("βœ…"):
1279
+ file_name = copilot.current_file_info.get("name", "unknown")
1280
+ duration = copilot.current_file_info.get("size_mb", 0) * 0.5 # Rough estimate
1281
+ transcript_length = len(results[0])
1282
+ copilot.add_segment_to_session(file_name, duration, transcript_length)
1283
+
1284
+ # Get current segment analysis
1285
+ current_segment_analysis = results[1]
1286
+
1287
+ # Check if we should show multi-view button (only after 2+ segments for meaningful comparison)
1288
+ show_multiview = len(copilot.session_segments) >= 2
1289
+
1290
+ # Return results plus updated session info
1291
+ return (
1292
+ results[0], # transcript
1293
+ current_segment_analysis, # current segment analysis
1294
+ results[2], # follow-ups
1295
+ results[3], # coverage
1296
+ results[4], # progress
1297
+ copilot.get_session_summary(), # session info
1298
+ gr.update(visible=show_multiview) # multi-view button visibility
1299
+ )
1300
+
1301
+
1302
+ # Multi-view generation function
1303
+ def generate_all_views():
1304
+ """Generate all analysis views"""
1305
+ individual, combined, comparison = copilot.generate_multi_view_analysis()
1306
+ return individual, combined, comparison
1307
+
1308
+
1309
+ # Connect the process button with loading state
1310
+ process_btn.click(
1311
+ fn=lambda: gr.update(
1312
+ value="πŸ”„ **Processing in progress...** Please wait, this may take several minutes for large files."),
1313
+ outputs=[processing_status]
1314
+ ).then(
1315
+ fn=process_and_update_session_multiview,
1316
+ inputs=[audio_state],
1317
+ outputs=[
1318
+ transcript_output,
1319
+ current_analysis_output,
1320
+ followup_output,
1321
+ coverage_output,
1322
+ progress_status,
1323
+ session_info,
1324
+ generate_multiview_btn
1325
+ ]
1326
+ ).then(
1327
+ fn=lambda: gr.update(value=""),
1328
+ outputs=[processing_status]
1329
+ )
1330
+
1331
+ # Connect the multi-view button
1332
+ generate_multiview_btn.click(
1333
+ fn=generate_all_views,
1334
+ outputs=[
1335
+ all_segments_output,
1336
+ combined_analysis_output,
1337
+ comparison_output
1338
+ ]
1339
+ )
1340
+
1341
+ # Session control buttons
1342
+ new_file_btn.click(
1343
+ fn=new_file_keep_setup,
1344
+ outputs=[audio_input, file_status, progress_status, session_info]
1345
+ )
1346
+
1347
+ reset_session_btn.click(
1348
+ fn=reset_session,
1349
+ outputs=[audio_input, file_status, progress_status, session_info, transcript_output]
1350
+ )
1351
+
1352
+ reset_all_btn.click(
1353
+ fn=reset_everything,
1354
+ outputs=[audio_input, file_status, progress_status, session_info, transcript_output,
1355
+ current_analysis_output]
1356
+ )
1357
+
1358
+ with gr.Tab("πŸ“Š Summary & Export"):
1359
+ gr.Markdown("### Generate comprehensive summary with multi-view analysis")
1360
+
1361
+
1362
+ def generate_enhanced_summary():
1363
+ if not copilot.transcript_history:
1364
+ return "No interview data yet.", "", ""
1365
+
1366
+ unique_codes = list(set(copilot.detected_codes))
1367
+
1368
+ # Generate different formats
1369
+ markdown_summary = f"""# Interview Summary Report
1370
+
1371
+ **Generated:** {datetime.now().strftime("%Y-%m-%d %H:%M")}
1372
+ **Analysis Engine:** Google Gemini Pro
1373
+ **Files Processed:** {', '.join(copilot.processed_files)}
1374
+ **Total Segments:** {len(copilot.session_segments)}
1375
+
1376
+ ## Research Question Coverage
1377
+ {chr(10).join([f"- {'βœ…' if covered else '❌'} {q}" for q, covered in zip(copilot.research_questions, copilot.coverage_status["rq_covered"])])}
1378
+
1379
+ ## Detected Codes/Themes ({len(unique_codes)} unique)
1380
+ {chr(10).join(['- ' + code for code in unique_codes])}
1381
+
1382
+ ## Segment-by-Segment Analysis
1383
+ {"Included in multi-view analysis - see Interview Processing tab" if copilot.segment_analyses else "No individual analyses yet"}
1384
+
1385
+ ## Full Transcript
1386
+ {chr(10).join(copilot.transcript_history)}"""
1387
+
1388
+ # CSV format for codes
1389
+ csv_codes = "Code,Frequency\n"
1390
+ code_freq = {}
1391
+ for code in copilot.detected_codes:
1392
+ code_freq[code] = code_freq.get(code, 0) + 1
1393
+ for code, freq in sorted(code_freq.items(), key=lambda x: x[1], reverse=True):
1394
+ csv_codes += f'"{code}",{freq}\n'
1395
+
1396
+ # JSON format with segment analyses
1397
+ json_export = json.dumps({
1398
+ "metadata": {
1399
+ "date": datetime.now().isoformat(),
1400
+ "files": copilot.processed_files,
1401
+ "total_segments": len(copilot.transcript_history),
1402
+ "analysis_engine": "Gemini Pro"
1403
+ },
1404
+ "research_questions": {
1405
+ "questions": copilot.research_questions,
1406
+ "coverage": copilot.coverage_status["rq_covered"]
1407
+ },
1408
+ "codes": unique_codes,
1409
+ "transcripts": copilot.transcript_history,
1410
+ "segment_analyses": {str(k): v for k, v in copilot.segment_analyses.items()} if hasattr(copilot,
1411
+ 'segment_analyses') else {}
1412
+ }, indent=2)
1413
+
1414
+ return markdown_summary, csv_codes, json_export
1415
+
1416
+
1417
+ with gr.Row():
1418
+ summary_btn = gr.Button("Generate All Formats", variant="primary", size="lg")
1419
+
1420
+ with gr.Row():
1421
+ with gr.Column():
1422
+ summary_display = gr.Markdown(label="Summary Preview")
1423
+
1424
+ with gr.Column():
1425
+ with gr.Accordion("πŸ“₯ Export Options", open=True):
1426
+ csv_export = gr.Textbox(
1427
+ label="CSV Export (Codes)",
1428
+ lines=10,
1429
+ interactive=True
1430
+ )
1431
+
1432
+ json_export = gr.Textbox(
1433
+ label="JSON Export (Complete Data)",
1434
+ lines=10,
1435
+ interactive=True
1436
+ )
1437
+
1438
+ summary_btn.click(
1439
+ fn=generate_enhanced_summary,
1440
+ outputs=[summary_display, csv_export, json_export]
1441
+ )
1442
+
1443
+ with gr.Tab("ℹ️ Help"):
1444
+ gr.Markdown(f"""
1445
+ ### System Information
1446
+
1447
+ **Temp Directory:** {copilot.temp_dir}
1448
+
1449
+ **Transcription Engine:** OpenAI Whisper
1450
+ - Requires: OPENAI_API_KEY in .env file
1451
+ - Max file size: 25 MB
1452
+ - Supported formats: MP3, WAV, M4A, OGG, WEBM, MP4, MPEG, MPGA
1453
+
1454
+ **Analysis Engine:** Google Gemini Pro
1455
+ - Requires: GEMINI_API_KEY in .env file
1456
+ - Free tier: 60 requests per minute
1457
+ - No file size limits (only processes text)
1458
+
1459
+ ### Multi-View Analysis Features
1460
+
1461
+ **Current Segment View:** Shows analysis of the just-processed segment
1462
+ **All Segments View:** Shows individual analyses for each segment
1463
+ **Combined Analysis:** Analyzes all segments together to find patterns
1464
+ **Comparison View:** Side-by-side comparison of all segments
1465
+
1466
+ ### File Handling Tips
1467
+
1468
+ **To reduce file size:**
1469
+ 1. Use the built-in compression tool
1470
+ 2. Record at lower quality (16kHz, mono)
1471
+ 3. Split long recordings into segments
1472
+
1473
+ **Best practices:**
1474
+ - Process 3-5 minute segments for optimal results
1475
+ - Use clear file names for easy tracking
1476
+ - Check file size before processing
1477
+
1478
+ ### Troubleshooting
1479
+
1480
+ **If recording doesn't work:**
1481
+ - Check browser permissions for microphone
1482
+ - Try a different browser (Chrome/Edge work best)
1483
+ - Use upload instead of recording
1484
+
1485
+ **If processing fails:**
1486
+ - Check the console for detailed error messages
1487
+ - Verify your API keys are correct
1488
+ - Ensure the audio file format is supported
1489
+
1490
+ ### Required API Keys
1491
+
1492
+ Add to your `.env` file:
1493
+ ```
1494
+ OPENAI_API_KEY=sk-your-openai-key
1495
+ GEMINI_API_KEY=your-gemini-key
1496
+ ```
1497
+ """)
1498
+
1499
+ # Launch
1500
+ if __name__ == "__main__":
1501
+ print("\n" + "=" * 50)
1502
+ print("πŸš€ Starting Enhanced Research Interview Co-Pilot with Multi-View Analysis")
1503
+ print("=" * 50)
1504
+
1505
+ # Check temp directory
1506
+ print(f"πŸ“ Temp directory: {copilot.temp_dir}")
1507
+ print(f" - Free space: {shutil.disk_usage(tempfile.gettempdir()).free / (1024 ** 3):.1f} GB")
1508
+
1509
+ # Check dependencies
1510
+ if shutil.which('ffmpeg'):
1511
+ print("βœ… FFmpeg found - compression available")
1512
+ else:
1513
+ print("⚠️ FFmpeg not found - compression unavailable")
1514
 
1515
+ # Check API keys
1516
+ if not os.getenv("OPENAI_API_KEY"):
1517
+ print("❌ No OpenAI API key found (required for transcription)")
1518
+ else:
1519
+ print("βœ… OpenAI API key loaded (Whisper transcription)")
1520
+ # Test OpenAI client initialization
1521
+ try:
1522
+ test_client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
1523
+ print("βœ… OpenAI client initialized successfully")
1524
+ except Exception as e:
1525
+ print(f"❌ Error initializing OpenAI client: {e}")
1526
 
1527
+ if not os.getenv("GEMINI_API_KEY"):
1528
+ print("❌ No Gemini API key found (required for analysis)")
1529
+ else:
1530
+ print("βœ… Gemini API key loaded (analysis)")
1531
 
1532
+ if not os.getenv("OPENAI_API_KEY") or not os.getenv("GEMINI_API_KEY"):
1533
+ print("\n⚠️ Please add missing API keys to your .env file")
1534
+ else:
1535
+ print("\nβœ… All systems ready!")
1536
 
1537
+ print("\nπŸ“Œ Launching application...")
1538
+ app.queue().launch()