dyryu1208 commited on
Commit
920dfd0
·
1 Parent(s): 24f0286
Files changed (10) hide show
  1. .DS_Store +0 -0
  2. README.md +74 -4
  3. analyze_claude.py +40 -0
  4. app.py +439 -0
  5. google_search.py +82 -0
  6. prompts.py +201 -0
  7. realtime_video_analysis.py +146 -0
  8. requirements.txt +12 -0
  9. run_backend.py +184 -0
  10. transcribe_texts +0 -0
.DS_Store ADDED
Binary file (6.15 kB). View file
 
README.md CHANGED
@@ -1,14 +1,84 @@
1
  ---
2
  title: Real Time AI Video Summarization Service
3
- emoji: 📈
4
- colorFrom: indigo
5
- colorTo: green
6
  sdk: gradio
7
  sdk_version: 5.33.0
8
  app_file: app.py
9
  pinned: false
10
  license: mit
11
  short_description: Multi-agent performs STT and summarizes real-time video
 
 
12
  ---
13
 
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: Real Time AI Video Summarization Service
3
+ emoji: 📺
4
+ colorFrom: purple
5
+ colorTo: indigo
6
  sdk: gradio
7
  sdk_version: 5.33.0
8
  app_file: app.py
9
  pinned: false
10
  license: mit
11
  short_description: Multi-agent performs STT and summarizes real-time video
12
+ tags:
13
+ - agent-demo-track
14
  ---
15
 
16
+ # Real-time AI Video Summarization Service: Multi-Agent Workflow Implementation
17
+
18
+ ## 💡 Service Overview
19
+
20
+ This application is a real-time analysis and summarization service for video content, powered by an AI agent workflow. Multiple specialized AI agents work together, each performing their distinct role to deliver comprehensive analytical results.
21
+
22
+ ## 🤖 AI Agent Workflow
23
+
24
+ The application comprises three specialized AI agents working in collaboration:
25
+
26
+ 1. **Speech Recognition Agent**: Based on AWS Transcribe, this agent converts video speech to text and specializes in distinguishing between multiple speakers.
27
+
28
+ 2. **Summarization Agent**: Leveraging the Claude 3.5 Haiku model, this agent analyzes the transcribed text and extracts key content. It excels at understanding context and identifying crucial concepts.
29
+
30
+ 3. **Knowledge Retrieval Agent**: Powered by Google Gemini, this agent extracts key keywords from the transcribed text and performs Google Search on these keywords, summarizing additional information for each keyword. This provides valuable context and background knowledge related to the video content.
31
+
32
+ These three agents operate asynchronously, processing data sequentially and sharing results under the coordination of a mediator (backend controller). They perform tasks autonomously without user intervention and update in real-time.
33
+
34
+ ## 🛠 Key Features
35
+
36
+ - **Autonomous Agent Collaboration**: Each agent works independently in its specialized domain and shares results
37
+ - **Real-time Speech Recognition**: The speech recognition agent converts video audio to text
38
+ - **Intelligent Content Summarization**: The summarization agent understands context and extracts essential content
39
+ - **Automatic Background Knowledge**: The knowledge retrieval agent provides relevant information from web searches
40
+ - **Multiple Speaker Identification**: Identification and distinction of various speakers in conversational content
41
+ - **Real-time Updates**: Entire agent workflow results refresh at 10-second intervals
42
+
43
+ ## 📋 Supported Content
44
+
45
+ Currently, the agent analysis system supports the following three AWS-related video contents:
46
+
47
+ 1. **Agents for Amazon Bedrock**: Technical lecture about Amazon Bedrock agents
48
+ 2. **Bundesliga Fan Experience**: Case study on how Bundesliga uses AI to enhance fan experiences
49
+ 3. **Discover New AWS Services with AWS Heroes**: Introduction to new AWS services in 2024
50
+
51
+ ## 🚀 How to Use
52
+
53
+ 1. Wait until the thumbnail images for each video fully appear.
54
+ 2. Select the video title located just below the thumbnail image, then click the video play button. (You can select any video, but we recommend choosing "Data, AI & Soccer How Bundesliga is transforming the fan experience" due to language considerations.)
55
+ 3. When you press the Auto Update button at the bottom, the Real-Time Script, AI Summary Result, and Keyword Search Result will be updated every 10 seconds in real-time according to the agent workflow.
56
+ * The Real-Time Script is the execution result of the Speech Recognition Agent that converts video content to text using AWS Transcribe.
57
+ * The AI Summary Result is the execution result of the Summarization Agent.
58
+ * The Keyword Search Result is the execution result of the Knowledge Retrieval Agent.
59
+
60
+ 4. By pressing the Refresh button, you can immediately check the results up to that point.
61
+
62
+ ## 🔧 Technology Stack
63
+
64
+ - **User Interface**: Gradio 5.31.0
65
+ - **Agent Technologies**:
66
+ - Speech Recognition: Amazon Transcribe
67
+ - Content Summarization: AWS Bedrock (Claude 3.5 Haiku)
68
+ - Knowledge Retrieval: Google Gemini 2.0 Flash
69
+
70
+ ## 📌 Notes
71
+
72
+ - Initial results take approximately 30 seconds to appear after the agent workflow starts.
73
+ - Automatic updates occur at 10-second intervals.
74
+ - Each agent's analysis results are accumulated and stored as history.
75
+
76
+ ## 🔗 Related Links
77
+
78
+ - [AWS Bedrock](https://aws.amazon.com/bedrock/)
79
+ - [Amazon Transcribe](https://aws.amazon.com/transcribe/)
80
+ - [Google Gemini AI](https://ai.google.dev/)
81
+
82
+ ## 📜 License
83
+
84
+ This project is released under the MIT License.
analyze_claude.py ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import boto3
2
+ import json
3
+ from prompts import *
4
+
5
+ bedrock = boto3.client(service_name="bedrock-runtime", region_name="us-east-1")
6
+ CLAUDE_MODEL_ID = "us.anthropic.claude-3-5-haiku-20241022-v1:0"
7
+
8
+ def analyze_with_claude(stt_data, content_type="국민의힘"):
9
+ """
10
+ Use Claude to summarize AI based on STT data.
11
+ Select the appropriate prompt according to the content_type.
12
+ """
13
+
14
+ if content_type == "Agents for Amazon Bedrock":
15
+ prompt_template = BEDROCK_CLAUDE_PROMPT
16
+ elif content_type == "Bundesliga Fan Experience":
17
+ prompt_template = BUNDESLIGA_CLAUDE_PROMPT
18
+ elif content_type == "AWS_2024_recap":
19
+ prompt_template = AWS_CLAUDE_PROMPT
20
+
21
+ formatted_prompt = prompt_template.format(stt_data=stt_data)
22
+
23
+ body = json.dumps({
24
+ "anthropic_version": "bedrock-2023-05-31",
25
+ "max_tokens": 1000,
26
+ "messages": [
27
+ {
28
+ "role": "user",
29
+ "content": formatted_prompt
30
+ }
31
+ ]
32
+ })
33
+
34
+ response = bedrock.invoke_model(
35
+ modelId=CLAUDE_MODEL_ID,
36
+ body=body
37
+ )
38
+
39
+ response_body = json.loads(response.get('body').read())
40
+ return response_body['content'][0]['text']
app.py ADDED
@@ -0,0 +1,439 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ from PIL import Image
3
+ import threading
4
+ import time
5
+ import os
6
+ import shutil
7
+ from pathlib import Path
8
+ from huggingface_hub import hf_hub_download
9
+ from run_backend import main as run_backend_main, analysis_results, search_results
10
+
11
+ # Files Prequisites
12
+ def prepare_dataset_files():
13
+ # Create local directory
14
+ os.makedirs("data", exist_ok=True)
15
+
16
+ # Check current directory
17
+ print(f"Current working directory: {os.getcwd()}")
18
+
19
+ # Define list of required files
20
+ files_to_download = [
21
+ "aws.mp4", "aws.png", "aws.wav",
22
+ "aws_bundesliga.mp4", "aws_bundesliga.png", "aws_bundesliga.wav",
23
+ "summit_sungwoo.mp4", "summit_sungwoo.png", "summit_sungwoo.wav"
24
+ ]
25
+
26
+ repo_id = "cloudplayer/hackathon_data"
27
+ repo_type = "dataset"
28
+
29
+ try:
30
+ for file_name in files_to_download:
31
+ local_path = os.path.join("data", file_name)
32
+
33
+ # Skip if file already exists
34
+ if os.path.exists(local_path):
35
+ print(f"File already exists, skipping download: {local_path}")
36
+ continue
37
+
38
+ # Download file from Hub
39
+ downloaded_path = hf_hub_download(
40
+ repo_id=repo_id,
41
+ filename=file_name,
42
+ repo_type=repo_type,
43
+ local_dir="data",
44
+ local_dir_use_symlinks=False # Download actual file
45
+ )
46
+
47
+ print(f"Downloaded file: {downloaded_path}")
48
+
49
+ # Check downloaded files
50
+ print(f"Files in data directory: {os.listdir('data')}")
51
+ return True
52
+ except Exception as e:
53
+ print(f"Error downloading files: {e}")
54
+ import traceback
55
+ traceback.print_exc()
56
+ return False
57
+
58
+ prepare_dataset_files()
59
+
60
+ # Set the directory path based on the current file
61
+ BASE_DIR = "."
62
+ DATA_DIR = "./data"
63
+ TRANSCRIPT_FILE = "./transcribe_texts"
64
+
65
+ # Analysis Status Management
66
+ analysis_running = False
67
+ analysis_thread = None
68
+ last_update_time = 0
69
+
70
+ def read_transcript():
71
+ """Read Transcript File"""
72
+ try:
73
+ with open(TRANSCRIPT_FILE, "r", encoding="utf-8") as f:
74
+ return f.read()
75
+ except Exception as e:
76
+ return "Loading Script..."
77
+
78
+ def get_current_content():
79
+ """Get Current Content"""
80
+ global last_update_time
81
+
82
+ if not analysis_running:
83
+ return "", "", "", "", ""
84
+
85
+ try:
86
+ current_time = time.time()
87
+ if current_time - last_update_time < 1.0: # If less than 1 second, do not update
88
+ return None
89
+
90
+ last_update_time = current_time
91
+ transcript = read_transcript()
92
+ current_analysis = analysis_results[-1] if analysis_results else ""
93
+ current_search = search_results[-1] if search_results else ""
94
+
95
+ # 이전 결과 업데이트
96
+ if len(analysis_results) > 1:
97
+ prev_analysis_text = "\n\n".join([
98
+ f"#### Summary #{i+1}\n{result}"
99
+ for i, result in enumerate(analysis_results[:-1])
100
+ ])
101
+ else:
102
+ prev_analysis_text = "No previous analysis results."
103
+
104
+ if len(search_results) > 1:
105
+ prev_search_text = "\n\n".join([
106
+ f"#### Search Result #{i+1}\n{result}"
107
+ for i, result in enumerate(search_results[:-1])
108
+ ])
109
+ else:
110
+ prev_search_text = "No previous search results."
111
+
112
+ return transcript, current_analysis, current_search, prev_analysis_text, prev_search_text
113
+ except Exception as e:
114
+ print(f"Error occurred while updating content: {e}")
115
+ return None
116
+
117
+ def start_analysis(party, video_path):
118
+ """Start Analysis"""
119
+ global analysis_running, analysis_thread, last_update_time
120
+
121
+ if not analysis_running:
122
+ analysis_running = True
123
+ last_update_time = time.time()
124
+ # Initialize Transcript File
125
+ try:
126
+ with open(TRANSCRIPT_FILE, "w", encoding="utf-8") as f:
127
+ f.write("")
128
+ except Exception as e:
129
+ pass
130
+
131
+ # Start Analysis Thread
132
+ analysis_thread = threading.Thread(target=run_backend_main, args=(party,))
133
+ analysis_thread.daemon = True
134
+ analysis_thread.start()
135
+
136
+ return gr.Markdown(f"# {party} Analysis"), gr.update(value=video_path)
137
+
138
+ def create_ui():
139
+ """Create UI"""
140
+ with gr.Blocks(title="Real-Time AI Video Summarization Service", theme=gr.themes.Soft()) as demo:
141
+ # State Variables
142
+ party = gr.State("")
143
+ container_visible = gr.State(True)
144
+ selection_visible = gr.State(True)
145
+ auto_update = gr.State(False) # Auto Update State
146
+ update_trigger = gr.State(0) # Update Trigger
147
+
148
+ # Add Timer Component (10 second interval)
149
+ timer = gr.Timer(10.0, active=False)
150
+
151
+ # Add User Guide at the top
152
+ gr.Markdown("""
153
+ ## How to Use:
154
+
155
+ 1. Wait until the thumbnail images for each video fully appear.
156
+ 2. Select the video title located just below the thumbnail image, then click the video play button in "Sample Video".
157
+ (You can select any video, but we recommend choosing "Data, AI & Soccer How Bundesliga is transforming the fan experience" due to language considerations.)
158
+ 3. When you press the Auto Update button at the bottom, the Real-Time Script, AI Summary Result, and Keyword Search Result will be updated every 10 seconds in real-time according to the agent workflow.
159
+ * The Real-Time Script is the execution result of the Speech Recognition Agent that converts video content to text using AWS Transcribe.
160
+ * The AI Summary Result is the execution result of the Summarization Agent.
161
+ * The Keyword Search Result is the execution result of the Knowledge Retrieval Agent.
162
+
163
+ 4. By pressing the Refresh button, you can immediately check the results up to that point.
164
+ """)
165
+
166
+ with gr.Column(visible=lambda: container_visible.value) as aws_container:
167
+ gr.Markdown("### AWS Lecture - Select the video to perform AI summarization")
168
+
169
+ with gr.Row(equal_height=True):
170
+ with gr.Column(scale=1, min_width=400):
171
+ aws_image_2 = gr.Image(
172
+ value=str(DATA_DIR + "/aws_bundesliga.png"),
173
+ label="aws_bundesliga",
174
+ show_label=True,
175
+ height=300, # Fixed Image Height
176
+ width=400, # Fixed Image Width
177
+ elem_id="aws_bundesliga"
178
+ )
179
+ aws_button_2 = gr.Button(
180
+ "Data, AI & Soccer How Bundesliga is transforming the fan experience",
181
+ variant="primary",
182
+ size="lg",
183
+ elem_id="aws_button_2"
184
+ )
185
+
186
+ with gr.Column(scale=1, min_width=400):
187
+ aws_image_1 = gr.Image(
188
+ value=str(DATA_DIR + "/summit_sungwoo.png"),
189
+ label="summit_sungwoo",
190
+ show_label=True,
191
+ height=300, # Fixed Image Height
192
+ width=400, # Fixed Image Width
193
+ elem_id="summit_sungwoo"
194
+ )
195
+ aws_button_1 = gr.Button(
196
+ "The Future of AI is Here! Agents for Amazon Bedrock",
197
+ variant="primary",
198
+ size="lg",
199
+ elem_id="aws_button_1"
200
+ )
201
+
202
+ with gr.Column(scale=1, min_width=400):
203
+ aws_image_3 = gr.Image(
204
+ value=str(DATA_DIR + "/aws.png"),
205
+ label="aws",
206
+ show_label=True,
207
+ height=300,
208
+ width=400,
209
+ elem_id="aws"
210
+ )
211
+ aws_button_3 = gr.Button(
212
+ "Discover the New AWS Services with AWS Heroes in 2024",
213
+ variant="primary",
214
+ size="lg",
215
+ elem_id="aws_button_3"
216
+ )
217
+
218
+ # Add CSS Style
219
+ gr.Markdown("""
220
+ <style>
221
+ #summit_sungwoo, #aws_bundesliga, #aws{
222
+ object-fit: contain !important;
223
+ background-color: #f8f9fa;
224
+ border-radius: 10px;
225
+ padding: 10px;
226
+ box-shadow: 0 2px 4px rgba(0,0,0,0.1);
227
+ }
228
+ #aws_button_1, #aws_button_2, #aws_button_3 {
229
+ margin-top: 20px;
230
+ width: 100%;
231
+ height: 50px;
232
+ font-size: 1.2em;
233
+ font-weight: bold;
234
+ border-radius: 8px;
235
+ transition: all 0.3s ease;
236
+ }
237
+ #aws_button_1:hover, #aws_button_2:hover, #aws_button_3:hover {
238
+ transform: translateY(-2px);
239
+ box-shadow: 0 4px 8px rgba(0,0,0,0.2);
240
+ }
241
+ </style>
242
+ """)
243
+
244
+ # Analysis Container (Initially Hidden)
245
+ with gr.Column(visible=lambda: selection_visible.value) as analysis_container:
246
+ title = gr.Markdown("# Video Analysis")
247
+
248
+ with gr.Row():
249
+ # Left: Video
250
+ with gr.Column(scale=3):
251
+ video = gr.Video(
252
+ label="Sample Video",
253
+ show_label=True,
254
+ interactive=False,
255
+ value=str(DATA_DIR + "/summit_sungwoo.mp4"), # Default Value
256
+ elem_id="debate_video"
257
+ )
258
+
259
+ # Right: Analysis Results Tabs
260
+ with gr.Column(scale=2):
261
+ with gr.Tabs() as tabs:
262
+ with gr.TabItem("Real-Time Script"):
263
+ transcript = gr.Textbox(
264
+ label="Real-Time Script",
265
+ show_label=True,
266
+ lines=20,
267
+ interactive=False,
268
+ value="Loading Script...", # Initial Value
269
+ elem_id="transcript_box"
270
+ )
271
+
272
+ with gr.TabItem("AI Summary Result"):
273
+ analysis = gr.Markdown(
274
+ value="Loading Analysis Result...", # Initial Value
275
+ elem_id="analysis_result"
276
+ )
277
+ with gr.Accordion("View Previous Analysis Results", open=False):
278
+ prev_analysis = gr.Markdown(
279
+ value="No previous analysis results.", # Initial Value
280
+ elem_id="prev_analysis"
281
+ )
282
+
283
+ with gr.TabItem("Keyword Search Result"):
284
+ search = gr.Markdown(
285
+ value="Loading Search Result...", # Initial Value
286
+ elem_id="search_result"
287
+ )
288
+ with gr.Accordion("View Previous Keyword Search Results", open=False):
289
+ prev_search = gr.Markdown(
290
+ value="No previous search results.", # Initial Value
291
+ elem_id="prev_search"
292
+ )
293
+
294
+ # Show Status
295
+ status = gr.Markdown("Analysis is in progress...")
296
+
297
+ # Update Button
298
+ with gr.Row():
299
+ update_button = gr.Button(
300
+ "Refresh",
301
+ variant="secondary",
302
+ size="lg",
303
+ elem_id="update_button"
304
+ )
305
+ auto_update_button = gr.Button(
306
+ "Auto Update",
307
+ variant="secondary",
308
+ size="lg",
309
+ elem_id="auto_update_button"
310
+ )
311
+
312
+ # Add CSS Style for Analysis Page
313
+ gr.Markdown("""
314
+ <style>
315
+ #debate_video {
316
+ border-radius: 10px;
317
+ box-shadow: 0 2px 4px rgba(0,0,0,0.1);
318
+ }
319
+ #transcript_box {
320
+ font-family: 'Noto Sans KR', sans-serif;
321
+ line-height: 1.6;
322
+ }
323
+ #analysis_result, #search_result, #prev_analysis, #prev_search {
324
+ font-family: 'Noto Sans KR', sans-serif;
325
+ line-height: 1.8;
326
+ padding: 15px;
327
+ background-color: #f8f9fa;
328
+ border-radius: 8px;
329
+ }
330
+ #update_button, #auto_update_button {
331
+ margin: 10px;
332
+ transition: all 0.3s ease;
333
+ }
334
+ #update_button:hover, #auto_update_button:hover {
335
+ transform: translateY(-2px);
336
+ box-shadow: 0 4px 8px rgba(0,0,0,0.2);
337
+ }
338
+ </style>
339
+ """)
340
+
341
+ def on_aws_select(content_name, video_file):
342
+ """AWS Lecture Selection Processing"""
343
+ party.value = content_name
344
+ video_path = str(DATA_DIR + f"/{video_file}")
345
+ container_visible.value = False
346
+ selection_visible.value = True
347
+ return start_analysis(content_name, video_path)
348
+
349
+ def trigger_update():
350
+ """Increase Update Trigger"""
351
+ update_trigger.value += 1
352
+ return update_trigger.value
353
+
354
+ def update_content(trigger):
355
+ """Update Content"""
356
+ if not analysis_running:
357
+ return (
358
+ "Analysis has not started.",
359
+ "No analysis results.",
360
+ "No search results.",
361
+ "No previous analysis results.",
362
+ "No previous search results.",
363
+ trigger
364
+ )
365
+
366
+ result = get_current_content()
367
+ if result is None:
368
+ return (
369
+ transcript.value,
370
+ analysis.value,
371
+ search.value,
372
+ prev_analysis.value,
373
+ prev_search.value,
374
+ trigger
375
+ )
376
+ return (*result, trigger)
377
+
378
+ def toggle_auto_update():
379
+ """Toggle Auto Update"""
380
+ auto_update.value = not auto_update.value
381
+ if auto_update.value:
382
+ # Start Auto Update - Increase Trigger
383
+ trigger_update()
384
+ return "Auto Update has started. It will be updated every 10 seconds.", gr.Timer(active=True)
385
+ else:
386
+ return "Auto Update has stopped.", gr.Timer(active=False)
387
+
388
+ aws_button_1.click(
389
+ fn=lambda: on_aws_select("Agents for Amazon Bedrock", "summit_sungwoo.mp4"),
390
+ outputs=[title, video]
391
+ )
392
+
393
+ aws_button_2.click(
394
+ fn=lambda: on_aws_select("Bundesliga Fan Experience", "aws_bundesliga.mp4"),
395
+ outputs=[title, video]
396
+ )
397
+
398
+ aws_button_3.click(
399
+ fn=lambda: on_aws_select("AWS_2024_recap", "aws.mp4"),
400
+ outputs=[title, video]
401
+ )
402
+
403
+ # Update Button Click Event
404
+ update_button.click(
405
+ fn=trigger_update,
406
+ outputs=[update_trigger]
407
+ )
408
+
409
+ # Auto Update Button Click Event
410
+ auto_update_button.click(
411
+ fn=toggle_auto_update,
412
+ outputs=[status, timer]
413
+ )
414
+
415
+ # Timer Tick Event
416
+ timer.tick(
417
+ fn=lambda: trigger_update() if auto_update.value else None,
418
+ outputs=[update_trigger]
419
+ )
420
+
421
+ # Update Trigger Change Event
422
+ update_trigger.change(
423
+ fn=update_content,
424
+ inputs=[update_trigger],
425
+ outputs=[transcript, analysis, search, prev_analysis, prev_search, update_trigger]
426
+ )
427
+
428
+ # Initial Load - Set Update Trigger
429
+ demo.load(
430
+ fn=lambda: (update_trigger.value + 1),
431
+ outputs=[update_trigger]
432
+ )
433
+
434
+ return demo
435
+
436
+ if __name__ == "__main__":
437
+ demo = create_ui()
438
+ demo.queue() # Enable Queue
439
+ demo.launch(share=True)
google_search.py ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import json
3
+ from google import genai
4
+ from google.genai import types
5
+ from dotenv import load_dotenv
6
+ from prompts import *
7
+
8
+ try:
9
+ from dotenv import load_dotenv
10
+ load_dotenv()
11
+ except ImportError:
12
+ pass
13
+
14
+ def format_search_results(results):
15
+ """Format Search Results"""
16
+ formatted_output = ""
17
+
18
+ for i in range(1, 4):
19
+ formatted_output += f"### {i}. {results[f'keyword{i}']}\n"
20
+ formatted_output += f"{results[f'summary{i}']}"
21
+ formatted_output += "\n"
22
+
23
+ return formatted_output
24
+
25
+ def grounding_with_google_search(stt_data, content_type = "국민의힘"):
26
+ """
27
+ Extract Keywords and Perform Google Search
28
+ """
29
+ # Create a client for Google GenAI
30
+ client = genai.Client(api_key=os.getenv("GEMINI_API_KEY"))
31
+
32
+ # Define a sample schema for search results
33
+ sample_schema = {
34
+ "keyword1": "Core Keyword 1",
35
+ "summary1": "Summarize Search Result about Core Keyword 1",
36
+ "keyword2": "Core Keyword 2",
37
+ "summary2": "Summarize Search Result about Core Keyword 2",
38
+ "keyword3": "Core Keyword 3",
39
+ "summary3": "Summarize Search Result about Core Keyword 3"
40
+ }
41
+
42
+ # Select the appropriate system prompt based on content_type
43
+ if content_type == "Agents for Amazon Bedrock":
44
+ system_prompt = BEDROCK_SEARCH_PROMPT
45
+ elif content_type == "Bundesliga Fan Experience":
46
+ system_prompt = BUNDESLIGA_SEARCH_PROMPT
47
+ elif content_type == "AWS_2024_recap":
48
+ system_prompt = AWS_SEARCH_PROMPT
49
+
50
+ # Format the system prompt with the sample schema
51
+ system_prompt = system_prompt.format(sample_schema=sample_schema)
52
+
53
+ # Prepare the human message with the input script
54
+ human_message = f"""
55
+ ## Input Script
56
+ {stt_data}
57
+ """
58
+
59
+ # Generate content using the Google GenAI client
60
+ response = client.models.generate_content(
61
+ model="gemini-2.0-flash-001",
62
+ contents=human_message,
63
+ config=types.GenerateContentConfig(
64
+ system_instruction=system_prompt,
65
+ response_mime_type="application/json",
66
+ tools=[
67
+ types.Tool(
68
+ google_search=types.GoogleSearchRetrieval(
69
+ dynamic_retrieval_config = types.DynamicRetrievalConfig(
70
+ mode=types.DynamicRetrievalConfigMode.MODE_UNSPECIFIED,
71
+ dynamic_threshold=0.0
72
+ )
73
+ )
74
+ )
75
+ ]
76
+ )
77
+ )
78
+
79
+ # Parse the response text and format the search results
80
+ text = response.text
81
+ results = json.loads(text)
82
+ return format_search_results(results)
prompts.py ADDED
@@ -0,0 +1,201 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # AWS SUMMIT AI Summary
2
+ BEDROCK_CLAUDE_PROMPT = """
3
+ You are a real-time analyst for a technical lecture on AWS Bedrock and generative AI.
4
+ The presenter is explaining their company, the technologies they use, and how these technologies are implemented.
5
+ Analyze the lecture content below and summarize the key points concisely.
6
+ Read the IMPORTANT section carefully, and ensure all summaries are in English.
7
+
8
+ * IMPORTANT:
9
+ 1. The input data is a script converted from real-time speech, so typos may occur.
10
+ - Correct typos related to technical terms to the correct terms
11
+ - Correct misnamed AWS services or company names to their accurate forms
12
+ - Exclude content that is difficult to understand in context
13
+
14
+ 2. Focus points for summary:
15
+ - The main business and role of the presenter's company/organization
16
+ - Key technologies explained in the presentation (AWS Bedrock, generative AI, agents, etc.)
17
+ - Main steps of the technology implementation process
18
+ - Examples or use cases of technology application mentioned in the presentation
19
+
20
+ Here is the lecture content:
21
+
22
+ {stt_data}
23
+
24
+ 1. Describe the company the presenter is affiliated with.
25
+ 2. What technology is the presenter explaining?
26
+ 3. Describe the process or implementation method of the technology.
27
+ """
28
+
29
+ # AWS SUMMIT AI Summary Google Search Keywords
30
+ BEDROCK_SEARCH_PROMPT = """
31
+ You are a Focused Search and Analysis Assistant for AWS technical presentations.
32
+
33
+ Your tasks:
34
+ 1. Read the Input script which was extracted from real-time voice records of an AWS Bedrock technical presentation.
35
+
36
+ 2. Extract exactly 3 most significant elements from the script, focusing specifically on:
37
+ - The presenter's company/organization and its business
38
+ - AWS Bedrock and generative AI technologies mentioned
39
+ - Implementation methods and processes described
40
+
41
+ 3. For each extracted element:
42
+ - The data you enter is scripted data transcribed from real-time speech and may contain typos. Process typos that make sense in context.
43
+ - Correct any misnamed AWS services or company names to their accurate forms
44
+ - Exclude from words anything that doesn't really make sense
45
+ - Search for relevant information that provides clear context about the element
46
+ - Provide comprehensive summaries in English. All summaries MUST be provided in English only.
47
+ - Each element should be one word and short.
48
+
49
+ 4. Priority should be given to:
50
+ - Company/organization name of the presenter and its core business
51
+ - Specific AWS Bedrock features and generative AI technologies mentioned
52
+ - Technical implementation steps or processes described
53
+ - Any examples or use cases mentioned in the presentation
54
+
55
+ Output Format:
56
+ {sample_schema}
57
+
58
+ * keyword1 should relate to the presenter's company or organization
59
+ * keyword2 should relate to the core technology discussed (AWS Bedrock/generative AI)
60
+ * keyword3 should relate to implementation methods or processes
61
+ * Summary 1, 2, 3 are the searches and answers for each keyword. Include a detailed description of at least 2-3 sentences that would help understand the context of the presentation.
62
+ """
63
+
64
+ BUNDESLIGA_CLAUDE_PROMPT = """
65
+ You are a real-time analyst for a podcast discussing how the Bundesliga uses data and AI to innovate fan experiences.
66
+ The podcast features a dialogue format with two speakers (Questioner 1, Responder 1) discussing how the Bundesliga is using data and AI.
67
+ Analyze the conversation below and summarize the main discussion points and Q&A.
68
+ Read the IMPORTANT section carefully, and ensure all summaries are in English.
69
+
70
+ * IMPORTANT:
71
+ 1. The input data is a script converted from real-time speech, so typos may occur.
72
+ - Correct typos related to football terms, technical terms, and Bundesliga-related terms
73
+ - Consider the context of the dialogue between the questioner and responder
74
+ - Exclude content that is difficult to understand in context
75
+
76
+ 2. Focus points for summary:
77
+ - The core of the current discussion topic
78
+ - The main points of the questions posed by the questioner
79
+ - The key answers and information provided by the responder
80
+ - Important examples of data/AI usage in the Bundesliga discussed in the conversation
81
+
82
+ 3. Conversation structure analysis:
83
+ - Clearly distinguish and identify question-answer pairs
84
+ - Identify the interests of the questioner and the expertise of the responder
85
+ - Consider the flow and logical development of the conversation
86
+
87
+ Here is the podcast conversation content:
88
+ {stt_data}
89
+
90
+ 1. What is the current topic of discussion in the podcast?
91
+ 2. What are the main questions from the questioner and the main answers from the responder?
92
+ """
93
+
94
+ BUNDESLIGA_SEARCH_PROMPT = """
95
+ You are a Focused Search and Analysis Assistant for sports podcast interviews.
96
+
97
+ Your tasks:
98
+ 1. Read the Input script which was extracted from real-time voice records of a podcast interview between an interviewer (questioner) and an interviewee (responder) discussing how Bundesliga uses data and AI to innovate fan experience. Note that the podcast content is in English.
99
+
100
+ 2. Extract exactly 3 most significant elements from the script, focusing specifically on:
101
+ - The main discussion topic being addressed in the conversation
102
+ - Key questions posed by the interviewer
103
+ - Important answers and insights provided by the responder
104
+
105
+ 3. For each extracted element:
106
+ - The data you enter is transcribed from an English podcast interview
107
+ - First understand the question-answer exchange structure correctly
108
+ - Process any sports terminology, team names, or technical terms that may contain typos
109
+ - Exclude unclear statements or tangential discussions
110
+ - Search for relevant information that provides context to the discussion topics
111
+ - Provide comprehensive summaries in English. All summaries MUST be provided in English only.
112
+ - Each element should be one word and short.
113
+
114
+ 4. Priority should be given to:
115
+ - Main topics of discussion in the interview
116
+ - Specific questions asked by the interviewer about data/AI in Bundesliga
117
+ - Key insights, examples, or explanations provided by the responder
118
+ - Discussion points that reveal how Bundesliga is using technology
119
+
120
+ 5. Language handling:
121
+ - Even though the input content is in English, you must extract keywords in English and provide all summaries in English
122
+ - Translate any technical terms appropriately into English
123
+ - Ensure the English summaries are natural and fluent
124
+
125
+ Output Format:
126
+ {sample_schema}
127
+
128
+ * keyword1 should relate to the main discussion topic
129
+ * keyword2 should relate to a key question from the interviewer
130
+ * keyword3 should relate to an important answer/insight from the responder
131
+ * Summary 1, 2, 3 are the English searches and answers for each keyword. Include a detailed description of at least 2-3 sentences that helps understand the context of the podcast discussion.
132
+ """
133
+
134
+ AWS_CLAUDE_PROMPT = """
135
+ You are a real-time analyst for a YouTube video covering major cloud services introduced at the 2024 AWS re:Invent event.
136
+ The video features a host (Speaker 0) and AWS Heroes (Speakers 1, 2, 3).
137
+ Identify the ongoing topics in the conversation and summarize the statements made by each AWS Hero.
138
+ Read the IMPORTANT section carefully, and ensure all summaries are in English.
139
+
140
+ * IMPORTANT:
141
+ 1. The input data is a script converted from real-time speech, so typos may occur.
142
+ - Interpret typos that make sense in context with the correct meaning
143
+ - Exclude content that doesn't make sense
144
+
145
+ 2. Speaker information may not be accurate, so:
146
+ - Determine the actual speaker based on the context and flow of the conversation
147
+ - Check continuity with previous statements
148
+ - Use distinctive speech patterns of the host and heroes
149
+
150
+ 3. Focus points for summary:
151
+ - Clearly identify the changing topics in real-time
152
+ - Summarize the key technologies of AWS services mentioned by each hero
153
+ - If a hero consistently mentions a specific service, output only that hero's statements
154
+ - Use the following format for each hero:
155
+ - • Hero Name (Company Name, Job Title)
156
+ - Understand the intent of statements even from inaccurate text
157
+
158
+ Here is the video conversation content:
159
+
160
+ {stt_data}
161
+
162
+ 1. What is the current topic of discussion?
163
+ 2. Summarize the main statements about AWS services made by each hero.
164
+ """
165
+
166
+ AWS_SEARCH_PROMPT = """
167
+ You are a Specialized AWS Cloud Services Analysis Assistant.
168
+
169
+ Your tasks:
170
+ 1. Read the Input script which was extracted from 2024 AWS re:Invent event videos.
171
+
172
+ 2. Extract exactly 3 most significant elements from the script, including:
173
+ - AWS cloud services and product names (e.g., EC2, S3, Lambda)
174
+ - Cloud computing technologies and concepts
175
+ - New features or service announcements
176
+ - AWS Heroes or presenters' names
177
+ - Cloud architecture patterns or best practices
178
+ - Security or cost optimization strategies
179
+
180
+ 3. For each extracted element:
181
+ - The data you enter is scripted data transcribed from real-time speech and may contain typos. Process typos that make sense in context (e.g., "lambda" might be "Lambda").
182
+ - Correct technical terminology when transcription errors occur due to English-Korean pronunciation differences
183
+ - Search for relevant technical background information
184
+ - Provide comprehensive summaries in English. All summaries MUST be provided in English only.
185
+ - Focus on technical context and cloud computing significance
186
+ - Each element should be one word or short phrase, preferably the official AWS service name or technical term.
187
+
188
+ 4. Priority should be given to:
189
+ - Newly announced AWS services or features
190
+ - Frequently mentioned cloud architectures or services
191
+ - Technical terms or cloud concepts that need explanation
192
+ - Key AWS Heroes or AWS leadership mentioned
193
+ - Case studies or demonstrations highlighted in the content
194
+ - Differentiated AWS technologies or approaches
195
+
196
+ Output Format:
197
+ {sample_schema}
198
+
199
+ * keyword1, 2, 3 are the main AWS-related keywords pulled from the script data.
200
+ * Summary 1, 2, 3 are the searches and answers for each keyword. Include a detailed technical description of at least 2-3 sentences in English, explaining the service functionality and cloud computing context.
201
+ """
realtime_video_analysis.py ADDED
@@ -0,0 +1,146 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import nest_asyncio
2
+ import asyncio
3
+ import aiofiles
4
+ from amazon_transcribe.client import TranscribeStreamingClient
5
+ from amazon_transcribe.handlers import TranscriptResultStreamHandler
6
+ from amazon_transcribe.model import TranscriptEvent
7
+ import logging
8
+
9
+ # Enable support for nested asyncio event loops
10
+ nest_asyncio.apply()
11
+
12
+ # Set up logging
13
+ logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
14
+ logger = logging.getLogger('transcription_system')
15
+
16
+ class TranscriptHandler(TranscriptResultStreamHandler):
17
+ """Handler class for processing Amazon Transcribe events"""
18
+
19
+ def __init__(self, output_stream, output_file_path="./transcribe_texts"):
20
+ super().__init__(output_stream)
21
+ self.results = []
22
+ self.processing_complete = False
23
+ self.output_file_path = output_file_path
24
+
25
+ # Initialize the output file at the start
26
+ with open(self.output_file_path, 'w', encoding="utf-8") as f:
27
+ f.write("")
28
+
29
+ async def handle_transcript_event(self, transcript_event: TranscriptEvent):
30
+ results = transcript_event.transcript.results
31
+ for result in results:
32
+ if not result.is_partial and result.channel_id == 'ch_1':
33
+ for alt in result.alternatives:
34
+ start_time = None
35
+ end_time = None
36
+ current_speaker = None
37
+ utterance = []
38
+
39
+ for item in alt.items:
40
+ if start_time is None:
41
+ start_time = item.start_time
42
+
43
+ if item.speaker is not None:
44
+ if current_speaker is None:
45
+ current_speaker = item.speaker
46
+
47
+ if current_speaker != item.speaker:
48
+ transcript = f"Speaker {current_speaker}: {''.join(utterance).strip()}"
49
+ print(f"\n{transcript}")
50
+ self.results.append(transcript)
51
+
52
+ self._append_to_file(transcript)
53
+
54
+ current_speaker = item.speaker
55
+ start_time = item.start_time
56
+ utterance = []
57
+
58
+ if item.item_type == 'pronunciation' and utterance:
59
+ utterance.append(' ')
60
+ utterance.append(item.content)
61
+ end_time = item.end_time
62
+
63
+ # Output the last utterance
64
+ if utterance:
65
+ transcript = f"Speaker {current_speaker}: {''.join(utterance).strip()}"
66
+ print(f"\n{transcript}")
67
+ self.results.append(transcript)
68
+ self._append_to_file(transcript)
69
+
70
+ def _append_to_file(self, transcript):
71
+ """Append STT script to file"""
72
+ try:
73
+ with open(self.output_file_path, 'a', encoding='utf-8') as f:
74
+ f.write(transcript + "\n")
75
+ except Exception as e:
76
+ logger.error(f"Error occurred while writing to file: {str(e)}")
77
+
78
+ def set_complete(self):
79
+ """Indicate that transcription processing is complete"""
80
+ self.processing_complete = True
81
+
82
+ try:
83
+ with open(self.output_file_path, 'a', encoding="utf-8") as f:
84
+ f.write("\n----STT work complete---\n")
85
+ except Exception as e:
86
+ logger.error(f"Error occurred while writing completion marker: {str(e)}")
87
+
88
+ async def process_audio_file(file_path, region="ap-northeast-2", sample_rate=32000, language=None, content_type=None):
89
+ """Asynchronous function to process audio files and generate transcripts"""
90
+ logger.info(f"Starting transcription for file '{file_path}'")
91
+
92
+ if language is None:
93
+ if content_type == "Bundesliga Fan Experience" or "bundesliga" in file_path.lower():
94
+ language = "en-US"
95
+ logger.info("English content detected: changing language setting to 'en-us'")
96
+ else:
97
+ language = "ko-KR"
98
+ logger.info("Default language setting: 'ko-KR'")
99
+
100
+ client = TranscribeStreamingClient(region=region)
101
+
102
+ stream = await client.start_stream_transcription(
103
+ language_code=language,
104
+ media_sample_rate_hz=sample_rate,
105
+ media_encoding="pcm",
106
+ enable_partial_results_stabilization=True,
107
+ partial_results_stability="high",
108
+ show_speaker_label=True,
109
+ enable_channel_identification=True,
110
+ number_of_channels=2
111
+ )
112
+
113
+ handler = TranscriptHandler(stream.output_stream)
114
+
115
+ async def write_chunks():
116
+ try:
117
+ async with aiofiles.open(file_path, 'rb') as afp:
118
+ # Skip WAV header
119
+ await afp.seek(44)
120
+
121
+ while True:
122
+ chunk = await afp.read(1024*16)
123
+ if not chunk:
124
+ break
125
+ await stream.input_stream.send_audio_event(audio_chunk=chunk)
126
+ await asyncio.sleep(0.125)
127
+
128
+ await stream.input_stream.end_stream()
129
+ except Exception as e:
130
+ logger.error(f"Error occurred while writing chunks: {str(e)}")
131
+
132
+ await asyncio.gather(write_chunks(), handler.handle_events())
133
+
134
+ handler.set_complete()
135
+ logger.info(f"Transcription complete: {len(handler.results)} utterance segments processed")
136
+ return handler
137
+
138
+ def run_transcription(file_path, content_type=None):
139
+ """Synchronous wrapper function to run in ThreadPoolExecutor"""
140
+ loop = asyncio.new_event_loop()
141
+ asyncio.set_event_loop(loop)
142
+ try:
143
+ handler = loop.run_until_complete(process_audio_file(file_path, content_type=content_type))
144
+ return handler # Return the handler object itself
145
+ finally:
146
+ loop.close()
requirements.txt ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ boto3==1.38.1
2
+ botocore==1.38.1
3
+ amazon-transcribe
4
+ aiofiles
5
+ asyncio
6
+ nest-asyncio
7
+ streamlit
8
+ pillow
9
+ gradio
10
+ python-dotenv
11
+ google-genai
12
+ huggingface-hub
run_backend.py ADDED
@@ -0,0 +1,184 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import concurrent.futures
2
+ import os
3
+ import time
4
+ import logging
5
+ import threading
6
+ from realtime_video_analysis import run_transcription
7
+ from analyze_claude import analyze_with_claude
8
+ from google_search import grounding_with_google_search
9
+
10
+ # Set up logging
11
+ logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
12
+ logger = logging.getLogger('backend_system')
13
+
14
+ analysis_results = []
15
+ search_results = []
16
+
17
+ def periodic_claude_analysis(party, output_file_path="./transcribe_texts"):
18
+ """
19
+ Function to read and analyze transcripts from a file
20
+ - Wait 30 seconds before the first analysis
21
+ - Analyze at 10-second intervals thereafter
22
+ """
23
+ logger.info("Starting Claude analysis task")
24
+ analysis_count = 0
25
+
26
+ while not os.path.exists(output_file_path):
27
+ logger.info("Waiting for transcription file to be created...")
28
+ time.sleep(2)
29
+
30
+ logger.info("Waiting 30 seconds for initial transcription collection...")
31
+ time.sleep(30)
32
+ logger.info("Wait complete, starting analysis")
33
+
34
+ while True:
35
+ try:
36
+ if not os.path.exists(output_file_path):
37
+ logger.warning("Transcription file is missing. Waiting...")
38
+ time.sleep(5)
39
+ continue
40
+
41
+ with open(output_file_path, "r", encoding="utf-8") as f:
42
+ current_content = f.read()
43
+
44
+ if current_content.strip():
45
+ analysis_count += 1
46
+ logger.info(f"Starting analysis #{analysis_count}: Read content from file")
47
+
48
+ try:
49
+ analysis_result = analyze_with_claude(current_content, party)
50
+ print("\n" + "="*50)
51
+ print(f"Analysis result #{analysis_count} - {time.strftime('%Y-%m-%d %H:%M:%S')}")
52
+ print("="*50)
53
+ print(analysis_result)
54
+ print("="*50 + "\n")
55
+ analysis_results.append(analysis_result)
56
+
57
+ except Exception as e:
58
+ logger.error(f"Error occurred during Claude summarization: {str(e)}")
59
+
60
+ else:
61
+ logger.info("No content in file. Waiting...")
62
+
63
+ if "----STT work complete---" in current_content:
64
+ break
65
+
66
+ except Exception as e:
67
+ logger.error(f"Error occurred while reading file: {str(e)}")
68
+
69
+ time.sleep(10)
70
+
71
+ logger.info("Claude analysis task complete")
72
+
73
+ def periodic_google_search(party, output_file_path="./transcribe_texts"):
74
+ """
75
+ Function to read entire transcript from a file and perform keyword extraction and search with Gemini
76
+ - Wait 30 seconds before the first search
77
+ - Search at 10-second intervals thereafter
78
+ """
79
+ logger.info("Starting Google search task")
80
+ search_count = 0
81
+
82
+ # Wait until the file is created
83
+ while not os.path.exists(output_file_path):
84
+ logger.info("Waiting for transcription file...")
85
+ time.sleep(2)
86
+
87
+ # Initial 30-second wait
88
+ logger.info("Waiting 30 seconds for initial transcription collection...")
89
+ time.sleep(30)
90
+ logger.info("Wait complete, starting Google search")
91
+
92
+ # If the file exists, read and search periodically
93
+ while True:
94
+ try:
95
+ # Check if the file exists
96
+ if not os.path.exists(output_file_path):
97
+ logger.warning("Transcription file is missing. Waiting...")
98
+ time.sleep(5)
99
+ continue
100
+
101
+ # Read entire file content
102
+ with open(output_file_path, 'r', encoding='utf-8') as f:
103
+ content = f.read()
104
+ all_lines = content.splitlines()
105
+
106
+ # Use only the last 5 lines of the entire file content for google_search
107
+ last_lines = all_lines[-5:] if len(all_lines) >= 5 else all_lines
108
+ current_content = "".join(last_lines).strip()
109
+
110
+ # Log content (for debugging)
111
+ logger.info(f"Starting Google search #{search_count}: Analyzing last 5 lines in STT file")
112
+
113
+ # If there is content, perform search
114
+ if current_content:
115
+ search_count += 1
116
+ logger.info(f"Google Search #{search_count} Start: Analyzing last 5 lines in STT file")
117
+
118
+ try:
119
+ # Keyword extraction and search with Gemini
120
+ search_result = grounding_with_google_search(current_content, party)
121
+
122
+ # Output search results
123
+ print("\n" + "="*50)
124
+ print(f"Google Search Result #{search_count} - {time.strftime('%Y-%m-%d %H:%M:%S')}")
125
+ print("="*50)
126
+ print(search_result)
127
+ print("="*50 + "\n")
128
+ search_results.append(search_result)
129
+
130
+ except Exception as e:
131
+ logger.error(f"Error occurred during Google search: {str(e)}")
132
+ else:
133
+ logger.info("No content in file. Waiting...")
134
+
135
+ # Check if the completion marker is present
136
+ if "----STT work complete---" in current_content:
137
+ logger.info("Completion marker detected. Google search complete.")
138
+ break
139
+
140
+ except Exception as e:
141
+ logger.error(f"Error occurred while reading file: {str(e)}")
142
+
143
+ # Wait 10 seconds
144
+ time.sleep(10)
145
+
146
+ logger.info("Google search task complete")
147
+
148
+ def main(party=None):
149
+ """Main function - run parallel tasks"""
150
+
151
+ # Select audio file based on the button clicked
152
+ if party == "더불어민주당":
153
+ audio_file = None
154
+ elif party == "Agents for Amazon Bedrock":
155
+ audio_file = './data/summit_sungwoo.wav'
156
+ elif party == "Bundesliga Fan Experience":
157
+ audio_file = './data/aws_bundesliga.wav'
158
+ elif party == "AWS_2024_recap":
159
+ audio_file = './data/aws.wav'
160
+ else: # Default or "국민의힘"
161
+ audio_file = None
162
+ party = "국민의힘"
163
+
164
+ output_file_path = './transcribe_texts'
165
+
166
+ logger.info("Backend system started")
167
+
168
+ # Run parallel tasks using ThreadPoolExecutor
169
+ with concurrent.futures.ThreadPoolExecutor() as executor:
170
+ # Submit two tasks simultaneously
171
+ task1 = executor.submit(run_transcription, audio_file, party)
172
+ task2 = executor.submit(periodic_claude_analysis, party, output_file_path)
173
+ task3 = executor.submit(periodic_google_search, party, output_file_path)
174
+
175
+ # Wait for both tasks to complete
176
+ task1.result()
177
+ task2.result()
178
+ task3.result()
179
+
180
+ logger.info("All tasks complete")
181
+ return "Analysis complete"
182
+
183
+ if __name__ == "__main__":
184
+ results = main()
transcribe_texts ADDED
File without changes