| 2025-08-21 00:57:48 - INFO - Loading model: openbmb/MiniCPM-V-4 |
| 2025-08-21 00:57:49 - INFO - vision_config is None, using default vision config |
| 2025-08-21 00:58:53 - INFO - Model loaded in 64.86 seconds |
| 2025-08-21 00:58:53 - INFO - GPU Memory Usage after model load: 7802.99 MB |
| 2025-08-21 01:00:40 - INFO - [f1b33371-19eb-4445-b227-d46a97ed0050] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_001.mp4' |
| 2025-08-21 01:00:40 - INFO - [f1b33371-19eb-4445-b227-d46a97ed0050] Video saved to temporary file: temp_videos/f1b33371-19eb-4445-b227-d46a97ed0050.mp4 |
| 2025-08-21 01:00:40 - INFO - [f1b33371-19eb-4445-b227-d46a97ed0050] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:00:45 - INFO - [f1b33371-19eb-4445-b227-d46a97ed0050] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:00:45 - INFO - [f1b33371-19eb-4445-b227-d46a97ed0050] 30 frames saved to temp_videos/f1b33371-19eb-4445-b227-d46a97ed0050 |
| 2025-08-21 01:01:03 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:01:14 - INFO - Tokens per second: 5.106639419779758, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:01:14 - INFO - [f1b33371-19eb-4445-b227-d46a97ed0050] Inference time: 33.66 seconds, CPU usage: 16.3%, CPU core utilization: [16.0, 13.4, 20.9, 14.9] |
| 2025-08-21 01:01:14 - INFO - [f1b33371-19eb-4445-b227-d46a97ed0050] Cleaned up temporary frame directory: temp_videos/f1b33371-19eb-4445-b227-d46a97ed0050 |
| 2025-08-21 01:01:14 - INFO - [a95d5cef-7c7e-42ee-8944-0bfe41e9beed] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_002.mp4' |
| 2025-08-21 01:01:14 - INFO - [a95d5cef-7c7e-42ee-8944-0bfe41e9beed] Video saved to temporary file: temp_videos/a95d5cef-7c7e-42ee-8944-0bfe41e9beed.mp4 |
| 2025-08-21 01:01:14 - INFO - [a95d5cef-7c7e-42ee-8944-0bfe41e9beed] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:01:19 - INFO - [a95d5cef-7c7e-42ee-8944-0bfe41e9beed] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:01:19 - INFO - [a95d5cef-7c7e-42ee-8944-0bfe41e9beed] 30 frames saved to temp_videos/a95d5cef-7c7e-42ee-8944-0bfe41e9beed |
| 2025-08-21 01:01:32 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:01:43 - INFO - Tokens per second: 5.976602351659928, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:01:43 - INFO - [a95d5cef-7c7e-42ee-8944-0bfe41e9beed] Inference time: 29.21 seconds, CPU usage: 36.9%, CPU core utilization: [52.9, 47.3, 16.7, 30.9] |
| 2025-08-21 01:01:43 - INFO - [a95d5cef-7c7e-42ee-8944-0bfe41e9beed] Cleaned up temporary frame directory: temp_videos/a95d5cef-7c7e-42ee-8944-0bfe41e9beed |
| 2025-08-21 01:01:43 - INFO - [3074ddc0-8709-448c-b863-d209d175a408] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_003.mp4' |
| 2025-08-21 01:01:43 - INFO - [3074ddc0-8709-448c-b863-d209d175a408] Video saved to temporary file: temp_videos/3074ddc0-8709-448c-b863-d209d175a408.mp4 |
| 2025-08-21 01:01:43 - INFO - [3074ddc0-8709-448c-b863-d209d175a408] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:01:48 - INFO - [3074ddc0-8709-448c-b863-d209d175a408] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:01:48 - INFO - [3074ddc0-8709-448c-b863-d209d175a408] 30 frames saved to temp_videos/3074ddc0-8709-448c-b863-d209d175a408 |
| 2025-08-21 01:02:01 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:02:11 - INFO - Tokens per second: 5.11842779774044, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:02:11 - INFO - [3074ddc0-8709-448c-b863-d209d175a408] Inference time: 28.20 seconds, CPU usage: 37.6%, CPU core utilization: [51.0, 23.2, 57.8, 18.2] |
| 2025-08-21 01:02:11 - INFO - [3074ddc0-8709-448c-b863-d209d175a408] Cleaned up temporary frame directory: temp_videos/3074ddc0-8709-448c-b863-d209d175a408 |
| 2025-08-21 01:02:11 - INFO - [07ccf239-d23c-4776-8404-e885a43e8515] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_004.mp4' |
| 2025-08-21 01:02:11 - INFO - [07ccf239-d23c-4776-8404-e885a43e8515] Video saved to temporary file: temp_videos/07ccf239-d23c-4776-8404-e885a43e8515.mp4 |
| 2025-08-21 01:02:11 - INFO - [07ccf239-d23c-4776-8404-e885a43e8515] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:02:16 - INFO - [07ccf239-d23c-4776-8404-e885a43e8515] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:02:16 - INFO - [07ccf239-d23c-4776-8404-e885a43e8515] 30 frames saved to temp_videos/07ccf239-d23c-4776-8404-e885a43e8515 |
| 2025-08-21 01:02:29 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:02:43 - INFO - Tokens per second: 7.094428465980155, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:02:43 - INFO - [07ccf239-d23c-4776-8404-e885a43e8515] Inference time: 31.26 seconds, CPU usage: 36.1%, CPU core utilization: [16.6, 46.3, 15.5, 66.2] |
| 2025-08-21 01:02:43 - INFO - [07ccf239-d23c-4776-8404-e885a43e8515] Cleaned up temporary frame directory: temp_videos/07ccf239-d23c-4776-8404-e885a43e8515 |
| 2025-08-21 01:02:43 - INFO - [2bbdb8ec-7655-434d-834f-cb58caa1d778] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_005.mp4' |
| 2025-08-21 01:02:43 - INFO - [2bbdb8ec-7655-434d-834f-cb58caa1d778] Video saved to temporary file: temp_videos/2bbdb8ec-7655-434d-834f-cb58caa1d778.mp4 |
| 2025-08-21 01:02:43 - INFO - [2bbdb8ec-7655-434d-834f-cb58caa1d778] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:02:48 - INFO - [2bbdb8ec-7655-434d-834f-cb58caa1d778] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:02:48 - INFO - [2bbdb8ec-7655-434d-834f-cb58caa1d778] 30 frames saved to temp_videos/2bbdb8ec-7655-434d-834f-cb58caa1d778 |
| 2025-08-21 01:03:00 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:03:11 - INFO - Tokens per second: 5.149733378245225, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:03:11 - INFO - [2bbdb8ec-7655-434d-834f-cb58caa1d778] Inference time: 28.50 seconds, CPU usage: 37.4%, CPU core utilization: [19.1, 93.3, 16.5, 20.5] |
| 2025-08-21 01:03:11 - INFO - [2bbdb8ec-7655-434d-834f-cb58caa1d778] Cleaned up temporary frame directory: temp_videos/2bbdb8ec-7655-434d-834f-cb58caa1d778 |
| 2025-08-21 01:03:11 - INFO - [2e2162d1-d7a0-4b02-940b-60b27a47d77e] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_006.mp4' |
| 2025-08-21 01:03:11 - INFO - [2e2162d1-d7a0-4b02-940b-60b27a47d77e] Video saved to temporary file: temp_videos/2e2162d1-d7a0-4b02-940b-60b27a47d77e.mp4 |
| 2025-08-21 01:03:11 - INFO - [2e2162d1-d7a0-4b02-940b-60b27a47d77e] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:03:16 - INFO - [2e2162d1-d7a0-4b02-940b-60b27a47d77e] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:03:16 - INFO - [2e2162d1-d7a0-4b02-940b-60b27a47d77e] 30 frames saved to temp_videos/2e2162d1-d7a0-4b02-940b-60b27a47d77e |
| 2025-08-21 01:03:29 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:03:37 - INFO - Tokens per second: 1.8942455900034438, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:03:37 - INFO - [2e2162d1-d7a0-4b02-940b-60b27a47d77e] Inference time: 25.68 seconds, CPU usage: 37.9%, CPU core utilization: [27.3, 31.8, 57.5, 35.1] |
| 2025-08-21 01:03:37 - INFO - [2e2162d1-d7a0-4b02-940b-60b27a47d77e] Cleaned up temporary frame directory: temp_videos/2e2162d1-d7a0-4b02-940b-60b27a47d77e |
| 2025-08-21 01:03:37 - INFO - [f08503e2-3694-4dd9-b73f-a2cebaac51af] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_007.mp4' |
| 2025-08-21 01:03:37 - INFO - [f08503e2-3694-4dd9-b73f-a2cebaac51af] Video saved to temporary file: temp_videos/f08503e2-3694-4dd9-b73f-a2cebaac51af.mp4 |
| 2025-08-21 01:03:37 - INFO - [f08503e2-3694-4dd9-b73f-a2cebaac51af] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:03:42 - INFO - [f08503e2-3694-4dd9-b73f-a2cebaac51af] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:03:42 - INFO - [f08503e2-3694-4dd9-b73f-a2cebaac51af] 30 frames saved to temp_videos/f08503e2-3694-4dd9-b73f-a2cebaac51af |
| 2025-08-21 01:03:55 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:04:03 - INFO - Tokens per second: 2.3168022945047397, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:04:03 - INFO - [f08503e2-3694-4dd9-b73f-a2cebaac51af] Inference time: 25.99 seconds, CPU usage: 37.8%, CPU core utilization: [22.5, 61.1, 17.4, 50.2] |
| 2025-08-21 01:04:03 - INFO - [f08503e2-3694-4dd9-b73f-a2cebaac51af] Cleaned up temporary frame directory: temp_videos/f08503e2-3694-4dd9-b73f-a2cebaac51af |
| 2025-08-21 01:04:03 - INFO - [63cd6cc8-794f-4d25-9168-10dbba530d1d] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_008.mp4' |
| 2025-08-21 01:04:03 - INFO - [63cd6cc8-794f-4d25-9168-10dbba530d1d] Video saved to temporary file: temp_videos/63cd6cc8-794f-4d25-9168-10dbba530d1d.mp4 |
| 2025-08-21 01:04:03 - INFO - [63cd6cc8-794f-4d25-9168-10dbba530d1d] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:04:08 - INFO - [63cd6cc8-794f-4d25-9168-10dbba530d1d] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:04:08 - INFO - [63cd6cc8-794f-4d25-9168-10dbba530d1d] 30 frames saved to temp_videos/63cd6cc8-794f-4d25-9168-10dbba530d1d |
| 2025-08-21 01:04:21 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:04:30 - INFO - Tokens per second: 4.072920119323832, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:04:30 - INFO - [63cd6cc8-794f-4d25-9168-10dbba530d1d] Inference time: 27.38 seconds, CPU usage: 37.7%, CPU core utilization: [26.2, 41.3, 45.1, 38.4] |
| 2025-08-21 01:04:30 - INFO - [63cd6cc8-794f-4d25-9168-10dbba530d1d] Cleaned up temporary frame directory: temp_videos/63cd6cc8-794f-4d25-9168-10dbba530d1d |
| 2025-08-21 01:04:30 - INFO - [9c39529e-632b-4ee3-8076-2ecd9890f71e] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_009.mp4' |
| 2025-08-21 01:04:30 - INFO - [9c39529e-632b-4ee3-8076-2ecd9890f71e] Video saved to temporary file: temp_videos/9c39529e-632b-4ee3-8076-2ecd9890f71e.mp4 |
| 2025-08-21 01:04:30 - INFO - [9c39529e-632b-4ee3-8076-2ecd9890f71e] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:04:35 - INFO - [9c39529e-632b-4ee3-8076-2ecd9890f71e] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:04:35 - INFO - [9c39529e-632b-4ee3-8076-2ecd9890f71e] 30 frames saved to temp_videos/9c39529e-632b-4ee3-8076-2ecd9890f71e |
| 2025-08-21 01:04:48 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:04:56 - INFO - Tokens per second: 2.3167189804012382, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:04:56 - INFO - [9c39529e-632b-4ee3-8076-2ecd9890f71e] Inference time: 26.01 seconds, CPU usage: 38.1%, CPU core utilization: [21.9, 49.8, 55.8, 25.0] |
| 2025-08-21 01:04:56 - INFO - [9c39529e-632b-4ee3-8076-2ecd9890f71e] Cleaned up temporary frame directory: temp_videos/9c39529e-632b-4ee3-8076-2ecd9890f71e |
| 2025-08-21 01:04:56 - INFO - [81880b7c-ee68-4a30-849f-1d4ec94e298d] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_010.mp4' |
| 2025-08-21 01:04:56 - INFO - [81880b7c-ee68-4a30-849f-1d4ec94e298d] Video saved to temporary file: temp_videos/81880b7c-ee68-4a30-849f-1d4ec94e298d.mp4 |
| 2025-08-21 01:04:56 - INFO - [81880b7c-ee68-4a30-849f-1d4ec94e298d] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:05:01 - INFO - [81880b7c-ee68-4a30-849f-1d4ec94e298d] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:05:01 - INFO - [81880b7c-ee68-4a30-849f-1d4ec94e298d] 30 frames saved to temp_videos/81880b7c-ee68-4a30-849f-1d4ec94e298d |
| 2025-08-21 01:05:14 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:05:24 - INFO - Tokens per second: 3.995493400475914, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:05:24 - INFO - [81880b7c-ee68-4a30-849f-1d4ec94e298d] Inference time: 27.32 seconds, CPU usage: 37.8%, CPU core utilization: [24.0, 30.3, 46.3, 50.6] |
| 2025-08-21 01:05:24 - INFO - [81880b7c-ee68-4a30-849f-1d4ec94e298d] Cleaned up temporary frame directory: temp_videos/81880b7c-ee68-4a30-849f-1d4ec94e298d |
| 2025-08-21 01:05:24 - INFO - [17dbdf76-a781-446b-92e4-20f0c09ea7fb] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_011.mp4' |
| 2025-08-21 01:05:24 - INFO - [17dbdf76-a781-446b-92e4-20f0c09ea7fb] Video saved to temporary file: temp_videos/17dbdf76-a781-446b-92e4-20f0c09ea7fb.mp4 |
| 2025-08-21 01:05:24 - INFO - [17dbdf76-a781-446b-92e4-20f0c09ea7fb] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:05:28 - INFO - [17dbdf76-a781-446b-92e4-20f0c09ea7fb] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:05:28 - INFO - [17dbdf76-a781-446b-92e4-20f0c09ea7fb] 30 frames saved to temp_videos/17dbdf76-a781-446b-92e4-20f0c09ea7fb |
| 2025-08-21 01:05:41 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:05:53 - INFO - Tokens per second: 6.202907021110835, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:05:53 - INFO - [17dbdf76-a781-446b-92e4-20f0c09ea7fb] Inference time: 29.86 seconds, CPU usage: 36.8%, CPU core utilization: [55.8, 29.8, 44.5, 17.1] |
| 2025-08-21 01:05:53 - INFO - [17dbdf76-a781-446b-92e4-20f0c09ea7fb] Cleaned up temporary frame directory: temp_videos/17dbdf76-a781-446b-92e4-20f0c09ea7fb |
| 2025-08-21 01:05:53 - INFO - [cb51ff1b-2681-489a-944e-810fb0878d91] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_012.mp4' |
| 2025-08-21 01:05:53 - INFO - [cb51ff1b-2681-489a-944e-810fb0878d91] Video saved to temporary file: temp_videos/cb51ff1b-2681-489a-944e-810fb0878d91.mp4 |
| 2025-08-21 01:05:53 - INFO - [cb51ff1b-2681-489a-944e-810fb0878d91] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:05:58 - INFO - [cb51ff1b-2681-489a-944e-810fb0878d91] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:05:58 - INFO - [cb51ff1b-2681-489a-944e-810fb0878d91] 30 frames saved to temp_videos/cb51ff1b-2681-489a-944e-810fb0878d91 |
| 2025-08-21 01:06:11 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:06:24 - INFO - Tokens per second: 6.5597487765640565, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:06:24 - INFO - [cb51ff1b-2681-489a-944e-810fb0878d91] Inference time: 30.37 seconds, CPU usage: 36.6%, CPU core utilization: [25.1, 23.9, 51.7, 45.5] |
| 2025-08-21 01:06:24 - INFO - [cb51ff1b-2681-489a-944e-810fb0878d91] Cleaned up temporary frame directory: temp_videos/cb51ff1b-2681-489a-944e-810fb0878d91 |
| 2025-08-21 01:06:24 - INFO - [05904247-99c1-419b-974f-352384eb4d6f] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_013.mp4' |
| 2025-08-21 01:06:24 - INFO - [05904247-99c1-419b-974f-352384eb4d6f] Video saved to temporary file: temp_videos/05904247-99c1-419b-974f-352384eb4d6f.mp4 |
| 2025-08-21 01:06:24 - INFO - [05904247-99c1-419b-974f-352384eb4d6f] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:06:29 - INFO - [05904247-99c1-419b-974f-352384eb4d6f] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:06:29 - INFO - [05904247-99c1-419b-974f-352384eb4d6f] 30 frames saved to temp_videos/05904247-99c1-419b-974f-352384eb4d6f |
| 2025-08-21 01:06:42 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:06:54 - INFO - Tokens per second: 6.599854347514273, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:06:54 - INFO - [05904247-99c1-419b-974f-352384eb4d6f] Inference time: 30.52 seconds, CPU usage: 36.9%, CPU core utilization: [68.2, 18.6, 42.7, 18.1] |
| 2025-08-21 01:06:54 - INFO - [05904247-99c1-419b-974f-352384eb4d6f] Cleaned up temporary frame directory: temp_videos/05904247-99c1-419b-974f-352384eb4d6f |
| 2025-08-21 01:06:54 - INFO - [1a8e1c72-a4c4-4ac2-ad36-2bb2e3c2218c] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_014.mp4' |
| 2025-08-21 01:06:54 - INFO - [1a8e1c72-a4c4-4ac2-ad36-2bb2e3c2218c] Video saved to temporary file: temp_videos/1a8e1c72-a4c4-4ac2-ad36-2bb2e3c2218c.mp4 |
| 2025-08-21 01:06:54 - INFO - [1a8e1c72-a4c4-4ac2-ad36-2bb2e3c2218c] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:06:59 - INFO - [1a8e1c72-a4c4-4ac2-ad36-2bb2e3c2218c] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:06:59 - INFO - [1a8e1c72-a4c4-4ac2-ad36-2bb2e3c2218c] 30 frames saved to temp_videos/1a8e1c72-a4c4-4ac2-ad36-2bb2e3c2218c |
| 2025-08-21 01:07:12 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:07:24 - INFO - Tokens per second: 6.004141023516783, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:07:24 - INFO - [1a8e1c72-a4c4-4ac2-ad36-2bb2e3c2218c] Inference time: 29.61 seconds, CPU usage: 36.9%, CPU core utilization: [54.0, 34.3, 33.5, 25.6] |
| 2025-08-21 01:07:24 - INFO - [1a8e1c72-a4c4-4ac2-ad36-2bb2e3c2218c] Cleaned up temporary frame directory: temp_videos/1a8e1c72-a4c4-4ac2-ad36-2bb2e3c2218c |
| 2025-08-21 01:07:24 - INFO - [4180adcc-5e38-40df-9153-9dc175d55b7d] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_015.mp4' |
| 2025-08-21 01:07:24 - INFO - [4180adcc-5e38-40df-9153-9dc175d55b7d] Video saved to temporary file: temp_videos/4180adcc-5e38-40df-9153-9dc175d55b7d.mp4 |
| 2025-08-21 01:07:24 - INFO - [4180adcc-5e38-40df-9153-9dc175d55b7d] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:07:29 - INFO - [4180adcc-5e38-40df-9153-9dc175d55b7d] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:07:29 - INFO - [4180adcc-5e38-40df-9153-9dc175d55b7d] 30 frames saved to temp_videos/4180adcc-5e38-40df-9153-9dc175d55b7d |
| 2025-08-21 01:07:42 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:07:57 - INFO - Tokens per second: 7.644809074843396, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:07:57 - INFO - [4180adcc-5e38-40df-9153-9dc175d55b7d] Inference time: 32.59 seconds, CPU usage: 35.7%, CPU core utilization: [26.1, 37.3, 48.7, 30.7] |
| 2025-08-21 01:07:57 - INFO - [4180adcc-5e38-40df-9153-9dc175d55b7d] Cleaned up temporary frame directory: temp_videos/4180adcc-5e38-40df-9153-9dc175d55b7d |
| 2025-08-21 01:07:57 - INFO - [31151bfe-e639-4354-8d86-1446365b7a6d] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_016.mp4' |
| 2025-08-21 01:07:57 - INFO - [31151bfe-e639-4354-8d86-1446365b7a6d] Video saved to temporary file: temp_videos/31151bfe-e639-4354-8d86-1446365b7a6d.mp4 |
| 2025-08-21 01:07:57 - INFO - [31151bfe-e639-4354-8d86-1446365b7a6d] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:08:01 - INFO - [31151bfe-e639-4354-8d86-1446365b7a6d] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:08:02 - INFO - [31151bfe-e639-4354-8d86-1446365b7a6d] 30 frames saved to temp_videos/31151bfe-e639-4354-8d86-1446365b7a6d |
| 2025-08-21 01:08:14 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:08:24 - INFO - Tokens per second: 3.917112300612553, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:08:24 - INFO - [31151bfe-e639-4354-8d86-1446365b7a6d] Inference time: 27.22 seconds, CPU usage: 37.9%, CPU core utilization: [26.2, 35.0, 54.6, 35.9] |
| 2025-08-21 01:08:24 - INFO - [31151bfe-e639-4354-8d86-1446365b7a6d] Cleaned up temporary frame directory: temp_videos/31151bfe-e639-4354-8d86-1446365b7a6d |
| 2025-08-21 01:08:24 - INFO - [bea72135-e169-4ffa-8a4d-711a63d29de7] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_017.mp4' |
| 2025-08-21 01:08:24 - INFO - [bea72135-e169-4ffa-8a4d-711a63d29de7] Video saved to temporary file: temp_videos/bea72135-e169-4ffa-8a4d-711a63d29de7.mp4 |
| 2025-08-21 01:08:24 - INFO - [bea72135-e169-4ffa-8a4d-711a63d29de7] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:08:29 - INFO - [bea72135-e169-4ffa-8a4d-711a63d29de7] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:08:29 - INFO - [bea72135-e169-4ffa-8a4d-711a63d29de7] 30 frames saved to temp_videos/bea72135-e169-4ffa-8a4d-711a63d29de7 |
| 2025-08-21 01:08:42 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:08:52 - INFO - Tokens per second: 5.140461789786328, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:08:52 - INFO - [bea72135-e169-4ffa-8a4d-711a63d29de7] Inference time: 28.39 seconds, CPU usage: 36.8%, CPU core utilization: [30.2, 35.9, 49.2, 32.1] |
| 2025-08-21 01:08:52 - INFO - [bea72135-e169-4ffa-8a4d-711a63d29de7] Cleaned up temporary frame directory: temp_videos/bea72135-e169-4ffa-8a4d-711a63d29de7 |
| 2025-08-21 01:08:52 - INFO - [0225d6ec-246f-4a57-8cd4-937fc512a6da] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_018.mp4' |
| 2025-08-21 01:08:52 - INFO - [0225d6ec-246f-4a57-8cd4-937fc512a6da] Video saved to temporary file: temp_videos/0225d6ec-246f-4a57-8cd4-937fc512a6da.mp4 |
| 2025-08-21 01:08:52 - INFO - [0225d6ec-246f-4a57-8cd4-937fc512a6da] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:08:57 - INFO - [0225d6ec-246f-4a57-8cd4-937fc512a6da] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:08:57 - INFO - [0225d6ec-246f-4a57-8cd4-937fc512a6da] 30 frames saved to temp_videos/0225d6ec-246f-4a57-8cd4-937fc512a6da |
| 2025-08-21 01:09:10 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:09:25 - INFO - Tokens per second: 7.796685733470381, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:09:25 - INFO - [0225d6ec-246f-4a57-8cd4-937fc512a6da] Inference time: 32.85 seconds, CPU usage: 36.0%, CPU core utilization: [47.5, 17.9, 27.3, 51.2] |
| 2025-08-21 01:09:25 - INFO - [0225d6ec-246f-4a57-8cd4-937fc512a6da] Cleaned up temporary frame directory: temp_videos/0225d6ec-246f-4a57-8cd4-937fc512a6da |
| 2025-08-21 01:09:25 - INFO - [a4d435b1-e6a4-4c53-883e-4df02b912d3a] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_019.mp4' |
| 2025-08-21 01:09:25 - INFO - [a4d435b1-e6a4-4c53-883e-4df02b912d3a] Video saved to temporary file: temp_videos/a4d435b1-e6a4-4c53-883e-4df02b912d3a.mp4 |
| 2025-08-21 01:09:25 - INFO - [a4d435b1-e6a4-4c53-883e-4df02b912d3a] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:09:30 - INFO - [a4d435b1-e6a4-4c53-883e-4df02b912d3a] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:09:30 - INFO - [a4d435b1-e6a4-4c53-883e-4df02b912d3a] 30 frames saved to temp_videos/a4d435b1-e6a4-4c53-883e-4df02b912d3a |
| 2025-08-21 01:09:43 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:09:59 - INFO - Tokens per second: 8.04906132671884, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:09:59 - INFO - [a4d435b1-e6a4-4c53-883e-4df02b912d3a] Inference time: 33.57 seconds, CPU usage: 35.7%, CPU core utilization: [28.8, 35.7, 39.0, 39.3] |
| 2025-08-21 01:09:59 - INFO - [a4d435b1-e6a4-4c53-883e-4df02b912d3a] Cleaned up temporary frame directory: temp_videos/a4d435b1-e6a4-4c53-883e-4df02b912d3a |
| 2025-08-21 01:09:59 - INFO - [cb7f092b-c376-4f47-80ca-804b08b972a4] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_020.mp4' |
| 2025-08-21 01:09:59 - INFO - [cb7f092b-c376-4f47-80ca-804b08b972a4] Video saved to temporary file: temp_videos/cb7f092b-c376-4f47-80ca-804b08b972a4.mp4 |
| 2025-08-21 01:09:59 - INFO - [cb7f092b-c376-4f47-80ca-804b08b972a4] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:10:04 - INFO - [cb7f092b-c376-4f47-80ca-804b08b972a4] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:10:04 - INFO - [cb7f092b-c376-4f47-80ca-804b08b972a4] 30 frames saved to temp_videos/cb7f092b-c376-4f47-80ca-804b08b972a4 |
| 2025-08-21 01:10:16 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:10:29 - INFO - Tokens per second: 6.551982232700224, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:10:29 - INFO - [cb7f092b-c376-4f47-80ca-804b08b972a4] Inference time: 30.41 seconds, CPU usage: 36.7%, CPU core utilization: [35.2, 27.6, 62.1, 22.0] |
| 2025-08-21 01:10:29 - INFO - [cb7f092b-c376-4f47-80ca-804b08b972a4] Cleaned up temporary frame directory: temp_videos/cb7f092b-c376-4f47-80ca-804b08b972a4 |
| 2025-08-21 01:10:29 - INFO - [2e42703b-a52c-49f8-a396-ae02062d1c39] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_021.mp4' |
| 2025-08-21 01:10:29 - INFO - [2e42703b-a52c-49f8-a396-ae02062d1c39] Video saved to temporary file: temp_videos/2e42703b-a52c-49f8-a396-ae02062d1c39.mp4 |
| 2025-08-21 01:10:29 - INFO - [2e42703b-a52c-49f8-a396-ae02062d1c39] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:10:34 - INFO - [2e42703b-a52c-49f8-a396-ae02062d1c39] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:10:34 - INFO - [2e42703b-a52c-49f8-a396-ae02062d1c39] 30 frames saved to temp_videos/2e42703b-a52c-49f8-a396-ae02062d1c39 |
| 2025-08-21 01:10:47 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:10:58 - INFO - Tokens per second: 5.068557863754886, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:10:58 - INFO - [2e42703b-a52c-49f8-a396-ae02062d1c39] Inference time: 28.44 seconds, CPU usage: 37.4%, CPU core utilization: [24.8, 47.0, 16.6, 61.0] |
| 2025-08-21 01:10:58 - INFO - [2e42703b-a52c-49f8-a396-ae02062d1c39] Cleaned up temporary frame directory: temp_videos/2e42703b-a52c-49f8-a396-ae02062d1c39 |
| 2025-08-21 01:10:58 - INFO - [b5420571-277f-43c0-ba2e-141c5b252721] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_022.mp4' |
| 2025-08-21 01:10:58 - INFO - [b5420571-277f-43c0-ba2e-141c5b252721] Video saved to temporary file: temp_videos/b5420571-277f-43c0-ba2e-141c5b252721.mp4 |
| 2025-08-21 01:10:58 - INFO - [b5420571-277f-43c0-ba2e-141c5b252721] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:11:02 - INFO - [b5420571-277f-43c0-ba2e-141c5b252721] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:11:02 - INFO - [b5420571-277f-43c0-ba2e-141c5b252721] 30 frames saved to temp_videos/b5420571-277f-43c0-ba2e-141c5b252721 |
| 2025-08-21 01:11:15 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:11:26 - INFO - Tokens per second: 4.765763711580632, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:11:26 - INFO - [b5420571-277f-43c0-ba2e-141c5b252721] Inference time: 27.98 seconds, CPU usage: 37.6%, CPU core utilization: [43.1, 34.8, 55.9, 16.7] |
| 2025-08-21 01:11:26 - INFO - [b5420571-277f-43c0-ba2e-141c5b252721] Cleaned up temporary frame directory: temp_videos/b5420571-277f-43c0-ba2e-141c5b252721 |
| 2025-08-21 01:11:26 - INFO - [0df499a8-8a08-4b6b-a8bd-63fd84cde688] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_023.mp4' |
| 2025-08-21 01:11:26 - INFO - [0df499a8-8a08-4b6b-a8bd-63fd84cde688] Video saved to temporary file: temp_videos/0df499a8-8a08-4b6b-a8bd-63fd84cde688.mp4 |
| 2025-08-21 01:11:26 - INFO - [0df499a8-8a08-4b6b-a8bd-63fd84cde688] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:11:30 - INFO - [0df499a8-8a08-4b6b-a8bd-63fd84cde688] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:11:31 - INFO - [0df499a8-8a08-4b6b-a8bd-63fd84cde688] 30 frames saved to temp_videos/0df499a8-8a08-4b6b-a8bd-63fd84cde688 |
| 2025-08-21 01:11:43 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:11:58 - INFO - Tokens per second: 7.483257212457778, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:11:58 - INFO - [0df499a8-8a08-4b6b-a8bd-63fd84cde688] Inference time: 32.23 seconds, CPU usage: 36.1%, CPU core utilization: [26.1, 27.6, 47.6, 43.3] |
| 2025-08-21 01:11:58 - INFO - [0df499a8-8a08-4b6b-a8bd-63fd84cde688] Cleaned up temporary frame directory: temp_videos/0df499a8-8a08-4b6b-a8bd-63fd84cde688 |
| 2025-08-21 01:11:58 - INFO - [1e06a7b7-be5d-449e-b3a5-84f5a7f53e2a] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_024.mp4' |
| 2025-08-21 01:11:58 - INFO - [1e06a7b7-be5d-449e-b3a5-84f5a7f53e2a] Video saved to temporary file: temp_videos/1e06a7b7-be5d-449e-b3a5-84f5a7f53e2a.mp4 |
| 2025-08-21 01:11:58 - INFO - [1e06a7b7-be5d-449e-b3a5-84f5a7f53e2a] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:12:03 - INFO - [1e06a7b7-be5d-449e-b3a5-84f5a7f53e2a] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:12:03 - INFO - [1e06a7b7-be5d-449e-b3a5-84f5a7f53e2a] 30 frames saved to temp_videos/1e06a7b7-be5d-449e-b3a5-84f5a7f53e2a |
| 2025-08-21 01:12:16 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:12:25 - INFO - Tokens per second: 3.676425320411625, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:12:25 - INFO - [1e06a7b7-be5d-449e-b3a5-84f5a7f53e2a] Inference time: 27.02 seconds, CPU usage: 38.5%, CPU core utilization: [40.1, 28.5, 57.3, 28.2] |
| 2025-08-21 01:12:25 - INFO - [1e06a7b7-be5d-449e-b3a5-84f5a7f53e2a] Cleaned up temporary frame directory: temp_videos/1e06a7b7-be5d-449e-b3a5-84f5a7f53e2a |
| 2025-08-21 01:12:25 - INFO - [5794ceef-3e1b-4291-b263-2b236146168a] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_025.mp4' |
| 2025-08-21 01:12:25 - INFO - [5794ceef-3e1b-4291-b263-2b236146168a] Video saved to temporary file: temp_videos/5794ceef-3e1b-4291-b263-2b236146168a.mp4 |
| 2025-08-21 01:12:25 - INFO - [5794ceef-3e1b-4291-b263-2b236146168a] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:12:30 - INFO - [5794ceef-3e1b-4291-b263-2b236146168a] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:12:30 - INFO - [5794ceef-3e1b-4291-b263-2b236146168a] 30 frames saved to temp_videos/5794ceef-3e1b-4291-b263-2b236146168a |
| 2025-08-21 01:12:43 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:12:53 - INFO - Tokens per second: 5.01668206513031, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:12:53 - INFO - [5794ceef-3e1b-4291-b263-2b236146168a] Inference time: 28.36 seconds, CPU usage: 37.5%, CPU core utilization: [29.8, 41.1, 41.8, 37.3] |
| 2025-08-21 01:12:53 - INFO - [5794ceef-3e1b-4291-b263-2b236146168a] Cleaned up temporary frame directory: temp_videos/5794ceef-3e1b-4291-b263-2b236146168a |
| 2025-08-21 01:12:53 - INFO - [686109f9-7315-4a90-8563-98c12607d0a8] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_026.mp4' |
| 2025-08-21 01:12:53 - INFO - [686109f9-7315-4a90-8563-98c12607d0a8] Video saved to temporary file: temp_videos/686109f9-7315-4a90-8563-98c12607d0a8.mp4 |
| 2025-08-21 01:12:53 - INFO - [686109f9-7315-4a90-8563-98c12607d0a8] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:12:58 - INFO - [686109f9-7315-4a90-8563-98c12607d0a8] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:12:58 - INFO - [686109f9-7315-4a90-8563-98c12607d0a8] 30 frames saved to temp_videos/686109f9-7315-4a90-8563-98c12607d0a8 |
| 2025-08-21 01:13:11 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:13:23 - INFO - Tokens per second: 6.198040985676184, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:13:23 - INFO - [686109f9-7315-4a90-8563-98c12607d0a8] Inference time: 29.83 seconds, CPU usage: 37.0%, CPU core utilization: [26.8, 32.5, 30.5, 58.1] |
| 2025-08-21 01:13:23 - INFO - [686109f9-7315-4a90-8563-98c12607d0a8] Cleaned up temporary frame directory: temp_videos/686109f9-7315-4a90-8563-98c12607d0a8 |
| 2025-08-21 01:13:23 - INFO - [3b5a6bbe-89d5-4405-82ff-ea34ddf2f53c] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_027.mp4' |
| 2025-08-21 01:13:23 - INFO - [3b5a6bbe-89d5-4405-82ff-ea34ddf2f53c] Video saved to temporary file: temp_videos/3b5a6bbe-89d5-4405-82ff-ea34ddf2f53c.mp4 |
| 2025-08-21 01:13:23 - INFO - [3b5a6bbe-89d5-4405-82ff-ea34ddf2f53c] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:13:28 - INFO - [3b5a6bbe-89d5-4405-82ff-ea34ddf2f53c] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:13:28 - INFO - [3b5a6bbe-89d5-4405-82ff-ea34ddf2f53c] 30 frames saved to temp_videos/3b5a6bbe-89d5-4405-82ff-ea34ddf2f53c |
| 2025-08-21 01:13:41 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:13:54 - INFO - Tokens per second: 6.7633887422454855, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:13:54 - INFO - [3b5a6bbe-89d5-4405-82ff-ea34ddf2f53c] Inference time: 30.77 seconds, CPU usage: 36.5%, CPU core utilization: [64.3, 25.4, 39.5, 16.6] |
| 2025-08-21 01:13:54 - INFO - [3b5a6bbe-89d5-4405-82ff-ea34ddf2f53c] Cleaned up temporary frame directory: temp_videos/3b5a6bbe-89d5-4405-82ff-ea34ddf2f53c |
| 2025-08-21 01:13:54 - INFO - [c0db1c86-c4f5-4f7a-92ec-0b0631bde80c] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_028.mp4' |
| 2025-08-21 01:13:54 - INFO - [c0db1c86-c4f5-4f7a-92ec-0b0631bde80c] Video saved to temporary file: temp_videos/c0db1c86-c4f5-4f7a-92ec-0b0631bde80c.mp4 |
| 2025-08-21 01:13:54 - INFO - [c0db1c86-c4f5-4f7a-92ec-0b0631bde80c] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:13:59 - INFO - [c0db1c86-c4f5-4f7a-92ec-0b0631bde80c] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:13:59 - INFO - [c0db1c86-c4f5-4f7a-92ec-0b0631bde80c] 30 frames saved to temp_videos/c0db1c86-c4f5-4f7a-92ec-0b0631bde80c |
| 2025-08-21 01:14:12 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:14:24 - INFO - Tokens per second: 6.241116142632914, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:14:24 - INFO - [c0db1c86-c4f5-4f7a-92ec-0b0631bde80c] Inference time: 29.90 seconds, CPU usage: 36.9%, CPU core utilization: [52.1, 24.6, 27.8, 43.1] |
| 2025-08-21 01:14:24 - INFO - [c0db1c86-c4f5-4f7a-92ec-0b0631bde80c] Cleaned up temporary frame directory: temp_videos/c0db1c86-c4f5-4f7a-92ec-0b0631bde80c |
| 2025-08-21 01:14:24 - INFO - [bb36dcfa-9c43-4bf7-8c8e-becac106dbbb] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_029.mp4' |
| 2025-08-21 01:14:24 - INFO - [bb36dcfa-9c43-4bf7-8c8e-becac106dbbb] Video saved to temporary file: temp_videos/bb36dcfa-9c43-4bf7-8c8e-becac106dbbb.mp4 |
| 2025-08-21 01:14:24 - INFO - [bb36dcfa-9c43-4bf7-8c8e-becac106dbbb] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:14:29 - INFO - [bb36dcfa-9c43-4bf7-8c8e-becac106dbbb] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:14:29 - INFO - [bb36dcfa-9c43-4bf7-8c8e-becac106dbbb] 30 frames saved to temp_videos/bb36dcfa-9c43-4bf7-8c8e-becac106dbbb |
| 2025-08-21 01:14:42 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:14:53 - INFO - Tokens per second: 5.427872993526753, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:14:53 - INFO - [bb36dcfa-9c43-4bf7-8c8e-becac106dbbb] Inference time: 28.90 seconds, CPU usage: 36.8%, CPU core utilization: [58.1, 25.7, 27.4, 36.1] |
| 2025-08-21 01:14:53 - INFO - [bb36dcfa-9c43-4bf7-8c8e-becac106dbbb] Cleaned up temporary frame directory: temp_videos/bb36dcfa-9c43-4bf7-8c8e-becac106dbbb |
| 2025-08-21 01:14:53 - INFO - [78a73f26-9f35-4eff-bd02-6c440580ce76] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_030.mp4' |
| 2025-08-21 01:14:53 - INFO - [78a73f26-9f35-4eff-bd02-6c440580ce76] Video saved to temporary file: temp_videos/78a73f26-9f35-4eff-bd02-6c440580ce76.mp4 |
| 2025-08-21 01:14:53 - INFO - [78a73f26-9f35-4eff-bd02-6c440580ce76] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:14:58 - INFO - [78a73f26-9f35-4eff-bd02-6c440580ce76] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:14:58 - INFO - [78a73f26-9f35-4eff-bd02-6c440580ce76] 30 frames saved to temp_videos/78a73f26-9f35-4eff-bd02-6c440580ce76 |
| 2025-08-21 01:15:11 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:15:24 - INFO - Tokens per second: 6.759735939024082, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:15:24 - INFO - [78a73f26-9f35-4eff-bd02-6c440580ce76] Inference time: 30.85 seconds, CPU usage: 36.6%, CPU core utilization: [41.3, 20.6, 31.9, 52.4] |
| 2025-08-21 01:15:24 - INFO - [78a73f26-9f35-4eff-bd02-6c440580ce76] Cleaned up temporary frame directory: temp_videos/78a73f26-9f35-4eff-bd02-6c440580ce76 |
| 2025-08-21 01:15:24 - INFO - [59928551-1922-42e9-b7ef-b8f27f8d44a7] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_031.mp4' |
| 2025-08-21 01:15:24 - INFO - [59928551-1922-42e9-b7ef-b8f27f8d44a7] Video saved to temporary file: temp_videos/59928551-1922-42e9-b7ef-b8f27f8d44a7.mp4 |
| 2025-08-21 01:15:24 - INFO - [59928551-1922-42e9-b7ef-b8f27f8d44a7] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:15:28 - INFO - [59928551-1922-42e9-b7ef-b8f27f8d44a7] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:15:28 - INFO - [59928551-1922-42e9-b7ef-b8f27f8d44a7] 30 frames saved to temp_videos/59928551-1922-42e9-b7ef-b8f27f8d44a7 |
| 2025-08-21 01:15:41 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:15:52 - INFO - Tokens per second: 5.539399356957513, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:15:52 - INFO - [59928551-1922-42e9-b7ef-b8f27f8d44a7] Inference time: 28.92 seconds, CPU usage: 37.0%, CPU core utilization: [36.2, 29.1, 66.1, 16.6] |
| 2025-08-21 01:15:53 - INFO - [59928551-1922-42e9-b7ef-b8f27f8d44a7] Cleaned up temporary frame directory: temp_videos/59928551-1922-42e9-b7ef-b8f27f8d44a7 |
| 2025-08-21 01:15:53 - INFO - [824a7ab4-629c-45f5-9a3d-4a4db67f847f] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_032.mp4' |
| 2025-08-21 01:15:53 - INFO - [824a7ab4-629c-45f5-9a3d-4a4db67f847f] Video saved to temporary file: temp_videos/824a7ab4-629c-45f5-9a3d-4a4db67f847f.mp4 |
| 2025-08-21 01:15:53 - INFO - [824a7ab4-629c-45f5-9a3d-4a4db67f847f] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:15:57 - INFO - [824a7ab4-629c-45f5-9a3d-4a4db67f847f] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:15:57 - INFO - [824a7ab4-629c-45f5-9a3d-4a4db67f847f] 30 frames saved to temp_videos/824a7ab4-629c-45f5-9a3d-4a4db67f847f |
| 2025-08-21 01:16:10 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:16:19 - INFO - Tokens per second: 3.2576537369296608, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:16:19 - INFO - [824a7ab4-629c-45f5-9a3d-4a4db67f847f] Inference time: 26.58 seconds, CPU usage: 38.1%, CPU core utilization: [48.7, 36.2, 31.6, 36.0] |
| 2025-08-21 01:16:19 - INFO - [824a7ab4-629c-45f5-9a3d-4a4db67f847f] Cleaned up temporary frame directory: temp_videos/824a7ab4-629c-45f5-9a3d-4a4db67f847f |
| 2025-08-21 01:16:19 - INFO - [9483b45f-591e-4e30-b51b-94a81dec9839] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_033.mp4' |
| 2025-08-21 01:16:19 - INFO - [9483b45f-591e-4e30-b51b-94a81dec9839] Video saved to temporary file: temp_videos/9483b45f-591e-4e30-b51b-94a81dec9839.mp4 |
| 2025-08-21 01:16:19 - INFO - [9483b45f-591e-4e30-b51b-94a81dec9839] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:16:24 - INFO - [9483b45f-591e-4e30-b51b-94a81dec9839] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:16:24 - INFO - [9483b45f-591e-4e30-b51b-94a81dec9839] 30 frames saved to temp_videos/9483b45f-591e-4e30-b51b-94a81dec9839 |
| 2025-08-21 01:16:37 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:16:49 - INFO - Tokens per second: 6.511908523166135, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:16:49 - INFO - [9483b45f-591e-4e30-b51b-94a81dec9839] Inference time: 30.29 seconds, CPU usage: 36.5%, CPU core utilization: [55.4, 25.4, 48.9, 16.2] |
| 2025-08-21 01:16:49 - INFO - [9483b45f-591e-4e30-b51b-94a81dec9839] Cleaned up temporary frame directory: temp_videos/9483b45f-591e-4e30-b51b-94a81dec9839 |
| 2025-08-21 01:16:49 - INFO - [8229188c-0cc1-4717-8e19-43ece6e413b5] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_034.mp4' |
| 2025-08-21 01:16:49 - INFO - [8229188c-0cc1-4717-8e19-43ece6e413b5] Video saved to temporary file: temp_videos/8229188c-0cc1-4717-8e19-43ece6e413b5.mp4 |
| 2025-08-21 01:16:49 - INFO - [8229188c-0cc1-4717-8e19-43ece6e413b5] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:16:54 - INFO - [8229188c-0cc1-4717-8e19-43ece6e413b5] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:16:54 - INFO - [8229188c-0cc1-4717-8e19-43ece6e413b5] 30 frames saved to temp_videos/8229188c-0cc1-4717-8e19-43ece6e413b5 |
| 2025-08-21 01:17:07 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:17:18 - INFO - Tokens per second: 5.322139736625115, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:17:18 - INFO - [8229188c-0cc1-4717-8e19-43ece6e413b5] Inference time: 28.59 seconds, CPU usage: 37.4%, CPU core utilization: [41.1, 43.3, 45.5, 19.4] |
| 2025-08-21 01:17:18 - INFO - [8229188c-0cc1-4717-8e19-43ece6e413b5] Cleaned up temporary frame directory: temp_videos/8229188c-0cc1-4717-8e19-43ece6e413b5 |
| 2025-08-21 01:17:18 - INFO - [00df263f-0a55-4646-a5f1-7392cbd3a66e] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_035.mp4' |
| 2025-08-21 01:17:18 - INFO - [00df263f-0a55-4646-a5f1-7392cbd3a66e] Video saved to temporary file: temp_videos/00df263f-0a55-4646-a5f1-7392cbd3a66e.mp4 |
| 2025-08-21 01:17:18 - INFO - [00df263f-0a55-4646-a5f1-7392cbd3a66e] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:17:23 - INFO - [00df263f-0a55-4646-a5f1-7392cbd3a66e] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:17:23 - INFO - [00df263f-0a55-4646-a5f1-7392cbd3a66e] 30 frames saved to temp_videos/00df263f-0a55-4646-a5f1-7392cbd3a66e |
| 2025-08-21 01:17:36 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:17:44 - INFO - Tokens per second: 2.001110574040108, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:17:44 - INFO - [00df263f-0a55-4646-a5f1-7392cbd3a66e] Inference time: 25.77 seconds, CPU usage: 38.4%, CPU core utilization: [35.6, 36.8, 32.8, 48.4] |
| 2025-08-21 01:17:44 - INFO - [00df263f-0a55-4646-a5f1-7392cbd3a66e] Cleaned up temporary frame directory: temp_videos/00df263f-0a55-4646-a5f1-7392cbd3a66e |
| 2025-08-21 01:17:44 - INFO - [56354734-256f-4dd4-a7ca-33f9f37b588a] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_036.mp4' |
| 2025-08-21 01:17:44 - INFO - [56354734-256f-4dd4-a7ca-33f9f37b588a] Video saved to temporary file: temp_videos/56354734-256f-4dd4-a7ca-33f9f37b588a.mp4 |
| 2025-08-21 01:17:44 - INFO - [56354734-256f-4dd4-a7ca-33f9f37b588a] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:17:49 - INFO - [56354734-256f-4dd4-a7ca-33f9f37b588a] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:17:49 - INFO - [56354734-256f-4dd4-a7ca-33f9f37b588a] 30 frames saved to temp_videos/56354734-256f-4dd4-a7ca-33f9f37b588a |
| 2025-08-21 01:18:02 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:18:11 - INFO - Tokens per second: 4.287825767821965, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:18:11 - INFO - [56354734-256f-4dd4-a7ca-33f9f37b588a] Inference time: 27.47 seconds, CPU usage: 37.6%, CPU core utilization: [18.0, 46.2, 68.9, 17.6] |
| 2025-08-21 01:18:11 - INFO - [56354734-256f-4dd4-a7ca-33f9f37b588a] Cleaned up temporary frame directory: temp_videos/56354734-256f-4dd4-a7ca-33f9f37b588a |
| 2025-08-21 01:18:11 - INFO - [43fc9b52-3741-493c-b317-62cd85256985] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_037.mp4' |
| 2025-08-21 01:18:11 - INFO - [43fc9b52-3741-493c-b317-62cd85256985] Video saved to temporary file: temp_videos/43fc9b52-3741-493c-b317-62cd85256985.mp4 |
| 2025-08-21 01:18:11 - INFO - [43fc9b52-3741-493c-b317-62cd85256985] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:18:16 - INFO - [43fc9b52-3741-493c-b317-62cd85256985] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:18:16 - INFO - [43fc9b52-3741-493c-b317-62cd85256985] 30 frames saved to temp_videos/43fc9b52-3741-493c-b317-62cd85256985 |
| 2025-08-21 01:18:29 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:18:40 - INFO - Tokens per second: 5.201636019010958, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:18:40 - INFO - [43fc9b52-3741-493c-b317-62cd85256985] Inference time: 28.50 seconds, CPU usage: 37.4%, CPU core utilization: [36.0, 34.1, 41.3, 38.1] |
| 2025-08-21 01:18:40 - INFO - [43fc9b52-3741-493c-b317-62cd85256985] Cleaned up temporary frame directory: temp_videos/43fc9b52-3741-493c-b317-62cd85256985 |
| 2025-08-21 01:18:40 - INFO - [bb21c0a9-84dd-4b05-8543-a9d4b52958ba] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_038.mp4' |
| 2025-08-21 01:18:40 - INFO - [bb21c0a9-84dd-4b05-8543-a9d4b52958ba] Video saved to temporary file: temp_videos/bb21c0a9-84dd-4b05-8543-a9d4b52958ba.mp4 |
| 2025-08-21 01:18:40 - INFO - [bb21c0a9-84dd-4b05-8543-a9d4b52958ba] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:18:45 - INFO - [bb21c0a9-84dd-4b05-8543-a9d4b52958ba] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:18:45 - INFO - [bb21c0a9-84dd-4b05-8543-a9d4b52958ba] 30 frames saved to temp_videos/bb21c0a9-84dd-4b05-8543-a9d4b52958ba |
| 2025-08-21 01:18:58 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:19:09 - INFO - Tokens per second: 5.959308976482591, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:19:09 - INFO - [bb21c0a9-84dd-4b05-8543-a9d4b52958ba] Inference time: 29.47 seconds, CPU usage: 36.6%, CPU core utilization: [47.9, 26.7, 42.6, 29.1] |
| 2025-08-21 01:19:09 - INFO - [bb21c0a9-84dd-4b05-8543-a9d4b52958ba] Cleaned up temporary frame directory: temp_videos/bb21c0a9-84dd-4b05-8543-a9d4b52958ba |
| 2025-08-21 01:19:09 - INFO - [c1c756cf-8d88-40f1-99d6-35014bca5417] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_039.mp4' |
| 2025-08-21 01:19:09 - INFO - [c1c756cf-8d88-40f1-99d6-35014bca5417] Video saved to temporary file: temp_videos/c1c756cf-8d88-40f1-99d6-35014bca5417.mp4 |
| 2025-08-21 01:19:09 - INFO - [c1c756cf-8d88-40f1-99d6-35014bca5417] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:19:14 - INFO - [c1c756cf-8d88-40f1-99d6-35014bca5417] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:19:14 - INFO - [c1c756cf-8d88-40f1-99d6-35014bca5417] 30 frames saved to temp_videos/c1c756cf-8d88-40f1-99d6-35014bca5417 |
| 2025-08-21 01:19:27 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:19:36 - INFO - Tokens per second: 3.3460227628996955, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:19:36 - INFO - [c1c756cf-8d88-40f1-99d6-35014bca5417] Inference time: 26.83 seconds, CPU usage: 38.0%, CPU core utilization: [50.8, 48.6, 31.4, 21.4] |
| 2025-08-21 01:19:36 - INFO - [c1c756cf-8d88-40f1-99d6-35014bca5417] Cleaned up temporary frame directory: temp_videos/c1c756cf-8d88-40f1-99d6-35014bca5417 |
| 2025-08-21 01:19:36 - INFO - [e408569f-7a5d-4851-961e-1b0408acf6fd] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_040.mp4' |
| 2025-08-21 01:19:36 - INFO - [e408569f-7a5d-4851-961e-1b0408acf6fd] Video saved to temporary file: temp_videos/e408569f-7a5d-4851-961e-1b0408acf6fd.mp4 |
| 2025-08-21 01:19:36 - INFO - [e408569f-7a5d-4851-961e-1b0408acf6fd] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:19:41 - INFO - [e408569f-7a5d-4851-961e-1b0408acf6fd] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:19:41 - INFO - [e408569f-7a5d-4851-961e-1b0408acf6fd] 30 frames saved to temp_videos/e408569f-7a5d-4851-961e-1b0408acf6fd |
| 2025-08-21 01:19:54 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:20:03 - INFO - Tokens per second: 3.758188071762749, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:20:03 - INFO - [e408569f-7a5d-4851-961e-1b0408acf6fd] Inference time: 27.11 seconds, CPU usage: 37.6%, CPU core utilization: [32.3, 62.0, 32.1, 23.9] |
| 2025-08-21 01:20:03 - INFO - [e408569f-7a5d-4851-961e-1b0408acf6fd] Cleaned up temporary frame directory: temp_videos/e408569f-7a5d-4851-961e-1b0408acf6fd |
| 2025-08-21 01:20:03 - INFO - [5dac0a79-3673-4ab1-b2ca-cefc83712b60] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_041.mp4' |
| 2025-08-21 01:20:03 - INFO - [5dac0a79-3673-4ab1-b2ca-cefc83712b60] Video saved to temporary file: temp_videos/5dac0a79-3673-4ab1-b2ca-cefc83712b60.mp4 |
| 2025-08-21 01:20:03 - INFO - [5dac0a79-3673-4ab1-b2ca-cefc83712b60] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:20:08 - INFO - [5dac0a79-3673-4ab1-b2ca-cefc83712b60] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:20:08 - INFO - [5dac0a79-3673-4ab1-b2ca-cefc83712b60] 30 frames saved to temp_videos/5dac0a79-3673-4ab1-b2ca-cefc83712b60 |
| 2025-08-21 01:20:21 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:20:30 - INFO - Tokens per second: 3.839752750295925, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:20:30 - INFO - [5dac0a79-3673-4ab1-b2ca-cefc83712b60] Inference time: 27.12 seconds, CPU usage: 37.8%, CPU core utilization: [28.4, 42.0, 34.0, 46.9] |
| 2025-08-21 01:20:30 - INFO - [5dac0a79-3673-4ab1-b2ca-cefc83712b60] Cleaned up temporary frame directory: temp_videos/5dac0a79-3673-4ab1-b2ca-cefc83712b60 |
| 2025-08-21 01:20:30 - INFO - [92c0e821-54b8-4c96-801a-be04166c4502] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_042.mp4' |
| 2025-08-21 01:20:30 - INFO - [92c0e821-54b8-4c96-801a-be04166c4502] Video saved to temporary file: temp_videos/92c0e821-54b8-4c96-801a-be04166c4502.mp4 |
| 2025-08-21 01:20:30 - INFO - [92c0e821-54b8-4c96-801a-be04166c4502] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:20:35 - INFO - [92c0e821-54b8-4c96-801a-be04166c4502] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:20:35 - INFO - [92c0e821-54b8-4c96-801a-be04166c4502] 30 frames saved to temp_videos/92c0e821-54b8-4c96-801a-be04166c4502 |
| 2025-08-21 01:20:48 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:21:01 - INFO - Tokens per second: 6.5598252890181605, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:21:01 - INFO - [92c0e821-54b8-4c96-801a-be04166c4502] Inference time: 30.37 seconds, CPU usage: 36.1%, CPU core utilization: [20.5, 62.7, 16.5, 44.7] |
| 2025-08-21 01:21:01 - INFO - [92c0e821-54b8-4c96-801a-be04166c4502] Cleaned up temporary frame directory: temp_videos/92c0e821-54b8-4c96-801a-be04166c4502 |
| 2025-08-21 01:21:01 - INFO - [f542ced9-0803-492e-a5c3-1a8cf04f1129] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_043.mp4' |
| 2025-08-21 01:21:01 - INFO - [f542ced9-0803-492e-a5c3-1a8cf04f1129] Video saved to temporary file: temp_videos/f542ced9-0803-492e-a5c3-1a8cf04f1129.mp4 |
| 2025-08-21 01:21:01 - INFO - [f542ced9-0803-492e-a5c3-1a8cf04f1129] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:21:06 - INFO - [f542ced9-0803-492e-a5c3-1a8cf04f1129] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:21:06 - INFO - [f542ced9-0803-492e-a5c3-1a8cf04f1129] 30 frames saved to temp_videos/f542ced9-0803-492e-a5c3-1a8cf04f1129 |
| 2025-08-21 01:21:19 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:21:31 - INFO - Tokens per second: 6.379927791197094, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:21:31 - INFO - [f542ced9-0803-492e-a5c3-1a8cf04f1129] Inference time: 30.11 seconds, CPU usage: 36.8%, CPU core utilization: [27.4, 48.1, 37.8, 33.9] |
| 2025-08-21 01:21:31 - INFO - [f542ced9-0803-492e-a5c3-1a8cf04f1129] Cleaned up temporary frame directory: temp_videos/f542ced9-0803-492e-a5c3-1a8cf04f1129 |
| 2025-08-21 01:21:31 - INFO - [01cce6ea-c917-49ae-b644-91a34b7204c5] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_044.mp4' |
| 2025-08-21 01:21:31 - INFO - [01cce6ea-c917-49ae-b644-91a34b7204c5] Video saved to temporary file: temp_videos/01cce6ea-c917-49ae-b644-91a34b7204c5.mp4 |
| 2025-08-21 01:21:31 - INFO - [01cce6ea-c917-49ae-b644-91a34b7204c5] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:21:36 - INFO - [01cce6ea-c917-49ae-b644-91a34b7204c5] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:21:36 - INFO - [01cce6ea-c917-49ae-b644-91a34b7204c5] 30 frames saved to temp_videos/01cce6ea-c917-49ae-b644-91a34b7204c5 |
| 2025-08-21 01:21:49 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:21:58 - INFO - Tokens per second: 3.7502463323747084, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:21:58 - INFO - [01cce6ea-c917-49ae-b644-91a34b7204c5] Inference time: 27.08 seconds, CPU usage: 48.6%, CPU core utilization: [82.7, 37.0, 40.9, 33.6] |
| 2025-08-21 01:21:58 - INFO - [01cce6ea-c917-49ae-b644-91a34b7204c5] Cleaned up temporary frame directory: temp_videos/01cce6ea-c917-49ae-b644-91a34b7204c5 |
| 2025-08-21 01:21:58 - INFO - [9738fd4f-d14a-4967-bdb7-b1c9156add2a] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_045.mp4' |
| 2025-08-21 01:21:58 - INFO - [9738fd4f-d14a-4967-bdb7-b1c9156add2a] Video saved to temporary file: temp_videos/9738fd4f-d14a-4967-bdb7-b1c9156add2a.mp4 |
| 2025-08-21 01:21:58 - INFO - [9738fd4f-d14a-4967-bdb7-b1c9156add2a] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:22:06 - INFO - [9738fd4f-d14a-4967-bdb7-b1c9156add2a] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:22:06 - INFO - [9738fd4f-d14a-4967-bdb7-b1c9156add2a] 30 frames saved to temp_videos/9738fd4f-d14a-4967-bdb7-b1c9156add2a |
| 2025-08-21 01:22:19 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:22:26 - INFO - Tokens per second: 1.8954757210352398, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:22:26 - INFO - [9738fd4f-d14a-4967-bdb7-b1c9156add2a] Inference time: 28.46 seconds, CPU usage: 50.5%, CPU core utilization: [45.6, 51.9, 34.0, 70.4] |
| 2025-08-21 01:22:26 - INFO - [9738fd4f-d14a-4967-bdb7-b1c9156add2a] Cleaned up temporary frame directory: temp_videos/9738fd4f-d14a-4967-bdb7-b1c9156add2a |
| 2025-08-21 01:22:26 - INFO - [e5c0b243-2f04-4afd-a044-8e574b65e7fe] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_046.mp4' |
| 2025-08-21 01:22:26 - INFO - [e5c0b243-2f04-4afd-a044-8e574b65e7fe] Video saved to temporary file: temp_videos/e5c0b243-2f04-4afd-a044-8e574b65e7fe.mp4 |
| 2025-08-21 01:22:26 - INFO - [e5c0b243-2f04-4afd-a044-8e574b65e7fe] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:22:31 - INFO - [e5c0b243-2f04-4afd-a044-8e574b65e7fe] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:22:31 - INFO - [e5c0b243-2f04-4afd-a044-8e574b65e7fe] 30 frames saved to temp_videos/e5c0b243-2f04-4afd-a044-8e574b65e7fe |
| 2025-08-21 01:22:44 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:22:56 - INFO - Tokens per second: 5.655760454979962, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:22:56 - INFO - [e5c0b243-2f04-4afd-a044-8e574b65e7fe] Inference time: 29.07 seconds, CPU usage: 36.7%, CPU core utilization: [19.4, 41.5, 59.0, 27.0] |
| 2025-08-21 01:22:56 - INFO - [e5c0b243-2f04-4afd-a044-8e574b65e7fe] Cleaned up temporary frame directory: temp_videos/e5c0b243-2f04-4afd-a044-8e574b65e7fe |
| 2025-08-21 01:22:56 - INFO - [2c19cb12-ab5a-476b-8773-34b5991b8716] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_047.mp4' |
| 2025-08-21 01:22:56 - INFO - [2c19cb12-ab5a-476b-8773-34b5991b8716] Video saved to temporary file: temp_videos/2c19cb12-ab5a-476b-8773-34b5991b8716.mp4 |
| 2025-08-21 01:22:56 - INFO - [2c19cb12-ab5a-476b-8773-34b5991b8716] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:23:00 - INFO - [2c19cb12-ab5a-476b-8773-34b5991b8716] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:23:00 - INFO - [2c19cb12-ab5a-476b-8773-34b5991b8716] 30 frames saved to temp_videos/2c19cb12-ab5a-476b-8773-34b5991b8716 |
| 2025-08-21 01:23:13 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:23:23 - INFO - Tokens per second: 3.9947178064700855, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:23:23 - INFO - [2c19cb12-ab5a-476b-8773-34b5991b8716] Inference time: 27.28 seconds, CPU usage: 37.9%, CPU core utilization: [27.5, 21.8, 42.6, 59.5] |
| 2025-08-21 01:23:23 - INFO - [2c19cb12-ab5a-476b-8773-34b5991b8716] Cleaned up temporary frame directory: temp_videos/2c19cb12-ab5a-476b-8773-34b5991b8716 |
| 2025-08-21 01:23:23 - INFO - [1f521869-b6a6-4ffc-b7f9-3ba6424abdc9] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_048.mp4' |
| 2025-08-21 01:23:23 - INFO - [1f521869-b6a6-4ffc-b7f9-3ba6424abdc9] Video saved to temporary file: temp_videos/1f521869-b6a6-4ffc-b7f9-3ba6424abdc9.mp4 |
| 2025-08-21 01:23:23 - INFO - [1f521869-b6a6-4ffc-b7f9-3ba6424abdc9] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:23:28 - INFO - [1f521869-b6a6-4ffc-b7f9-3ba6424abdc9] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:23:28 - INFO - [1f521869-b6a6-4ffc-b7f9-3ba6424abdc9] 30 frames saved to temp_videos/1f521869-b6a6-4ffc-b7f9-3ba6424abdc9 |
| 2025-08-21 01:23:41 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:23:52 - INFO - Tokens per second: 5.431023554612727, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:23:52 - INFO - [1f521869-b6a6-4ffc-b7f9-3ba6424abdc9] Inference time: 28.81 seconds, CPU usage: 36.9%, CPU core utilization: [35.2, 33.7, 57.7, 21.0] |
| 2025-08-21 01:23:52 - INFO - [1f521869-b6a6-4ffc-b7f9-3ba6424abdc9] Cleaned up temporary frame directory: temp_videos/1f521869-b6a6-4ffc-b7f9-3ba6424abdc9 |
| 2025-08-21 01:23:52 - INFO - [bb6768fb-6e99-4507-b665-27c2c1ae0b50] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_049.mp4' |
| 2025-08-21 01:23:52 - INFO - [bb6768fb-6e99-4507-b665-27c2c1ae0b50] Video saved to temporary file: temp_videos/bb6768fb-6e99-4507-b665-27c2c1ae0b50.mp4 |
| 2025-08-21 01:23:52 - INFO - [bb6768fb-6e99-4507-b665-27c2c1ae0b50] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:23:57 - INFO - [bb6768fb-6e99-4507-b665-27c2c1ae0b50] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:23:57 - INFO - [bb6768fb-6e99-4507-b665-27c2c1ae0b50] 30 frames saved to temp_videos/bb6768fb-6e99-4507-b665-27c2c1ae0b50 |
| 2025-08-21 01:24:09 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:24:20 - INFO - Tokens per second: 5.315589822361752, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:24:20 - INFO - [bb6768fb-6e99-4507-b665-27c2c1ae0b50] Inference time: 28.63 seconds, CPU usage: 44.5%, CPU core utilization: [27.1, 55.7, 35.4, 59.8] |
| 2025-08-21 01:24:20 - INFO - [bb6768fb-6e99-4507-b665-27c2c1ae0b50] Cleaned up temporary frame directory: temp_videos/bb6768fb-6e99-4507-b665-27c2c1ae0b50 |
| 2025-08-21 01:24:20 - INFO - [7c0a1e36-be63-4b18-9096-cf0faa8f5ca7] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_050.mp4' |
| 2025-08-21 01:24:20 - INFO - [7c0a1e36-be63-4b18-9096-cf0faa8f5ca7] Video saved to temporary file: temp_videos/7c0a1e36-be63-4b18-9096-cf0faa8f5ca7.mp4 |
| 2025-08-21 01:24:20 - INFO - [7c0a1e36-be63-4b18-9096-cf0faa8f5ca7] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:24:26 - INFO - [7c0a1e36-be63-4b18-9096-cf0faa8f5ca7] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:24:26 - INFO - [7c0a1e36-be63-4b18-9096-cf0faa8f5ca7] 30 frames saved to temp_videos/7c0a1e36-be63-4b18-9096-cf0faa8f5ca7 |
| 2025-08-21 01:24:39 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:24:48 - INFO - Tokens per second: 4.2201806034921, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:24:48 - INFO - [7c0a1e36-be63-4b18-9096-cf0faa8f5ca7] Inference time: 28.00 seconds, CPU usage: 37.8%, CPU core utilization: [45.9, 36.6, 32.4, 36.4] |
| 2025-08-21 01:24:48 - INFO - [7c0a1e36-be63-4b18-9096-cf0faa8f5ca7] Cleaned up temporary frame directory: temp_videos/7c0a1e36-be63-4b18-9096-cf0faa8f5ca7 |
| 2025-08-21 01:24:48 - INFO - [3a2ca98e-47b8-45aa-8834-9dc0e398936a] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_051.mp4' |
| 2025-08-21 01:24:48 - INFO - [3a2ca98e-47b8-45aa-8834-9dc0e398936a] Video saved to temporary file: temp_videos/3a2ca98e-47b8-45aa-8834-9dc0e398936a.mp4 |
| 2025-08-21 01:24:48 - INFO - [3a2ca98e-47b8-45aa-8834-9dc0e398936a] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:24:53 - INFO - [3a2ca98e-47b8-45aa-8834-9dc0e398936a] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:24:53 - INFO - [3a2ca98e-47b8-45aa-8834-9dc0e398936a] 30 frames saved to temp_videos/3a2ca98e-47b8-45aa-8834-9dc0e398936a |
| 2025-08-21 01:25:06 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:25:16 - INFO - Tokens per second: 4.498009394398241, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:25:16 - INFO - [3a2ca98e-47b8-45aa-8834-9dc0e398936a] Inference time: 27.74 seconds, CPU usage: 37.8%, CPU core utilization: [34.1, 28.6, 47.5, 40.9] |
| 2025-08-21 01:25:16 - INFO - [3a2ca98e-47b8-45aa-8834-9dc0e398936a] Cleaned up temporary frame directory: temp_videos/3a2ca98e-47b8-45aa-8834-9dc0e398936a |
| 2025-08-21 01:25:16 - INFO - [acb71d01-022e-4d9d-a98c-903f36965977] Received new video inference request. Prompt: 'Summarize the key observable events in this 1-minute convenience store video clip. Focus strictly on the physical actions and interactions of the people. Describe only what you can see', Video: '/mnt/data/xiuying/Code/local_deploy/video/Clips_60s/sample_part_052.mp4' |
| 2025-08-21 01:25:16 - INFO - [acb71d01-022e-4d9d-a98c-903f36965977] Video saved to temporary file: temp_videos/acb71d01-022e-4d9d-a98c-903f36965977.mp4 |
| 2025-08-21 01:25:16 - INFO - [acb71d01-022e-4d9d-a98c-903f36965977] Extracting frames using method: uniform, rate/threshold: 30 |
| 2025-08-21 01:25:22 - INFO - [acb71d01-022e-4d9d-a98c-903f36965977] Extracted 30 frames successfully. Saving to temporary files... |
| 2025-08-21 01:25:22 - INFO - [acb71d01-022e-4d9d-a98c-903f36965977] 30 frames saved to temp_videos/acb71d01-022e-4d9d-a98c-903f36965977 |
| 2025-08-21 01:25:35 - INFO - vision_config is None, using default vision config |
| 2025-08-21 01:25:47 - INFO - Tokens per second: 6.6824774787667796, Peak GPU memory MB: 11824.375 |
| 2025-08-21 01:25:47 - INFO - [acb71d01-022e-4d9d-a98c-903f36965977] Inference time: 31.31 seconds, CPU usage: 44.4%, CPU core utilization: [35.8, 44.5, 56.2, 41.3] |
| 2025-08-21 01:25:47 - INFO - [acb71d01-022e-4d9d-a98c-903f36965977] Cleaned up temporary frame directory: temp_videos/acb71d01-022e-4d9d-a98c-903f36965977 |
|
|