YuqianFu
/

Xseg-Baseline

Model card Files Files and versions

Metrics Training metrics Community

Xseg-Baseline / correspondence /SegSwap /data /optimization_summary.py

YuqianFu's picture

Upload folder using huggingface_hub

944cdc2 verified 3 months ago

history blame contribute delete

2.37 kB

	#!/usr/bin/env python3
	"""
	优化版本的关键改进总结：

	1. 并行处理优化 (最重要的性能提升)
	- ProcessPoolExecutor: 并行处理多个take_id
	- ThreadPoolExecutor: 在每个take内并行处理多个相机
	- 可以实现显著的速度提升，特别是在多核CPU上

	2. 图片保存优化
	- 批量处理帧而不是逐个处理
	- 使用cv2.INTER_AREA进行更高效的下采样
	- 优化的JPEG压缩参数
	- 减少重复的尺寸检查

	3. VideoReader优化
	- 自动尝试GPU加速，失败时回退到CPU
	- 更有效的批量帧读取
	- 减少重复的VideoReader实例创建

	4. 文件IO优化
	- 提前检查输出目录是否已存在
	- 减少重复的os.path.exists调用
	- 使用pathlib.Path进行更高效的路径操作

	5. 内存和数组操作优化
	- 更高效的BGR到RGB转换
	- 减少不必要的数组复制
	- 优化的错误处理避免异常开销

	预期性能提升：
	- 在多核系统上可能有2-8倍的速度提升
	- GPU加速可以进一步提升性能
	- 减少IO等待时间
	- 更好的资源利用率

	使用方法：
	python only_extract_frames_optimized.py \\
	--takepath /path/to/takes \\
	--annotationpath /path/to/annotations.json \\
	--split_path /path/to/split.json \\
	--split train \\
	--outputpath /path/to/output \\
	--max_workers 4 \\
	--camera_workers 8
	"""

	# 关键性能对比测试代码
	import time
	import os

	def benchmark_comparison():
	"""简单的性能对比测试"""
	print("性能优化对比:")
	print("=" * 50)

	# 原始方法模拟
	start = time.time()
	# 模拟串行处理
	for i in range(10):
	time.sleep(0.1) # 模拟IO操作
	original_time = time.time() - start

	# 优化方法模拟
	from concurrent.futures import ThreadPoolExecutor
	start = time.time()

	def mock_process(x):
	time.sleep(0.1)
	return x

	with ThreadPoolExecutor(max_workers=4) as executor:
	results = list(executor.map(mock_process, range(10)))

	optimized_time = time.time() - start

	print(f"原始串行方法时间: {original_time:.2f}秒")
	print(f"优化并行方法时间: {optimized_time:.2f}秒")
	print(f"速度提升: {original_time/optimized_time:.1f}倍")
	print("=" * 50)

	if __name__ == "__main__":
	benchmark_comparison()