faster-whisper-base-ct2 (float16)
OpenAI whisper-base 的 CTranslate2 转换版本(float16 量化)
本仓库是将 openai/whisper-base 使用 ct2-transformers-converter 转换为 CTranslate2 格式的模型,量化类型为 float16。
该版本专为高速推理设计,可直接配合 faster-whisper(强烈推荐)或原生 CTranslate2 使用,推理速度比原生 Transformers 版本快 4~10 倍,显存占用大幅降低,准确率几乎无损。
转换信息
转换命令:
ct2-transformers-converter --model D:/openai-whisper-base
--output_dir D:/faster-whisper-base-ct2
--copy_files tokenizer.json preprocessor_config.json
--quantization float16
from faster_whisper import WhisperModel, BatchedInferencePipeline
import gc
import torch
model_path = "D:/faster-whisper-base-ct2"
# 如果显存不足,用 "int8_float16" 或 "int8"
model = WhisperModel(
model_path,
device="cuda" if torch.cuda.is_available() else "cpu",
compute_type="int8_float16",
local_files_only=True
)
audio_path = "D:/distil-whisper/output.mp3"
print("正在转录(长音频优化中)...")
segments, info = model.transcribe(
audio_path,
language="zh" if "你的音频是中文" else None,
beam_size=5,
vad_filter=True,
vad_parameters=dict(min_silence_duration_ms=500),
word_timestamps=False
)
print(f"检测语言: {info.language} (概率: {info.language_probability:.2f})")
print(f"总时长: {info.duration:.2f} 秒")
print("转录结果:")
full_text = ""
for segment in segments:
print(f"'{segment.text}',")
full_text += segment.text.strip() + ", "
full_text = full_text.rstrip(", ").strip()
print("\n完整文本:")
print(full_text)
del model
gc.collect()
if torch.cuda.is_available():
torch.cuda.empty_cache()
torch.cuda.reset_peak_memory_stats()
print("显存已释放!")
- Downloads last month
- 17