faster-whisper-base-ct2 (float16)

OpenAI whisper-base 的 CTranslate2 转换版本(float16 量化)

本仓库是将 openai/whisper-base 使用 ct2-transformers-converter 转换为 CTranslate2 格式的模型,量化类型为 float16

该版本专为高速推理设计,可直接配合 faster-whisper(强烈推荐)或原生 CTranslate2 使用,推理速度比原生 Transformers 版本快 4~10 倍,显存占用大幅降低,准确率几乎无损。

转换信息

转换命令:

ct2-transformers-converter --model D:/openai-whisper-base
--output_dir D:/faster-whisper-base-ct2
--copy_files tokenizer.json preprocessor_config.json
--quantization float16


from faster_whisper import WhisperModel, BatchedInferencePipeline
import gc
import torch

model_path = "D:/faster-whisper-base-ct2"

# 如果显存不足,用 "int8_float16" 或 "int8"
model = WhisperModel(
    model_path,
    device="cuda" if torch.cuda.is_available() else "cpu",
    compute_type="int8_float16",
    local_files_only=True
)

audio_path = "D:/distil-whisper/output.mp3"

print("正在转录(长音频优化中)...")
segments, info = model.transcribe(
    audio_path,
    language="zh" if "你的音频是中文" else None,
    beam_size=5,
    vad_filter=True,
    vad_parameters=dict(min_silence_duration_ms=500),
    word_timestamps=False
)

print(f"检测语言: {info.language} (概率: {info.language_probability:.2f})")
print(f"总时长: {info.duration:.2f} 秒")
print("转录结果:")

full_text = ""
for segment in segments:
    print(f"'{segment.text}',")
    full_text += segment.text.strip() + ", "

full_text = full_text.rstrip(", ").strip()

print("\n完整文本:")
print(full_text)

del model
gc.collect()
if torch.cuda.is_available():
    torch.cuda.empty_cache()
    torch.cuda.reset_peak_memory_stats()

print("显存已释放!")
Downloads last month
17
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support