Whisper-medium โ€” Ternary Quantized (tritplane3)

Ternary-quantized version of openai/whisper-medium.

Specifications

Property Value
Base Model openai/whisper-medium
Parameters 769M
Quantization tritplane3 (240 decoder layers)
Audio encoder FP16 (preserved)
Stored size 453 MB
FP16 size ~3.1 GB
Compression 1.30ร—

Usage

from ternary_quant.inference import load_ternary_model
import torch, numpy as np

model, proc = load_ternary_model("AsadIsmail/whisper-medium-ternary", runtime_mode="cached", device="cpu")
model = model.float()  # Required for encoder compat

# Transcribe audio
import soundfile as sf
audio, sr = sf.read("audio.flac")
inputs = proc(audio.astype(np.float32), sampling_rate=sr, return_tensors="pt")
inputs = {k: v.float() for k, v in inputs.items()}
with torch.no_grad():
    ids = model.generate(**inputs, max_new_tokens=100)
print(proc.batch_decode(ids, skip_special_tokens=True)[0])

Collection

Part of ternary-models.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for AsadIsmail/whisper-medium-ternary

Finetuned
(870)
this model

Collection including AsadIsmail/whisper-medium-ternary