Whisper distil-large-v3 model for CTranslate2

This repository contains the conversion of distil-whisper/distil-large-v3 to the CTranslate2 model format.

This model can be used in CTranslate2 or projects based on CTranslate2 such as faster-whisper.

Example

import requests, base64, json

ENDPOINT_URL = "endpoints.huggingface.cloud"  # ๐ŸŒ replace with your URL endpoint
HF_TOKEN     = "hf_token"                     # ๐Ÿ”‘ replace with your HF token
AUDIO_FILE   = "audio.mp3"                    # ๐Ÿ”Š path to your local audio file

vad_params = {
                "min_silence_duration_ms": 100,
                "speech_pad_ms": 30,
                "min_speech_duration_ms": 40,
                "neg_threshold": 0.2,
                }

headers = {"Authorization": f"Bearer {HF_TOKEN}"}

def trans_fast(audiofile, params):
    """
    audiofile:  path to audio file
    params:     dict containing
                    - 'parameters' dict to pass to transcription model
                    - 'batched' argument (optional), defaults to True to use faster_whisper batched inference
    """
    with open(audiofile, "rb") as f:
        data = f.read() # read the file
        encoded_audio = base64.b64encode(data).decode('utf-8') # encode in b64 to send for transcription
    # Send audio bytes and params
    payload = {
                "inputs": encoded_audio,
                **params
                }
    response = requests.post(ENDPOINT_URL, headers=headers, json=payload)
    return response.json()


params = {
            # dict of parameters for faster_whisper transcription
            "parameters": {"language": "en", "vad_parameters": vad_params},
            # whether or not to use batched mode (defaults to True)
            # "batched": True,
            }
# Example usage
transcript = trans_fast(AUDIO_FILE, params)
print(transcript)

Conversion details

The original model was converted with the following command:

ct2-transformers-converter --model distil-whisper/distil-large-v3 --output_dir faster-distil-whisper-large-v3 \
    --copy_files tokenizer.json preprocessor_config.json --quantization float16

Note that the model weights are saved in FP16. This type can be changed when the model is loaded using the compute_type option in CTranslate2.

More information

For more information about the original model, see its model card.

Downloads last month
147
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support