whisper-test / README.md
markusingvarsson's picture
Upload folder using huggingface_hub
e3b7a06 verified
metadata
library_name: transformers.js
tags:
  - transformers.js
  - onnx
  - whisper
pipeline_tag: automatic-speech-recognition

Whisper Base ONNX

This is an ONNX conversion of OpenAI's whisper-base model, optimized for use with Transformers.js.

Model Details

  • Model Type: Whisper (Encoder-Decoder)
  • Task: Automatic Speech Recognition
  • Format: ONNX (INT8 Quantized)
  • Size: ~75MB (quantized from ~300MB)

Usage

import { pipeline } from '@huggingface/transformers';

const transcriber = await pipeline('automatic-speech-recognition', 'markusingvarsson/whisper-test');
const result = await transcriber('audio.wav');
console.log(result.text);

Conversion Details

This model was converted using a custom conversion pipeline that:

  1. Downloads the original HuggingFace model
  2. Exports to ONNX format with KV caching
  3. Applies INT8 quantization for smaller size
  4. Adds Whisper-specific alignment heads for timestamp support

The quantized models are approximately 4x smaller than the original while maintaining accuracy.