File size: 1,164 Bytes
f073fd7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e3b7a06
f073fd7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
---
library_name: transformers.js
tags:
  - transformers.js
  - onnx
  - whisper
pipeline_tag: automatic-speech-recognition
---

# Whisper Base ONNX

This is an ONNX conversion of OpenAI's [whisper-base](https://huggingface.co/openai/whisper-base) model, optimized for use with [Transformers.js](https://huggingface.co/docs/transformers.js).

## Model Details

- **Model Type:** Whisper (Encoder-Decoder)
- **Task:** Automatic Speech Recognition
- **Format:** ONNX (INT8 Quantized)
- **Size:** ~75MB (quantized from ~300MB)

## Usage

```javascript
import { pipeline } from '@huggingface/transformers';

const transcriber = await pipeline('automatic-speech-recognition', 'markusingvarsson/whisper-test');
const result = await transcriber('audio.wav');
console.log(result.text);
```

## Conversion Details

This model was converted using a custom conversion pipeline that:
1. Downloads the original HuggingFace model
2. Exports to ONNX format with KV caching
3. Applies INT8 quantization for smaller size
4. Adds Whisper-specific alignment heads for timestamp support

The quantized models are approximately 4x smaller than the original while maintaining accuracy.