markusingvarsson
/

whisper-test

Automatic Speech Recognition

Transformers.js

Model card Files Files and versions

whisper-test / README.md

markusingvarsson's picture

markusingvarsson

Upload folder using huggingface_hub

e3b7a06 verified 4 months ago

|

history blame contribute delete

1.16 kB

	---
	library_name: transformers.js
	tags:
	- transformers.js
	- onnx
	- whisper
	pipeline_tag: automatic-speech-recognition
	---

	# Whisper Base ONNX

	This is an ONNX conversion of OpenAI's [whisper-base](https://huggingface.co/openai/whisper-base) model, optimized for use with [Transformers.js](https://huggingface.co/docs/transformers.js).

	## Model Details

	- Model Type: Whisper (Encoder-Decoder)
	- Task: Automatic Speech Recognition
	- Format: ONNX (INT8 Quantized)
	- Size: ~75MB (quantized from ~300MB)

	## Usage

	```javascript
	import { pipeline } from '@huggingface/transformers';

	const transcriber = await pipeline('automatic-speech-recognition', 'markusingvarsson/whisper-test');
	const result = await transcriber('audio.wav');
	console.log(result.text);
	```

	## Conversion Details

	This model was converted using a custom conversion pipeline that:
	1. Downloads the original HuggingFace model
	2. Exports to ONNX format with KV caching
	3. Applies INT8 quantization for smaller size
	4. Adds Whisper-specific alignment heads for timestamp support

	The quantized models are approximately 4x smaller than the original while maintaining accuracy.