Upload folder using huggingface_hub

Files changed (2) hide show

README.md ADDED Viewed

+---
+library_name: transformers.js
+tags:
+  - transformers.js
+  - onnx
+  - whisper
+pipeline_tag: automatic-speech-recognition
+---
+# Whisper Base ONNX
+This is an ONNX conversion of OpenAI's [whisper-base](https://huggingface.co/openai/whisper-base) model, optimized for use with [Transformers.js](https://huggingface.co/docs/transformers.js).
+## Model Details
+- **Model Type:** Whisper (Encoder-Decoder)
+- **Task:** Automatic Speech Recognition
+- **Format:** ONNX (INT8 Quantized)
+- **Size:** ~75MB (quantized from ~300MB)
+## Usage
+```javascript
+import { pipeline } from '@xenova/transformers';
+const transcriber = await pipeline('automatic-speech-recognition', 'markusingvarsson/whisper-test');
+const result = await transcriber('audio.wav');
+console.log(result.text);
+```
+## Conversion Details
+This model was converted using a custom conversion pipeline that:
+1. Downloads the original HuggingFace model
+2. Exports to ONNX format with KV caching
+3. Applies INT8 quantization for smaller size
+4. Adds Whisper-specific alignment heads for timestamp support
+The quantized models are approximately 4x smaller than the original while maintaining accuracy.

onnx/decoder_model_merged.onnx CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:fae23737d0ba6f3bd3d27658f3b6de17169014d11149cd033274e23c75a7e224
 size 79336381

 version https://git-lfs.github.com/spec/v1
+oid sha256:9295b9fb09be07b70ac6c54a7cfc7bf172d88a40f735035b30fc5a0f374fa435
 size 79336381