markusingvarsson commited on
Commit
f073fd7
·
verified ·
1 Parent(s): 3639147

Upload folder using huggingface_hub

Browse files
Files changed (2) hide show
  1. README.md +39 -0
  2. onnx/decoder_model_merged.onnx +1 -1
README.md ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers.js
3
+ tags:
4
+ - transformers.js
5
+ - onnx
6
+ - whisper
7
+ pipeline_tag: automatic-speech-recognition
8
+ ---
9
+
10
+ # Whisper Base ONNX
11
+
12
+ This is an ONNX conversion of OpenAI's [whisper-base](https://huggingface.co/openai/whisper-base) model, optimized for use with [Transformers.js](https://huggingface.co/docs/transformers.js).
13
+
14
+ ## Model Details
15
+
16
+ - **Model Type:** Whisper (Encoder-Decoder)
17
+ - **Task:** Automatic Speech Recognition
18
+ - **Format:** ONNX (INT8 Quantized)
19
+ - **Size:** ~75MB (quantized from ~300MB)
20
+
21
+ ## Usage
22
+
23
+ ```javascript
24
+ import { pipeline } from '@xenova/transformers';
25
+
26
+ const transcriber = await pipeline('automatic-speech-recognition', 'markusingvarsson/whisper-test');
27
+ const result = await transcriber('audio.wav');
28
+ console.log(result.text);
29
+ ```
30
+
31
+ ## Conversion Details
32
+
33
+ This model was converted using a custom conversion pipeline that:
34
+ 1. Downloads the original HuggingFace model
35
+ 2. Exports to ONNX format with KV caching
36
+ 3. Applies INT8 quantization for smaller size
37
+ 4. Adds Whisper-specific alignment heads for timestamp support
38
+
39
+ The quantized models are approximately 4x smaller than the original while maintaining accuracy.
onnx/decoder_model_merged.onnx CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:fae23737d0ba6f3bd3d27658f3b6de17169014d11149cd033274e23c75a7e224
3
  size 79336381
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9295b9fb09be07b70ac6c54a7cfc7bf172d88a40f735035b30fc5a0f374fa435
3
  size 79336381