SimFonX
/

whisper-onnx-optimized

Automatic Speech Recognition

Model card Files Files and versions

xet

Community

SimFonX commited on May 23, 2025

Commit

54d0c7f

verified ·

1 Parent(s): ef601af

Update README.md

Browse files

Files changed (1) hide show

README.md +7 -50

README.md CHANGED Viewed

@@ -19,28 +19,20 @@ Optimized Whisper ONNX models packaged for easy deployment. Each zip contains al
 | Model | Language | Size | Target Use | Download |
 |-------|----------|------|------------|----------|
-| **Medium English** | English-only | ~486MB | High quality English transcription | [whisper-medium-en-onnx.zip](medium-en/whisper-medium-en-onnx.zip) |
-| **Small English** | English-only | ~85MB | Fast English transcription | [whisper-small-en-onnx.zip](small-en/whisper-small-en-onnx.zip) |
-| **Small Multilingual** | 99 languages | ~110MB | Fast multilingual transcription | [whisper-small-multilingual-onnx.zip](small-multilingual/whisper-small-multilingual-onnx.zip) |
-| **Medium Multilingual** | 99 languages | ~295MB | High quality multilingual | [whisper-medium-multilingual-onnx.zip](medium-multilingual/whisper-medium-multilingual-onnx.zip) |
-| **Large v3 Turbo** | 99 languages | ~530MB | Best quality, fastest large model | [whisper-large-v3-turbo-onnx.zip](large-v3-turbo/whisper-large-v3-turbo-onnx.zip) |
-## Size Comparison vs GGML Q5_0
-All models are **smaller** than equivalent GGML Q5_0 models:
-- Medium English: 486MB vs 515MB GGML ✅ (-29MB)
-- Small models: ~85-110MB vs 182MB GGML ✅ (-70-97MB)
-- Large v3 Turbo: 530MB vs 574MB GGML ✅ (-44MB)
 ## Contents of Each Zip
-Each zip file contains 7 files needed for inference:
 ### ONNX Model Files
 - `encoder_model_quantized.onnx` - Audio encoder (processes mel spectrograms)
-- `decoder_model_merged_quantized.onnx` - Text decoder (generates transcription)
-- `decoder_with_past_model_quantized.onnx` - Optimized decoder with KV caching
 ### Configuration Files
 - `config.json` - Model configuration
@@ -48,41 +40,6 @@ Each zip file contains 7 files needed for inference:
 - `preprocessor_config.json` - Audio preprocessing settings
 - `tokenizer.json` - Tokenizer vocabulary
-## Usage
-### C# with ONNX Runtime
-```csharp
-// Download and extract zip
-var modelPath = "path/to/extracted/model/";
-// Initialize with DirectML support
-var sessionOptions = new SessionOptions();
-sessionOptions.AppendExecutionProvider_DML(0);
-var encoderSession = new InferenceSession(
-    Path.Combine(modelPath, "encoder_model_quantized.onnx"), sessionOptions);
-var decoderSession = new InferenceSession(
-    Path.Combine(modelPath, "decoder_with_past_model_quantized.onnx"), sessionOptions);
-```
-### Python with ONNX Runtime
-```python
-import onnxruntime as ort
-# Load with DirectML/CUDA support
-providers = ['DmlExecutionProvider', 'CPUExecutionProvider']
-encoder_session = ort.InferenceSession('encoder_model_quantized.onnx', providers=providers)
-decoder_session = ort.InferenceSession('decoder_with_past_model_quantized.onnx', providers=providers)
-```
-## Features
-✅ **DirectML Support** - Works with any DirectX 12 GPU (AMD, Intel, NVIDIA)
-✅ **CUDA Support** - Accelerated inference on NVIDIA GPUs
-✅ **CPU Fallback** - Automatic fallback to CPU if GPU unavailable
-✅ **Quantized** - INT8/INT4 quantization for smaller size and faster inference
-✅ **Complete** - All files needed for inference included
 ## Model Sources
 These models are repackaged from:

 | Model | Language | Size | Target Use | Download |
 |-------|----------|------|------------|----------|
+| **Small English** | English-only | 107MB | Fast English transcription | [whisper-small-en-onnx.zip](small-en/whisper-small-en-onnx.zip) |
+| **Small Multilingual** | 99 languages | 245MB | Fast multilingual transcription | [whisper-small-multilingual-onnx.zip](small-multilingual/whisper-small-multilingual-onnx.zip) |
+| **Medium English** | English-only | 247MB | High quality English transcription | [whisper-medium-en-onnx.zip](medium-en/whisper-medium-en-onnx.zip) |
+| **Medium Multilingual** | 99 languages | 602MB | High quality multilingual | [whisper-medium-multilingual-onnx.zip](medium-multilingual/whisper-medium-multilingual-onnx.zip) |
+| **Large v3 Turbo** | 99 languages | 646MB | Best quality, fastest large model | [whisper-large-v3-turbo-onnx.zip](large-v3-turbo/whisper-large-v3-turbo-onnx.zip) |
 ## Contents of Each Zip
+Each zip file contains 6 files needed for inference:
 ### ONNX Model Files
 - `encoder_model_quantized.onnx` - Audio encoder (processes mel spectrograms)
+- `decoder_with_past_model_quantized.onnx` - Text decoder (generates transcription), optimized decoder with KV caching
 ### Configuration Files
 - `config.json` - Model configuration
 - `preprocessor_config.json` - Audio preprocessing settings
 - `tokenizer.json` - Tokenizer vocabulary
 ## Model Sources
 These models are repackaged from: