v0.33.0

Browse files

See https://github.com/quic/ai-hub-models/releases/v0.33.0 for changelog.

Files changed (5) hide show

.gitattributes +1 -0
Whisper-Medium-En_WhisperDecoderInf.onnx → DEPLOYMENT_MODEL_LICENSE.pdf +2 -2
LICENSE +2 -0
README.md +4 -5
Whisper-Medium-En_WhisperEncoderInf.onnx +0 -3

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+DEPLOYMENT_MODEL_LICENSE.pdf filter=lfs diff=lfs merge=lfs -text

Whisper-Medium-En_WhisperDecoderInf.onnx → DEPLOYMENT_MODEL_LICENSE.pdf RENAMED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2bc4614f1f973a7f2b9ed0c4171d032761eac0cba67fc25d421f78cd8fa3275a
-size 1838005917

 version https://git-lfs.github.com/spec/v1
+oid sha256:4409f93b0e82531303b3e10f52f1fdfb56467a25f05b7441c6bbd8bb8a64b42c
+size 109629

LICENSE ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ The license of the original trained model can be found at https://github.com/openai/whisper/blob/main/LICENSE.
2	+ The license for the deployable model files (.tflite, .onnx, .dlc, .bin, etc.) can be found in DEPLOYMENT_MODEL_LICENSE.pdf.

README.md CHANGED Viewed

@@ -31,20 +31,19 @@ More details on model performance across various devices, can be found
   - Model checkpoint: medium.en
   - Input resolution: 80x3000 (30 seconds audio)
   - Mean decoded sequence length: 224 tokens
-  - Number of parameters: 769 M
-  - Model size (WhisperEncoder): 769 MB
-  - Model size (WhisperDecoder): 726 MB
 | Model | Precision | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit | Target Model
 |---|---|---|---|---|---|---|---|---|
 | WhisperEncoderInf | float | SA8295P ADP | Qualcomm® SA8295P | TFLITE | 1969.856 ms | 201 - 251 MB | GPU | [Whisper-Medium-En.tflite](https://huggingface.co/qualcomm/Whisper-Medium-En/blob/main/Whisper-Medium-En.tflite) |
 | WhisperEncoderInf | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | TFLITE | 1720.841 ms | 60 - 308 MB | GPU | [Whisper-Medium-En.tflite](https://huggingface.co/qualcomm/Whisper-Medium-En/blob/main/Whisper-Medium-En.tflite) |
 | WhisperEncoderInf | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | TFLITE | 1509.053 ms | 229 - 275 MB | GPU | [Whisper-Medium-En.tflite](https://huggingface.co/qualcomm/Whisper-Medium-En/blob/main/Whisper-Medium-En.tflite) |
-| WhisperEncoderInf | float | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX | 1545.124 ms | 953 - 953 MB | NPU | [Whisper-Medium-En.onnx](https://huggingface.co/qualcomm/Whisper-Medium-En/blob/main/Whisper-Medium-En.onnx) |
 | WhisperDecoderInf | float | SA8295P ADP | Qualcomm® SA8295P | TFLITE | 92.152 ms | 42 - 1250 MB | NPU | [Whisper-Medium-En.tflite](https://huggingface.co/qualcomm/Whisper-Medium-En/blob/main/Whisper-Medium-En.tflite) |
 | WhisperDecoderInf | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | TFLITE | 91.218 ms | 42 - 1597 MB | NPU | [Whisper-Medium-En.tflite](https://huggingface.co/qualcomm/Whisper-Medium-En/blob/main/Whisper-Medium-En.tflite) |
 | WhisperDecoderInf | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | TFLITE | 80.416 ms | 43 - 1382 MB | NPU | [Whisper-Medium-En.tflite](https://huggingface.co/qualcomm/Whisper-Medium-En/blob/main/Whisper-Medium-En.tflite) |
-| WhisperDecoderInf | float | Snapdragon X Elite CRD | Snapdragon® X Elite | ONNX | 66.789 ms | 566 - 566 MB | NPU | [Whisper-Medium-En.onnx](https://huggingface.co/qualcomm/Whisper-Medium-En/blob/main/Whisper-Medium-En.onnx) |

   - Model checkpoint: medium.en
   - Input resolution: 80x3000 (30 seconds audio)
   - Mean decoded sequence length: 224 tokens
+  - Number of parameters (WhisperEncoderInf): 358M
+  - Model size (WhisperEncoderInf) (float): 1.33 GB
+  - Number of parameters (WhisperDecoderInf): 406M
+  - Model size (WhisperDecoderInf) (float): 1.51 GB
 | Model | Precision | Device | Chipset | Target Runtime | Inference Time (ms) | Peak Memory Range (MB) | Primary Compute Unit | Target Model
 |---|---|---|---|---|---|---|---|---|
 | WhisperEncoderInf | float | SA8295P ADP | Qualcomm® SA8295P | TFLITE | 1969.856 ms | 201 - 251 MB | GPU | [Whisper-Medium-En.tflite](https://huggingface.co/qualcomm/Whisper-Medium-En/blob/main/Whisper-Medium-En.tflite) |
 | WhisperEncoderInf | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | TFLITE | 1720.841 ms | 60 - 308 MB | GPU | [Whisper-Medium-En.tflite](https://huggingface.co/qualcomm/Whisper-Medium-En/blob/main/Whisper-Medium-En.tflite) |
 | WhisperEncoderInf | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | TFLITE | 1509.053 ms | 229 - 275 MB | GPU | [Whisper-Medium-En.tflite](https://huggingface.co/qualcomm/Whisper-Medium-En/blob/main/Whisper-Medium-En.tflite) |
 | WhisperDecoderInf | float | SA8295P ADP | Qualcomm® SA8295P | TFLITE | 92.152 ms | 42 - 1250 MB | NPU | [Whisper-Medium-En.tflite](https://huggingface.co/qualcomm/Whisper-Medium-En/blob/main/Whisper-Medium-En.tflite) |
 | WhisperDecoderInf | float | Samsung Galaxy S24 | Snapdragon® 8 Gen 3 Mobile | TFLITE | 91.218 ms | 42 - 1597 MB | NPU | [Whisper-Medium-En.tflite](https://huggingface.co/qualcomm/Whisper-Medium-En/blob/main/Whisper-Medium-En.tflite) |
 | WhisperDecoderInf | float | Snapdragon 8 Elite QRD | Snapdragon® 8 Elite Mobile | TFLITE | 80.416 ms | 43 - 1382 MB | NPU | [Whisper-Medium-En.tflite](https://huggingface.co/qualcomm/Whisper-Medium-En/blob/main/Whisper-Medium-En.tflite) |

Whisper-Medium-En_WhisperEncoderInf.onnx DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:b86b70e4b238462b25d7c48f68d401ed6e32d6dc72d849019b3b9e09dbfcf2b8
-size 1430779879