publish whisper-medium OpenASR packs

Browse files

Files changed (5) hide show

.gitattributes +1 -0
README.md +113 -0
whisper-medium-fp16.oasr +3 -0
whisper-medium-q4_k.oasr +3 -0
whisper-medium-q8_0.oasr +3 -0

.gitattributes ADDED Viewed

	@@ -0,0 +1 @@


1	+ *.oasr filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,113 @@

+---
+license: apache-2.0
+base_model: openai/whisper-medium
+pipeline_tag: automatic-speech-recognition
+library_name: openasr
+tags:
+  - automatic-speech-recognition
+  - speech-to-text
+  - openasr
+  - oasr
+  - whisper-medium
+---
+<div align="center">
+# Whisper Medium · OpenASR
+**High-accuracy multilingual Whisper at 769M parameters**
+[![License](https://img.shields.io/badge/license-Apache--2.0-2563eb.svg)](https://huggingface.co/openai/whisper-medium/blob/main/README.md)
+[![Format](https://img.shields.io/badge/format-.oasr-7c3aed.svg)](https://github.com/QuintinShaw/openasr)
+[![Runtime](https://img.shields.io/badge/runtime-OpenASR-111827.svg)](https://openasr.org)
+[![Base model](https://img.shields.io/badge/base-whisper--medium-f59e0b.svg)](https://huggingface.co/openai/whisper-medium)
+Native speech-to-text in the **[OpenASR](https://github.com/QuintinShaw/openasr)** runtime —
+engineered for peak performance on CPU & GPU, **no Python at inference time**.
+</div>
+---
+## ✨ Highlights
+- 🎧 **Multilingual ASR** — transcribes many languages and can translate speech to English
+- 🎯 **769M parameters** — near-large accuracy with a more manageable footprint
+- 🌐 **Weak-supervision scale** — trained with Whisper's 680k-hour labelled speech corpus
+- 🦀 **Native in OpenASR** — `.oasr` packs run with no Python at inference, engineered for peak performance on CPU & GPU
+## 🚀 Quickstart
+```bash
+# 1. Install the OpenASR CLI  ·  https://openasr.org
+# 2. Pull a build (pick a quant — see the table below)
+openasr pull whisper-medium:q8
+# 3. Transcribe
+openasr transcribe audio.wav --model whisper-medium
+```
+All builds for this model:
+```bash
+openasr pull whisper-medium:fp16
+openasr pull whisper-medium:q8
+openasr pull whisper-medium:q4
+```
+## 📦 Available builds
+| Quant | File (`.oasr`) | Size | RAM peak | RTF · M1 CPU | RTF · M1 GPU | JFK ΔWER vs fp16 |
+|:------|:---------------|-----:|---------:|-------------:|-------------:|-----------------:|
+| fp16 | `whisper-medium-fp16.oasr` | 1.53 GB | 4.03 GB | 0.62× | 0.61× | 0.0% |
+| q8_0 | `whisper-medium-q8_0.oasr` | 874 MB | 2.17 GB | 0.46× | 0.41× | 0.0% |
+| q4_k | `whisper-medium-q4_k.oasr` | 522 MB | 1.54 GB | 0.51× | 0.39× | 0.0% |
+<sub>RTF = real-time factor on the fixed 11s JFK clip (**lower is faster**); RAM peak measured per pack
+in an isolated subprocess. JFK ΔWER compares each quantized build's JFK transcript to this model's
+fp16 JFK transcript, so it measures quantization drift rather than absolute recognition accuracy.
+**q8_0** is the recommended default — near-reference quality at a fraction of the
+footprint.</sub>
+## 🧠 About Whisper Medium
+Whisper Medium is OpenAI's 769M-parameter multilingual Whisper checkpoint. It uses the standard
+Whisper encoder-decoder architecture for automatic speech recognition and speech translation,
+trained with large-scale weak supervision on 680k hours of labelled speech. Medium delivers
+much of the large model's accuracy at a smaller footprint, a strong choice when quality matters
+but the largest checkpoint is too heavy. This OpenASR repo repackages the original
+`openai/whisper-medium` weights as `.oasr` packs that run natively in the OpenASR runtime with
+no Python at inference time. For most users the q8_0 build is the recommended default; q4_k is
+for tighter memory budgets and fp16 is for verification or maximum fidelity.
+## ⚙️ How these packs were made
+Converted from [openai/whisper-medium](https://huggingface.co/openai/whisper-medium) with the OpenASR importer:
+```bash
+openasr model-pack import-whisper-local <src> <out>.oasr \
+  --package-id whisper-medium --quantization {fp16,q8-0,q4-k}
+```
+The `.oasr` container is GGUF-backed; packs use zero-copy mmap weight binding and graph
+buffer reuse to keep peak memory low.
+## ⚖️ License
+These packs **inherit the upstream model's license: Apache-2.0**
+([source](https://huggingface.co/openai/whisper-medium/blob/main/README.md)). OpenASR packaging retains the upstream copyright and
+NOTICE; the only modifications are format conversion and quantization.
+## 🙏 Acknowledgements
+This pack is a redistribution of **Whisper Medium**, released by **OpenAI**
+([openai/whisper-medium](https://huggingface.co/openai/whisper-medium)).
+All credit for the original model, training recipe, and weights belongs to OpenAI. The
+upstream Hugging Face model card declares Apache-2.0 licensing; OpenASR only converts the
+weights into `.oasr` packages and adds quantized builds for local runtime use.
+## 🔗 Links
+- 🦀 **OpenASR** — <https://github.com/QuintinShaw/openasr>
+- 🌐 **Website** — <https://openasr.org>
+- 🤗 **Upstream model** — [openai/whisper-medium](https://huggingface.co/openai/whisper-medium)

whisper-medium-fp16.oasr ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:08a80860d71f72728ad9676f3f3e7ef45d460c9d38a4eb11199607ed374da200
+size 1534887520

whisper-medium-q4_k.oasr ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:180a31a5e134b7d7e1cfc21c8859000f644f14e8b41d52142ff757fe72e63390
+size 521963104

whisper-medium-q8_0.oasr ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5e663d322bcaa5743c3e4b3dac680f0b6c79f87edb9d7f1b9147a09329278c37
+size 874284640