publish whisper-large-v3 OpenASR packs

Browse files

Files changed (5) hide show

.gitattributes +1 -0
README.md +114 -0
whisper-large-v3-fp16.oasr +3 -0
whisper-large-v3-q4_k.oasr +3 -0
whisper-large-v3-q8_0.oasr +3 -0

.gitattributes ADDED Viewed

	@@ -0,0 +1 @@


1	+ *.oasr filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,114 @@

+---
+license: apache-2.0
+base_model: openai/whisper-large-v3
+pipeline_tag: automatic-speech-recognition
+library_name: openasr
+tags:
+  - automatic-speech-recognition
+  - speech-to-text
+  - openasr
+  - oasr
+  - whisper-large-v3
+---
+<div align="center">
+# Whisper Large v3 · OpenASR
+**OpenAI's most accurate Whisper, the v3 large checkpoint**
+[![License](https://img.shields.io/badge/license-Apache--2.0-2563eb.svg)](https://huggingface.co/openai/whisper-large-v3/blob/main/README.md)
+[![Format](https://img.shields.io/badge/format-.oasr-7c3aed.svg)](https://github.com/QuintinShaw/openasr)
+[![Runtime](https://img.shields.io/badge/runtime-OpenASR-111827.svg)](https://openasr.org)
+[![Base model](https://img.shields.io/badge/base-whisper--large--v3-f59e0b.svg)](https://huggingface.co/openai/whisper-large-v3)
+Native speech-to-text in the **[OpenASR](https://github.com/QuintinShaw/openasr)** runtime —
+engineered for peak performance on CPU & GPU, **no Python at inference time**.
+</div>
+---
+## ✨ Highlights
+- 🎧 **Multilingual ASR** — transcribes a wide range of languages and can translate speech to English
+- 🏆 **1.55B parameters** — the full-size Whisper, OpenAI's highest-accuracy checkpoint
+- 🔁 **v3 improvements** — trained on a larger, more diverse corpus with 128 mel bins for better robustness
+- 🦀 **Native in OpenASR** — `.oasr` packs run with no Python at inference, engineered for peak performance on CPU & GPU
+## 🚀 Quickstart
+```bash
+# 1. Install the OpenASR CLI  ·  https://openasr.org
+# 2. Pull a build (pick a quant — see the table below)
+openasr pull whisper-large-v3:q8
+# 3. Transcribe
+openasr transcribe audio.wav --model whisper-large-v3
+```
+All builds for this model:
+```bash
+openasr pull whisper-large-v3:fp16
+openasr pull whisper-large-v3:q8
+openasr pull whisper-large-v3:q4
+```
+## 📦 Available builds
+| Quant | File (`.oasr`) | Size | RAM peak | RTF · M1 CPU | RTF · M1 GPU | JFK ΔWER vs fp16 |
+|:------|:---------------|-----:|---------:|-------------:|-------------:|-----------------:|
+| fp16 | `whisper-large-v3-fp16.oasr` | 3.09 GB | 4.70 GB | 1.17× | 1.13× | 0.0% |
+| q8_0 | `whisper-large-v3-q8_0.oasr` | 1.71 GB | 4.05 GB | 0.65× | 0.46× | 0.0% |
+| q4_k | `whisper-large-v3-q4_k.oasr` | 978 MB | 2.46 GB | 0.61× | 0.49× | 0.0% |
+<sub>RTF = real-time factor on the fixed 11s JFK clip (**lower is faster**); RAM peak measured per pack
+in an isolated subprocess. JFK ΔWER compares each quantized build's JFK transcript to this model's
+fp16 JFK transcript, so it measures quantization drift rather than absolute recognition accuracy.
+**q8_0** is the recommended default — near-reference quality at a fraction of the
+footprint.</sub>
+## 🧠 About Whisper Large v3
+Whisper Large v3 is OpenAI's 1.55B-parameter multilingual Whisper checkpoint, the most accurate
+member of the family. It uses the standard Whisper encoder-decoder architecture for automatic
+speech recognition and speech translation; v3 was trained on a larger and more diverse labelled
+corpus and uses 128 mel-frequency bins, improving robustness across languages and conditions
+over earlier large checkpoints. This OpenASR repo repackages the original
+`openai/whisper-large-v3` weights as `.oasr` packs that run natively in the OpenASR runtime with
+no Python at inference time. For most users the q8_0 build is the recommended default; q4_k is
+for tighter memory budgets and fp16 is for verification or maximum fidelity. For a faster
+large-grade option, see the distilled `whisper-large-v3-turbo`.
+## ⚙️ How these packs were made
+Converted from [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3) with the OpenASR importer:
+```bash
+openasr model-pack import-whisper-local <src> <out>.oasr \
+  --package-id whisper-large-v3 --quantization {fp16,q8-0,q4-k}
+```
+The `.oasr` container is GGUF-backed; packs use zero-copy mmap weight binding and graph
+buffer reuse to keep peak memory low.
+## ⚖️ License
+These packs **inherit the upstream model's license: Apache-2.0**
+([source](https://huggingface.co/openai/whisper-large-v3/blob/main/README.md)). OpenASR packaging retains the upstream copyright and
+NOTICE; the only modifications are format conversion and quantization.
+## 🙏 Acknowledgements
+This pack is a redistribution of **Whisper Large v3**, released by **OpenAI**
+([openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3)).
+All credit for the original model, training recipe, and weights belongs to OpenAI. The
+upstream Hugging Face model card declares Apache-2.0 licensing; OpenASR only converts the
+weights into `.oasr` packages and adds quantized builds for local runtime use.
+## 🔗 Links
+- 🦀 **OpenASR** — <https://github.com/QuintinShaw/openasr>
+- 🌐 **Website** — <https://openasr.org>
+- 🤗 **Upstream model** — [openai/whisper-large-v3](https://huggingface.co/openai/whisper-large-v3)

whisper-large-v3-fp16.oasr ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8a0052daa836ecc10cd5e71f3dffce21af001b47a624501e5ef08d536a732598
+size 3088750656

whisper-large-v3-q4_k.oasr ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2585fb2c9f5506266a88963a80eddb07a759d334416fac851b61a40b0a18df71
+size 978491456

whisper-large-v3-q8_0.oasr ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2ea5a9cf974b0524ee570feb50c96d2b754b1108312a880f7778bf55caf49327
+size 1712494656