File size: 3,934 Bytes

---
language:
- en
- zh
- de
- es
- ru
- ko
- fr
- ja
- pt
- tr
- pl
- ca
- nl
- ar
- sv
- it
- id
- hi
- fi
- vi
- he
- uk
- el
- ms
- cs
- ro
- da
- hu
- ta
- 'no'
- th
- ur
- hr
- bg
- lt
- la
- mi
- ml
- cy
- sk
- te
- fa
- lv
- bn
- sr
- az
- sl
- kn
- et
- mk
- br
- eu
- is
- hy
- ne
- mn
- bs
- kk
- sq
- sw
- gl
- mr
- pa
- si
- km
- sn
- yo
- so
- af
- oc
- ka
- be
- tg
- sd
- gu
- am
- yi
- lo
- uz
- fo
- ht
- ps
- tk
- nn
- mt
- sa
- lb
- my
- bo
- tl
- mg
- as
- tt
- haw
- ln
- ha
- ba
- jw
- su
tags:
- audio
- automatic-speech-recognition
- hf-asr-leaderboard
- open4bits
widget:
- example_title: Librispeech sample 1
  src: https://cdn-media.huggingface.co/speech_samples/sample1.flac
- example_title: Librispeech sample 2
  src: https://cdn-media.huggingface.co/speech_samples/sample2.flac
model-index:
- name: whisper-base
  results:
  - task:
      name: Automatic Speech Recognition
      type: automatic-speech-recognition
    dataset:
      name: LibriSpeech (clean)
      type: librispeech_asr
      config: clean
      split: test
      args:
        language: en
    metrics:
    - name: Test WER
      type: wer
      value: 5.008769117619326
  - task:
      name: Automatic Speech Recognition
      type: automatic-speech-recognition
    dataset:
      name: LibriSpeech (other)
      type: librispeech_asr
      config: other
      split: test
      args:
        language: en
    metrics:
    - name: Test WER
      type: wer
      value: 12.84936273212057
  - task:
      name: Automatic Speech Recognition
      type: automatic-speech-recognition
    dataset:
      name: Common Voice 11.0
      type: mozilla-foundation/common_voice_11_0
      config: hi
      split: test
      args:
        language: hi
    metrics:
    - name: Test WER
      type: wer
      value: 131
pipeline_tag: automatic-speech-recognition
license: apache-2.0
base_model:
- openai/whisper-base
---

# Open4bits / Whisper Base FP16

This repository provides the **Whisper Base model converted to FP16 (float16) precision**, published by Open4bits to enable more efficient inference while maintaining transcription quality.

The underlying Whisper model and architecture are **owned by OpenAI**. This repository contains only a precision-converted version of the original model weights.

The model is designed for multilingual speech-to-text tasks and can be used in research, experimentation, and production ASR pipelines.

---

## Model Overview

Whisper is a sequence-to-sequence transformer model developed by OpenAI for automatic speech recognition and speech translation.  
This release uses the **Base** variant and preserves the original architecture while reducing memory usage through FP16 precision.

---

## Model Details

- **Architecture:** Whisper Base  
- **Parameters:** ~74 million  
- **Precision:** float16 (FP16)  
- **Task:** Automatic Speech Recognition (ASR)  
- **Languages:** Multilingual  
- **Weight tying:** Preserved  
- **Compatibility:** Hugging Face Transformers, PyTorch  

This conversion improves inference speed and lowers VRAM requirements compared to FP32 versions, making it suitable for deployment on consumer and server-grade GPUs.

---

## Intended Use

This model is intended for:
- Speech-to-text transcription
- Multilingual ASR applications
- Research and benchmarking
- Efficient inference in low-memory environments

---

## Limitations

* Performance depends on audio quality, language, and accent
* Inherits known limitations of the Whisper Base architecture
* Not fine-tuned for domain-specific or highly noisy audio

---

## License

This model is released under the **Apache License 2.0**.
The original Whisper model and associated intellectual property are owned by OpenAI.

---

## Support

If you find this model useful, please consider supporting the project.
Your support helps us continue releasing and maintaining high-quality open models.
Support us with a heart.