Jacaranda-Health
/

ASR-STT-8bit

Automatic Speech Recognition

8-bit precision

Model card Files Files and versions

eolang commited on Aug 7, 2025

Commit

3e22be3

·

verified ·

1 Parent(s): 7ba8c82

Upload README (1).md

Files changed (1) hide show

README (1).md +80 -0

README (1).md ADDED Viewed

	@@ -0,0 +1,80 @@

+---
+license: apache-2.0
+base_model: Jacaranda-Health/ASR-STT
+tags:
+- speech-to-text
+- automatic-speech-recognition
+- quantized
+- 8bit
+language:
+- en
+pipeline_tag: automatic-speech-recognition
+---
+# ASR-STT 8BIT Quantized
+This is a 8bit quantized version of [Jacaranda-Health/ASR-STT](https://huggingface.co/Jacaranda-Health/ASR-STT).
+## Model Details
+- **Base Model**: Jacaranda-Health/ASR-STT
+- **Quantization**: 8bit
+- **Size Reduction**: 73.1% smaller than original
+- **Original Size**: 2913.89 MB
+- **Quantized Size**: 784.94 MB
+## Usage
+```python
+from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor, BitsAndBytesConfig
+import torch
+import librosa
+# Load processor
+processor = AutoProcessor.from_pretrained("eolang/ASR-STT-8bit")
+# Configure quantization
+quantization_config = BitsAndBytesConfig(
+    load_in_8bit=True
+    llm_int8_threshold=6.0,
+    llm_int8_has_fp16_weight=False
+)
+# Load quantized model
+model = AutoModelForSpeechSeq2Seq.from_pretrained(
+    "eolang/ASR-STT-8bit",
+    quantization_config=quantization_config,
+    device_map="auto"
+)
+# Transcription function
+def transcribe(filepath):
+    audio, sr = librosa.load(filepath, sr=16000)
+    inputs = processor(audio, sampling_rate=sr, return_tensors="pt")
+    # Convert to half precision for quantized models
+    if torch.cuda.is_available():
+        inputs = {k: v.cuda().half() for k, v in inputs.items()}
+    else:
+        inputs = {k: v.half() for k, v in inputs.items()}
+    with torch.no_grad():
+        generated_ids = model.generate(inputs["input_features"])
+    return processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
+# Example usage
+transcription = transcribe("path/to/audio.wav")
+print(transcription)
+```
+## Performance
+- Faster inference due to reduced precision
+- Lower memory usage
+- Maintained transcription quality
+## Requirements
+- transformers
+- torch
+- bitsandbytes
+- librosa