metadata
base_model: ogulcanakca/whisper-small-tr
base_model_relation: quantized
language:
- tr
license: cc0-1.0
tags:
- ctranslate2
- faster-whisper
- int8
- automatic-speech-recognition
- robust-speech
- quantization
- whisper
spaces:
- ogulcanakca/turkish-asr-demo
widget:
- src: >-
https://drive.google.com/file/d/1fgb1yOwXSba17HtLMIbcLh79j9nMtQzH/view?usp=sharing
example_title: Sample Speech (Common Voice)
- text: Use the Space demo above to test the model.
example_title: Information
model-index:
- name: faster-whisper-small-tr
results:
- task:
name: Automatic Speech Recognition
type: automatic-speech-recognition
metrics:
- name: Real Time Factor (Speedup)
type: rtf
value: 19.21
Faster Whisper Small Turkish (INT8 Quantized)
This model is an optimized CTranslate2 (INT8) conversion of the robust Turkish ASR model ogulcanakca/whisper-small-tr.
Performance Benchmarks
The model was benchmarked against the original Hugging Face Transformers implementation on an NVIDIA A100 GPU.
| Model Format | Precision | Inference Time (Avg) | Speedup Factor |
|---|---|---|---|
| Original PyTorch | FP16 | 10.35 sec | 1x (Baseline) |
| Faster-Whisper | INT8 | 0.54 sec | 19.2x Faster |
Note: Benchmarks were conducted on a standard Common Voice audio sample.
Usage
To use this model, you need the faster-whisper library.
pip install faster-whisper
from faster_whisper import WhisperModel
model_id = "ogulcanakca/faster-whisper-small-tr"
# Run on GPU with INT8
model = WhisperModel(model_id, device="cuda", compute_type="int8")
# or Run on CPU with INT8 (High performance on CPU too!)
# model = WhisperModel(model_id, device="cpu", compute_type="int8")
segments, info = model.transcribe("audio.mp3", beam_size=5, language="tr")
print(f"Detected language '{info.language}' with probability {info.language_probability}")
for segment in segments:
print(f"[{segment.start:.2f}s -> {segment.end:.2f}s] {segment.text}")
Model Details
- Base Model:
ogulcanakca/whisper-small-tr(Fine-tuned on Common Voice 23.0 with JIT Augmentation) - Quantization: 8-bit Integer (INT8)
- Backend: CTranslate2
- Objective: Low-latency real-time streaming and high-throughput batch processing.