r7_onnx_int8

INT8 quantized ONNX export of hetchyy/r7.

Format: ONNX int8 dynamic quantization
Size: 2 MB
Output: IPA phoneme tokens (Quranic Arabic)

Usage

import numpy as np
import onnxruntime as ort
from transformers import AutoProcessor
from huggingface_hub import hf_hub_download

processor = AutoProcessor.from_pretrained("hetchyy/r7_onnx_int8")
onnx_path = hf_hub_download("hetchyy/r7_onnx_int8", "model_quantized.onnx")
sess = ort.InferenceSession(onnx_path)

inputs = processor(audio_array, sampling_rate=16000, return_tensors="np", padding=True)
logits = sess.run(["logits"], {"input_values": inputs["input_values"]})[0]
predicted_ids = np.argmax(logits, axis=-1)
transcription = processor.batch_decode(predicted_ids)

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hetchyy/r7_onnx_int8

Base model

hetchyy/r7

Quantized

(1)

this model