--- license: apache-2.0 base_model: - lj1995/VoiceConversionWebUI - facebook/hubert-base-ls960 pipeline_tag: feature-extraction library_name: fairseq tags: - rvc - audio --- # Hubert Base ONNX Model for Voice Conversion This is the **ONNX-exported version of the Hubert Base model**, fine-tuned for voice conversion and compatible with modern inference pipelines. This model allows fast and efficient audio processing in ONNX runtime environments. It builds upon the following models: - [lj1995/VoiceConversionWebUI](https://huggingface.co/lj1995/VoiceConversionWebUI) - [facebook/hubert-base-ls960](https://huggingface.co/facebook/hubert-base-ls960) --- ## Features - Converts audio features into high-quality embeddings for voice conversion tasks. - Fully ONNX-compatible for optimized inference on CPUs and GPUs. - Lightweight and easy to integrate in custom voice processing pipelines. - No extra requirements needed, just **numpy** and **onnxruntime** ## ONNX Model Report **Model:** `hubert_base.onnx` **Producer:** pytorch 2.0.0 **IR Version:** 8 **Opsets:** ai.onnx:18 **Parameters:** 94,370,816 --- ### 🟦 Inputs - **source** | type: `float32` | shape: [batch_size, sequence_length] - *Waveform PCM 32 - SR 16,000 - Mono* - **padding_mask** | type: `bool` | shape: [batch_size, sequence_length] - It is usually a completely false array, with the same shape as the waveform. `padding_mask = np.zeros(waveform.shape, dtype=np.bool_)` ### 🟩 Outputs - **features** | type: `float32` | shape: [batch_size, sequence_length, 768 ] --- ## Usage ```python import numpy as np import onnxruntime as ort class OnnxHubert: """ Class to load and run the ONNX model exported by Hubert. Attributes: session (ort.InferenceSession): The ONNX Runtime session. input_name (str): The name of the input node. output_name (str): The name of the output node. Methods: extract_features_batch (source, padding_mask): Run the ONNX model and extract features from the batch. extract_features (source, padding_mask): Run the ONNX model and extract features from a single input. """ def __init__(self, model_path: str, thread_num: int = None): """ Initialize the OnnxHubert object. Parameters: model_path (str): The path to the ONNX model file. thread_num (int, optional): The number of threads to use for inference. Defaults to None. Attributes: session (ort.InferenceSession): The ONNX Runtime session. input_name (str): The name of the input node. output_name (str): The name of the output node. """ self.session = ort.InferenceSession(model_path) self.input_name = self.session.get_inputs()[0].name self.output_name = self.session.get_outputs()[0].name def extract_features( self, source: np.ndarray, padding_mask: np.ndarray ) -> np.ndarray: """ Extract features from the batch using the ONNX model. Inputs: source: ndarray of shape (batch_size, sequence_length) float32 padding_mask: ndarray of shape (batch_size, sequence_length) bool Returns: ndarray of shape (D, 768) with the extracted features """ result = self.session.run(None, { "source": source, "padding_mask": padding_mask }) return result[0] ``` ## Installation You can install the required libraries with: ```bash pip install onnxruntime numpy ```