MidFord327 commited on
Commit
94d4ef6
·
verified ·
1 Parent(s): 63cae26

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +117 -3
README.md CHANGED
@@ -1,3 +1,117 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model:
4
+ - lj1995/VoiceConversionWebUI
5
+ - facebook/hubert-base-ls960
6
+ pipeline_tag: audio-classification
7
+ library_name: fairseq
8
+ tags:
9
+ - rvc
10
+ - audio
11
+ ---
12
+
13
+ # Hubert Base ONNX Model for Voice Conversion
14
+
15
+ This is the **ONNX-exported version of the Hubert Base model**, fine-tuned for voice conversion and compatible with modern inference pipelines. This model allows fast and efficient audio processing in ONNX runtime environments.
16
+
17
+ It builds upon the following models:
18
+ - [lj1995/VoiceConversionWebUI](https://huggingface.co/lj1995/VoiceConversionWebUI)
19
+ - [facebook/hubert-base-ls960](https://huggingface.co/facebook/hubert-base-ls960)
20
+
21
+ ---
22
+
23
+ ## Features
24
+
25
+ - Converts audio features into high-quality embeddings for voice conversion tasks.
26
+ - Fully ONNX-compatible for optimized inference on CPUs and GPUs.
27
+ - Lightweight and easy to integrate in custom voice processing pipelines.
28
+ - No extra requirements needed, just **numpy** and **onnxruntime**
29
+
30
+ ## ONNX Model Report
31
+
32
+ **Model:** `hubert_base.onnx`
33
+ **Producer:** pytorch 2.0.0
34
+ **IR Version:** 8
35
+ **Opsets:** ai.onnx:18
36
+ **Parameters:** 94,370,816
37
+
38
+ ---
39
+
40
+ ### 🟦 Inputs
41
+ - **source** | type: `float32` | shape: [batch_size, sequence_length]
42
+ - *Waveform PCM 32 - SR 16,000*
43
+ - **padding_mask** | type: `bool` | shape: [batch_size, sequence_length]
44
+ - It is usually a completely false array, with the same shape as the waveform. `padding_mask = np.zeros(waveform.shape, dtype=np.bool_)`
45
+
46
+ ### 🟩 Outputs
47
+ - **features** | type: `float32` | shape: [batch_size, sequence_length, 768 ]
48
+
49
+
50
+
51
+ ---
52
+
53
+ ## Usage
54
+
55
+ ```python
56
+ import numpy as np
57
+ import onnxruntime as ort
58
+
59
+ class OnnxHubert:
60
+ """
61
+ Class to load and run the ONNX model exported by Hubert.
62
+
63
+ Attributes:
64
+ session (ort.InferenceSession): The ONNX Runtime session.
65
+ input_name (str): The name of the input node.
66
+ output_name (str): The name of the output node.
67
+
68
+ Methods:
69
+ extract_features_batch (source, padding_mask): Run the ONNX model and extract features from the batch.
70
+ extract_features (source, padding_mask): Run the ONNX model and extract features from a single input.
71
+ """
72
+ def __init__(self, model_path: str, thread_num: int = None):
73
+ """
74
+ Initialize the OnnxHubert object.
75
+
76
+ Parameters:
77
+ model_path (str): The path to the ONNX model file.
78
+ thread_num (int, optional): The number of threads to use for inference. Defaults to None.
79
+
80
+ Attributes:
81
+ session (ort.InferenceSession): The ONNX Runtime session.
82
+ input_name (str): The name of the input node.
83
+ output_name (str): The name of the output node.
84
+ """
85
+ self.session = ort.InferenceSession(model_path)
86
+
87
+ self.input_name = self.session.get_inputs()[0].name
88
+ self.output_name = self.session.get_outputs()[0].name
89
+ def extract_features(
90
+ self,
91
+ source: np.ndarray,
92
+ padding_mask: np.ndarray
93
+ ) -> np.ndarray:
94
+ """
95
+ Extract features from the batch using the ONNX model.
96
+
97
+ Inputs:
98
+ source: ndarray of shape (batch_size, sequence_length) float32
99
+ padding_mask: ndarray of shape (batch_size, sequence_length) bool
100
+
101
+ Returns:
102
+ ndarray of shape (D, 768) with the extracted features
103
+ """
104
+ result = self.session.run(None, {
105
+ "source": source,
106
+ "padding_mask": padding_mask
107
+ })
108
+ return result[0]
109
+ ```
110
+
111
+ ## Installation
112
+
113
+ You can install the required libraries with:
114
+
115
+ ```bash
116
+ pip install onnxruntime numpy
117
+ ```