Update README.md
Browse files
README.md
CHANGED
|
@@ -102,15 +102,18 @@ The **Simba** family consists of state-of-the-art models fine-tuned using SimbaB
|
|
| 102 |
- **Simba-M** (MMS-1b-all)
|
| 103 |
- **Simba-H** (AfriHuBERT)
|
| 104 |
|
| 105 |
-
|
| 106 |
-
|
| 107 |
-
|
|
| 108 |
-
|
| 109 |
-
| π₯**Simba-
|
| 110 |
-
| π₯**Simba-
|
| 111 |
-
| π₯**Simba-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 112 |
|
| 113 |
-
* **Simba-S** (based on SeamlessM4T-v2-MT) emerged as the best-performing ASR model overall.
|
| 114 |
|
| 115 |
**π§© Usage Example**
|
| 116 |
|
|
@@ -125,7 +128,9 @@ asr_pipeline = pipeline(
|
|
| 125 |
model="UBC-NLP/Simba-S" #Simba mdoels `UBC-NLP/Simba-S`, `UBC-NLP/Simba-W`, `UBC-NLP/Simba-X`, `UBC-NLP/Simba-H`, `UBC-NLP/Simba-M`
|
| 126 |
)
|
| 127 |
|
|
|
|
| 128 |
asr_pipeline.model.load_adapter("multilingual_african") # Only for `UBC-NLP/Simba-M`
|
|
|
|
| 129 |
|
| 130 |
# Transcribe audio from file
|
| 131 |
result = asr_pipeline("https://africa.dlnlp.ai/simba/audio/afr_Lwazi_afr_test_idx3889.wav")
|
|
@@ -140,6 +145,36 @@ result = asr_pipeline({
|
|
| 140 |
print(result["text"])
|
| 141 |
|
| 142 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 143 |
Get started with Simba models in minutes using our interactive Colab notebook: [](https://github.com/UBC-NLP/simba/edit/main/simba_models.ipynb)
|
| 144 |
|
| 145 |
|
|
|
|
| 102 |
- **Simba-M** (MMS-1b-all)
|
| 103 |
- **Simba-H** (AfriHuBERT)
|
| 104 |
|
| 105 |
+
π Explore the Frontier
|
| 106 |
+
|
| 107 |
+
| **ASR Models** | **Architecture** | **#Parameters** | **π€ Hugging Face Model Card** | **Status** |
|
| 108 |
+
|---------|:------------------:| :------------------:| :------------------:|:------------------:|
|
| 109 |
+
| π₯**Simba-S**π₯| SeamlessM4T-v2 | 2.3B | π€ [https://huggingface.co/UBC-NLP/Simba-S](https://huggingface.co/UBC-NLP/Simba-S) | β
Released |
|
| 110 |
+
| π₯**Simba-W**π₯| Whisper | 1.5B | π€ [https://huggingface.co/UBC-NLP/Simba-W](https://huggingface.co/UBC-NLP/Simba-W) | β
Released |
|
| 111 |
+
| π₯**Simba-X**π₯| Wav2Vec2 | 1B | π€ [https://huggingface.co/UBC-NLP/Simba-X](https://huggingface.co/UBC-NLP/Simba-X) | β
Released |
|
| 112 |
+
| π₯**Simba-M**π₯| MMS | 1B | π€ [https://huggingface.co/UBC-NLP/Simba-M](https://huggingface.co/UBC-NLP/Simba-M) | β
Released |
|
| 113 |
+
| π₯**Simba-H**π₯| HuBERT | 94M | π€ [https://huggingface.co/UBC-NLP/Simba-H](https://huggingface.co/UBC-NLP/Simba-H) | β
Released |
|
| 114 |
+
|
| 115 |
+
* **Simba-S** emerged as the best-performing ASR model overall.
|
| 116 |
|
|
|
|
| 117 |
|
| 118 |
**π§© Usage Example**
|
| 119 |
|
|
|
|
| 128 |
model="UBC-NLP/Simba-S" #Simba mdoels `UBC-NLP/Simba-S`, `UBC-NLP/Simba-W`, `UBC-NLP/Simba-X`, `UBC-NLP/Simba-H`, `UBC-NLP/Simba-M`
|
| 129 |
)
|
| 130 |
|
| 131 |
+
##### Load the multilingual African adapter (Only for `UBC-NLP/Simba-M`)
|
| 132 |
asr_pipeline.model.load_adapter("multilingual_african") # Only for `UBC-NLP/Simba-M`
|
| 133 |
+
###########################
|
| 134 |
|
| 135 |
# Transcribe audio from file
|
| 136 |
result = asr_pipeline("https://africa.dlnlp.ai/simba/audio/afr_Lwazi_afr_test_idx3889.wav")
|
|
|
|
| 145 |
print(result["text"])
|
| 146 |
|
| 147 |
```
|
| 148 |
+
|
| 149 |
+
#### Example Outputs
|
| 150 |
+
|
| 151 |
+
Using the same audio file with different Simba models:
|
| 152 |
+
|
| 153 |
+
```python
|
| 154 |
+
# Simba-S
|
| 155 |
+
{'text': 'watter verontwaardiging sou daar, in ons binneste gewees het.'}
|
| 156 |
+
```
|
| 157 |
+
|
| 158 |
+
```python
|
| 159 |
+
# Simba-W
|
| 160 |
+
{'text': 'watter veronwaardigingsel daar, in ons binneste gewees het.'}
|
| 161 |
+
```
|
| 162 |
+
|
| 163 |
+
```python
|
| 164 |
+
# Simba-X
|
| 165 |
+
{'text': 'fator fr on ar taamsodr is'}
|
| 166 |
+
```
|
| 167 |
+
|
| 168 |
+
```python
|
| 169 |
+
# Simba-M
|
| 170 |
+
{'text': 'watter veronwaardiging sodaar in ons binniste gewees het'}
|
| 171 |
+
```
|
| 172 |
+
|
| 173 |
+
```python
|
| 174 |
+
# Simba-H
|
| 175 |
+
{'text': 'watter vironwaardiging so daar in ons binneste geweeshet'}
|
| 176 |
+
```
|
| 177 |
+
|
| 178 |
Get started with Simba models in minutes using our interactive Colab notebook: [](https://github.com/UBC-NLP/simba/edit/main/simba_models.ipynb)
|
| 179 |
|
| 180 |
|