EYEDOL commited on
Commit
d2e445b
Β·
verified Β·
1 Parent(s): 61996b9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -10
README.md CHANGED
@@ -23,7 +23,6 @@ pipeline_tag: automatic-speech-recognition
23
  **Base Model:** `openai/whisper-small` (fine-tuned for Swahili)
24
 
25
  ---
26
-
27
  ## 🌍 Overview
28
 
29
  **SALAMA-STT** (Speech-to-Text) is the **first module** of the **SALAMA Framework** β€” a modular end-to-end **speech-to-speech AI system** built for African languages.
@@ -50,7 +49,6 @@ The model was fine-tuned on the **Mozilla Common Voice 17.0 Swahili** dataset, e
50
  | Languages | Swahili (`sw`), English (`en`) |
51
 
52
  ---
53
-
54
  ## πŸ“š Dataset
55
 
56
  | Dataset | Description | Purpose |
@@ -60,7 +58,6 @@ The model was fine-tuned on the **Mozilla Common Voice 17.0 Swahili** dataset, e
60
  | Common Voice validation split | 2.3 hours | Evaluation |
61
 
62
  ---
63
-
64
  ## 🧠 Model Capabilities
65
 
66
  - Speech-to-text transcription in **Swahili**
@@ -70,7 +67,6 @@ The model was fine-tuned on the **Mozilla Common Voice 17.0 Swahili** dataset, e
70
  - Provides timestamped segment transcriptions
71
 
72
  ---
73
-
74
  ## πŸ“Š Evaluation Metrics
75
 
76
  | Metric | Baseline (Whisper-small) | Fine-tuned (SALAMA-STT) | Improvement |
@@ -82,7 +78,6 @@ The model was fine-tuned on the **Mozilla Common Voice 17.0 Swahili** dataset, e
82
  > Evaluation conducted on a 2-hour held-out Swahili validation set from Common Voice.
83
 
84
  ---
85
-
86
  ## βš™οΈ Usage (Python Example)
87
 
88
  Below is a quick example for Swahili speech transcription using this model:
@@ -113,7 +108,6 @@ print(result["text"])
113
  > *β€œKaribu kwenye mfumo wa SALAMA unaosaidia kutambua na kuelewa sauti ya Kiswahili kwa usahihi mkubwa.”*
114
 
115
  ---
116
-
117
  ## πŸ” Model Performance Summary
118
 
119
  | Dataset | Metric | Score |
@@ -123,7 +117,6 @@ print(result["text"])
123
  | Local Swahili Test Set | Accuracy | **95.4%** |
124
 
125
  ---
126
-
127
  ## ⚑ Key Features
128
 
129
  - πŸŽ™οΈ **Accurate Swahili ASR** trained on diverse voices
@@ -133,7 +126,6 @@ print(result["text"])
133
  - πŸš€ **Fast inference optimized with FP16 precision**
134
 
135
  ---
136
-
137
  ## 🚫 Limitations
138
 
139
  - May misinterpret **code-mixed (Swahili-English)** speech
@@ -142,11 +134,9 @@ print(result["text"])
142
  - Performance may decline on **non-native Swahili speakers**
143
 
144
  ---
145
-
146
  ## πŸ”— Related Models
147
 
148
  | Model | Description |
149
  |--------|-------------|
150
  | [`EYEDOL/salama-llm`](https://huggingface.co/EYEDOL/salama-llm) | Swahili instruction-tuned LLM for reasoning and dialogue |
151
  | [`EYEDOL/salama-tts`](https://huggingface.co/EYEDOL/salama-tts) | Swahili text-to-speech (VITS) model for natural speech synthesis |
152
-
 
23
  **Base Model:** `openai/whisper-small` (fine-tuned for Swahili)
24
 
25
  ---
 
26
  ## 🌍 Overview
27
 
28
  **SALAMA-STT** (Speech-to-Text) is the **first module** of the **SALAMA Framework** β€” a modular end-to-end **speech-to-speech AI system** built for African languages.
 
49
  | Languages | Swahili (`sw`), English (`en`) |
50
 
51
  ---
 
52
  ## πŸ“š Dataset
53
 
54
  | Dataset | Description | Purpose |
 
58
  | Common Voice validation split | 2.3 hours | Evaluation |
59
 
60
  ---
 
61
  ## 🧠 Model Capabilities
62
 
63
  - Speech-to-text transcription in **Swahili**
 
67
  - Provides timestamped segment transcriptions
68
 
69
  ---
 
70
  ## πŸ“Š Evaluation Metrics
71
 
72
  | Metric | Baseline (Whisper-small) | Fine-tuned (SALAMA-STT) | Improvement |
 
78
  > Evaluation conducted on a 2-hour held-out Swahili validation set from Common Voice.
79
 
80
  ---
 
81
  ## βš™οΈ Usage (Python Example)
82
 
83
  Below is a quick example for Swahili speech transcription using this model:
 
108
  > *β€œKaribu kwenye mfumo wa SALAMA unaosaidia kutambua na kuelewa sauti ya Kiswahili kwa usahihi mkubwa.”*
109
 
110
  ---
 
111
  ## πŸ” Model Performance Summary
112
 
113
  | Dataset | Metric | Score |
 
117
  | Local Swahili Test Set | Accuracy | **95.4%** |
118
 
119
  ---
 
120
  ## ⚑ Key Features
121
 
122
  - πŸŽ™οΈ **Accurate Swahili ASR** trained on diverse voices
 
126
  - πŸš€ **Fast inference optimized with FP16 precision**
127
 
128
  ---
 
129
  ## 🚫 Limitations
130
 
131
  - May misinterpret **code-mixed (Swahili-English)** speech
 
134
  - Performance may decline on **non-native Swahili speakers**
135
 
136
  ---
 
137
  ## πŸ”— Related Models
138
 
139
  | Model | Description |
140
  |--------|-------------|
141
  | [`EYEDOL/salama-llm`](https://huggingface.co/EYEDOL/salama-llm) | Swahili instruction-tuned LLM for reasoning and dialogue |
142
  | [`EYEDOL/salama-tts`](https://huggingface.co/EYEDOL/salama-tts) | Swahili text-to-speech (VITS) model for natural speech synthesis |