Pengwin30
/

whisper-medium-fine-tuned

@@ -1,85 +1,88 @@
----
-license: mit
-base_model:
-- openai/whisper-medium
-language: en
-tags:
-- automatic-speech-recognition
-- whisper
-- fine-tuning
-- speech
-model-index:
-  - name: whisper-medium-finetuned-custom
-    results:
-      - task:
-          type: automatic-speech-recognition
-        dataset:
-          name: Custom Audio Dataset
-          type: audio
-        metrics:
-          - name: Word Error Rate
-            type: wer
-            value: 0.XX  # Replace with actual WER
----
-# Whisper Medium Fine-Tuned on Custom English Dataset
-This model is a fine-tuned version of OpenAI's [`whisper-medium`](https://huggingface.co/openai/whisper-medium), optimized for transcribing English speech from a custom dataset.
-## 🛠️ Model Details
-- **Base Model:** openai/whisper-medium
-- **Fine-tuned by:** Winardi (Research by Ms. Tong Rong)
-- **Language:** English (monolingual)
-- **Framework:** PyTorch, Hugging Face Transformers
-## 📚 Training Data
-The model was fine-tuned on a proprietary/custom audio dataset using `metadata(clean1).csv`. Corrupted or low-quality audio files were excluded. The data was split as follows:
-- **Training:** 80%
-- **Validation:** 10%
-- **Testing:** 10% (used only for evaluation, not during training)
-## 🎯 Intended Use
-This model is intended for **automatic speech recognition (ASR)** in English, especially for environments similar to the training dataset (e.g., single-speaker, clean audio).
-## 📉 Performance
-- **Metric:** Word Error Rate (WER)
-- **WER:** `2.07%`
-- **WER with Limited Vocalubary:** `3.23%`
-## 🚫 Limitations
-- Not robust to heavy background noise or overlapping speech
-- May not perform well on dialects or accents not represented in training data
-- Only supports English input
-## 💬 How to Use
-```python
-from transformers import pipeline
-asr = pipeline("automatic-speech-recognition", model="your-username/whisper-medium-finetuned-custom")
-result = asr("path/to/audio.wav")
-print(result["text"])
-```
-## 📜 License
-This model is licensed under the **MIT License**.
-## 🙏 Citation
-If you use this model in your work, please cite:
-```
-@misc{whisper-finetuned-custom,
-  author = {Tong Rong, Winardi},
-  title = {Whisper Medium Fine-Tuned on Custom Dataset},
-  year = {2025},
-  url = {https://huggingface.co/your-username/whisper-medium-finetuned-custom}
-}
 ```

+---
+license: mit
+base_model:
+- openai/whisper-medium
+language: en
+tags:
+- automatic-speech-recognition
+- whisper
+- fine-tuning
+- speech
+model-index:
+  - name: Pengwin30/whisper-medium-fine-tuned
+    results:
+      - task:
+          type: automatic-speech-recognition
+        dataset:
+          name: Custom Audio Dataset
+          type: audio
+        metrics:
+          - name: Word Error Rate
+            type: wer
+            value: 2.07%
+          - name: Word Error Rate With Limited Vocabulary
+            type: wer
+            value: 3.23%
+---
+# Whisper Medium Fine-Tuned on Custom English Dataset
+This model is a fine-tuned version of OpenAI's [`whisper-medium`](https://huggingface.co/openai/whisper-medium), optimized for transcribing English speech from a custom dataset.
+## 🛠️ Model Details
+- **Base Model:** openai/whisper-medium
+- **Fine-tuned by:** Winardi (Research by Ms. Tong Rong)
+- **Language:** English (monolingual)
+- **Framework:** PyTorch, Hugging Face Transformers
+## 📚 Training Data
+The model was fine-tuned on a proprietary/custom audio dataset using `metadata(clean1).csv`. Corrupted or low-quality audio files were excluded. The data was split as follows:
+- **Training:** 80%
+- **Validation:** 10%
+- **Testing:** 10% (used only for evaluation, not during training)
+## 🎯 Intended Use
+This model is intended for **automatic speech recognition (ASR)** in English, especially for environments similar to the training dataset (e.g., single-speaker, clean audio).
+## 📉 Performance
+- **Metric:** Word Error Rate (WER)
+- **WER:** `2.07%`
+- **WER with Limited Vocalubary:** `3.23%`
+## 🚫 Limitations
+- Not robust to heavy background noise or overlapping speech
+- May not perform well on dialects or accents not represented in training data
+- Only supports English input
+## 💬 How to Use
+```python
+from transformers import pipeline
+asr = pipeline("automatic-speech-recognition", model="Pengwin30/whisper-medium-fine-tuned")
+result = asr("path/to/audio.wav")
+print(result["text"])
+```
+## 📜 License
+This model is licensed under the **MIT License**.
+## 🙏 Citation
+If you use this model in your work, please cite:
+```
+@misc{Pengwin30/whisper-medium-fine-tuned,
+  author = {Tong Rong, Winardi},
+  title = {Whisper Medium Fine-Tuned on Custom Dataset},
+  year = {2025},
+  url = {https://huggingface.co/Pengwin30/whisper-medium-fine-tuned}
+}
 ```