Add pipeline tag and library name to metadata
Browse filesHi! I'm Niels from the community science team at Hugging Face.
I've opened this PR to improve the metadata and documentation of your model card:
- Added `pipeline_tag: automatic-speech-recognition` to make the model discoverable in the ASR category.
- Added `library_name: transformers` to enable the code snippet widget and the "Use in Transformers" button.
- Added a reference to the paper [SloPal: A 60-Million-Word Slovak Parliamentary Corpus with Aligned Speech and Fine-Tuned ASR Models](https://huggingface.co/papers/2509.19270) in the description.
These changes help users find and use your model more easily!
README.md
CHANGED
|
@@ -1,6 +1,14 @@
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
| 2 |
language:
|
| 3 |
- sk
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
tags:
|
| 5 |
- speech
|
| 6 |
- asr
|
|
@@ -9,11 +17,6 @@ tags:
|
|
| 9 |
- parliament
|
| 10 |
- legal
|
| 11 |
- politics
|
| 12 |
-
base_model: openai/whisper-large-v3
|
| 13 |
-
datasets:
|
| 14 |
-
- erikbozik/slovak-plenary-asr-corpus
|
| 15 |
-
metrics:
|
| 16 |
-
- wer
|
| 17 |
model-index:
|
| 18 |
- name: whisper-large-v3-sk
|
| 19 |
results:
|
|
@@ -24,9 +27,9 @@ model-index:
|
|
| 24 |
name: Common Voice 21 (Slovak test set)
|
| 25 |
type: common_voice
|
| 26 |
metrics:
|
| 27 |
-
-
|
| 28 |
-
type: wer
|
| 29 |
value: 11.6
|
|
|
|
| 30 |
- task:
|
| 31 |
type: automatic-speech-recognition
|
| 32 |
name: Automatic Speech Recognition
|
|
@@ -34,15 +37,14 @@ model-index:
|
|
| 34 |
name: FLEURS (Slovak test set)
|
| 35 |
type: fleurs
|
| 36 |
metrics:
|
| 37 |
-
-
|
| 38 |
-
type: wer
|
| 39 |
value: 5.5
|
| 40 |
-
|
| 41 |
---
|
| 42 |
|
| 43 |
# Whisper Large-v3 — Fine-tuned on Slovak Plenary ASR Corpus
|
| 44 |
|
| 45 |
-
This model is a fine-tuned version of [`openai/whisper-large-v3`](https://huggingface.co/openai/whisper-large-v3).
|
| 46 |
It is adapted for **Slovak ASR** using [SloPalSpeech](https://huggingface.co/datasets/erikbozik/slovak-plenary-asr-corpus): **2,806 hours** of aligned, ≤30 s speech–text pairs from official plenary sessions of the **Slovak National Council**.
|
| 47 |
|
| 48 |
- **Language:** Slovak
|
|
|
|
| 1 |
---
|
| 2 |
+
base_model: openai/whisper-large-v3
|
| 3 |
+
datasets:
|
| 4 |
+
- erikbozik/slovak-plenary-asr-corpus
|
| 5 |
language:
|
| 6 |
- sk
|
| 7 |
+
license: mit
|
| 8 |
+
metrics:
|
| 9 |
+
- wer
|
| 10 |
+
library_name: transformers
|
| 11 |
+
pipeline_tag: automatic-speech-recognition
|
| 12 |
tags:
|
| 13 |
- speech
|
| 14 |
- asr
|
|
|
|
| 17 |
- parliament
|
| 18 |
- legal
|
| 19 |
- politics
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
model-index:
|
| 21 |
- name: whisper-large-v3-sk
|
| 22 |
results:
|
|
|
|
| 27 |
name: Common Voice 21 (Slovak test set)
|
| 28 |
type: common_voice
|
| 29 |
metrics:
|
| 30 |
+
- type: wer
|
|
|
|
| 31 |
value: 11.6
|
| 32 |
+
name: WER
|
| 33 |
- task:
|
| 34 |
type: automatic-speech-recognition
|
| 35 |
name: Automatic Speech Recognition
|
|
|
|
| 37 |
name: FLEURS (Slovak test set)
|
| 38 |
type: fleurs
|
| 39 |
metrics:
|
| 40 |
+
- type: wer
|
|
|
|
| 41 |
value: 5.5
|
| 42 |
+
name: WER
|
| 43 |
---
|
| 44 |
|
| 45 |
# Whisper Large-v3 — Fine-tuned on Slovak Plenary ASR Corpus
|
| 46 |
|
| 47 |
+
This model is a fine-tuned version of [`openai/whisper-large-v3`](https://huggingface.co/openai/whisper-large-v3), presented in the paper [SloPal: A 60-Million-Word Slovak Parliamentary Corpus with Aligned Speech and Fine-Tuned ASR Models](https://huggingface.co/papers/2509.19270).
|
| 48 |
It is adapted for **Slovak ASR** using [SloPalSpeech](https://huggingface.co/datasets/erikbozik/slovak-plenary-asr-corpus): **2,806 hours** of aligned, ≤30 s speech–text pairs from official plenary sessions of the **Slovak National Council**.
|
| 49 |
|
| 50 |
- **Language:** Slovak
|