| Afrikaans-tuned Whisper model by DigiPhyte (Pty) Ltd |
| ==================================================== |
|
|
| This model is a fine-tune / derivative work, distributed under the MIT License |
| (see LICENSE). The upstream works below are gratefully acknowledged and their |
| licences are preserved as required. |
|
|
| 1. Base model: OpenAI Whisper |
| ----------------------------- |
| openai/whisper-large-v3 (and large-v3-turbo / medium / small variants) |
| Copyright OpenAI. Licensed under the Apache License, Version 2.0. |
| https://huggingface.co/openai/whisper-large-v3 |
| A copy of the Apache 2.0 licence is included as LICENSE-whisper-apache-2.0.txt. |
| Modifications by DigiPhyte: fine-tuned with a LoRA adapter for South African |
| Afrikaans, then merged into the base weights and converted to the CTranslate2 |
| format for use with faster-whisper. |
|
|
| 2. Training data: afrikaans-30s |
| ------------------------------- |
| andreoosthuizen/afrikaans-30s (approx. 56 hours of Afrikaans speech). |
| Licensed under Creative Commons Attribution 4.0 International (CC BY 4.0). |
| https://huggingface.co/datasets/andreoosthuizen/afrikaans-30s |
|
|
| 3. LoRA adapter (large-v3 model only) |
| ------------------------------------- |
| andreoosthuizen/whisper-large-v3-afrikaans |
| Copyright (c) Andre Oosthuizen. Licensed under the MIT License. |
| https://huggingface.co/andreoosthuizen/whisper-large-v3-afrikaans |
| This attribution applies to the large-v3 Afrikaans model, which reuses this |
| adapter. The medium, large-v3-turbo, and small Afrikaans models were fine-tuned |
| by DigiPhyte directly on the afrikaans-30s dataset (item 2) and do not include |
| this adapter. |
|
|