--- library_name: peft license: apache-2.0 base_model: openai/whisper-large-v2 tags: - generated_from_trainer datasets: - mozilla-foundation/common_voice_11_0 - tunis-ai/arabic_speech_corpus - THCHS-30 model-index: - name: lowhipa-large-comb results: [] pipeline_tag: automatic-speech-recognition --- # lowhipa-large-comb This Whisper-for-IPA (WhIPA) model adapter is a PEFT LoRA fine-tuned version of [openai/whisper-large-v2](https://huggingface.co/openai/whisper-large-v2) on a subset of: - CommonVoice11 dataset (1k samples each from Greek, Finnish, Hungarian, Japanese, Maltese, Polish, Tamil) with G2P-based IPA transcriptions - Mandarin THCHS-30 database (https://arxiv.org/pdf/1512.01882) with IPA transcriptions by Taubert (2023, https://zenodo.org/records/7528596) (1k samples) - Arabic Speech Corpus (https://en.arabicspeechcorpus.com) with custom IPA transcriptions transliterated from the provided Buckwalter transcriptions (1k samples) (https://doi.org/10.5281/zenodo.17111977) ## Model description For deployment and description, please refer to https://github.com/jshrdt/whipa. ``` from transformers import WhisperForConditionalGeneration, WhisperTokenizer, WhisperProcessor from peft import PeftModel tokenizer = WhisperTokenizer.from_pretrained("openai/whisper-large-v2", task="transcribe") tokenizer.add_special_tokens({"additional_special_tokens": ["<|ip|>"] + tokenizer.all_special_tokens}) base_model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-large-v2") base_model.generation_config.lang_to_id["<|ip|>"] = tokenizer.convert_tokens_to_ids(["<|ip|>"])[0] base_model.resize_token_embeddings(len(tokenizer)) whipa_model = PeftModel.from_pretrained(base_model, "jshrdt/lowhipa-large-comb") whipa_model.generation_config.language = "<|ip|>" whipa_model.generation_config.task = "transcribe" whipa_processor = WhisperProcessor.from_pretrained("openai/whisper-large-v2", task="transcribe") ``` ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters ### Training results | Training Loss | Epoch | Validation Loss | |:-------------:|:-------:|:---------------:| | 0.7537 | 2.0323 | 0.5796585083007812 | | 0.2638 | 4.0645 | 0.4017384648323059 | | 0.1532 | 6.0968 | 0.40539106726646423 | | 0.0909 | 8.1290 | 0.4510815143585205 | | 0.0535 | 10.1613 | 0.4732421040534973 | ### Framework versions - PEFT 0.15.1 - Transformers 4.48.3 - Pytorch 2.6.0+cu124 - Datasets 3.2.0 - Tokenizers 0.21.0