---
library_name: peft
license: apache-2.0
base_model: openai/whisper-large-v2
tags:
- generated_from_trainer
datasets:
- mozilla-foundation/common_voice_11_0
- tunis-ai/arabic_speech_corpus
- THCHS-30
model-index:
- name: lowhipa-large-comb
  results: []
pipeline_tag: automatic-speech-recognition
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# lowhipa-large-comb

This Whisper-for-IPA (WhIPA) model adapter is a PEFT LoRA fine-tuned version of [openai/whisper-large-v2](https://huggingface.co/openai/whisper-large-v2) on a subset of:
- CommonVoice11 dataset (1k samples each from Greek, Finnish, Hungarian, Japanese, Maltese, Polish, Tamil) with G2P-based IPA transcriptions
- Mandarin THCHS-30 database (https://arxiv.org/pdf/1512.01882) with IPA transcriptions by Taubert (2023, https://zenodo.org/records/7528596) (1k samples)
- Arabic Speech Corpus (https://en.arabicspeechcorpus.com) with custom IPA transcriptions transliterated from the provided Buckwalter transcriptions (1k samples) (https://doi.org/10.5281/zenodo.17111977)

## Model description


For deployment and description, please refer to https://github.com/jshrdt/whipa.

```
from transformers import WhisperForConditionalGeneration, WhisperTokenizer, WhisperProcessor
from peft import PeftModel

tokenizer = WhisperTokenizer.from_pretrained("openai/whisper-large-v2", task="transcribe")
tokenizer.add_special_tokens({"additional_special_tokens": ["<|ip|>"] + tokenizer.all_special_tokens})

base_model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-large-v2")
base_model.generation_config.lang_to_id["<|ip|>"] = tokenizer.convert_tokens_to_ids(["<|ip|>"])[0]
base_model.resize_token_embeddings(len(tokenizer))

whipa_model = PeftModel.from_pretrained(base_model, "jshrdt/lowhipa-large-comb")

whipa_model.generation_config.language = "<|ip|>"
whipa_model.generation_config.task = "transcribe"

whipa_processor = WhisperProcessor.from_pretrained("openai/whisper-large-v2", task="transcribe")

```

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

### Training results

| Training Loss | Epoch    | Validation Loss |
|:-------------:|:-------:|:---------------:|
| 0.7537        | 2.0323  | 0.5796585083007812          |
| 0.2638        | 4.0645  | 0.4017384648323059          |
| 0.1532        | 6.0968  | 0.40539106726646423          |
| 0.0909        | 8.1290  | 0.4510815143585205          |
| 0.0535        | 10.1613 | 0.4732421040534973          |

### Framework versions

- PEFT 0.15.1
- Transformers 4.48.3
- Pytorch 2.6.0+cu124
- Datasets 3.2.0
- Tokenizers 0.21.0