lowhipa-large-asc / README.md
jshrdt's picture
Update README.md
4bee327 verified
metadata
base_model: openai/whisper-large-v2
library_name: peft
model-index:
  - name: lowhipa-base-asc
    results: []
datasets:
  - tunis-ai/arabic_speech_corpus
pipeline_tag: automatic-speech-recognition

lowhipa-base-asc

This Whisper-for-IPA (WhIPA) model adapter is a PEFT LoRA fine-tuned version of openai/whisper-large-v2 on a subset (1k samples) of the Arabic Speech Corpus (https://en.arabicspeechcorpus.com) with custom IPA transcriptions transliterated from the provided Buckwalter transcriptions; ASC-IPA dataset available at https://doi.org/10.5281/zenodo.17111977.

Model description

For deployment and description, please refer to https://github.com/jshrdt/whipa.

from transformers import WhisperForConditionalGeneration, WhisperTokenizer, WhisperProcessor
from peft import PeftModel

tokenizer = WhisperTokenizer.from_pretrained("openai/whisper-large-v2", task="transcribe")
tokenizer.add_special_tokens({"additional_special_tokens": ["<|ip|>"] + tokenizer.all_special_tokens})

base_model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-base")
base_model.generation_config.lang_to_id["<|ip|>"] = tokenizer.convert_tokens_to_ids(["<|ip|>"])[0]
base_model.resize_token_embeddings(len(tokenizer))

whipa_model = PeftModel.from_pretrained(base_model, "jshrdt/lowhipa-large-asc")

whipa_model.generation_config.language = "<|ip|>"
whipa_model.generation_config.task = "transcribe"

whipa_processor = WhisperProcessor.from_pretrained("openai/whisper-large-v2", task="transcribe")

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

Training results

Training Loss Epoch Step Validation Loss
0.2402 2.0 126 0.2061
0.1 4.0 252 0.1705
0.0411 6.0 378 0.1515
0.0118 8.0 504 0.0.1530
0.0056 10.0 630 0.1585

Framework versions

  • PEFT 0.15.1
  • Transformers 4.48.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.2.0
  • PEFT 0.15.1