--- language: en license: apache-2.0 tags: - automatic-speech-recognition - speech - audio - transformers- peft - lora - adapter library_name: transformers pipeline_tag: automatic-speech-recognition --- # Bruno7/ksa-whisper-model ## Model Description Fine-tuned Arabic Whisper model for Saudi dialect ## Base Model This adapter is designed to work with: `openai/whisper-large-v3` ## Usage ```python from transformers import pipeline from peft import PeftModel, PeftConfig # Load the adapter configuration config = PeftConfig.from_pretrained("Bruno7/ksa-whisper-model") # Load base model and apply adapter pipe = pipeline( "automatic-speech-recognition", model=config.base_model_name_or_path, device="cuda" if torch.cuda.is_available() else "cpu" ) # Load and apply the adapter model = PeftModel.from_pretrained(pipe.model, "Bruno7/ksa-whisper-model") pipe.model = model # Process audio result = pipe("path_to_audio.wav") print(result["text"]) ``` ### Alternative Usage (Direct Loading) ```python from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq from peft import PeftModel # Load base model and processor processor = AutoProcessor.from_pretrained("openai/whisper-large-v3") model = AutoModelForSpeechSeq2Seq.from_pretrained("openai/whisper-large-v3") # Apply adapter model = PeftModel.from_pretrained(model, "Bruno7/ksa-whisper-model") # Your inference code here ``` ## Model Architecture This is a PEFT (Parameter-Efficient Fine-Tuning) adapter model that modifies a base Whisper model for improved performance on specific domains or languages. The adapter uses LoRA (Low-Rank Adaptation) techniques to efficiently fine-tune the model while keeping the parameter count minimal. ## Inference This adapter can be applied to the base model for domain-specific speech recognition tasks. ## Limitations - Requires the base model to be loaded separately - Performance may vary with different audio qualities and accents - Requires audio preprocessing for optimal results