Instructions to use fjmgAI/whisper-large-v3-ATC with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use fjmgAI/whisper-large-v3-ATC with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="fjmgAI/whisper-large-v3-ATC")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("fjmgAI/whisper-large-v3-ATC", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Unsloth Studio new
How to use fjmgAI/whisper-large-v3-ATC with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for fjmgAI/whisper-large-v3-ATC to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for fjmgAI/whisper-large-v3-ATC to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for fjmgAI/whisper-large-v3-ATC to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="fjmgAI/whisper-large-v3-ATC", max_seq_length=2048, )
Fine-Tuned Model
fjmgAI/whisper-large-v3-ATC
Base Model
unsloth/whisper-large-v3
Fine-Tuning Method
Fine-tuning was performed using unsloth, an efficient fine-tuning framework optimized for low-resource environments.
Dataset
Description
This dataset contains 14,830 examples transcriptions and corresponding audio files from two main sources: ATCO2 and the UWB-ATCC corpus, specifically selected for aviation-related communications.
Fine-Tuning Details
- The model was trained using the Seq2SeqTrainer.
- The Word Error Rate (WER) was employed as the loss metric to evaluate and optimize the model's performance during the fine-tuning process.
Usage
Direct Usage (Unsloth)
First install the dependencies:
Colab Version
%%capture
!pip install --no-deps bitsandbytes accelerate xformers==0.0.29.post3 peft trl==0.15.2 triton cut_cross_entropy unsloth_zoo
!pip install sentencepiece protobuf "datasets>=3.4.1" huggingface_hub hf_transfer
!pip install transformers==4.51.3
!pip install --no-deps unsloth
!pip install librosa soundfile evaluate jiwer
No Colab Version
pip install unsloth
pip install librosa soundfile evaluate jiwer
Then you can load this model and run inference.
import torch
from unsloth import FastModel
from transformers import pipeline
from transformers import WhisperForConditionalGeneration
model, tokenizer = FastModel.from_pretrained(
model_name = "fjmgAI/whisper-large-v3-ATC",
dtype = None,
load_in_4bit = False,
auto_model = WhisperForConditionalGeneration,
whisper_language = "English",
whisper_task = "transcribe",
)
model.generation_config.language = "<|en|>"
model.generation_config.task = "transcribe"
model.config.suppress_tokens = []
model.generation_config.forced_decoder_ids = None
whisper = pipeline(
"automatic-speech-recognition",
model=model,
tokenizer=tokenizer.tokenizer,
feature_extractor=tokenizer.feature_extractor,
processor=tokenizer,
return_language=True,
torch_dtype=torch.float16
)
audio_file = "audio_example.flac"
transcribed_text = whisper(audio_file)
print(transcribed_text["text"])
Purpose
This fine-tuned model is designed for Speech-to-Text (STT) applications in Air Traffic Control (ATC) environments, leveraging a specialized ATC dataset to enhance robustness and precision in transcribing ATC recordings. The model aims to deliver accurate and reliable transcription while maintaining efficient performance.
- Developed by: fjmgAI
- License: apache-2.0
