Custom Whisper Model - Refined Version

This is a refined version of the custom Whisper model, enhanced through continued fine-tuning.

🎯 Model Overview

  • Base: Custom Whisper model (crimsonwolf2/custom-whisper-1)
  • Refinement: Continued fine-tuning on 49 additional samples
  • Training Loss: Reduced from 2.14 β†’ 0.12 (94% improvement)
  • Training Steps: 250 steps with partial encoder freezing

πŸ“Š Training Results

Excellent convergence with 94% loss reduction!

Step Training Loss
25 2.144
50 1.073
75 0.609
100 0.328
125 0.204
150 0.150
175 0.133
200 0.129
225 0.120
250 0.123

πŸš€ Usage

from transformers import WhisperProcessor, WhisperForConditionalGeneration
import torch

# Load the refined model
processor = WhisperProcessor.from_pretrained('crimsonwolf2/custom-whisper-refined')
model = WhisperForConditionalGeneration.from_pretrained('crimsonwolf2/custom-whisper-refined')

# Process audio
inputs = processor.feature_extractor(audio, sampling_rate=16000, return_tensors='pt')

# Generate transcription
with torch.no_grad():
    predicted_ids = model.generate(
        inputs.input_features,
        language='en',
        task='transcribe',
        max_length=448
    )

transcription = processor.tokenizer.decode(predicted_ids[0], skip_special_tokens=True)
print(transcription)

πŸ”§ Training Configuration

  • Method: Continued fine-tuning with frozen encoder
  • Architecture: Whisper Small (244M parameters)
  • Training Data: 49 domain-specific samples
  • Batch Size: 2 (effective: 8 with gradient accumulation)
  • Learning Rate: 5e-6 (conservative for continued training)
  • Optimization: AdamW with 25 warmup steps
  • Precision: Mixed (FP16)
  • Training Time: ~6.5 minutes

πŸ“ˆ Performance Improvements

This refined model demonstrates:

  • Excellent convergence with smooth loss reduction
  • Domain adaptation through continued fine-tuning
  • Stable training with no overfitting signs
  • Preserved base capabilities while improving on specific data

🏷️ Model Versions

  • v1.0: Initial custom fine-tuning (crimsonwolf2/custom-whisper-1)
  • v2.0: Continued fine-tuning refinement (this version)

πŸ“ Training Notes

The model was refined using a conservative approach:

  • Encoder layers frozen to preserve learned features
  • Decoder and projection layers fine-tuned for adaptation
  • Low learning rate to prevent catastrophic forgetting
  • Gradient checkpointing for memory efficiency

This approach successfully improved the model while maintaining stability.

Downloads last month
-
Safetensors
Model size
0.2B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for crimsonwolf2/custom-whisper-refined

Finetuned
(1)
this model