Custom Whisper Model - Refined Version
This is a refined version of the custom Whisper model, enhanced through continued fine-tuning.
π― Model Overview
- Base: Custom Whisper model (crimsonwolf2/custom-whisper-1)
- Refinement: Continued fine-tuning on 49 additional samples
- Training Loss: Reduced from 2.14 β 0.12 (94% improvement)
- Training Steps: 250 steps with partial encoder freezing
π Training Results
Excellent convergence with 94% loss reduction!
| Step | Training Loss |
|---|---|
| 25 | 2.144 |
| 50 | 1.073 |
| 75 | 0.609 |
| 100 | 0.328 |
| 125 | 0.204 |
| 150 | 0.150 |
| 175 | 0.133 |
| 200 | 0.129 |
| 225 | 0.120 |
| 250 | 0.123 |
π Usage
from transformers import WhisperProcessor, WhisperForConditionalGeneration
import torch
# Load the refined model
processor = WhisperProcessor.from_pretrained('crimsonwolf2/custom-whisper-refined')
model = WhisperForConditionalGeneration.from_pretrained('crimsonwolf2/custom-whisper-refined')
# Process audio
inputs = processor.feature_extractor(audio, sampling_rate=16000, return_tensors='pt')
# Generate transcription
with torch.no_grad():
predicted_ids = model.generate(
inputs.input_features,
language='en',
task='transcribe',
max_length=448
)
transcription = processor.tokenizer.decode(predicted_ids[0], skip_special_tokens=True)
print(transcription)
π§ Training Configuration
- Method: Continued fine-tuning with frozen encoder
- Architecture: Whisper Small (244M parameters)
- Training Data: 49 domain-specific samples
- Batch Size: 2 (effective: 8 with gradient accumulation)
- Learning Rate: 5e-6 (conservative for continued training)
- Optimization: AdamW with 25 warmup steps
- Precision: Mixed (FP16)
- Training Time: ~6.5 minutes
π Performance Improvements
This refined model demonstrates:
- Excellent convergence with smooth loss reduction
- Domain adaptation through continued fine-tuning
- Stable training with no overfitting signs
- Preserved base capabilities while improving on specific data
π·οΈ Model Versions
- v1.0: Initial custom fine-tuning (crimsonwolf2/custom-whisper-1)
- v2.0: Continued fine-tuning refinement (this version)
π Training Notes
The model was refined using a conservative approach:
- Encoder layers frozen to preserve learned features
- Decoder and projection layers fine-tuned for adaptation
- Low learning rate to prevent catastrophic forgetting
- Gradient checkpointing for memory efficiency
This approach successfully improved the model while maintaining stability.
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support
Model tree for crimsonwolf2/custom-whisper-refined
Base model
crimsonwolf2/custom-whisper-1