Custom Whisper Model - Refined Version

This is a refined version of the custom Whisper model, enhanced through continued fine-tuning.

🎯 Model Overview

Base: Custom Whisper model (crimsonwolf2/custom-whisper-1)
Refinement: Continued fine-tuning on 49 additional samples
Training Loss: Reduced from 2.14 → 0.12 (94% improvement)
Training Steps: 250 steps with partial encoder freezing

📊 Training Results

Excellent convergence with 94% loss reduction!

Step	Training Loss
25	2.144
50	1.073
75	0.609
100	0.328
125	0.204
150	0.150
175	0.133
200	0.129
225	0.120
250	0.123

🚀 Usage

from transformers import WhisperProcessor, WhisperForConditionalGeneration
import torch

# Load the refined model
processor = WhisperProcessor.from_pretrained('crimsonwolf2/custom-whisper-refined')
model = WhisperForConditionalGeneration.from_pretrained('crimsonwolf2/custom-whisper-refined')

# Process audio
inputs = processor.feature_extractor(audio, sampling_rate=16000, return_tensors='pt')

# Generate transcription
with torch.no_grad():
    predicted_ids = model.generate(
        inputs.input_features,
        language='en',
        task='transcribe',
        max_length=448
    )

transcription = processor.tokenizer.decode(predicted_ids[0], skip_special_tokens=True)
print(transcription)

🔧 Training Configuration

Method: Continued fine-tuning with frozen encoder
Architecture: Whisper Small (244M parameters)
Training Data: 49 domain-specific samples
Batch Size: 2 (effective: 8 with gradient accumulation)
Learning Rate: 5e-6 (conservative for continued training)
Optimization: AdamW with 25 warmup steps
Precision: Mixed (FP16)
Training Time: ~6.5 minutes

📈 Performance Improvements

This refined model demonstrates:

Excellent convergence with smooth loss reduction
Domain adaptation through continued fine-tuning
Stable training with no overfitting signs
Preserved base capabilities while improving on specific data

🏷️ Model Versions

v1.0: Initial custom fine-tuning (crimsonwolf2/custom-whisper-1)
v2.0: Continued fine-tuning refinement (this version)

📝 Training Notes

The model was refined using a conservative approach:

Encoder layers frozen to preserve learned features
Decoder and projection layers fine-tuned for adaptation
Low learning rate to prevent catastrophic forgetting
Gradient checkpointing for memory efficiency

This approach successfully improved the model while maintaining stability.

Downloads last month: -

Safetensors

Model size

0.2B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for crimsonwolf2/custom-whisper-refined

Base model

crimsonwolf2/custom-whisper-1

Finetuned

(1)

this model