Whisper-Small-Morse

This model is fine-tuned off of openai/whisper-small of 100k synthetic samples of Morse Code audio, transcription (raw capitalized decoded text), and translation (English interpretation). It uses the unused <|startoflm|> token as a language marker.

Ethical Considerations

This model was trained off of synthetically generated text generated by Claude by Anthropic. It may contain biases of the underlying model, especially when using the translation task. Much of the generated text is relevant to the amateur radio domain.

Audio data has been generated with stochastically determined tone, noise, and WPM, off of the transcribed text ground truth.

Training

This model was fine-tuned on a Nvidia RTX 4070 Ti Super for approximately 48 hours over the course of 15 epochs.

Downloads last month
28
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for michaellin/whisper-small-morse

Finetuned
(3518)
this model

Dataset used to train michaellin/whisper-small-morse

Space using michaellin/whisper-small-morse 1