Transducens
/

error-preserving-whisper-distilled

Model card Files Files and versions

Gabi00 commited on Oct 9, 2024

Commit

be20b67

·

verified ·

1 Parent(s): be7028f

Update README.md

Files changed (1) hide show

README.md +43 -1

README.md CHANGED Viewed

@@ -18,6 +18,40 @@ grammatical mistakes, slang, and non-native speaker errors. This model helps imp
 in scenarios where speakers use incorrect or informal English, making it useful in language learning,
 transcription of casual conversations, or analyzing spoken communication from non-native English speakers.
 ## Usage Guide
 This project was executed on an Ubuntu 22.04.3 system running Linux kernel 6.8.0-40-generic.
@@ -80,4 +114,12 @@ model.generation_config.task = "transcribe"
 tokenizer = WhisperTokenizer.from_pretrained("openai/whisper-large-v3", task="transcribe")
 feature_extractor = WhisperFeatureExtractor.from_pretrained("openai/whisper-large-v3")
-pipe = pipeline(model=model, tokenizer=tokenizer, feature_extractor=feature_extractor, task="automatic-speech-recognition", device=device)

 in scenarios where speakers use incorrect or informal English, making it useful in language learning,
 transcription of casual conversations, or analyzing spoken communication from non-native English speakers.
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 1e-05
+- train_batch_size: 28
+- eval_batch_size: 28
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 50
+- training_steps: 100000
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss | Wer     |
+|:-------------:|:------:|:----:|:---------------:|:-------:|
+| 1.5189        | 0.4444 | 500  | 1.1913          | 25.9108 |
+| 1.1727        | 0.8889 | 1000 | 0.9531          | 24.5396 |
+| 1.1341        | 1.3333 | 1500 | 0.8688          | 22.2761 |
+| 1.0152        | 1.7778 | 2000 | 0.8174          | 20.8792 |
+| 1.0589        | 2.2222 | 2500 | 0.7855          | 20.7595 |
+| 0.9793        | 2.6667 | 3000 | 0.7611          | 22.2846 |
+| 0.9594        | 3.1111 | 3500 | 0.7442          | 20.3860 |
+| 1.0031        | 3.5556 | 4000 | 0.7303          | 18.5045 |
+| 0.9525        | 4.0    | 4500 | 0.7199          | 18.1054 |
+| 0.8729        | 4.4444 | 5000 | 0.7105          | 19.3170 |
+| 1.0031        | 4.8889 | 5500 | 0.7028          | 19.7446 |
+| 0.9273        | 5.3333 | 6000 | 0.6966          | 19.7189 |
+| 0.9174        | 5.7778 | 6500 | 0.6896          | 18.4475 |
+| 0.8842        | 6.2222 | 7000 | 0.6839          | 18.4361 |
 ## Usage Guide
 This project was executed on an Ubuntu 22.04.3 system running Linux kernel 6.8.0-40-generic.
 tokenizer = WhisperTokenizer.from_pretrained("openai/whisper-large-v3", task="transcribe")
 feature_extractor = WhisperFeatureExtractor.from_pretrained("openai/whisper-large-v3")
+pipe = pipeline(model=model, tokenizer=tokenizer, feature_extractor=feature_extractor, task="automatic-speech-recognition", device=device)
+### Framework versions
+- PEFT 0.11.1
+- Transformers 4.42.4
+- Pytorch 2.1.0+cu118
+- Datasets 2.20.0
+- Tokenizers 0.19.1