noflm commited on
Commit
74cc5dd
Β·
verified Β·
1 Parent(s): 326c994

Add model card

Browse files
Files changed (1) hide show
  1. README.md +86 -0
README.md ADDED
@@ -0,0 +1,86 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - ja
4
+ license: mit
5
+ tags:
6
+ - whisper
7
+ - fine-tuning
8
+ - jdd-topic1
9
+ - speechbrain
10
+ - automatic-speech-recognition
11
+ base_model: openai/whisper-base
12
+ datasets:
13
+ - noflm/jdd_topic1_batch2
14
+ pipeline_tag: automatic-speech-recognition
15
+ ---
16
+
17
+ # Whisper Fine-tuning Experiment: jdd_topic1_batch2-whisper-base
18
+
19
+ ## Model Description
20
+
21
+ This model contains a complete Whisper fine-tuning experiment including:
22
+ - Training checkpoints (SpeechBrain format)
23
+ - Final model (Transformers format)
24
+ - Test results and evaluation metrics
25
+ - Training history visualizations
26
+
27
+ ## Model Information
28
+
29
+ - **Base Model**: openai/whisper-base
30
+ - **Framework**: SpeechBrain v1.0.3
31
+ - **Training Dataset**: [noflm/jdd_topic1_batch2](https://huggingface.co/datasets/noflm/jdd_topic1_batch2)
32
+ - **Language**: Japanese (ja)
33
+ - **Task**: Automatic Speech Recognition (ASR)
34
+
35
+ ## Test Results
36
+
37
+ - **WER**: 12.17%
38
+ - **CER**: 9.08%
39
+ - **Test Loss**: 0.0814
40
+
41
+ ## Contents
42
+
43
+ ```
44
+ β”œβ”€β”€ checkpoints/ # Training checkpoints
45
+ β”‚ β”œβ”€β”€ CKPT+epoch_*/ # Per-epoch checkpoints
46
+ β”‚ β”œβ”€β”€ CKPT+BEST_WER/ # Best WER checkpoint
47
+ β”‚ └── CKPT+FINAL/ # Final checkpoint
48
+ β”œβ”€β”€ final_model/ # Transformers-compatible model
49
+ β”‚ β”œβ”€β”€ config.json # Model configuration
50
+ β”‚ β”œβ”€β”€ model.safetensors # Model weights
51
+ β”‚ β”œβ”€β”€ preprocessor_config.json
52
+ β”‚ β”œβ”€β”€ tokenizer_config.json
53
+ β”‚ └── ...
54
+ β”œβ”€β”€ test_results.json # Test metrics
55
+ β”œβ”€β”€ detailed_metrics.json # Detailed training history
56
+ β”œβ”€β”€ training_history_speechbrain.png # Training curves
57
+ └── training_report_speechbrain.txt # Summary report
58
+ ```
59
+
60
+ ## Usage
61
+
62
+ ### Load checkpoint (SpeechBrain format)
63
+ ```python
64
+ import torch
65
+ checkpoint = torch.load('checkpoints/CKPT+BEST_WER/model.ckpt')
66
+ ```
67
+
68
+ ### Load final model (Transformers format)
69
+ ```python
70
+ from transformers import WhisperForConditionalGeneration, WhisperProcessor
71
+
72
+ model = WhisperForConditionalGeneration.from_pretrained("./final_model")
73
+ processor = WhisperProcessor.from_pretrained("./final_model")
74
+ ```
75
+
76
+ ## Citation
77
+
78
+ If you use this experiment data, please cite the original Whisper paper:
79
+ ```bibtex
80
+ @article{radford2022robust,
81
+ title={Robust speech recognition via large-scale weak supervision},
82
+ author={Radford, Alec and Kim, Jong Wook and Xu, Tao and Brockman, Greg and McLeavey, Christine and Sutskever, Ilya},
83
+ journal={arXiv preprint arXiv:2212.04356},
84
+ year={2022}
85
+ }
86
+ ```