carmi commited on
Commit
33d7b75
·
verified ·
1 Parent(s): db8ae18

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -3
README.md CHANGED
@@ -28,7 +28,7 @@ This model is a fine-tuned version of [Whisper Medium](https://github.com/openai
28
 
29
  The dataset used for training and fine-tuning this model consists of approximately 2,200 hours of transcribed audio, primarily featuring Israeli Levantine Arabic, along with some general Levantine Arabic content. The data sources include:
30
 
31
- 1. **Self-maintained Collection**: 2,000 hours of audio data curated by the team, covering a wide range of Israeli Levantine Arabic speech.
32
 
33
  - **Total Dataset Size**: ~1,200 hours
34
  - **Sampling Rate**: 8kHz - upsampled to 16kHz
@@ -39,11 +39,18 @@ The dataset used for training and fine-tuning this model consists of approximate
39
  The model is compatible with 16kHz audio input. Ensure your files are at the same sample rate for optimal results. You can load the model as follows:
40
 
41
  ```python
 
42
  import faster_whisper
43
  import librosa
44
 
 
 
45
  with torch.no_grad():
46
  audio_data, sample_rate = librosa.load(audio_file)
47
  audio_data = librosa.resample(audio_data, orig_sr=sample_rate, target_sr=16000)
48
- segs, _ = model.transcribe(audio_data, language='ar')
49
- transcript = ' '.join(s.text for s in segs)
 
 
 
 
 
28
 
29
  The dataset used for training and fine-tuning this model consists of approximately 2,200 hours of transcribed audio, primarily featuring Israeli Levantine Arabic, along with some general Levantine Arabic content. The data sources include:
30
 
31
+ 1. **Self-maintained Collection**: 1,200 hours of audio data curated by the team, covering a wide range of Israeli Levantine Arabic speech.
32
 
33
  - **Total Dataset Size**: ~1,200 hours
34
  - **Sampling Rate**: 8kHz - upsampled to 16kHz
 
39
  The model is compatible with 16kHz audio input. Ensure your files are at the same sample rate for optimal results. You can load the model as follows:
40
 
41
  ```python
42
+ pip install faster-whisper
43
  import faster_whisper
44
  import librosa
45
 
46
+ model = faster_whisper.WhisperModel("model.bin")
47
+ audio_file = 'your audio file.wav'
48
  with torch.no_grad():
49
  audio_data, sample_rate = librosa.load(audio_file)
50
  audio_data = librosa.resample(audio_data, orig_sr=sample_rate, target_sr=16000)
51
+ segments, _ = model.transcribe(audio_data, language='ar')
52
+ for segment in segments:
53
+ for word in segment.words:
54
+ print("[%.2fs -> %.2fs] %s" % (word.start, word.end, word.word))
55
+
56
+ transcript = ' '.join(s.text for s in segments)