Jaal047 commited on
Commit
2ade873
·
verified ·
1 Parent(s): 182ece1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -3
README.md CHANGED
@@ -1,3 +1,52 @@
1
- ---
2
- license: cc-by-sa-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-sa-4.0
3
+ ---
4
+ # Detect Profanity in Surabaya Javanese Dialect
5
+ This is the model built for the project
6
+ [Deteksi Perkataan Vulgar Dalam Bahasa Jawa Dialek Surabaya Pada Konten Video Dengan Speech-To-Text ](https://github.com/jaal047/Detect-Profanity-in-Surabaya-Javanese-Dialect)
7
+
8
+ It is a fine-tuned [indonesian-nlp/wav2vec2-indonesian-javanese-sundanese](https://huggingface.co/indonesian-nlp/wav2vec2-indonesian-javanese-sundanese)
9
+ model on the [Profanity Speech Suroboyoan dataset](https://huggingface.co/datasets/Jaal047/profanity-speech-suroboyoan)
10
+
11
+ When using this model, make sure that your speech input is sampled at 16kHz.
12
+
13
+ ## Usage
14
+ The model can be used directly (without a language model) as follows:
15
+ ```python
16
+ import torch
17
+ import torchaudio
18
+ from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
19
+ import noisereduce as nr
20
+ import librosa
21
+ import soundfile as sf
22
+
23
+ # Load model dan processor
24
+ processor = Wav2Vec2Processor.from_pretrained("Jaal047/profanity-javanese-sby")
25
+ model = Wav2Vec2ForCTC.from_pretrained("Jaal047/profanity-javanese-sby")
26
+
27
+ # Load dan kurangi noise dari audio
28
+ file_audio_path = 'audio.wav'
29
+ y, sr = librosa.load(file_audio_path, sr=16000)
30
+ reduced_noise = nr.reduce_noise(y=y, sr=sr)
31
+ sf.write('audio_reduced_noise1.wav', reduced_noise, sr)
32
+
33
+ # Fungsi untuk memuat dan preprocess audio
34
+ def load_and_preprocess_audio(file_path):
35
+ audio_array, sampling_rate = torchaudio.load(file_path)
36
+ if sampling_rate != 16000:
37
+ audio_array = torchaudio.transforms.Resample(orig_freq=sampling_rate, new_freq=16000)(audio_array)
38
+ audio_array = torchaudio.transforms.Vol(gain=1.0, gain_type='amplitude')(audio_array)
39
+ return audio_array.squeeze().numpy()
40
+
41
+ # Preprocess dan inferensi
42
+ audio_array = load_and_preprocess_audio('audio_reduced_noise1.wav')
43
+ inputs = processor(audio_array, sampling_rate=16000, return_tensors="pt", padding=True)
44
+ with torch.no_grad():
45
+ logits = model(inputs.input_values).logits
46
+
47
+ # Ambil argmax dan decode prediksi
48
+ predicted_ids = torch.argmax(logits, dim=-1)
49
+ transcription = processor.batch_decode(predicted_ids)[0]
50
+
51
+ print("Transkripsi:", transcription)
52
+ ```