Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,52 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: cc-by-sa-4.0
|
| 3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: cc-by-sa-4.0
|
| 3 |
+
---
|
| 4 |
+
# Detect Profanity in Surabaya Javanese Dialect
|
| 5 |
+
This is the model built for the project
|
| 6 |
+
[Deteksi Perkataan Vulgar Dalam Bahasa Jawa Dialek Surabaya Pada Konten Video Dengan Speech-To-Text ](https://github.com/jaal047/Detect-Profanity-in-Surabaya-Javanese-Dialect)
|
| 7 |
+
|
| 8 |
+
It is a fine-tuned [indonesian-nlp/wav2vec2-indonesian-javanese-sundanese](https://huggingface.co/indonesian-nlp/wav2vec2-indonesian-javanese-sundanese)
|
| 9 |
+
model on the [Profanity Speech Suroboyoan dataset](https://huggingface.co/datasets/Jaal047/profanity-speech-suroboyoan)
|
| 10 |
+
|
| 11 |
+
When using this model, make sure that your speech input is sampled at 16kHz.
|
| 12 |
+
|
| 13 |
+
## Usage
|
| 14 |
+
The model can be used directly (without a language model) as follows:
|
| 15 |
+
```python
|
| 16 |
+
import torch
|
| 17 |
+
import torchaudio
|
| 18 |
+
from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor
|
| 19 |
+
import noisereduce as nr
|
| 20 |
+
import librosa
|
| 21 |
+
import soundfile as sf
|
| 22 |
+
|
| 23 |
+
# Load model dan processor
|
| 24 |
+
processor = Wav2Vec2Processor.from_pretrained("Jaal047/profanity-javanese-sby")
|
| 25 |
+
model = Wav2Vec2ForCTC.from_pretrained("Jaal047/profanity-javanese-sby")
|
| 26 |
+
|
| 27 |
+
# Load dan kurangi noise dari audio
|
| 28 |
+
file_audio_path = 'audio.wav'
|
| 29 |
+
y, sr = librosa.load(file_audio_path, sr=16000)
|
| 30 |
+
reduced_noise = nr.reduce_noise(y=y, sr=sr)
|
| 31 |
+
sf.write('audio_reduced_noise1.wav', reduced_noise, sr)
|
| 32 |
+
|
| 33 |
+
# Fungsi untuk memuat dan preprocess audio
|
| 34 |
+
def load_and_preprocess_audio(file_path):
|
| 35 |
+
audio_array, sampling_rate = torchaudio.load(file_path)
|
| 36 |
+
if sampling_rate != 16000:
|
| 37 |
+
audio_array = torchaudio.transforms.Resample(orig_freq=sampling_rate, new_freq=16000)(audio_array)
|
| 38 |
+
audio_array = torchaudio.transforms.Vol(gain=1.0, gain_type='amplitude')(audio_array)
|
| 39 |
+
return audio_array.squeeze().numpy()
|
| 40 |
+
|
| 41 |
+
# Preprocess dan inferensi
|
| 42 |
+
audio_array = load_and_preprocess_audio('audio_reduced_noise1.wav')
|
| 43 |
+
inputs = processor(audio_array, sampling_rate=16000, return_tensors="pt", padding=True)
|
| 44 |
+
with torch.no_grad():
|
| 45 |
+
logits = model(inputs.input_values).logits
|
| 46 |
+
|
| 47 |
+
# Ambil argmax dan decode prediksi
|
| 48 |
+
predicted_ids = torch.argmax(logits, dim=-1)
|
| 49 |
+
transcription = processor.batch_decode(predicted_ids)[0]
|
| 50 |
+
|
| 51 |
+
print("Transkripsi:", transcription)
|
| 52 |
+
```
|