Upload fine-tuned Bengali speaker diarization model

Files changed (5) hide show

README.md CHANGED Viewed

@@ -1,25 +1,32 @@
 ---
-license: mit
 tags:
-  - speaker-diarization
-  - pyannote
-  - audio
 ---
 # diarization_filtered_v1
-## Training config
-- **Pretrained model:** `pyannote/segmentation-3.0`
-- **Embedding model:** `pyannote/wespeaker-voxceleb-resnet34-LM`
-- **Duration:** 10.0s chunks
-- **Max speakers/chunk:** 4
-- **Batch size:** 64
-- **Learning rate:** 2e-05
-- **Max epochs:** 5
-## Usage
-```python
-from huggingface_hub import hf_hub_download
-ckpt = hf_hub_download(repo_id="ishtiakmoin/diarization_filtered_v1", filename="final_model.ckpt")
-config = hf_hub_download(repo_id="ishtiakmoin/diarization_filtered_v1", filename="pipeline_config.json")
-```

 ---
+language:
+- bn
 tags:
+- speaker-diarization
+- pyannote
+- pyannote-audio
+- audio
+- voice
+- speech
+- bengali
+license: mit
+datasets:
+- custom
+metrics:
+- der
+model-index:
+- name: diarization_filtered_v1
+  results:
+  - task:
+      type: speaker-diarization
+      name: Speaker Diarization
+    metrics:
+    - type: der
+      value: Not computed
+      name: Diarization Error Rate
 ---
 # diarization_filtered_v1
+This is a fine-tuned speaker diarization model based on pyannote.audio, specifically trained on Bengali audio data.

USAGE.md ADDED Viewed


1	+ # Example Usage: diarization_filtered_v1
2	+
3	+ This example shows how to use the model for speaker diarization.
4	+

config.yaml ADDED Viewed

+# Model configuration for pyannote.audio
+task:
+  name: SpeakerDiarization
+architecture:
+  name: PyanNet
+specifications:
+  duration: 5.0
+  sample_rate: 16000
+training:
+  batch_size: 32
+  learning_rate: 0.0001
+  max_epochs: 20

pipeline_config.json CHANGED Viewed

@@ -1,28 +1,16 @@
 {
-  "segmentation_model": "work/models/final_model.ckpt",
   "embedding_model": "pyannote/wespeaker-voxceleb-resnet34-LM",
-  "parameters": {
     "segmentation": {
       "threshold": 0.5,
-      "min_duration_off": 0.1,
-      "min_duration_on": 0.5
     },
     "clustering": {
       "method": "centroid",
       "threshold": 0.7,
       "min_cluster_size": 12
     }
-  },
-  "training_config": {
-    "pretrained_model": "pyannote/segmentation-3.0",
-    "duration": 10.0,
-    "max_speakers_per_chunk": 4,
-    "batch_size": 64,
-    "learning_rate": 2e-05,
-    "max_epochs": 5,
-    "warm_up": [
-      0.1,
-      0.1
-    ]
   }
 }

 {
+  "model_type": "speaker-diarization",
+  "pyannote_version": "3.3.2",
   "embedding_model": "pyannote/wespeaker-voxceleb-resnet34-LM",
+  "optimal_parameters": {
     "segmentation": {
       "threshold": 0.5,
+      "min_duration_off": 0.0
     },
     "clustering": {
       "method": "centroid",
       "threshold": 0.7,
       "min_cluster_size": 12
     }
   }
 }

pytorch_model.bin ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:9633b666624d820d36d046dbc89ebeae69b4c4c5ec41e42bb5e5dd79d52d9c68
+size 17735492