Tasfiya025
/

UrbanSound_EventDetection_Wav2Vec2

+---
+tags:
+- audio-classification
+- sound-event-detection
+- wav2vec2
+- urban-acoustics
+- deep-learning
+datasets:
+- UrbanSoundscape_EventDetection_Metadata
+license: apache-2.0
+model-index:
+- name: UrbanSound_EventDetection_Wav2Vec2
+  results:
+  - task:
+      name: Audio Classification
+      type: audio-classification
+    metrics:
+    - type: accuracy
+      value: 0.945
+      name: Event Detection Accuracy
+    - type: f1_macro
+      value: 0.938
+      name: Macro F1 Score
+---
+# UrbanSound_EventDetection_Wav2Vec2
+## 👂 Overview
+The **UrbanSound_EventDetection_Wav2Vec2** is a highly efficient model based on the pre-trained **Wav2Vec2** architecture, fine-tuned specifically for classifying momentary and continuous sound events within urban environments. It processes raw audio waveforms to identify one of eight high-priority urban sound classes, focusing on high-impact and potentially anomalous events.
+## 🧠 Model Architecture
+This model utilizes the standard Wav2Vec2 pipeline, which operates directly on raw audio data without the need for manual feature extraction (like MFCCs).
+* **Base Model:** `facebook/wav2vec2-base`
+* **Feature Extractor:** A stack of 1D convolutional layers extracts local features from the raw waveform.
+* **Transformer Encoder:** 12 layers of Transformer blocks capture long-range dependencies and global context within the audio clip.
+* **Classification Head:** A task-specific linear layer is placed on top of the contextualized representations to predict one of the 8 event labels.
+* **Target Classes:** Car\_Horn, Children\_Playing, Dog\_Barking, Machinery\_Hum, Siren\_Emergency, Train\_Whistle, Tire\_Screech, and Glass\_Shattering.
+## 🎯 Intended Use
+This model is intended for smart city, safety, and acoustic monitoring systems:
+1.  **Acoustic Surveillance:** Real-time detection of emergency sounds (Siren, Glass Shattering, Tire Screech) for public safety alerting.
+2.  **Noise Pollution Monitoring:** Quantifying the occurrence and frequency of specific noise sources (Car Horn, Machinery Hum) in different city zones.
+3.  **Urban Planning:** Analyzing soundscape composition to inform policy on zoning and noise mitigation strategies.
+## ⚠️ Limitations
+1.  **Event Overlap:** The current setup is trained for single-label classification. If multiple sounds occur simultaneously (e.g., Siren + Dog Barking), the model will only output the single most probable event, potentially ignoring others.
+2.  **Domain Shift:** The model's performance may degrade if deployed in environments with significantly different background noise profiles (e.g., highly quiet suburbs vs. extremely loud Asian markets).
+3.  **Localization:** This model performs *event detection* but does not inherently provide *sound localization* (Direction-of-Arrival or DOA), which would require specialized input features (like ambisonic audio) and a different model head.
+---
+### MODEL 2: **MedicalChatbot_IntentClassifier_RoBERTa**
+This model is a RoBERTa-based model for multi-class classification of user intent within medical dialogue transcripts.
+#### config.json
+```json
+{
+  "_name_or_path": "roberta-base",
+  "architectures": [
+    "RobertaForSequenceClassification"
+  ],
+  "hidden_size": 768,
+  "model_type": "roberta",
+  "num_hidden_layers": 12,
+  "vocab_size": 50265,
+  "id2label": {
+    "0": "Symptom_Reporting",
+    "1": "Advice_Seeking",
+    "2": "Medication_Query",
+    "3": "Appointment_Scheduling",
+    "4": "Billing_Query",
+    "5": "Causal_Query",
+    "6": "Record_Retrieval",
+    "7": "Urgency_Assessment"
+  },
+  "label2id": {
+    "Symptom_Reporting": 0,
+    "Advice_Seeking": 1,
+    "Medication_Query": 2,
+    "Appointment_Scheduling": 3,
+    "Billing_Query": 4,
+    "Causal_Query": 5,
+    "Record_Retrieval": 6,
+    "Urgency_Assessment": 7
+  },
+  "num_labels": 8,
+  "problem_type": "single_label_classification",
+  "transformers_version": "4.36.0"
+}