|
|
--- |
|
|
tags: |
|
|
- audio-classification |
|
|
- sound-event-detection |
|
|
- wav2vec2 |
|
|
- urban-acoustics |
|
|
- deep-learning |
|
|
datasets: |
|
|
- UrbanSoundscape_EventDetection_Metadata |
|
|
license: apache-2.0 |
|
|
model-index: |
|
|
- name: UrbanSound_EventDetection_Wav2Vec2 |
|
|
results: |
|
|
- task: |
|
|
name: Audio Classification |
|
|
type: audio-classification |
|
|
metrics: |
|
|
- type: accuracy |
|
|
value: 0.945 |
|
|
name: Event Detection Accuracy |
|
|
- type: f1_macro |
|
|
value: 0.938 |
|
|
name: Macro F1 Score |
|
|
--- |
|
|
|
|
|
# UrbanSound_EventDetection_Wav2Vec2 |
|
|
|
|
|
## 👂 Overview |
|
|
|
|
|
The **UrbanSound_EventDetection_Wav2Vec2** is a highly efficient model based on the pre-trained **Wav2Vec2** architecture, fine-tuned specifically for classifying momentary and continuous sound events within urban environments. It processes raw audio waveforms to identify one of eight high-priority urban sound classes, focusing on high-impact and potentially anomalous events. |
|
|
|
|
|
## 🧠 Model Architecture |
|
|
|
|
|
This model utilizes the standard Wav2Vec2 pipeline, which operates directly on raw audio data without the need for manual feature extraction (like MFCCs). |
|
|
|
|
|
* **Base Model:** `facebook/wav2vec2-base` |
|
|
* **Feature Extractor:** A stack of 1D convolutional layers extracts local features from the raw waveform. |
|
|
* **Transformer Encoder:** 12 layers of Transformer blocks capture long-range dependencies and global context within the audio clip. |
|
|
* **Classification Head:** A task-specific linear layer is placed on top of the contextualized representations to predict one of the 8 event labels. |
|
|
* **Target Classes:** Car\_Horn, Children\_Playing, Dog\_Barking, Machinery\_Hum, Siren\_Emergency, Train\_Whistle, Tire\_Screech, and Glass\_Shattering. |
|
|
|
|
|
## 🎯 Intended Use |
|
|
|
|
|
This model is intended for smart city, safety, and acoustic monitoring systems: |
|
|
|
|
|
1. **Acoustic Surveillance:** Real-time detection of emergency sounds (Siren, Glass Shattering, Tire Screech) for public safety alerting. |
|
|
2. **Noise Pollution Monitoring:** Quantifying the occurrence and frequency of specific noise sources (Car Horn, Machinery Hum) in different city zones. |
|
|
3. **Urban Planning:** Analyzing soundscape composition to inform policy on zoning and noise mitigation strategies. |
|
|
|
|
|
## ⚠️ Limitations |
|
|
|
|
|
1. **Event Overlap:** The current setup is trained for single-label classification. If multiple sounds occur simultaneously (e.g., Siren + Dog Barking), the model will only output the single most probable event, potentially ignoring others. |
|
|
2. **Domain Shift:** The model's performance may degrade if deployed in environments with significantly different background noise profiles (e.g., highly quiet suburbs vs. extremely loud Asian markets). |
|
|
3. **Localization:** This model performs *event detection* but does not inherently provide *sound localization* (Direction-of-Arrival or DOA), which would require specialized input features (like ambisonic audio) and a different model head. |
|
|
|
|
|
--- |
|
|
|
|
|
### MODEL 2: **MedicalChatbot_IntentClassifier_RoBERTa** |
|
|
|
|
|
This model is a RoBERTa-based model for multi-class classification of user intent within medical dialogue transcripts. |
|
|
|
|
|
#### config.json |
|
|
|
|
|
```json |
|
|
{ |
|
|
"_name_or_path": "roberta-base", |
|
|
"architectures": [ |
|
|
"RobertaForSequenceClassification" |
|
|
], |
|
|
"hidden_size": 768, |
|
|
"model_type": "roberta", |
|
|
"num_hidden_layers": 12, |
|
|
"vocab_size": 50265, |
|
|
"id2label": { |
|
|
"0": "Symptom_Reporting", |
|
|
"1": "Advice_Seeking", |
|
|
"2": "Medication_Query", |
|
|
"3": "Appointment_Scheduling", |
|
|
"4": "Billing_Query", |
|
|
"5": "Causal_Query", |
|
|
"6": "Record_Retrieval", |
|
|
"7": "Urgency_Assessment" |
|
|
}, |
|
|
"label2id": { |
|
|
"Symptom_Reporting": 0, |
|
|
"Advice_Seeking": 1, |
|
|
"Medication_Query": 2, |
|
|
"Appointment_Scheduling": 3, |
|
|
"Billing_Query": 4, |
|
|
"Causal_Query": 5, |
|
|
"Record_Retrieval": 6, |
|
|
"Urgency_Assessment": 7 |
|
|
}, |
|
|
"num_labels": 8, |
|
|
"problem_type": "single_label_classification", |
|
|
"transformers_version": "4.36.0" |
|
|
} |