jmcmeen commited on
Commit
9f0ec05
·
verified ·
1 Parent(s): f495b4f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +39 -3
README.md CHANGED
@@ -1,8 +1,44 @@
1
  ---
2
  license: bsd-3-clause
 
 
 
 
3
  datasets:
4
  - ashraq/esc50
5
- base_model:
6
- - MIT/ast-finetuned-audioset-10-10-0.4593
7
  pipeline_tag: audio-classification
8
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: bsd-3-clause
3
+ tags:
4
+ - audio-classification
5
+ - audio
6
+ - environmental-sound
7
  datasets:
8
  - ashraq/esc50
 
 
9
  pipeline_tag: audio-classification
10
+ base_model: MIT/ast-finetuned-audioset-10-10-0.4593
11
+ ---
12
+
13
+ # AST Fine-tuned on ESC-50
14
+
15
+ An Audio Spectrogram Transformer (AST) model fine-tuned on the ESC-50 dataset for environmental sound classification.
16
+
17
+ ## Model Description
18
+
19
+ This model is based on the [Audio Spectrogram Transformer](https://arxiv.org/abs/2104.01778) architecture, fine-tuned to classify 50 categories of environmental sounds. The AST applies a pure attention mechanism to audio spectrograms, treating them as sequences of patches similar to Vision Transformers (ViT).
20
+
21
+ ## Training
22
+
23
+ - **Base Model**: MIT/ast-finetuned-audioset-10-10-0.4593
24
+ - **Dataset**: [ESC-50](https://github.com/karolpiczak/ESC-50) (Environmental Sound Classification)
25
+
26
+ ## Labels
27
+
28
+ The model classifies audio into 50 environmental sound categories:
29
+
30
+ **Animals**: cat, chirping_birds, cow, crow, dog, frog, hen, insects, pig, rooster, sheep
31
+
32
+ **Natural Sounds**: crackling_fire, crickets, rain, sea_waves, thunderstorm, water_drops, wind
33
+
34
+ **Human Sounds**: breathing, brushing_teeth, clapping, coughing, crying_baby, drinking_sipping, footsteps, laughing, sneezing, snoring
35
+
36
+ **Domestic Sounds**: clock_alarm, clock_tick, door_wood_creaks, door_wood_knock, glass_breaking, keyboard_typing, mouse_click, toilet_flush, vacuum_cleaner, washing_machine
37
+
38
+ **Urban Sounds**: airplane, car_horn, church_bells, engine, fireworks, helicopter, siren, train
39
+
40
+ **Mechanical/Tools**: can_opening, chainsaw, hand_saw, pouring_water
41
+
42
+ ## License
43
+
44
+ BSD-3-Clause