Audio Classification
English
music
art
admin commited on
Commit
313fe5f
·
1 Parent(s): 9ad6c3e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +78 -59
README.md CHANGED
@@ -1,59 +1,78 @@
1
- ---
2
- license: mit
3
- datasets:
4
- - ccmusic-database/erhu_playing_tech
5
- language:
6
- - en
7
- metrics:
8
- - accuracy
9
- pipeline_tag: audio-classification
10
- tags:
11
- - music
12
- - art
13
- ---
14
-
15
- The Erhu Performance Technique Recognition Model is an audio analysis tool based on deep learning techniques, aiming to automatically distinguish different techniques in erhu performance. By deeply analyzing the acoustic characteristics of erhu music, the model is able to recognize 11 basic playing techniques, including split bow, pad bow, overtone, continuous bow, glissando, big glissando, strike bow, pizzicato, throw bow, staccato bow, vibrato, tremolo and vibrato. Through time-frequency conversion, feature extraction and pattern recognition, the model can accurately categorize the complex techniques of erhu performance, which provides an efficient technical support for music information retrieval, music education, and research on the art of erhu performance. The application of this model not only enriches the research in the field of music acoustics, but also opens up a new way for the inheritance and innovation of traditional music.
16
-
17
- ## Usage
18
- ```python
19
- from modelscope import snapshot_download
20
- model_dir = snapshot_download('ccmusic-database/erhu_playing_tech')
21
- ```
22
-
23
- ## Maintenance
24
- ```bash
25
- GIT_LFS_SKIP_SMUDGE=1 git clone git@hf.co:ccmusic-database/erhu_playing_tech
26
- cd erhu_playing_tech
27
- ```
28
-
29
- ## Results
30
- A demo result of Swin-S fine-tuning by mel:
31
- <style>
32
- #pianos td {
33
- vertical-align: middle !important;
34
- text-align: center;
35
- }
36
- #pianos th {
37
- text-align: center;
38
- }
39
- </style>
40
- <table id="pianos">
41
- <tr>
42
- <th>Loss curve</th>
43
- <td><img src="./loss.jpg"></td>
44
- </tr>
45
- <tr>
46
- <th>Training and validation accuracy</th>
47
- <td><img src="./acc.jpg"></td>
48
- </tr>
49
- <tr>
50
- <th>Confusion matrix</th>
51
- <td><img src="./mat.jpg"></td>
52
- </tr>
53
- </table>
54
-
55
- ## Mirror
56
- <https://www.modelscope.cn/models/ccmusic-database/erhu_playing_tech>
57
-
58
- ## Reference
59
- [1] <https://github.com/monetjoe/ccmusic_eval>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - ccmusic-database/erhu_playing_tech
5
+ language:
6
+ - en
7
+ metrics:
8
+ - accuracy
9
+ pipeline_tag: audio-classification
10
+ tags:
11
+ - music
12
+ - art
13
+ ---
14
+
15
+ The Erhu Performance Technique Recognition Model is an audio analysis tool based on deep learning techniques, aiming to automatically distinguish different techniques in erhu performance. By deeply analyzing the acoustic characteristics of erhu music, the model is able to recognize 11 basic playing techniques, including split bow, pad bow, overtone, continuous bow, glissando, big glissando, strike bow, pizzicato, throw bow, staccato bow, vibrato, tremolo and vibrato. Through time-frequency conversion, feature extraction and pattern recognition, the model can accurately categorize the complex techniques of erhu performance, which provides an efficient technical support for music information retrieval, music education, and research on the art of erhu performance. The application of this model not only enriches the research in the field of music acoustics, but also opens up a new way for the inheritance and innovation of traditional music.
16
+
17
+ ## Demo
18
+ <https://huggingface.co/spaces/ccmusic-database/erhu-playing-tech>
19
+
20
+ ## Usage
21
+ ```python
22
+ from modelscope import snapshot_download
23
+ model_dir = snapshot_download('ccmusic-database/erhu_playing_tech')
24
+ ```
25
+
26
+ ## Maintenance
27
+ ```bash
28
+ GIT_LFS_SKIP_SMUDGE=1 git clone git@hf.co:ccmusic-database/erhu_playing_tech
29
+ cd erhu_playing_tech
30
+ ```
31
+
32
+ ## Results
33
+ A demo result of Swin-S fine-tuning by mel:
34
+ <style>
35
+ #pianos td {
36
+ vertical-align: middle !important;
37
+ text-align: center;
38
+ }
39
+ #pianos th {
40
+ text-align: center;
41
+ }
42
+ </style>
43
+ <table id="pianos">
44
+ <tr>
45
+ <th>Loss curve</th>
46
+ <td><img src="./loss.jpg"></td>
47
+ </tr>
48
+ <tr>
49
+ <th>Training and validation accuracy</th>
50
+ <td><img src="./acc.jpg"></td>
51
+ </tr>
52
+ <tr>
53
+ <th>Confusion matrix</th>
54
+ <td><img src="./mat.jpg"></td>
55
+ </tr>
56
+ </table>
57
+
58
+ ## Dataset
59
+ <https://huggingface.co/datasets/ccmusic-database/erhu_playing_tech>
60
+
61
+ ## Mirror
62
+ <https://www.modelscope.cn/models/ccmusic-database/erhu_playing_tech>
63
+
64
+ ## Evaluation
65
+ <https://github.com/monetjoe/ccmusic_eval>
66
+
67
+ ## Cite
68
+ ```bibtex
69
+ @dataset{zhaorui_liu_2021_5676893,
70
+ author = {Monan Zhou, Shenyang Xu, Zhaorui Liu, Zhaowen Wang, Feng Yu, Wei Li and Baoqiang Han},
71
+ title = {CCMusic: an Open and Diverse Database for Chinese and General Music Information Retrieval Research},
72
+ month = {mar},
73
+ year = {2024},
74
+ publisher = {HuggingFace},
75
+ version = {1.2},
76
+ url = {https://huggingface.co/ccmusic-database}
77
+ }
78
+ ```