admin
commited on
Commit
·
80c8b34
1
Parent(s):
20de02d
sync ms model
Browse files
README.md
CHANGED
|
@@ -9,12 +9,18 @@ tags:
|
|
| 9 |
- art
|
| 10 |
metrics:
|
| 11 |
- accuracy
|
| 12 |
-
pipeline_tag:
|
| 13 |
library_name: https://github.com/monetjoe/Piano-Classification
|
| 14 |
---
|
| 15 |
|
| 16 |
This study, based on deep learning technology, draws inspiration from classical backbone network structures in the computer vision domain to construct an innovative 8-class piano timbre discriminator model through audio data processing. The model focuses on eight brands and types of pianos, including Kawai, Kawai Grand, YOUNG CHANG, HSINGHAI, Steinway Theatre, Steinway Grand, Pearl River, and Yamaha. By transforming audio data into Mel spectrograms and conducting supervised learning in the fine-tuning phase, the model accurately distinguishes different piano timbres and performs well in practical testing. In the training process, a large-scale annotated audio dataset is utilized, and the introduction of deep learning technology provides crucial support for improving the model's performance by progressively learning to extract key features from audio. The piano timbre discriminator model has broad potential applications in music assessment, audio engineering, and other fields, offering an advanced and reliable solution for piano timbre discrimination. This study expands new possibilities for the application of deep learning in the audio domain, providing valuable references for future research and applications in related fields.
|
| 17 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 18 |
## Maintenance
|
| 19 |
```bash
|
| 20 |
GIT_LFS_SKIP_SMUDGE=1 git clone git@hf.co:ccmusic-database/pianos
|
|
@@ -23,7 +29,6 @@ cd pianos
|
|
| 23 |
|
| 24 |
## Results
|
| 25 |
A demo result of SqueezeNet fine-tuning:
|
| 26 |
-
|
| 27 |
<style>
|
| 28 |
#pianos td {
|
| 29 |
vertical-align: middle !important;
|
|
@@ -36,15 +41,15 @@ A demo result of SqueezeNet fine-tuning:
|
|
| 36 |
<table id="pianos">
|
| 37 |
<tr>
|
| 38 |
<th>Loss curve</th>
|
| 39 |
-
<td><img src="
|
| 40 |
</tr>
|
| 41 |
<tr>
|
| 42 |
<th>Training and validation accuracy</th>
|
| 43 |
-
<td><img src="
|
| 44 |
</tr>
|
| 45 |
<tr>
|
| 46 |
<th>Confusion matrix</th>
|
| 47 |
-
<td><img src="
|
| 48 |
</tr>
|
| 49 |
</table>
|
| 50 |
|
|
|
|
| 9 |
- art
|
| 10 |
metrics:
|
| 11 |
- accuracy
|
| 12 |
+
pipeline_tag: audio-classification
|
| 13 |
library_name: https://github.com/monetjoe/Piano-Classification
|
| 14 |
---
|
| 15 |
|
| 16 |
This study, based on deep learning technology, draws inspiration from classical backbone network structures in the computer vision domain to construct an innovative 8-class piano timbre discriminator model through audio data processing. The model focuses on eight brands and types of pianos, including Kawai, Kawai Grand, YOUNG CHANG, HSINGHAI, Steinway Theatre, Steinway Grand, Pearl River, and Yamaha. By transforming audio data into Mel spectrograms and conducting supervised learning in the fine-tuning phase, the model accurately distinguishes different piano timbres and performs well in practical testing. In the training process, a large-scale annotated audio dataset is utilized, and the introduction of deep learning technology provides crucial support for improving the model's performance by progressively learning to extract key features from audio. The piano timbre discriminator model has broad potential applications in music assessment, audio engineering, and other fields, offering an advanced and reliable solution for piano timbre discrimination. This study expands new possibilities for the application of deep learning in the audio domain, providing valuable references for future research and applications in related fields.
|
| 17 |
|
| 18 |
+
## Usage
|
| 19 |
+
```python
|
| 20 |
+
from modelscope import snapshot_download
|
| 21 |
+
model_dir = snapshot_download('ccmusic-database/pianos')
|
| 22 |
+
```
|
| 23 |
+
|
| 24 |
## Maintenance
|
| 25 |
```bash
|
| 26 |
GIT_LFS_SKIP_SMUDGE=1 git clone git@hf.co:ccmusic-database/pianos
|
|
|
|
| 29 |
|
| 30 |
## Results
|
| 31 |
A demo result of SqueezeNet fine-tuning:
|
|
|
|
| 32 |
<style>
|
| 33 |
#pianos td {
|
| 34 |
vertical-align: middle !important;
|
|
|
|
| 41 |
<table id="pianos">
|
| 42 |
<tr>
|
| 43 |
<th>Loss curve</th>
|
| 44 |
+
<td><img src="./loss.jpg"></td>
|
| 45 |
</tr>
|
| 46 |
<tr>
|
| 47 |
<th>Training and validation accuracy</th>
|
| 48 |
+
<td><img src="./acc.jpg"></td>
|
| 49 |
</tr>
|
| 50 |
<tr>
|
| 51 |
<th>Confusion matrix</th>
|
| 52 |
+
<td><img src="./mat.jpg"></td>
|
| 53 |
</tr>
|
| 54 |
</table>
|
| 55 |
|
acc.csv
DELETED
|
@@ -1,41 +0,0 @@
|
|
| 1 |
-
tra_acc_list,val_acc_list,lr_list
|
| 2 |
-
48.13950386769805,46.7982924226254,0.001
|
| 3 |
-
58.64897305948252,57.1504802561366,0.001
|
| 4 |
-
63.230194718591626,62.4332977588047,0.001
|
| 5 |
-
68.17151240330755,67.2358591248666,0.001
|
| 6 |
-
71.53907708722326,69.05016008537886,0.001
|
| 7 |
-
73.25286743131501,69.74386339381003,0.001
|
| 8 |
-
78.9877300613497,77.96157950907151,0.001
|
| 9 |
-
79.93464923979728,78.54855923159018,0.001
|
| 10 |
-
82.78874366497733,82.07043756670224,0.001
|
| 11 |
-
84.15577487329955,82.49733191035219,0.001
|
| 12 |
-
85.96292344625233,84.04482390608324,0.001
|
| 13 |
-
84.76260336089624,82.55069370330843,0.001
|
| 14 |
-
87.34995998933049,84.95197438633937,0.001
|
| 15 |
-
88.38356895172046,86.49946638207044,0.001
|
| 16 |
-
88.2568684982662,85.80576307363927,0.001
|
| 17 |
-
90.86423046145639,89.70117395944503,0.0001
|
| 18 |
-
91.42437983462257,88.58057630736393,0.0001
|
| 19 |
-
92.06455054681248,89.2742796157951,0.0001
|
| 20 |
-
92.11789810616165,90.12806830309499,0.0001
|
| 21 |
-
91.96452387303282,89.64781216648879,0.0001
|
| 22 |
-
92.13123499599894,90.87513340448238,0.0001
|
| 23 |
-
92.71138970392104,90.18143009605123,1e-05
|
| 24 |
-
92.47132568684982,90.71504802561367,1e-05
|
| 25 |
-
92.69805281408375,90.82177161152615,1e-05
|
| 26 |
-
92.68471592424646,90.92849519743864,1e-05
|
| 27 |
-
92.16457722059216,91.03521878335113,1e-05
|
| 28 |
-
92.39797279274472,91.46211312700106,1e-05
|
| 29 |
-
92.53134169111763,90.18143009605123,1e-05
|
| 30 |
-
92.61803147506001,90.07470651013874,1e-05
|
| 31 |
-
92.61136303014138,90.18143009605123,1e-05
|
| 32 |
-
92.47132568684982,90.0213447171825,1e-05
|
| 33 |
-
92.79141104294479,89.70117395944503,1e-05
|
| 34 |
-
92.73806348359562,90.18143009605123,1.0000000000000002e-06
|
| 35 |
-
92.50466791144305,90.2881536819637,1.0000000000000002e-06
|
| 36 |
-
92.61803147506001,90.92849519743864,1.0000000000000002e-06
|
| 37 |
-
92.44465190717524,90.92849519743864,1.0000000000000002e-06
|
| 38 |
-
92.65804214457188,90.66168623265742,1.0000000000000002e-06
|
| 39 |
-
92.4979994665244,90.44823906083245,1.0000000000000002e-06
|
| 40 |
-
92.60469458522273,90.34151547491996,1.0000000000000002e-07
|
| 41 |
-
92.81808482261937,90.98185699039489,1.0000000000000002e-07
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
acc.pdf
DELETED
|
Binary file (26.7 kB)
|
|
|
loss.csv
DELETED
|
The diff for this file is too large to render.
See raw diff
|
|
|
loss.pdf
DELETED
|
Binary file (56.6 kB)
|
|
|
mat.csv
DELETED
|
@@ -1,8 +0,0 @@
|
|
| 1 |
-
3.946666666666666379e-02,5.333333333333333589e-04,3.200000000000000153e-03,0.000000000000000000e+00,1.600000000000000077e-03,0.000000000000000000e+00,0.000000000000000000e+00,1.066666666666666718e-03
|
| 2 |
-
0.000000000000000000e+00,1.946666666666666545e-01,5.333333333333333589e-04,2.666666666666666578e-03,5.333333333333333155e-03,0.000000000000000000e+00,3.200000000000000153e-03,2.666666666666666578e-03
|
| 3 |
-
3.733333333333333295e-03,2.133333333333333436e-03,2.048000000000000098e-01,3.733333333333333295e-03,5.333333333333333589e-04,0.000000000000000000e+00,5.333333333333333589e-04,2.133333333333333436e-03
|
| 4 |
-
1.600000000000000077e-03,2.133333333333333436e-03,3.200000000000000153e-03,9.493333333333332791e-02,1.066666666666666718e-03,0.000000000000000000e+00,6.933333333333333015e-03,2.133333333333333436e-03
|
| 5 |
-
1.600000000000000077e-03,2.666666666666666578e-03,0.000000000000000000e+00,5.333333333333333589e-04,7.040000000000000424e-02,1.066666666666666718e-03,2.666666666666666578e-03,2.666666666666666578e-03
|
| 6 |
-
0.000000000000000000e+00,0.000000000000000000e+00,0.000000000000000000e+00,5.333333333333333589e-04,0.000000000000000000e+00,7.946666666666667156e-02,0.000000000000000000e+00,0.000000000000000000e+00
|
| 7 |
-
0.000000000000000000e+00,1.066666666666666718e-03,1.066666666666666718e-03,4.266666666666666871e-03,1.066666666666666718e-03,0.000000000000000000e+00,1.157333333333333270e-01,5.333333333333333155e-03
|
| 8 |
-
1.066666666666666718e-03,1.600000000000000077e-03,5.333333333333333589e-04,1.600000000000000077e-03,1.600000000000000077e-03,0.000000000000000000e+00,9.066666666666667318e-03,1.098666666666666680e-01
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
mat.jpg
CHANGED
|
|
mat.pdf
DELETED
|
Binary file (29.9 kB)
|
|
|
result.log
DELETED
|
@@ -1,21 +0,0 @@
|
|
| 1 |
-
precision recall f1-score support
|
| 2 |
-
|
| 3 |
-
PearlRiver 0.831 0.860 0.846 86
|
| 4 |
-
YoungChang 0.951 0.931 0.941 392
|
| 5 |
-
Steinway-T 0.960 0.941 0.950 408
|
| 6 |
-
Hsinghai 0.877 0.848 0.862 210
|
| 7 |
-
Kawai 0.863 0.863 0.863 153
|
| 8 |
-
Steinway 0.987 0.993 0.990 150
|
| 9 |
-
Kawai-G 0.838 0.900 0.868 241
|
| 10 |
-
Yamaha 0.873 0.877 0.875 235
|
| 11 |
-
|
| 12 |
-
accuracy 0.909 1875
|
| 13 |
-
macro avg 0.897 0.902 0.899 1875
|
| 14 |
-
weighted avg 0.910 0.909 0.910 1875
|
| 15 |
-
|
| 16 |
-
Backbone : squeezenet1_1
|
| 17 |
-
Start time : 2024-01-08 21:56:02
|
| 18 |
-
Finish time : 2024-01-08 23:20:54
|
| 19 |
-
Time cost : 5092s
|
| 20 |
-
Full finetune: True
|
| 21 |
-
Focal loss : True
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
save.pt
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:c909b4adf93fc850ac5284e62ea23c64a4fa558ac4242e88b70cfae2be166163
|
| 3 |
-
size 3319671
|
|
|
|
|
|
|
|
|
|
|
|