zikangzheng's picture
End of training
fbce964 verified
metadata
library_name: transformers
license: apache-2.0
base_model: facebook/wav2vec2-base
tags:
  - generated_from_trainer
datasets:
  - marsyas/gtzan
metrics:
  - accuracy
  - precision
  - recall
  - f1
model-index:
  - name: wav2vec2-base-finetuned-gtzan-optimized
    results:
      - task:
          name: Audio Classification
          type: audio-classification
        dataset:
          name: GTZAN
          type: marsyas/gtzan
          config: default
          split: train
          args: default
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.72
          - name: Precision
            type: precision
            value: 0.7270750083250083
          - name: Recall
            type: recall
            value: 0.72
          - name: F1
            type: f1
            value: 0.7156373854245563

wav2vec2-base-finetuned-gtzan-optimized

This model is a fine-tuned version of facebook/wav2vec2-base on the GTZAN dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2450
  • Accuracy: 0.72
  • Precision: 0.7271
  • Recall: 0.72
  • F1: 0.7156

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine_with_restarts
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 30
  • label_smoothing_factor: 0.1

Training results

Training Loss Epoch Step Accuracy F1 Validation Loss Precision Recall
2.3003 1.0 22 0.18 0.0834 2.2929 0.0544 0.18
2.2917 2.0 44 0.1333 0.0597 2.2837 0.0408 0.1333
2.2778 3.0 66 0.26 0.2083 2.2567 0.3654 0.26
2.2471 4.0 88 0.34 0.2758 2.2149 0.3997 0.34
2.1629 5.0 110 0.32 0.2353 2.1427 0.3069 0.32
2.08 6.0 132 0.3733 0.2776 2.0558 0.2645 0.3733
2.0188 7.0 154 0.3867 0.2997 1.9914 0.3095 0.3867
1.9483 8.0 176 0.3867 0.3167 1.9420 0.3785 0.3867
1.8804 9.0 198 0.4467 0.3905 1.8842 0.4878 0.4467
1.8063 10.0 220 0.3867 0.2975 1.8867 0.3360 0.3867
1.7808 11.0 242 0.4133 0.3619 1.8269 0.4118 0.4133
1.7031 12.0 264 0.5133 0.4759 1.7784 0.5104 0.5133
1.6752 13.0 286 0.4933 0.4502 1.7580 0.5315 0.4933
1.6843 14.0 308 0.5 0.4609 1.7113 0.5002 0.5
1.6136 15.0 330 0.4667 0.4276 1.7132 0.4710 0.4667
1.6392 1.9957 349 1.6793 0.4667 0.4630 0.4667 0.4112
1.5396 3.0 524 1.5783 0.5267 0.5407 0.5267 0.4945
1.5981 4.0 699 1.6018 0.5 0.5358 0.5 0.4795
1.3127 5.0 874 1.4972 0.56 0.5732 0.56 0.5382
1.5041 6.0 1049 1.5921 0.5267 0.5740 0.5267 0.5166
1.1165 7.0 1224 1.4291 0.5667 0.5364 0.5667 0.5296
1.1177 8.0 1399 1.3336 0.6267 0.6217 0.6267 0.5932
0.8805 9.0 1574 1.3987 0.5867 0.6336 0.5867 0.5745
0.8566 10.0 1749 1.2999 0.66 0.6753 0.66 0.6565
1.0281 11.0 1924 1.3834 0.66 0.6770 0.66 0.6539
0.8522 12.0 2099 1.3038 0.6933 0.7138 0.6933 0.6848
0.8237 13.0 2274 1.4544 0.6133 0.6358 0.6133 0.5935
0.7483 14.0 2449 1.3505 0.6867 0.7018 0.6867 0.6835
0.6935 15.0 2624 1.2758 0.68 0.6990 0.68 0.6805
0.6927 16.0 2799 1.2943 0.7 0.7034 0.7 0.6918
0.5777 17.0 2974 1.3557 0.6867 0.6959 0.6867 0.6773
0.5445 18.0 3149 1.3008 0.7133 0.7246 0.7133 0.7078
0.5349 19.0 3324 1.2980 0.6933 0.7111 0.6933 0.6921
0.5268 20.0 3499 1.2516 0.72 0.7325 0.72 0.7201
0.5458 21.0 3674 1.2454 0.7067 0.7028 0.7067 0.7011
0.5167 22.0 3849 1.2321 0.6933 0.7007 0.6933 0.6908
0.5157 23.0 4024 1.3093 0.68 0.6978 0.68 0.6797
0.51 24.0 4199 1.2763 0.7067 0.7198 0.7067 0.7044
0.5109 25.0 4374 1.2671 0.6933 0.7038 0.6933 0.6913

Framework versions

  • Transformers 4.57.0.dev0
  • Pytorch 2.9.0.dev20250716+cu129
  • Datasets 4.0.0
  • Tokenizers 0.22.0