ASTMessyMashup / README.md
rudranshmathur's picture
Update README.md
82db288 verified
metadata
library_name: transformers
tags:
  - music
license: mit
language:
  - en
base_model:
  - MIT/ast-finetuned-audioset-10-10-0.4593
pipeline_tag: audio-classification

AST Audio Classification Model (Messy Mashup)

Introduction

This model is a fine-tuned Audio Spectrogram Transformer (AST) designed for audio classification tasks on the Messy Mashup dataset. It leverages pretrained audio representations and adapts them to classify audio inputs into predefined categories.

Model Description

  • Developed by: Rudransh Mathur
  • Institution: Indian Institute of Technology, Madras
  • Model type: Transformer-based Audio Classification Model
  • Base model: AST (fine-tuned on AudioSet)
  • Framework: Transformers (Hugging Face) + PyTorch
  • License: MIT

This model builds upon the pretrained AST architecture and is fine-tuned for improved performance on domain-specific audio data.

Model Sources

Intended Use

  • Audio classification tasks
  • Music/audio tagging
  • Experimental research in audio transformers

Training Details

  • Dataset: Messy Mashup Audio Dataset
    • GENRES: [blues, classical, country, disco, hiphop, jazz, metal, pop, reggae, rock]
    • STEMS = [drums, vocals, bass, other]
  • Epochs: 10
  • Optimizer: AdamW
  • Loss Function: Cross-Entropy Loss
  • Scheduler: Cosine Sheduler with warmup steps

Preprocessing

  • Randomly sampled audio stem files within the same genre and mixed to create a mixed song audio song similar to test dataset
  • Added 5 seconds of noise from the noise dataset 2-3 times on a random basis in the audio file.
  • Audio inputs converted using AST feature extractor
  • Sampling rate aligned with model requirements

Performance

  • Best Validation Accuracy: 0.87
  • Best Validation Loss: 0.40373
  • Best Test Accuracy: 0.92
  • Best Validation Loss: 0.3458

๐Ÿš€ Usage

from transformers import AutoModelForAudioClassification, AutoFeatureExtractor

model = AutoModelForAudioClassification.from_pretrained("rudranshmathur/ASTMessyMashup")
feature_extractor = AutoFeatureExtractor.from_pretrained("rudranshmathur/ASTMessyMashup")