yangwang825
/

mert-base

Audio Classification

feature-extraction

Model card Files Files and versions

yangwang825 commited on Aug 6, 2023

Commit

dbde56d

·

1 Parent(s): cd6fd2a

Create README.md

Files changed (1) hide show

README.md +36 -0

README.md ADDED Viewed

	@@ -0,0 +1,36 @@

+# MERT
+## Usage
+```python
+import numpy as np
+from transformers import AutoFeatureExtractor, AutoModelForAudioClassification
+model_id = 'yangwang825/mert-base'
+batch_size = 4
+num_classes = 10
+max_duration = 1.0
+feature_extractor = AutoFeatureExtractor.from_pretrained(
+    model_id,
+    trust_remote_code=True
+)
+mert = AutoModelForAudioClassification.from_pretrained(
+    model_id,
+    num_labels=num_classes,
+    ignore_mismatched_sizes=True,
+    trust_remote_code=True
+)
+# Simulate the list of waveforms
+audio_arrays = [np.random.rand(16000, ) for _ in range(batch_size)]
+inputs = feature_extractor(
+    audio_arrays, # List of waveforms in numpy array format
+    sampling_rate=feature_extractor.sampling_rate,
+    max_length=int(feature_extractor.sampling_rate * max_duration),
+    padding=True,
+    truncation=True,
+    return_tensors='pt'
+)
+logits = mert(**inputs) # The logits shape is (batch_size, num_classes)
+```