Bee Audio Classifier

5-class audio classifier for bee colony health monitoring. Trained on segmented hive recordings using MFCC-based feature extraction.

Last updated: 2026-05-04 07:31 UTC

Classes

Label Description
active_colony β€”
external_noise β€”
missing_queen β€”
pest_infestation β€”
pre_swarm β€”
queenbee_present β€”
swarming β€”

Model performance

File Description Accuracy F1 (weighted)
random_forest_model.pkl Random Forest 0.9444 0.9444
svm_rbf_model.pkl SVM (RBF) 0.8889 0.8880
xgboost_model.pkl XGBoost best 0.9630 0.9630
gradient_boosting_model.pkl Gradient Boosting 0.8519 0.8514
bee_cnn_classifier.h5 CNN (Mel Spectrogram) β€” β€”
best_cnn.h5 CNN checkpoint β€” β€”

label_encoder.pkl is required by all classical ML models. cnn_label_encoder.pkl is required by the CNN models.

Feature extraction (172 features per 5-second segment)

  • 40 MFCCs Γ— (mean + std) = 80
  • 40 delta-MFCCs Γ— mean = 40
  • 12 Chroma Γ— (mean + std) = 24
  • Mel spectrogram stats (mean, std, max, min) = 4
  • Spectral centroid (mean + std) = 2
  • Spectral bandwidth (mean + std) = 2
  • Spectral rolloff (mean + std) = 2
  • Spectral contrast Γ— 7 Γ— mean = 7
  • Zero crossing rate (mean + std) = 2
  • RMS energy (mean + std) = 2
  • Tonnetz Γ— 6 Γ— mean = 6

Quick Python usage

import joblib
import librosa
import numpy as np

model = joblib.load("xgboost_model.pkl")
le    = joblib.load("label_encoder.pkl")

def extract_features(y, sr, n_mfcc=40,
                     hop_length=512, n_fft=2048):
    feats = {}
    mfcc = librosa.feature.mfcc(y=y, sr=sr, n_mfcc=n_mfcc, n_fft=n_fft, hop_length=hop_length)
    for i in range(n_mfcc):
        feats[f"mfcc_{i}_mean"] = np.mean(mfcc[i])
        feats[f"mfcc_{i}_std"]  = np.std(mfcc[i])
    delta = librosa.feature.delta(mfcc)
    for i in range(n_mfcc):
        feats[f"mfcc_delta_{i}_mean"] = np.mean(delta[i])
    chroma = librosa.feature.chroma_stft(y=y, sr=sr, n_fft=n_fft, hop_length=hop_length)
    for i in range(12):
        feats[f"chroma_{i}_mean"] = np.mean(chroma[i])
        feats[f"chroma_{i}_std"]  = np.std(chroma[i])
    mel = librosa.feature.melspectrogram(y=y, sr=sr, hop_length=hop_length)
    mel_db = librosa.power_to_db(mel, ref=np.max)
    feats["mel_mean"] = np.mean(mel_db); feats["mel_std"]  = np.std(mel_db)
    feats["mel_max"]  = np.max(mel_db);  feats["mel_min"]  = np.min(mel_db)
    sc = librosa.feature.spectral_centroid(y=y, sr=sr, hop_length=hop_length)
    feats["spectral_centroid_mean"] = np.mean(sc); feats["spectral_centroid_std"] = np.std(sc)
    sb = librosa.feature.spectral_bandwidth(y=y, sr=sr, hop_length=hop_length)
    feats["spectral_bandwidth_mean"] = np.mean(sb); feats["spectral_bandwidth_std"] = np.std(sb)
    sr_f = librosa.feature.spectral_rolloff(y=y, sr=sr, hop_length=hop_length)
    feats["spectral_rolloff_mean"] = np.mean(sr_f); feats["spectral_rolloff_std"] = np.std(sr_f)
    contrast = librosa.feature.spectral_contrast(y=y, sr=sr, hop_length=hop_length)
    for i in range(contrast.shape[0]):
        feats[f"spectral_contrast_{i}_mean"] = np.mean(contrast[i])
    zcr = librosa.feature.zero_crossing_rate(y, hop_length=hop_length)
    feats["zcr_mean"] = np.mean(zcr); feats["zcr_std"] = np.std(zcr)
    rms = librosa.feature.rms(y=y, hop_length=hop_length)
    feats["rms_mean"] = np.mean(rms); feats["rms_std"] = np.std(rms)
    harmonic = librosa.effects.harmonic(y)
    tonnetz = librosa.feature.tonnetz(y=harmonic, sr=sr)
    for i in range(6):
        feats[f"tonnetz_{i}_mean"] = np.mean(tonnetz[i])
    return np.array(list(feats.values())).reshape(1, -1)

y, sr = librosa.load("hive_recording.wav", sr=22050)
seg   = y[:int(5.0 * sr)]
feat  = extract_features(seg, sr)
pred  = le.classes_[model.predict(feat)[0]]
print(pred)
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support