Chicken Vocalization Classifier
This model is designed for classifying chicken vocalizations into categories indicative of health status or environmental noise. It serves as a crucial audio-based component within the "Chicken Health & Behavior Detection" multimodal project, aiming to aid in the early detection of poultry diseases and the monitoring of farm conditions.
Model Description
The chicken-vocalization-classifier is a Convolutional Neural Network (CNN) built with PyTorch, designed to process Log-Mel Spectrogram representations of audio recordings. It categorizes chicken sounds into three classes: Healthy, Noise, and Unhealthy. This model can help identify abnormal vocalizations (e.g., coughing, distress calls) that might signal health issues, or distinguish between relevant chicken sounds and general farm noise.
Training Data
This model was trained using the "Poultry Vocalization Signal Dataset for Early Disease Detection".
Training Procedure
The model was implemented and trained using the PyTorch framework
Model Architecture: The model, named
ModdifiedModel, consists of afeaturesextractor and aclassifierhead.- Features Extractor (Sequential): Composed of three blocks, each containing a
Conv2dlayer,BatchNorm2d,ReLUactivation, andMaxPool2d.- Block 1:
Conv2d(1, 32, kernel_size=3),BatchNorm2d(32),ReLU(),MaxPool2d(2) - Block 2:
Conv2d(32, 64, kernel_size=3),BatchNorm2d(64),ReLU(),MaxPool2d(2) - Block 3:
Conv2d(64, 128, kernel_size=3),BatchNorm2d(128),ReLU(),MaxPool2d(2)
- Block 1:
- Classifier (Sequential): Contains a
Flattenlayer, twoLinearlayers,Dropout, andReLUactivation.Linear(in_features=25088, out_features=256)Dropout(0.5)ReLU()Linear(in_features=256, out_features=3)(for 3 classes)
- Features Extractor (Sequential): Composed of three blocks, each containing a
Preprocessing: Audio files are converted to Log-Mel Spectrograms using
librosa.SAMPLE_RATE = 22050Hz- Audio is sampled to approximately 1.5 seconds (
WAV_SIZE = int(1.5 * SAMPLE_RATE)) MEL_BANDS = 128N_FFT = 2648HOP_LENGTH = 256
Data Splitting: The dataset was split into training and testing sets using
train_test_splitwithtest_size=0.2andrandom_state=27Loss Function:
nn.CrossEntropyLoss()Optimizer:
torch.optim.Adamwith a learning rate (lr) of0.001Epochs: The model was trained for
30epochsBatch Size: Training was performed with a
batch_sizeof32
Performance
The Modified model was evaluated on a test set.
How to Use
You can load this trained model's weights with PyTorch. For full usage examples, including audio preprocessing steps and inference, please refer to the CHBD_Vocalization_Analysis.ipynb notebook provided in this repository.
from huggingface_hub import hf_hub_download
import torch
import torch.nn as nn
import os
# Define the ModdifiedModel class
# (You will need to copy this class definition from the CHBD_Vocalization_Analysis.ipynb file)
class ModdifiedModel(nn.Module):
def __init__(self, num_classes=3):
super(ModdifiedModel, self).__init__()
# ... (copy the full model architecture definition here) ...
def forward(self, x):
# ... (copy the forward pass definition here) ...
# Instantiate the model
model = ModdifiedModel(num_classes=3)
# Define each Hugging Face details
repo_id = "IceKhoffi/chicken-vocalization-classifier"
filename = "Chiken_CNN_Disease_Detection_Model.pth"
model_path = hf_hub_download(repo_id=repo_id, filename=filename)
state_dict = torch.load(model_path, map_location='cpu')
model.load_state_dict(state_dict)
# Set model to evaluation mode
model.eval()
# The model is now loaded and ready for inference.
# Refer to the provided .ipynb for detailed preprocessing and inference examples.

