Engine Knock Detection - Custom 1D CNN

This model detects engine knock from raw audio waveforms using a custom 1D Convolutional Neural Network.

Model Description

Architecture: Custom 4-layer 1D CNN
Input: Raw audio waveforms (80,000 samples @ 16kHz = 5 seconds)
Output: Binary classification (clean vs knocking)
Framework: PyTorch
Parameters: ~5M trainable parameters

Performance Metrics

Evaluated on test set:

Metric	Score
Accuracy	0.7389
Precision	0.7660
Recall	0.7423
F1-Score	0.7539

Usage

import torch
import torchaudio
from huggingface_hub import hf_hub_download

# Load model architecture (you'll need to define Custom1DCNN class)
# See model architecture in the repository
from model import Custom1DCNN

model = Custom1DCNN(num_classes=2)
model_path = hf_hub_download(repo_id="cxlrd/engine-knock-cnn1d", filename="model.pth")
model.load_state_dict(torch.load(model_path, map_location='cpu'))
model.eval()

# Prepare audio
waveform, sample_rate = torchaudio.load('audio.wav')
if sample_rate != 16000:
    waveform = torchaudio.transforms.Resample(sample_rate, 16000)(waveform)

# Pad or truncate to 80000 samples
if waveform.shape[1] > 80000:
    waveform = waveform[:, :80000]
else:
    waveform = torch.nn.functional.pad(waveform, (0, 80000 - waveform.shape[1]))

# Predict
with torch.no_grad():
    output = model(waveform)
    prediction = torch.argmax(output, dim=1)
    print('Clean' if prediction == 0 else 'Knocking')

Training Details

Dataset: Custom engine sound recordings (1199 samples)
Training Split: 70% train, 15% validation, 15% test
Optimizer: Adam (lr=1e-3, weight_decay=1e-4)
Batch Size: 32
Early Stopping: Patience of 5 epochs
No Preprocessing: Direct raw waveform input

Architecture Details

Conv1D(1→64, k=80, s=4) → BatchNorm → ReLU → MaxPool(4)
Conv1D(64→128, k=3) → BatchNorm → ReLU → MaxPool(4)
Conv1D(128→256, k=3) → BatchNorm → ReLU → MaxPool(4)
Conv1D(256→512, k=3) → BatchNorm → ReLU → AdaptiveAvgPool
Dropout(0.5) → Linear(512→128) → ReLU → Dropout(0.3) → Linear(128→2)

Advantages

Fast Inference: No spectrogram conversion needed
Lightweight: Processes raw audio directly
Real-time Capable: Suitable for edge deployment

Citation

If you use this model, please cite:

@misc{engine-knock-cnn1d,
  author = {cxlrd},
  title = {Engine Knock Detection with 1D CNN},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/cxlrd/engine-knock-cnn1d}}
}

Downloads last month: 2