SpeechGuard KWS โ€” BC-ResNet-8 Keyword Spotter

Part of the SpeechGuard AI system submitted to Samsung EnnovateX AX Hackathon 2026.

Model Description

BC-ResNet-8 keyword spotter trained on Google Speech Commands v2 with noise augmentation. Uses PCEN (Per-Channel Energy Normalization) frontend for robust noise handling.

Performance

Metric Value
TA Clean 99.0%
TA Noisy (-5 to +30 dB) 98.5%
Parameters 2,444
Latency (CPU) 1.1ms

Usage

import torch
from huggingface_hub import hf_hub_download

# Download checkpoint
ckpt_path = hf_hub_download(
    repo_id="MADHAV-SAMDANI/speechguard-kws",
    filename="best_kws.pt"
)

# Load model
from speechguard.kws.bc_resnet import BCResNet8
ckpt  = torch.load(ckpt_path, map_location="cpu", weights_only=False)
model = BCResNet8(num_classes=len(ckpt["classes"]), n_mels=80)
model.load_state_dict(ckpt["model_state"])
model.eval()

Training

  • Dataset: Google Speech Commands v2 (2000 samples/class)
  • Epochs: 35
  • Optimizer: AdamW with cosine LR annealing
  • Noise augmentation: ESC-50 + synthetic (white, pink, babble)
  • Hardware: MacBook Air CPU (~70 minutes)

Citation

Samsung EnnovateX AX Hackathon 2026 โ€” Problem #04 Team: Placecomm Prophets (IIT Kharagpur)

Downloads last month
16
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Dataset used to train MADHAV-SAMDANI/speechguard-kws