arunps's picture
docs: update Hugging Face Space README to correctly reference BabbleCor corpus and OSF link
42347f3 verified

A newer version of the Gradio SDK is available: 6.13.0

Upgrade
metadata
title: 🧠 VocClassifier  Infant Vocalization Classifier
emoji: 🍼
colorFrom: indigo
colorTo: yellow
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: true
license: mit

🧠 VocClassifier — Infant Vocalization Classifier

This interactive demo classifies infant vocalizations into five acoustic categories based on their spectral features.

Classes:
🗣️ Non-canonical • 🎶 Other • 👶 Canonical • 😢 Cry • 😂 Laugh


🧩 How It Works

  • Upload or record an infant vocalization clip (.wav, ≤10 seconds).
  • The system converts it into a Mel-spectrogram and classifies it using a ConvNeXt model trained with FastAI + TIMM.
  • It outputs predicted probabilities for each of the five classes.

The model was fine-tuned on Mel-spectrograms derived from the BabbleCor corpus BabbleCor Dataset on OSF.
You can retrain this model on your own infant data using the open-source repository below 👇

📦 Source code & training pipeline:
👉 https://github.com/arunps12/VisionInfantNet


⚙️ Supported Classes

Class Description
Non-canonical Early immature, non-syllabic vocalizations
Other Unclassified or environmental sounds
Canonical Mature babbling, clear consonant–vowel patterns
Cry Distress or discomfort vocalizations
Laugh Playful or positive expressions

🧠 Model Details

  • Architecture: ConvNeXt (Tiny)
  • Framework: FastAI (v2.7) + PyTorch (v2.4)
  • Features: Mel-spectrogram (224×224 px, Jet colormap)
  • Training corpus: BabbleCor (restricted, BabbleCor Dataset on OSF)
  • Fine-tuning: 5-way infant vocalization classification

🧰 How to Use

  1. Click Upload or record a short infant vocalization.
  2. Press Submit.
  3. View the predicted class and probability distribution.

🧾 Credits

Author: Arun Singh
Affiliation: University of Oslo, Norway
License: MIT
GitHub: https://github.com/arunps12/VisionInfantNet


💬 Notes

This demo showcases how spectral features of infant sounds can be visualized and classified.
The underlying model can be fine-tuned for any infant dataset following the VisionInfantNet pipeline.


Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference