Spaces:

arunps
/

Adult_Infant_voc_test

Sleeping

App Files Files Community

Adult_Infant_voc_test / README.md

arunps

docs: update Hugging Face Space README to correctly reference BabbleCor corpus and OSF link

42347f3 verified 6 months ago

preview code

raw

history blame contribute delete

2.58 kB

A newer version of the Gradio SDK is available: 6.13.0

Upgrade

metadata

title: 🧠 VocClassifier — Infant Vocalization Classifier
emoji: 🍼
colorFrom: indigo
colorTo: yellow
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: true
license: mit

🧠 VocClassifier — Infant Vocalization Classifier

This interactive demo classifies infant vocalizations into five acoustic categories based on their spectral features.

Classes:
🗣️ Non-canonical • 🎶 Other • 👶 Canonical • 😢 Cry • 😂 Laugh

🧩 How It Works

Upload or record an infant vocalization clip (.wav, ≤10 seconds).
The system converts it into a Mel-spectrogram and classifies it using a ConvNeXt model trained with FastAI + TIMM.
It outputs predicted probabilities for each of the five classes.

The model was fine-tuned on Mel-spectrograms derived from the BabbleCor corpus BabbleCor Dataset on OSF.
You can retrain this model on your own infant data using the open-source repository below 👇

📦 Source code & training pipeline:
👉 https://github.com/arunps12/VisionInfantNet

⚙️ Supported Classes

Class	Description
Non-canonical	Early immature, non-syllabic vocalizations
Other	Unclassified or environmental sounds
Canonical	Mature babbling, clear consonant–vowel patterns
Cry	Distress or discomfort vocalizations
Laugh	Playful or positive expressions

🧠 Model Details

Architecture: ConvNeXt (Tiny)
Framework: FastAI (v2.7) + PyTorch (v2.4)
Features: Mel-spectrogram (224×224 px, Jet colormap)
Training corpus: BabbleCor (restricted, BabbleCor Dataset on OSF)
Fine-tuning: 5-way infant vocalization classification

🧰 How to Use

Click Upload or record a short infant vocalization.
Press Submit.
View the predicted class and probability distribution.

🧾 Credits

Author: Arun Singh
Affiliation: University of Oslo, Norway
License: MIT
GitHub: https://github.com/arunps12/VisionInfantNet

💬 Notes

This demo showcases how spectral features of infant sounds can be visualized and classified.
The underlying model can be fine-tuned for any infant dataset following the VisionInfantNet pipeline.

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference