Instructions to use TheVortexProject/insectnet with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Scikit-learn
How to use TheVortexProject/insectnet with Scikit-learn:
from huggingface_hub import hf_hub_download import joblib model = joblib.load( hf_hub_download("TheVortexProject/insectnet", "sklearn_model.joblib") ) # only load pickle files from sources you trust # read more about it here https://skops.readthedocs.io/en/stable/persistence.html - Notebooks
- Google Colab
- Kaggle
Architecture
How InsectNet integrates with BirdNET-Pi and why it's designed this way.
BirdNET-Pi Model
BirdNET-Pi uses a socket-based client-server architecture for audio analysis:
arecord (15s WAV β StreamData/)
ββ birdnet_analysis.sh (shell loop)
ββ analyze.py (socket client on port 5050)
ββ BirdNET-Lite server loads WAV, runs TFLite, returns CSV
ββ detection: WAV β Extracted/By_Date/{species}/
ββ no detection: WAV deleted
Key design patterns InsectNet mirrors:
- Binary WAV lifecycle β every WAV is processed once. Keep or delete, no middle state.
- Detection-only persistence β non-detections produce zero artifacts.
- Shell-based orchestration β each service is an independent systemd unit.
InsectNet's Role
InsectNet is a read-only sidecar. It never touches BirdNET-Pi's files β it reads StreamData/ via inotify and copies WAVs to its own directory before BirdNET-Pi deletes them.
StreamData/ (new WAV)
β
ββββ BirdNET-Lite (port 5050) β CSV β keep/delete
β
ββββ InsectNet inotify β copy WAV β librosa β TFLite β logits β sklearn β keep/delete
β
captures/{class}/{ts}_{cls}_{conf}.wav
detections.jsonl (append)
Why BirdNET Logits
InsectNet classifiers train on BirdNET's 6,522-dim logit space, not raw audio. This is possible because BirdNET v2.4 has 31 Orthoptera species in its label set β field crickets, tree crickets, conehead katydids, ground crickets, and meadow katydids. The logit space already encodes insect acoustic structure.
Cicadas are absent from BirdNET's labels, but their acoustic features still produce distinguishable patterns in the logit space (confirmed by field validation with cosine similarity against training centroids).
Classifier Architecture
All production InsectNet classifiers use:
StandardScaler β OneVsRest(LogisticRegression(C=0.1, class_weight='balanced'))
- StandardScaler normalizes the 6,522-dim logit vectors
- OneVsRest trains one binary classifier per class (sigmoid output)
- LogisticRegression with L2 regularization (C=0.1), balanced class weights
This is the same architecture BirdNET uses internally without the softmax β sigmoid-per-class allows multi-label predictions (one clip can be both "cicada_drone" and "frog").
Multi-Label Training
Training data format: clips are labeled with lists of active classes, not a
single category. A clip containing overlapping frog and cricket calls is
labeled ["frog", "cricket_katydid"].
MultiLabelBinarizer converts to an indicator matrix. Per-class
F1-optimized thresholds are swept 0.1-0.95 during evaluation. Each class gets
its own decision threshold.
Background Training Data
Background clips come from two sources:
- BirdNET bird clips β every labeled bird clip is confirmed non-insect audio from the same microphone and environment.
- Public datasets (ESC-50 for environmental noise, iNatSounds for labeled insect audio).
Two-Tier System
InsectNet operates at two levels:
| Layer | Runs On | Backbone | Purpose |
|---|---|---|---|
| Sidecar | BirdNET-Pi (Pi 4) | BirdNET TFLite logits | Real-time capture, keeps WAVs |
| Archive | Workstation | Perch 2.0 embeddings | Offline enrichment, multi-taxa discovery |
The sidecar is the edge capture system. The archive (separate repo) is the analysis layer that pulls captures, embeds them with Perch 2.0, and enables multi-taxa classification. They are complementary.
BirdNET Species Coverage
BirdNET v2.4 has 6,522 species labels. Insect-relevant coverage:
| Group | In BirdNET? | Notes |
|---|---|---|
| 31 Orthoptera (crickets, katydids) | β | Field crickets, tree crickets, coneheads, ground crickets, meadow katydids |
| 0 Cicada species | β | Zero cicada labels β relies on general acoustic features |
| 0 Bee species | β | Zero Hymenoptera labels |
| 0 Grasshopper species | β | Though some Acrididae may trigger Orthoptera channels |
This means 31 logit channels carry insect-class information directly; the other 6,491 channels may carry incidental insect structure.
BirdNET-Pi Access
Default credentials:
- Host: 192.168.1.223
- User: birdnetpi / birdnetpi
- Python:
/home/birdnetpi/BirdNET-Pi/birdnet/bin/python3 - Model:
/home/birdnetpi/BirdNET-Pi/model/BirdNET_GLOBAL_6K_V2.4_Model_FP16.tflite - StreamData:
/home/birdnetpi/BirdSongs/StreamData/
The sidecar expects the TFLite model at DEFAULT_BIRDNET_MODEL and StreamData
at DEFAULT_STREAMDATA (both configurable via CLI args).