Instructions to use TheVortexProject/insectnet with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Scikit-learn
How to use TheVortexProject/insectnet with Scikit-learn:
from huggingface_hub import hf_hub_download import joblib model = joblib.load( hf_hub_download("TheVortexProject/insectnet", "sklearn_model.joblib") ) # only load pickle files from sources you trust # read more about it here https://skops.readthedocs.io/en/stable/persistence.html - Notebooks
- Google Colab
- Kaggle
| # Architecture | |
| How InsectNet integrates with BirdNET-Pi and why it's designed this way. | |
| ## BirdNET-Pi Model | |
| BirdNET-Pi uses a **socket-based client-server architecture** for audio analysis: | |
| ``` | |
| arecord (15s WAV β StreamData/) | |
| ββ birdnet_analysis.sh (shell loop) | |
| ββ analyze.py (socket client on port 5050) | |
| ββ BirdNET-Lite server loads WAV, runs TFLite, returns CSV | |
| ββ detection: WAV β Extracted/By_Date/{species}/ | |
| ββ no detection: WAV deleted | |
| ``` | |
| Key design patterns InsectNet mirrors: | |
| - **Binary WAV lifecycle** β every WAV is processed once. Keep or delete, no middle state. | |
| - **Detection-only persistence** β non-detections produce zero artifacts. | |
| - **Shell-based orchestration** β each service is an independent systemd unit. | |
| ## InsectNet's Role | |
| InsectNet is a **read-only sidecar**. It never touches BirdNET-Pi's files β it | |
| reads StreamData/ via inotify and copies WAVs to its own directory before | |
| BirdNET-Pi deletes them. | |
| ``` | |
| StreamData/ (new WAV) | |
| β | |
| ββββ BirdNET-Lite (port 5050) β CSV β keep/delete | |
| β | |
| ββββ InsectNet inotify β copy WAV β librosa β TFLite β logits β sklearn β keep/delete | |
| β | |
| captures/{class}/{ts}_{cls}_{conf}.wav | |
| detections.jsonl (append) | |
| ``` | |
| ## Why BirdNET Logits | |
| InsectNet classifiers train on BirdNET's **6,522-dim logit space**, not raw | |
| audio. This is possible because BirdNET v2.4 has 31 Orthoptera species in its | |
| label set β field crickets, tree crickets, conehead katydids, ground crickets, | |
| and meadow katydids. The logit space already encodes insect acoustic structure. | |
| Cicadas are absent from BirdNET's labels, but their acoustic features still | |
| produce distinguishable patterns in the logit space (confirmed by field | |
| validation with cosine similarity against training centroids). | |
| ## Classifier Architecture | |
| All production InsectNet classifiers use: | |
| ``` | |
| StandardScaler β OneVsRest(LogisticRegression(C=0.1, class_weight='balanced')) | |
| ``` | |
| - **StandardScaler** normalizes the 6,522-dim logit vectors | |
| - **OneVsRest** trains one binary classifier per class (sigmoid output) | |
| - **LogisticRegression** with L2 regularization (C=0.1), balanced class weights | |
| This is the same architecture BirdNET uses internally without the softmax β | |
| sigmoid-per-class allows multi-label predictions (one clip can be both | |
| "cicada_drone" and "frog"). | |
| ## Multi-Label Training | |
| Training data format: clips are labeled with lists of active classes, not a | |
| single category. A clip containing overlapping frog and cricket calls is | |
| labeled `["frog", "cricket_katydid"]`. | |
| `MultiLabelBinarizer` converts to an indicator matrix. Per-class | |
| F1-optimized thresholds are swept 0.1-0.95 during evaluation. Each class gets | |
| its own decision threshold. | |
| ## Background Training Data | |
| Background clips come from two sources: | |
| 1. **BirdNET bird clips** β every labeled bird clip is confirmed non-insect | |
| audio from the same microphone and environment. | |
| 2. **Public datasets** (ESC-50 for environmental noise, iNatSounds for | |
| labeled insect audio). | |
| ## Two-Tier System | |
| InsectNet operates at two levels: | |
| | Layer | Runs On | Backbone | Purpose | | |
| |-------|---------|----------|---------| | |
| | **Sidecar** | BirdNET-Pi (Pi 4) | BirdNET TFLite logits | Real-time capture, keeps WAVs | | |
| | **Archive** | Workstation | Perch 2.0 embeddings | Offline enrichment, multi-taxa discovery | | |
| The sidecar is the edge capture system. The archive (separate repo) is the | |
| analysis layer that pulls captures, embeds them with Perch 2.0, and enables | |
| multi-taxa classification. They are complementary. | |
| ## BirdNET Species Coverage | |
| BirdNET v2.4 has 6,522 species labels. Insect-relevant coverage: | |
| | Group | In BirdNET? | Notes | | |
| |-------|-------------|-------| | |
| | 31 Orthoptera (crickets, katydids) | β | Field crickets, tree crickets, coneheads, ground crickets, meadow katydids | | |
| | 0 Cicada species | β | Zero cicada labels β relies on general acoustic features | | |
| | 0 Bee species | β | Zero Hymenoptera labels | | |
| | 0 Grasshopper species | β | Though some Acrididae may trigger Orthoptera channels | | |
| This means 31 logit channels carry insect-class information directly; the | |
| other 6,491 channels may carry incidental insect structure. | |
| ## BirdNET-Pi Access | |
| Default credentials: | |
| - **Host:** 192.168.1.223 | |
| - **User:** birdnetpi / birdnetpi | |
| - **Python:** `/home/birdnetpi/BirdNET-Pi/birdnet/bin/python3` | |
| - **Model:** `/home/birdnetpi/BirdNET-Pi/model/BirdNET_GLOBAL_6K_V2.4_Model_FP16.tflite` | |
| - **StreamData:** `/home/birdnetpi/BirdSongs/StreamData/` | |
| The sidecar expects the TFLite model at `DEFAULT_BIRDNET_MODEL` and StreamData | |
| at `DEFAULT_STREAMDATA` (both configurable via CLI args). | |