TheVortexProject commited on
Commit
0e7b80b
·
verified ·
1 Parent(s): 2002d61

Initial upload: 6-class BirdNET-logit classifier, model card, docs

Browse files
Files changed (6) hide show
  1. README.md +109 -0
  2. architecture.md +122 -0
  3. classifier.joblib +3 -0
  4. deploy.sh +45 -0
  5. thresholds.md +63 -0
  6. validation.md +69 -0
README.md ADDED
@@ -0,0 +1,109 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-sa-4.0
3
+ library_name: sklearn
4
+ tags:
5
+ - bioacoustics
6
+ - insect-classification
7
+ - birdnet
8
+ - edge-ai
9
+ - raspberry-pi
10
+ - non-commercial
11
+ datasets:
12
+ - InsectSet459
13
+ - iNatSounds
14
+ - ESC-50
15
+ ---
16
+
17
+ # InsectNet
18
+
19
+ A BirdNET-Pi sidecar that classifies insect sounds in real time.
20
+ **Research prototype — active development.**
21
+
22
+ ## What It Is
23
+
24
+ InsectNet is a lightweight sklearn head trained on BirdNET's 6,522-dim logit
25
+ space. It runs alongside BirdNET-Pi on a Raspberry Pi, watches the audio
26
+ stream, and sorts captured WAVs into acoustic classes.
27
+
28
+ The architecture is simple: StandardScaler → OneVsRest(LogisticRegression).
29
+ Nothing novel — the interesting part is that BirdNET's logit space encodes
30
+ insect acoustic structure well enough that a linear probe works for several
31
+ classes.
32
+
33
+ ## What's Validated
34
+
35
+ Field validation at Pine Hollow, Tennessee (35.8565, -83.3744):
36
+
37
+ | Class | Status | Confidence (field) | Notes |
38
+ |-------|--------|-------------------|-------|
39
+ | background | Production | N/A | 0.984 F1, 1,669 public clips + field negatives |
40
+ | cicada_drone | Working | 83-100% | Natural capture at 83%, playback at 99-100%. AC unit false positive at 92%. |
41
+ | frog | Working | 51-99% | Natural chorus confirmed. 440+ captures in one evening, two species identified. |
42
+ | cricket_katydid | Likely working | 99+% | Playback at 100%. Natural summer data pending. |
43
+ | grasshopper | Data-limited | TBD | 183 training clips, 0.701 F1. Not production-ready. |
44
+ | bee | Untrained | TBD | 43 training clips, 0.608 F1. No real field captures. Known false positives from weed whacker and night noise. |
45
+
46
+ ## What It's Not
47
+
48
+ This is not a finished product. It's a working research prototype that has
49
+ been field-tested enough to know it catches real insects — and also catches
50
+ enough false positives to know it shouldn't be trusted blindly.
51
+
52
+ - The F1 numbers are from cross-validation on public training data, not from
53
+ field deployment. Actual performance varies with environment, mic placement,
54
+ and insect proximity.
55
+ - All threshold tuning was done over one month at a single location.
56
+ - Grasshopper and bee classes need substantially more training data before
57
+ they can be used without human review.
58
+
59
+ ## Known Limitations
60
+
61
+ - **BirdNET dependency.** The classifier requires BirdNET's TFLite model to
62
+ extract logits. Without BirdNET, the classifier can't run.
63
+ - **Mic placement.** The outdoor mic at Pine Hollow is upward-facing for birds.
64
+ Ground-level insect sounds must be loud to reach it.
65
+ - **No cicada species channels.** BirdNET has zero cicada labels. Cicada
66
+ detection relies on general acoustic features in the BirdNET embedding space.
67
+ - **False positives.** AC units → cicada_drone (92%). Weed whackers → bee
68
+ (98%). Night noise → bee (50-70%).
69
+ - **All BirdNET species IDs are approximate.** BirdNET maps to the closest
70
+ species in its 6,522-label set, which may not be the true species.
71
+
72
+ ## How to Use
73
+
74
+ The classifier alone isn't useful standalone — it needs BirdNET's TFLite
75
+ model to produce logits. The full capture pipeline lives on GitHub:
76
+
77
+ [https://github.com/vortexpjeff/insectnet](https://github.com/vortexpjeff/insectnet)
78
+
79
+ ```python
80
+ # After extracting BirdNET logits (6,522-dim vector):
81
+ import joblib
82
+ clf = joblib.load("classifier.joblib")
83
+ X = clf["scaler"].transform(logits.reshape(1, -1))
84
+ scores = clf["classifier"].predict_proba(X)[0]
85
+ for i, cls in enumerate(clf["classes"]):
86
+ print(f"{cls}: {scores[i]*100:.1f}%")
87
+ ```
88
+
89
+ ## Training Data
90
+
91
+ | Source | Clips | License | Content |
92
+ |--------|-------|---------|---------|
93
+ | InsectSet459 | ~1,800 | CC BY-NC-SA 4.0 | 459 insect species, primarily Orthoptera |
94
+ | iNatSounds | ~1,041 | CC BY-NC 4.0 | iNaturalist insect observations |
95
+ | ESC-50 | 1,519 | CC BY-NC 4.0 | Environmental noise (background class) |
96
+ | Pine Hollow field | 38 (unreviewed) | CC BY-NC-SA 4.0 | Natural captures from Pi sidecar |
97
+
98
+ All training data and the BirdNET backbone are non-commercial. Derivative
99
+ classifiers must use a compatible license.
100
+
101
+ ## Project Status
102
+
103
+ Actively developed. Summer 2026 is the primary field data collection window
104
+ for improving grasshopper, bee, and cricket classes. New captures are being
105
+ accumulated continuously via the BirdNET-Pi sidecar.
106
+
107
+ ## License
108
+
109
+ CC BY-NC-SA 4.0 — See LICENSE file.
architecture.md ADDED
@@ -0,0 +1,122 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Architecture
2
+
3
+ How InsectNet integrates with BirdNET-Pi and why it's designed this way.
4
+
5
+ ## BirdNET-Pi Model
6
+
7
+ BirdNET-Pi uses a **socket-based client-server architecture** for audio analysis:
8
+
9
+ ```
10
+ arecord (15s WAV → StreamData/)
11
+ └→ birdnet_analysis.sh (shell loop)
12
+ └→ analyze.py (socket client on port 5050)
13
+ └→ BirdNET-Lite server loads WAV, runs TFLite, returns CSV
14
+ └→ detection: WAV → Extracted/By_Date/{species}/
15
+ └→ no detection: WAV deleted
16
+ ```
17
+
18
+ Key design patterns InsectNet mirrors:
19
+ - **Binary WAV lifecycle** — every WAV is processed once. Keep or delete, no middle state.
20
+ - **Detection-only persistence** — non-detections produce zero artifacts.
21
+ - **Shell-based orchestration** — each service is an independent systemd unit.
22
+
23
+ ## InsectNet's Role
24
+
25
+ InsectNet is a **read-only sidecar**. It never touches BirdNET-Pi's files — it
26
+ reads StreamData/ via inotify and copies WAVs to its own directory before
27
+ BirdNET-Pi deletes them.
28
+
29
+ ```
30
+ StreamData/ (new WAV)
31
+
32
+ ├──→ BirdNET-Lite (port 5050) → CSV → keep/delete
33
+
34
+ └──→ InsectNet inotify → copy WAV → librosa → TFLite → logits → sklearn → keep/delete
35
+
36
+ captures/{class}/{ts}_{cls}_{conf}.wav
37
+ detections.jsonl (append)
38
+ ```
39
+
40
+ ## Why BirdNET Logits
41
+
42
+ InsectNet classifiers train on BirdNET's **6,522-dim logit space**, not raw
43
+ audio. This is possible because BirdNET v2.4 has 31 Orthoptera species in its
44
+ label set — field crickets, tree crickets, conehead katydids, ground crickets,
45
+ and meadow katydids. The logit space already encodes insect acoustic structure.
46
+
47
+ Cicadas are absent from BirdNET's labels, but their acoustic features still
48
+ produce distinguishable patterns in the logit space (confirmed by field
49
+ validation with cosine similarity against training centroids).
50
+
51
+ ## Classifier Architecture
52
+
53
+ All production InsectNet classifiers use:
54
+
55
+ ```
56
+ StandardScaler → OneVsRest(LogisticRegression(C=0.1, class_weight='balanced'))
57
+ ```
58
+
59
+ - **StandardScaler** normalizes the 6,522-dim logit vectors
60
+ - **OneVsRest** trains one binary classifier per class (sigmoid output)
61
+ - **LogisticRegression** with L2 regularization (C=0.1), balanced class weights
62
+
63
+ This is the same architecture BirdNET uses internally without the softmax —
64
+ sigmoid-per-class allows multi-label predictions (one clip can be both
65
+ "cicada_drone" and "frog").
66
+
67
+ ## Multi-Label Training
68
+
69
+ Training data format: clips are labeled with lists of active classes, not a
70
+ single category. A clip containing overlapping frog and cricket calls is
71
+ labeled `["frog", "cricket_katydid"]`.
72
+
73
+ `MultiLabelBinarizer` converts to an indicator matrix. Per-class
74
+ F1-optimized thresholds are swept 0.1-0.95 during evaluation. Each class gets
75
+ its own decision threshold.
76
+
77
+ ## Background Training Data
78
+
79
+ Background clips come from two sources:
80
+ 1. **BirdNET bird clips** — every labeled bird clip is confirmed non-insect
81
+ audio from the same microphone and environment.
82
+ 2. **Public datasets** (ESC-50 for environmental noise, iNatSounds for
83
+ labeled insect audio).
84
+
85
+ ## Two-Tier System
86
+
87
+ InsectNet operates at two levels:
88
+
89
+ | Layer | Runs On | Backbone | Purpose |
90
+ |-------|---------|----------|---------|
91
+ | **Sidecar** | BirdNET-Pi (Pi 4) | BirdNET TFLite logits | Real-time capture, keeps WAVs |
92
+ | **Archive** | Workstation | Perch 2.0 embeddings | Offline enrichment, multi-taxa discovery |
93
+
94
+ The sidecar is the edge capture system. The archive (separate repo) is the
95
+ analysis layer that pulls captures, embeds them with Perch 2.0, and enables
96
+ multi-taxa classification. They are complementary.
97
+
98
+ ## BirdNET Species Coverage
99
+
100
+ BirdNET v2.4 has 6,522 species labels. Insect-relevant coverage:
101
+
102
+ | Group | In BirdNET? | Notes |
103
+ |-------|-------------|-------|
104
+ | 31 Orthoptera (crickets, katydids) | ✅ | Field crickets, tree crickets, coneheads, ground crickets, meadow katydids |
105
+ | 0 Cicada species | ❌ | Zero cicada labels — relies on general acoustic features |
106
+ | 0 Bee species | ❌ | Zero Hymenoptera labels |
107
+ | 0 Grasshopper species | ❌ | Though some Acrididae may trigger Orthoptera channels |
108
+
109
+ This means 31 logit channels carry insect-class information directly; the
110
+ other 6,491 channels may carry incidental insect structure.
111
+
112
+ ## BirdNET-Pi Access
113
+
114
+ Default credentials:
115
+ - **Host:** 192.168.1.223
116
+ - **User:** birdnetpi / birdnetpi
117
+ - **Python:** `/home/birdnetpi/BirdNET-Pi/birdnet/bin/python3`
118
+ - **Model:** `/home/birdnetpi/BirdNET-Pi/model/BirdNET_GLOBAL_6K_V2.4_Model_FP16.tflite`
119
+ - **StreamData:** `/home/birdnetpi/BirdSongs/StreamData/`
120
+
121
+ The sidecar expects the TFLite model at `DEFAULT_BIRDNET_MODEL` and StreamData
122
+ at `DEFAULT_STREAMDATA` (both configurable via CLI args).
classifier.joblib ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5e6ecfc68d78a2cf2e9e9e47da5cb58d696e8de354fd620cfcccc5db9da48702
3
+ size 474892
deploy.sh ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # deploy.sh — Deploy InsectNet sidecar + classifier to a BirdNET-Pi
3
+ #
4
+ # Usage:
5
+ # ./scripts/deploy.sh # Deploy to default Pi (192.168.1.223)
6
+ # ./scripts/deploy.sh pi@192.168.1.50 # Deploy to a different BirdNET-Pi
7
+ # ./scripts/deploy.sh --model 3class.joblib # Deploy a different model
8
+ #
9
+ # This copies:
10
+ # src/insectnet/capture.py → ~/insectnet_capture/insectnet_capture.py
11
+ # src/insectnet/birdnet.py → ~/insectnet_capture/birdnet.py
12
+ # models/*.joblib → ~/insectnet_capture/classifier.joblib
13
+
14
+ set -euo pipefail
15
+
16
+ PI_HOST="${1:-birdnetpi@192.168.1.223}"
17
+ MODEL_SRC="${2:-models/6class.joblib}"
18
+ CAPTURE_DIR="~/insectnet_capture"
19
+
20
+ SCRIPT_DIR="$(cd "$(dirname "$0")/.." && pwd)"
21
+
22
+ echo "=== Deploying InsectNet to ${PI_HOST} ==="
23
+ echo " Model: ${MODEL_SRC}"
24
+ echo " Target: ${CAPTURE_DIR}"
25
+ echo ""
26
+
27
+ # Create remote dir
28
+ ssh "${PI_HOST}" "mkdir -p ${CAPTURE_DIR}"
29
+
30
+ # Deploy capture and birdnet modules
31
+ scp "${SCRIPT_DIR}/src/insectnet/capture.py" "${PI_HOST}:${CAPTURE_DIR}/insectnet_capture.py"
32
+ scp "${SCRIPT_DIR}/src/insectnet/birdnet.py" "${PI_HOST}:${CAPTURE_DIR}/birdnet.py"
33
+ echo " ✓ capture.py deployed"
34
+
35
+ # Deploy classifier
36
+ scp "${SCRIPT_DIR}/${MODEL_SRC}" "${PI_HOST}:${CAPTURE_DIR}/classifier.joblib"
37
+ echo " ✓ classifier deployed ($(basename ${MODEL_SRC}))"
38
+
39
+ echo ""
40
+ echo "=== Deploy complete ==="
41
+ echo ""
42
+ echo "Start the sidecar on the Pi:"
43
+ echo " ssh ${PI_HOST}"
44
+ echo " cd ${CAPTURE_DIR}"
45
+ echo " python3 insectnet_capture.py --threshold 0.3 --show"
thresholds.md ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Confidence Thresholds
2
+
3
+ Per-class confidence guidance for interpreting InsectNet predictions.
4
+ **Thresholds are class-specific** — a universal cutoff produces both
5
+ false positives and false negatives.
6
+
7
+ ## Current Guidance
8
+
9
+ | Class | Recommended Threshold | Status | Notes |
10
+ |-------|----------------------|--------|-------|
11
+ | cicada_drone | 0.50 | Confirmed | Natural capture at 83%, AC false positive at 92% — use RMS to disambiguate |
12
+ | frog | 0.50 | Confirmed | Validated at 51%; chorus peaks at 80-99%. RMS >0.004 increases confidence |
13
+ | cricket_katydid | 0.50 | Tentative | Summer chorus data needed for natural threshold |
14
+ | grasshopper | 0.80 | Data-limited | Only 183 training clips; most captures likely noise |
15
+ | bee | 0.80 | Data-limited | Only 43 training clips; night detections are false positives |
16
+
17
+ ## How Thresholds Were Determined
18
+
19
+ ### Cicada (83% natural confirmed)
20
+
21
+ The first natural cicada capture scored 83% with RMS 0.009. Playback tests
22
+ hit 99-100%. However, an AC window unit also scores 92% on this class with
23
+ background <2%. The 80%+ range includes both real cicadas and false positives.
24
+
25
+ **Disambiguation:** RMS. Real cicada at 83% had RMS 0.009. AC at 92% had
26
+ RMS 0.02+. A high-confidence cicada detection with high RMS may be mechanical.
27
+
28
+ ### Frog (51% natural confirmed)
29
+
30
+ A frog detected at only 51% was confirmed real by the user. The frog chorus
31
+ ranged from 55% (early evening, quiet) to 99.97% (peak chorus, loud).
32
+
33
+ **Guidance:** Any capture above 50% frog with RMS >0.004 is worth review.
34
+ Loud captures (RMS >0.015) at 80-99% are highly likely real.
35
+
36
+ ### Bee (no natural data)
37
+
38
+ The bee class has no known real detections. All captures to date are false
39
+ positives: weed whacker at 98%, night ambient noise at 50-70%. The true bee
40
+ threshold cannot be established without natural captures.
41
+
42
+ **Guidance:** Discard bee detections at night. Treat day detections below
43
+ 80% as noise.
44
+
45
+ ## Production vs Testing Threshold
46
+
47
+ | Context | Threshold | Rationale |
48
+ |---------|-----------|-----------|
49
+ | **Production capture** | 0.30 default | Favor recall — a few false positives are cheaper than missed detections |
50
+ | **Automated decision** | 0.80 minimum | Only act on high-confidence predictions without human review |
51
+ | **Research / scanning** | 0.30 | Cast a wide net; review all uncertain captures manually |
52
+
53
+ The 0.30 default is conservative for capture. It produces some uncertain
54
+ clips but the cost of missed insects is higher than the cost of reviewing
55
+ a few extra WAVs.
56
+
57
+ ## RMS Noise Floor
58
+
59
+ The sidecar skips inference for WAVs below the RMS noise floor (default
60
+ 0.002). This was calibrated from Pine Hollow ambient measurements and
61
+ corresponds to quiet background with no detectable acoustic activity.
62
+
63
+ Playback sessions may need a lower floor (0.001) for quiet phone speakers.
validation.md ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Field Validation
2
+
3
+ InsectNet field validation results from Pine Hollow, Tennessee (35.8565, -83.3744).
4
+ All detections are human-confirmed unless marked as playback.
5
+
6
+ ## Confirmed Natural Detections
7
+
8
+ | Date | Class | Conf. | Species | Method |
9
+ |------|-------|-------|---------|--------|
10
+ | May 30 | cicada_drone | 83% | Neotibicen/Megatibicen | User confirmed by ear |
11
+ | May 30 | frog | 51% | Cope's Gray Treefrog | User heard outside + WAV confirmed |
12
+ | May 30 | frog | 80-99.97% | Cope's Gray Treefrog + Eastern Narrow-mouthed Toad | User confirmed chorus, two species |
13
+ | May 31 | cricket_katydid | 99% | Field cricket | User confirmed by ear |
14
+
15
+ ## Confirmed Playback Detections
16
+
17
+ | Date | Class | Conf. | Species | Notes |
18
+ |------|-------|-------|---------|-------|
19
+ | May 29 | cicada_drone | 100% | Neotibicen lyricen | Phone playback at mic |
20
+ | May 29 | frog | 64.5% | American Toad | Phone playback, BirdNET cross-validated |
21
+ | May 29 | frog | 99.2% | Gray Treefrog (Dryophytes spp.) | Phone playback |
22
+ | May 29 | cricket_katydid | 100% | Gryllus campestris | Phone playback; BirdNET misidentified as G. fultoni |
23
+
24
+ ## Known False Positives
25
+
26
+ | Source | Class Triggered | Confidence | Notes |
27
+ |--------|----------------|------------|-------|
28
+ | AC window unit | cicada_drone | Up to 92.3% | Background <2% — very confident wrong answer |
29
+ | Weed whacker | bee | 98.1% | User confirmed |
30
+ | Night ambient noise | bee | 50-70% | Temporal filter needed — bees don't fly at night |
31
+
32
+ ## Key Validations
33
+
34
+ ### Frog Chorus (May 30)
35
+
36
+ The first sustained natural capture event. ~440 frog detections over 2.5 hours
37
+ (21:00-23:30). Two clear phases:
38
+
39
+ 1. **Early evening (17:25-18:35):** Individual frogs at 55-65% confidence,
40
+ RMS 0.003-0.008
41
+ 2. **Chorus peak (21:00-23:00):** Sustained 80-99.97% confidence,
42
+ RMS 0.01-0.08
43
+
44
+ Time-resolved BirdNET logit analysis identified two species calling
45
+ simultaneously: Cope's Gray Treefrog (dominant, throughout) and Eastern
46
+ Narrow-mouthed Toad (secondary, second half).
47
+
48
+ ### First Natural Cicada (May 30, 06:59)
49
+
50
+ After ~14 hours running unattended through overnight rain, the sidecar
51
+ captured a cicada at 83% confidence (RMS 0.009). The user confirmed it
52
+ sounded like a genuine cicada. Cosine similarity against training centroids
53
+ matched Neotibicen/Megatibicen (0.986).
54
+
55
+ ### First Low-Confidence Frog (May 30, 20:08)
56
+
57
+ A frog detected at only 51% confidence (RMS 0.004) was confirmed real —
58
+ the user heard the frog near the mic and confirmed the WAV. This invalidated
59
+ the hypothesis that real detections always clear 80% and established
60
+ class-dependent thresholds.
61
+
62
+ ## Validation vs Testing
63
+
64
+ - **Testing** confirms the pipeline works: inotify fires, WAVs get processed,
65
+ files end up in the right places. Confidence scores are not evidence.
66
+ - **Validation** requires human listening and explicit species identification.
67
+ Only validated captures qualify as training data.
68
+
69
+ All detections listed above are validated.