tphakala commited on
Commit
d016ed1
·
verified ·
1 Parent(s): 5cd8232

Add BirdNET v2.4 ONNX variants (int8_arm, fp32), labels, model card, license, checksums

Browse files
Files changed (6) hide show
  1. BirdNET_v2.4_fp32.onnx +3 -0
  2. BirdNET_v2.4_int8_arm.onnx +3 -0
  3. LICENSE +25 -0
  4. README.md +105 -0
  5. SHA256SUMS +3 -0
  6. labels.txt +0 -0
BirdNET_v2.4_fp32.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f49b686838f62fc8c5bcb4364cd514c64882f6fd666c204aa5d4eb80a7795264
3
+ size 62269581
BirdNET_v2.4_int8_arm.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6a2323fe8f99c2af4bb325cf8f11b52748a86aa1f2c9e706c1ae7f907289521a
3
+ size 46886045
LICENSE ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ BirdNET v2.4 (GLOBAL 6K) - License
2
+ ==================================
3
+
4
+ These model files are derived from BirdNET v2.4, developed by the K. Lisa Yang Center
5
+ for Conservation Bioacoustics (Cornell Lab of Ornithology) and Chemnitz University of
6
+ Technology.
7
+
8
+ Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0
9
+ International License (CC BY-NC-SA 4.0).
10
+
11
+ Full license text: https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode
12
+ Summary: https://creativecommons.org/licenses/by-nc-sa/4.0/
13
+
14
+ You are free to share and adapt the material under these terms:
15
+
16
+ - Attribution - You must give appropriate credit to BirdNET and indicate if changes
17
+ were made. Required credit, displayed wherever the model is used:
18
+
19
+ Powered by BirdNET (https://birdnet.cornell.edu/)
20
+
21
+ - NonCommercial - You may not use the material for commercial purposes.
22
+ - ShareAlike - If you remix, transform, or build upon the material, you must
23
+ distribute your contributions under the same license.
24
+
25
+ Attribution to BirdNET is a hard requirement of this license and must not be removed.
README.md ADDED
@@ -0,0 +1,105 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-sa-4.0
3
+ tags:
4
+ - audio
5
+ - audio-classification
6
+ - bioacoustics
7
+ - birds
8
+ - birdnet
9
+ - onnx
10
+ library_name: onnx
11
+ pipeline_tag: audio-classification
12
+ ---
13
+
14
+ # BirdNET v2.4 (GLOBAL 6K) - ONNX variants
15
+
16
+ ONNX builds of the **BirdNET GLOBAL 6K V2.4** bird sound classifier, optimized for
17
+ edge deployment in [BirdNET-Go](https://github.com/tphakala/birdnet-go). This repo holds
18
+ the precision/backend variants; the stock upstream TFLite model is unchanged and not
19
+ re-hosted here.
20
+
21
+ > **Powered by BirdNET (https://birdnet.cornell.edu/)**
22
+ >
23
+ > BirdNET is developed by the K. Lisa Yang Center for Conservation Bioacoustics at the
24
+ > Cornell Lab of Ornithology and Chemnitz University of Technology. These ONNX files are
25
+ > derived from the upstream BirdNET v2.4 model. Attribution to BirdNET is a hard license
26
+ > requirement: do not strip it.
27
+
28
+ ## Model summary
29
+
30
+ - **Classes:** 6,522 species (scientific + common name, see `labels.txt`)
31
+ - **Sample rate:** 48 kHz
32
+ - **Clip length:** 3 s (raw PCM waveform)
33
+ - **Input tensor:** `input`, `float32`, shape `[batch, 144000]` (3 s x 48 kHz)
34
+ - **Output tensor:** `output`, `float32`, shape `[batch, 6522]` (per-class logits; apply
35
+ sigmoid for confidence scores in `[0, 1]`)
36
+
37
+ The two variants share an identical input/output interface, so they are drop-in
38
+ replacements for one another.
39
+
40
+ ## Variants
41
+
42
+ | File | Precision | Size | Backend / target | Notes |
43
+ | --- | --- | --- | --- | --- |
44
+ | `BirdNET_v2.4_int8_arm.onnx` | INT8 (MatMul-only) + FP32 conv | ~47 MB | ONNX Runtime on ARM / low-RAM CPU | Dynamic INT8 applied only to the 1024x6522 classification head; the CNN backbone stays FP32. ~98% top-1 agreement vs FP32. The recommended low-RAM CPU build. |
45
+ | `BirdNET_v2.4_fp32.onnx` | FP32 | ~62 MB | OpenVINO (and full-precision reference) | Canonical full-precision master. Under OpenVINO it runs at f16 or f32 via `INFERENCE_PRECISION_HINT`. |
46
+
47
+ ### Precision notes
48
+
49
+ - **CPU / ARM:** use `int8_arm`. Full all-ops INT8 (ConvInteger) is *not* shipped: it
50
+ breaks accuracy (~34% top-1) and has no fast ARM kernel. Only MatMul-only quantization
51
+ of the head is accuracy-safe.
52
+ - **OpenVINO:** use `fp32`. The empty `INFERENCE_PRECISION_HINT` resolves to f16 on
53
+ fp16-capable hardware (A76 NEON, AVX512-FP16) and to f32 elsewhere. **Force
54
+ `INFERENCE_PRECISION_HINT=FP32` on GPU**, where f16 miscompiles.
55
+ - f16 is intentionally not provided as a separate file: OpenVINO derives it from the FP32
56
+ master via the precision hint, and on CPU f16 uses *more* RAM than fp32 (the runtime
57
+ up-converts f16 weights to f32 at load).
58
+
59
+ > Note: this is the **bird classifier**. The BirdNET v2.4 backbone is also used as an
60
+ > embedding extractor for bat detection; that embedding model lives separately at
61
+ > [`tphakala/BattyBirdNET-onnx`](https://huggingface.co/tphakala/BattyBirdNET-onnx) and
62
+ > must stay FP32 (its raw embedding output overflows at f16).
63
+
64
+ ## Labels
65
+
66
+ `labels.txt` has 6,522 lines, one per class, in BirdNET order. Format is
67
+ `Scientific name_Common name`, for example:
68
+
69
+ ```
70
+ Abroscopus albogularis_Rufous-faced Warbler
71
+ ```
72
+
73
+ Output index `i` corresponds to line `i` of `labels.txt`.
74
+
75
+ ## Usage (ONNX Runtime, Python)
76
+
77
+ ```python
78
+ import numpy as np, onnxruntime as ort
79
+
80
+ sess = ort.InferenceSession("BirdNET_v2.4_int8_arm.onnx")
81
+
82
+ # 3 s of 48 kHz mono PCM as float32, shape [1, 144000]
83
+ audio = np.zeros((1, 144000), dtype=np.float32)
84
+
85
+ logits = sess.run(["output"], {"input": audio})[0] # [1, 6522]
86
+ conf = 1.0 / (1.0 + np.exp(-logits)) # sigmoid -> [0, 1]
87
+ labels = open("labels.txt").read().splitlines()
88
+ top = conf[0].argmax()
89
+ print(labels[top], float(conf[0, top]))
90
+ ```
91
+
92
+ ## Checksums
93
+
94
+ See `SHA256SUMS`.
95
+
96
+ ## License
97
+
98
+ BirdNET v2.4 is distributed under **CC BY-NC-SA 4.0** (non-commercial, share-alike,
99
+ attribution required). See `LICENSE` and keep the BirdNET attribution above with any use
100
+ or redistribution.
101
+
102
+ ## Source
103
+
104
+ - Upstream: [birdnet-team/BirdNET-Analyzer](https://github.com/birdnet-team/BirdNET-Analyzer)
105
+ - ONNX conversion + quantization recipes: [tphakala/birdnet-go](https://github.com/tphakala/birdnet-go)
SHA256SUMS ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ 6a2323fe8f99c2af4bb325cf8f11b52748a86aa1f2c9e706c1ae7f907289521a BirdNET_v2.4_int8_arm.onnx
2
+ f49b686838f62fc8c5bcb4364cd514c64882f6fd666c204aa5d4eb80a7795264 BirdNET_v2.4_fp32.onnx
3
+ 487937b6ad132b8506215523209a87b86adc9dce8e5ed3048ce9268189dddd3d labels.txt
labels.txt ADDED
The diff for this file is too large to render. See raw diff