| --- |
| license: cc-by-nc-sa-4.0 |
| tags: |
| - audio |
| - audio-classification |
| - bioacoustics |
| - birds |
| - birdnet |
| - onnx |
| library_name: onnx |
| pipeline_tag: audio-classification |
| --- |
| |
| # BirdNET v2.4 (GLOBAL 6K) - ONNX variants |
|
|
| ONNX builds of the **BirdNET GLOBAL 6K V2.4** bird sound classifier, optimized for |
| edge deployment in [BirdNET-Go](https://github.com/tphakala/birdnet-go). This repo holds |
| the precision/backend variants; the stock upstream TFLite model is unchanged and not |
| re-hosted here. |
|
|
| > **Powered by BirdNET (https://birdnet.cornell.edu/)** |
| > |
| > BirdNET is developed by the K. Lisa Yang Center for Conservation Bioacoustics at the |
| > Cornell Lab of Ornithology and Chemnitz University of Technology. These ONNX files are |
| > derived from the upstream BirdNET v2.4 model. Attribution to BirdNET is a hard license |
| > requirement: do not strip it. |
|
|
| ## Model summary |
|
|
| - **Classes:** 6,522 species (scientific + common name, see `labels.txt`) |
| - **Sample rate:** 48 kHz |
| - **Clip length:** 3 s (raw PCM waveform) |
| - **Input tensor:** `input`, `float32`, shape `[batch, 144000]` (3 s x 48 kHz) |
| - **Output tensor:** `output`, `float32`, shape `[batch, 6522]` (per-class logits; apply |
| sigmoid for confidence scores in `[0, 1]`) |
|
|
| The two variants share an identical input/output interface, so they are drop-in |
| replacements for one another. |
|
|
| ## Variants |
|
|
| | File | Precision | Size | Backend / target | Notes | |
| | --- | --- | --- | --- | --- | |
| | `BirdNET_v2.4_int8_arm.onnx` | INT8 (MatMul-only) + FP32 conv | ~47 MB | ONNX Runtime on ARM / low-RAM CPU | Dynamic INT8 applied only to the 1024x6522 classification head; the CNN backbone stays FP32. ~98% top-1 agreement vs FP32. The recommended low-RAM CPU build. | |
| | `BirdNET_v2.4_fp32.onnx` | FP32 | ~62 MB | OpenVINO (and full-precision reference) | Canonical full-precision master. Under OpenVINO it runs at f16 or f32 via `INFERENCE_PRECISION_HINT`. | |
|
|
| ### Precision notes |
|
|
| - **CPU / ARM:** use `int8_arm`. Full all-ops INT8 (ConvInteger) is *not* shipped: it |
| breaks accuracy (~34% top-1) and has no fast ARM kernel. Only MatMul-only quantization |
| of the head is accuracy-safe. |
| - **OpenVINO:** use `fp32`. The empty `INFERENCE_PRECISION_HINT` resolves to f16 on |
| fp16-capable hardware (A76 NEON, AVX512-FP16) and to f32 elsewhere. **Force |
| `INFERENCE_PRECISION_HINT=FP32` on GPU**, where f16 miscompiles. |
| - f16 is intentionally not provided as a separate file: OpenVINO derives it from the FP32 |
| master via the precision hint, and on CPU f16 uses *more* RAM than fp32 (the runtime |
| up-converts f16 weights to f32 at load). |
|
|
| > Note: this is the **bird classifier**. The BirdNET v2.4 backbone is also used as an |
| > embedding extractor for bat detection; that embedding model lives separately at |
| > [`tphakala/BattyBirdNET-onnx`](https://huggingface.co/tphakala/BattyBirdNET-onnx) and |
| > must stay FP32 (its raw embedding output overflows at f16). |
|
|
| ## Labels |
|
|
| `labels.txt` has 6,522 lines, one per class, in BirdNET order. Format is |
| `Scientific name_Common name`, for example: |
|
|
| ``` |
| Abroscopus albogularis_Rufous-faced Warbler |
| ``` |
|
|
| Output index `i` corresponds to line `i` of `labels.txt`. |
|
|
| ## Usage (ONNX Runtime, Python) |
|
|
| ```python |
| import numpy as np, onnxruntime as ort |
| |
| sess = ort.InferenceSession("BirdNET_v2.4_int8_arm.onnx") |
| |
| # 3 s of 48 kHz mono PCM as float32, shape [1, 144000] |
| audio = np.zeros((1, 144000), dtype=np.float32) |
| |
| logits = sess.run(["output"], {"input": audio})[0] # [1, 6522] |
| conf = 1.0 / (1.0 + np.exp(-logits)) # sigmoid -> [0, 1] |
| labels = open("labels.txt").read().splitlines() |
| top = conf[0].argmax() |
| print(labels[top], float(conf[0, top])) |
| ``` |
|
|
| ## Checksums |
|
|
| See `SHA256SUMS`. |
|
|
| ## License |
|
|
| BirdNET v2.4 is distributed under **CC BY-NC-SA 4.0** (non-commercial, share-alike, |
| attribution required). See `LICENSE` and keep the BirdNET attribution above with any use |
| or redistribution. |
|
|
| ## Source |
|
|
| - Upstream: [birdnet-team/BirdNET-Analyzer](https://github.com/birdnet-team/BirdNET-Analyzer) |
| - ONNX conversion + quantization recipes: [tphakala/birdnet-go](https://github.com/tphakala/birdnet-go) |
|
|