BattyBirdNET ONNX Models
ONNX-converted regional bat species classifiers from BattyBirdNET-Analyzer, optimized for inference with ONNX Runtime.
These are lightweight classification heads trained on BirdNET v2.4 embeddings using transfer learning. They identify bat species from ultrasonic echolocation recordings.
How It Works
BattyBirdNET uses a two-stage inference pipeline:
- BirdNET v2.4 extracts 1024-dim embeddings from audio
- Regional bat classifier (this repo) maps embeddings to bat species
The "slow-down trick": bat recordings at 256 kHz are fed directly to BirdNET without resampling. BirdNET's spectrogram pipeline (trained on 48 kHz bird audio) treats the samples as 48 kHz, shifting ultrasonic bat calls (20-120 kHz) into the audible frequency range where its learned features work.
Model Variants
Three precision formats are provided for each regional classifier:
| Format | Directory | Use Case |
|---|---|---|
| FP32 | fp32/ |
Standard precision for GPU and desktop CPU |
| FP16 | fp16/ |
Half precision for Raspberry Pi 5, modern GPUs |
| INT8 | int8/ |
Quantized for low-power devices, embedded ARM |
Available Regions
| Region | Model | Species | Coverage |
|---|---|---|---|
| Bavaria | BattyBirdNET-Bavaria-256kHz |
32 | Germany, Central Europe |
| Bavaria (high) | BattyBirdNET-Bavaria-256kHz-high |
24 | Germany, stricter thresholds |
| EU | BattyBirdNET-EU-256kHz |
30 | Broad European coverage |
| Scotland | BattyBirdNET-Scotland-256kHz |
11 | Scotland |
| South Wales | BattyBirdNET-SouthWales-256kHz |
29 | South Wales |
| Sweden | BattyBirdNET-Sweden-256kHz |
23 | Sweden, Nordic |
| UK | BattyBirdNET-UK-256kHz |
20 | United Kingdom |
| USA | BattyBirdNET-USA-256kHz |
38 | United States (full) |
| USA East | BattyBirdNET-USA-EAST-256kHz |
23 | Eastern United States |
| USA East (high) | BattyBirdNET-USA-EAST-256kHz-high |
17 | Eastern US, stricter thresholds |
| USA West | BattyBirdNET-USA-WEST-256kHz |
28 | Western United States |
Model Architecture
- Input:
[batch, 1024]float32 (BirdNET v2.4 embedding vectors) - Output:
[batch, N_species]float32 (raw logits, apply sigmoid for probabilities) - Activation: Sigmoid (same as BirdNET)
- Labels: One species per line in corresponding
_Labels.txtfile
Usage with Birda (Rust CLI)
# Install models
mkdir -p ~/.local/share/birda/models/bat/
cp fp32/BattyBirdNET-Bavaria-256kHz_fp32.onnx ~/.local/share/birda/models/bat/
cp labels/BattyBirdNET-Bavaria-256kHz_Labels.txt ~/.local/share/birda/models/bat/
# Run bat detection
birda analyze -m birdnet-v24-embeddings --bat bavaria bat_recording.wav
See birda for full documentation.
Usage with Python (ONNX Runtime)
import numpy as np
import onnxruntime as ort
# Load BirdNET v2.4 (with embeddings exposed) and bat classifier
birdnet = ort.InferenceSession("birdnet-v24-embeddings.onnx")
bat_model = ort.InferenceSession("fp32/BattyBirdNET-Bavaria-256kHz_fp32.onnx")
# Load 256kHz bat audio (144,000 samples = 0.5625s)
audio = np.random.randn(1, 144000).astype(np.float32) # replace with real audio
# Stage 1: Extract embeddings
outputs = birdnet.run(None, {"input": audio})
embeddings = outputs[1] # [1, 1024]
# Stage 2: Classify bat species
logits = bat_model.run(None, {"input": embeddings})[0]
scores = 1 / (1 + np.exp(-logits)) # sigmoid
# Load labels
with open("labels/BattyBirdNET-Bavaria-256kHz_Labels.txt") as f:
labels = [line.strip() for line in f if line.strip()]
for i, (label, score) in enumerate(zip(labels, scores[0])):
if score > 0.1:
print(f"{label}: {score:.1%}")
Conversion Details
Converted from BattyBirdNET-Analyzer v1.0 TFLite models using birdnet-onnx-converter:
- TFLite to ONNX via tf2onnx (opset 17)
- Optimized with onnxslim (graph simplification, dynamic batching)
- FP16 and INT8 quantization via onnxconverter-common and onnxruntime
Audio Requirements
- Sample rate: 256 kHz (standard for AudioMoth and similar bat detectors)
- Chunk size: 144,000 samples (0.5625 seconds at 256 kHz)
- Overlap: 25% recommended (36,000 samples)
- Format: WAV recommended; FLAC, MP3 also supported
Training Data Sources
The original BattyBirdNET models were trained on data from:
- NABat Machine Learning (North America)
- xeno-canto (bird and bat recordings)
- ChiroVox (public bat call library)
- Animal Sound Archive Berlin (Humboldt University)
License
CC-BY-NC-SA-4.0 (non-commercial use only), following the original BattyBirdNET-Analyzer license.
Citation
If you use these models, please cite:
- BattyBirdNET-Analyzer
- BirdNET-Analyzer (Kahl et al.)
Acknowledgments
- rdz-oss for the original BattyBirdNET-Analyzer and trained models
- BirdNET team at Cornell Lab of Ornithology for the BirdNET v2.4 backbone
- birdnet-onnx-converter for TFLite-to-ONNX conversion and optimization