CatchtheCatcher
Author: Quanxiao Liu, Stockholm University
License: CC-BY-NC-SA 4.0
Overview
CatchtheCatcher is a two-model pipeline for automated detection and population-level classification of pied flycatcher (Ficedula hypoleuca) vocalisations from field recordings. The pipeline accepts audio recordings as input, converts them to spectrograms, and applies the two models sequentially: first to detect vocalisation events, then to assign each detected song to one of eight European breeding populations.
Models
1. CatchtheCatcher.pt β Vocalisation Detection
A YOLOv8-based object detection model fine-tuned to localise pied flycatcher vocalisation events in spectrogram images.
| Property | Detail |
|---|---|
| Architecture | YOLOv8 (fine-tuned from Ultralytics/YOLOv8) |
| Input | 640 Γ 640 px spectrogram images (PNG) |
| Output | Bounding boxes over vocalisation events |
| Training data | Field recordings from Tovetorp, Sweden |
| Training epochs | 100 |
Performance (validation set):
| Metric | Value |
|---|---|
| mAP50 | 94.0% |
| Precision | 88.8% |
| Recall | 88.4% |
All losses (box, cls, dfl) converge smoothly by epoch 100 with no signs of overfitting.
2. piedflycatcher_population-identification.pt β Population Classification
A classification model that assigns detected pied flycatcher songs to one of eight European breeding populations based on spectrogram features.
| Property | Detail |
|---|---|
| Architecture | YOLOv8 (fine-tuned from Ultralytics/YOLOv8) |
| Input | Spectrogram of a detected song segment (640 Γ 640 px) |
| Output | Population label (one of 8 classes) |
| Training data | 200β500 songs per population (see below) |
Populations:
| # | Population / Site | Country |
|---|---|---|
| 1 | Dartmoor | United Kingdom |
| 2 | De Hoge Veluwe | Netherlands |
| 3 | Drenthe | Netherlands |
| 4 | Finland | Finland |
| 5 | La Hiruela | Spain |
| 6 | Lund | Sweden |
| 7 | Tovetorp | Sweden |
| 8 | Valsain | Spain |
Performance: Evaluated via confusion matrix on held-out test songs. Confusion matrix available on request.
Intended Use
This pipeline is intended for researchers studying pied flycatcher biogeography, population structure, and vocal dialects. It may also be used for passive acoustic monitoring of pied flycatcher breeding populations across Europe.
Not intended for: Species identification (the detector assumes recordings contain pied flycatcher vocalisations), real-time deployment, or populations outside the eight training sites without validation.
How to Use
from ultralytics import YOLO
# Step 1: detect vocalisations in a spectrogram
detector = YOLO("CatchtheCatcher.pt")
results = detector("your_spectrogram.png")
# Step 2: classify detected crops into populations
classifier = YOLO("piedflycatcher_population-identification.pt")
# Pass cropped detections from step 1 to the classifier
Spectrograms should be generated at 640 Γ 640 px with the following parameters:
n_fft = 1024 # STFT window length
hop_length = 512 # hop length
window_type = 'hann'
fmin = 100 # Hz
fmax = 10000 # Hz
Citation
If you use CatchtheCatcher in your research, please cite:
Liu, Q. (2025). CatchtheCatcher: Detection and population classification model
for pied flycatcher (Ficedula hypoleuca) vocalisations [Model].
Stockholm University. Hugging Face.
https://huggingface.co/Tempestmars/CatchtheCatcher
License
This model is released under CC-BY-NC-SA 4.0. You are free to share and adapt the material for non-commercial purposes, provided appropriate credit is given and derivatives are shared under the same license.