leharris3
/

FINCH

Audio Classification

model_hub_mixin

pytorch_model_hub_mixin

Model card Files Files and versions

leharris3 commited on 17 days ago

Commit

75afa95

·

verified ·

1 Parent(s): 796dfc3

Add model card

Files changed (1) hide show

README.md +52 -4

README.md CHANGED Viewed

@@ -4,12 +4,60 @@ pipeline_tag: audio-classification
 tags:
 - model_hub_mixin
 - pytorch_model_hub_mixin
 license: apache-2.0
 language:
 - en
 ---
-This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
-- Code: [More Information Needed]
-- Paper: [More Information Needed]
-- Docs: [More Information Needed]

 tags:
 - model_hub_mixin
 - pytorch_model_hub_mixin
+- bioacoustics
+- bird-species
 license: apache-2.0
 language:
 - en
+datasets:
+- birdclef-2021
 ---
+# FINCH: Adaptive Evidence Weighting for Audio-Spatiotemporal Fusion
+This is the Stage B checkpoint for **FINCH**, a bioacoustic species identification framework that fuses audio classification with spatiotemporal priors from [eBird](https://ebird.org/) abundance data. A frozen [NatureLM-audio](https://huggingface.co/EarthSpeciesProject/NatureLM-audio) encoder is paired with a learned gating network that adaptively weights audio evidence against space-time context.
+- **Paper:** [arXiv:2602.03817](https://arxiv.org/abs/2602.03817)
+- **Code:** [github.com/leharris3/birdnoise](https://github.com/leharris3/birdnoise)
+## Model description
+The Stage B model computes fused logits as:
+```
+final_logits = audio_logits / T + w(a, x, t) * log(prior + eps)
+```
+where `w(a, x, t)` is a small gating MLP conditioned on audio confidence, prior confidence, location, and time-of-year. `T` and `eps` are learned scalars.
+Only the trainable parameters (classifier head, gating network, temperature, epsilon) are stored here (~3 MB). The frozen NatureLM-audio encoder (~8B params) is downloaded separately from [EarthSpeciesProject/NatureLM-audio](https://huggingface.co/EarthSpeciesProject/NatureLM-audio) at load time.
+## How to use
+Requires the [FINCH source code](https://github.com/leharris3/birdnoise) and its dependencies.
+```python
+from Models import HFStageBModel
+model = HFStageBModel.from_pretrained("leharris3/FINCH")
+```
+## Training details
+- **Dataset:** BirdCLEF 2021 (184 species)
+- **Encoder:** NatureLM-audio (frozen)
+- **Stage A:** Linear probe + scalar fusion weight (30 epochs)
+- **Stage B:** Gating network `w(a, x, t)` + learned T, eps (warm-started from Stage A)
+- **Best val accuracy:** 82.0%
+## Citation
+```bibtex
+@article{ovanger2026adaptive,
+    title   = {Adaptive Evidence Weighting for Audio-Spatiotemporal Fusion},
+    author  = {Oscar Ovanger and Levi Harris and Timothy H. Keitt},
+    journal = {arXiv preprint arXiv:2602.03817},
+    year    = {2026},
+    url     = {https://arxiv.org/abs/2602.03817}
+}
+```