BioME: A Resource-Efficient Bioacoustic Foundational Model

BioME (Bioacoustic Modulation-aware Encoder) is a resource-efficient audio encoder designed for bioacoustic applications. BioME is trained via layer-to-layer distillation from a high-capacity teacher model (BEATs), enabling strong representational transfer while significantly reducing the parameter count. To further improve ecological generalization, the model is pretrained on multi-domain data spanning speech, environmental sounds, and animal vocalizations. A key contribution is the integration of modulation-aware acoustic features via FiLM conditioning, injecting a DSP-inspired inductive bias that enhances feature disentanglement in low-capacity regimes.

You can read the full preprint here

Checkpoints

Model	Parameters	Dim	Layer	Checkpoint
BioME Edge	6M	192	12	link
Biome Small	26M	384	12	link
Biome Base	76M	768	12	link

🚀 How To Use

Installation

pip install -U transformers

Load Model and Extract Features

import torch
import torchaudio
from transformers import AutoModel

# Load pre-trained model
model = AutoModel.from_pretrained("Hguimaraes/biome_edge_bio", trust_remote_code=True).cuda().eval()

# Load audio and resample to 16kHz
wav, sr = torchaudio.load_audio("path/to/audio")  # (batch_size, wav_len)
wav = torchaudio.functional.resample(
    wav,
    sr,
    16000,
    lowpass_filter_width=64,
    rolloff=0.9475937167399596,
    resampling_method="sinc_interp_kaiser",
    beta=14.769656459379492,
)

# Extract features
with torch.no_grad():
    output = model(wav)

# output["last_hidden_states"]: final output (batch_size, seq_len, encoder_dim)
# output["hidden_states"]:      list of 12 elements with (batch_size, seq_len, encoder_dim) tensors (features for each layer)

For more details on the model architecture, please check the file modeling_biome.py

📖 Citation

@article{guimaraes2026biome,
  title={BioME: A Resource-Efficient Bioacoustic Foundational Model for IoT Applications},
  author={Guimar{\~a}es, Heitor R and Tiwari, Abhishek and Abdollahi, Mahsa and Avila, Anderson R and Falk, Tiago H},
  journal={arXiv preprint arXiv:2602.09970},
  year={2026}
}