BioME: A Resource-Efficient Bioacoustic Foundational Model
BioME (Bioacoustic Modulation-aware Encoder) is a resource-efficient audio encoder designed for bioacoustic applications. BioME is trained via layer-to-layer distillation from a high-capacity teacher model (BEATs), enabling strong representational transfer while significantly reducing the parameter count. To further improve ecological generalization, the model is pretrained on multi-domain data spanning speech, environmental sounds, and animal vocalizations. A key contribution is the integration of modulation-aware acoustic features via FiLM conditioning, injecting a DSP-inspired inductive bias that enhances feature disentanglement in low-capacity regimes.
You can read the full preprint here
Checkpoints
| Model | Parameters | Dim | Layer | Checkpoint |
|---|---|---|---|---|
| BioME Edge | 6M | 192 | 12 | link |
| Biome Small | 26M | 384 | 12 | link |
| Biome Base | 76M | 768 | 12 | link |
🚀 How To Use
Installation
pip install -U transformers
Load Model and Extract Features
import torch
import torchaudio
from transformers import AutoModel
# Load pre-trained model
model = AutoModel.from_pretrained("Hguimaraes/biome_edge_bio", trust_remote_code=True).cuda().eval()
# Load audio and resample to 16kHz
wav, sr = torchaudio.load_audio("path/to/audio") # (batch_size, wav_len)
wav = torchaudio.functional.resample(
wav,
sr,
16000,
lowpass_filter_width=64,
rolloff=0.9475937167399596,
resampling_method="sinc_interp_kaiser",
beta=14.769656459379492,
)
# Extract features
with torch.no_grad():
output = model(wav)
# output["last_hidden_states"]: final output (batch_size, seq_len, encoder_dim)
# output["hidden_states"]: list of 12 elements with (batch_size, seq_len, encoder_dim) tensors (features for each layer)
For more details on the model architecture, please check the file modeling_biome.py
📖 Citation
@article{
}
Acknowledgement
Much of our code base (and even this README.md!) is based on the following repositories:
Thank you so much to the authors!
- Downloads last month
- 16