EarthSpeciesProject

@@ -1,11 +1,11 @@
 ---
 license: cc-by-nc-sa-4.0
 tags:
-- EarthSpeciesProject
-- AVEX
-- Bioacoustics
-- RepresentationLearning
-- EfficientNet
 ---
 # Model Card for esp-aves2-effnetb0-bio
@@ -25,9 +25,9 @@ esp-aves2-effnetb0-bio is a **supervised bioacoustic encoder** trained to produc
 ### Model Sources
-- **Repository:** `https://github.com/earthspecies/esp-avex`
 - **Paper:** [What Matters for Bioacoustic Encoding](https://arxiv.org/abs/2508.11845)
-- **Hugging Face Model:** `TBA`
 - **Configuration:** [train_config.yaml](train_config.yaml)
 ### Parent Models
@@ -59,30 +59,75 @@ Not a generative model; does not output text.
 ## How to Get Started with the Model
-Loading this model requires the AVEX (Animal Vocalization Encoder) library `esp-avex` to be installed.
 ### Installation
-See [https://github.com/earthspecies/esp-avex](https://github.com/earthspecies/esp-avex) for installation instructions.
 ### Loading the Model
 ```python
 from avex import load_model
-model = load_model("esp-aves2-effnetb0-bio", device="cuda")
 ```
 ### Using the Model
 ```python
 # Case 1: embedding extraction (features only)
-backbone = load_model("esp-aves2-effnetb0-bio", device="cuda", return_features_only=True)
-embeddings = backbone(audio_tensor)
 # Case 2: supervised predictions (logits over label IDs; see label_map.json)
-model = load_model("esp-aves2-effnetb0-bio", device="cuda")
-logits = model(audio_tensor)
 ```
 ### Class Label Mapping
@@ -156,11 +201,11 @@ Aggregate results for linear probing (frozen base model) with esp-aves2-effnetb0
 **BibTeX:**
 ```bibtex
-@article{miron2025whatmattersbioacoustic,
-    title={What Matters for Bioacoustic Encoding},
-    author={Miron, Marius and Robinson, David and Alizadeh, Milad and Gilsenan-McMahon, Ellen and Narula, Gagan and Chemla, Emmanuel and Cusimano, Maddie and Effenberger, Felix and Hagiwara, Masato and Hoffman, Benjamin and Keen, Sara and Kim, Diane and Lawton, Jane K. and Liu, Jen-Yu and Raskin, Aza and Pietquin, Olivier and Geist, Matthieu},
-    journal={arXiv preprint arXiv:2508.11845},
-    year={2025}
 }
 ```

 ---
 license: cc-by-nc-sa-4.0
 tags:
+  - EarthSpeciesProject
+  - AVEX
+  - Bioacoustics
+  - RepresentationLearning
+  - EfficientNet
 ---
 # Model Card for esp-aves2-effnetb0-bio
 ### Model Sources
+- **Repository:** `https://github.com/earthspecies/avex`
 - **Paper:** [What Matters for Bioacoustic Encoding](https://arxiv.org/abs/2508.11845)
+- **Hugging Face Model:** [ESP-AVES2 Collection](https://huggingface.co/collections/EarthSpeciesProject/esp-aves2)
 - **Configuration:** [train_config.yaml](train_config.yaml)
 ### Parent Models
 ## How to Get Started with the Model
+Loading this model requires the AVEX (Animal Vocalization Encoder) library `avex` to be installed.
 ### Installation
+```bash
+pip install avex
+```
+Or with uv:
+```bash
+uv add avex
+```
+For more details, see [https://github.com/earthspecies/avex](https://github.com/earthspecies/avex).
 ### Loading the Model
 ```python
 from avex import load_model
+model = load_model("esp_aves2_effnetb0_bio", device="cuda")
 ```
 ### Using the Model
 ```python
 # Case 1: embedding extraction (features only)
+backbone = load_model("esp_aves2_effnetb0_bio", device="cuda", return_features_only=True)
+with torch.no_grad():
+    embeddings = backbone(audio_tensor)
+    # Shape: (batch, channels, height, width) for EfficientNet
+# Pool to get fixed-size embedding
+embedding = embeddings.mean(dim=(2, 3))  # Shape: (batch, channels)
 # Case 2: supervised predictions (logits over label IDs; see label_map.json)
+model = load_model("esp_aves2_effnetb0_bio", device="cuda")
+with torch.no_grad():
+    logits = model(audio_tensor)
+    predicted_class = logits.argmax(dim=-1).item()
+```
+### Transfer Learning with Probes
+```python
+from avex.models.probes import build_probe_from_config
+from avex.configs import ProbeConfig
+# Load backbone for feature extraction
+base = load_model("esp_aves2_effnetb0_bio", return_features_only=True, device="cuda")
+# Define a probe head for your task
+probe_config = ProbeConfig(
+    probe_type="linear",
+    target_layers=["last_layer"],
+    aggregation="mean",
+    freeze_backbone=True,
+    online_training=True,
+)
+probe = build_probe_from_config(
+    probe_config=probe_config,
+    base_model=base,
+    num_classes=10,  # Your number of classes
+    device="cuda",
+)
 ```
 ### Class Label Mapping
 **BibTeX:**
 ```bibtex
+@inproceedings{miron2025matters,
+  title={What Matters for Bioacoustic Encoding},
+  author={Miron, Marius and Robinson, David and Alizadeh, Milad and Gilsenan-McMahon, Ellen and Narula, Gagan and Chemla, Emmanuel and Cusimano, Maddie and Effenberger, Felix and Hagiwara, Masato and Hoffman, Benjamin and Keen, Sara and Kim, Diane and Lawton, Jane K. and Liu, Jen-Yu and Raskin, Aza and Pietquin, Olivier and Geist, Matthieu},
+  booktitle={The Fourteenth International Conference on Learning Representations},
+  year={2026}
 }
 ```