|
|
--- |
|
|
description: "Pretrained weights for CLIBD, a multimodal model bridging vision and genomics for biodiversity monitoring." |
|
|
--- |
|
|
|
|
|
# Model Card for CLIBD |
|
|
|
|
|
Official pretrained models for **CLIBD: Bridging Vision and Genomics for Biodiversity Monitoring at Scale.** |
|
|
Model usage and code: [github.com/bioscan-ml/clibd](https://github.com/bioscan-ml/clibd). |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
|
|
|
- **Finetuned from model:** |
|
|
|
|
|
-Image: timm model (["vit_base_patch16_224"](https://huggingface.co/timm/vit_base_patch16_224.mae)) |
|
|
|
|
|
-DNA barcode: BarcodeBERT ["bioscanr/barcodeBERT pre-trained on CANADA-1.5M"](https://huggingface.co/bioscan-ml/bioscan-clibd/tree/main/ckpt/BarcodeBERT/5_mer) |
|
|
|
|
|
-Text: Pre-trained BERT model (["prajjwal1/bert-small"](https://huggingface.co/prajjwal1/bert-small)) |
|
|
### Model Sources |
|
|
|
|
|
- **Repository:** https://github.com/bioscan-ml/clibd |
|
|
- **Paper:** https://arxiv.org/abs/2405.17537 |
|
|
|
|
|
### Model Checkpoints |
|
|
|
|
|
- **ckpt/bioscan_clip/final_experiments/image_dna_4gpu_50epoch/best.pth:** The model trained on the BIOSCAN-1M dataset by aligning images and DNA. |
|
|
- **ckpt/bioscan_clip/final_experiments/image_dna_text_4gpu_50epoch/best.pth:** The model trained on the BIOSCAN-1M dataset by aligning images, DNA, and taxonomy labels. |
|
|
- **ckpt/bioscan_clip/new_5M_training/image_dna_4gpu_50epoch/best.pth:** The model trained on the BIOSCAN-5M dataset by aligning images and DNA. |
|
|
- **ckpt/bioscan_clip/new_5M_training/image_dna_text_4gpu_50epoch/best.pth:** The model trained on the BIOSCAN-5M dataset by aligning images, DNA, and taxonomy labels. |
|
|
|
|
|
## Training Data |
|
|
|
|
|
-[BIOSCAN-1M](https://huggingface.co/datasets/bioscan-ml/BIOSCAN-1M). |
|
|
|
|
|
-[BIOSCAN-5M](https://huggingface.co/datasets/bioscan-ml/BIOSCAN-5M). |
|
|
|
|
|
You can also find the processed data from [here](https://huggingface.co/datasets/bioscan-ml/bioscan-clibd). |
|
|
|
|
|
**BibTeX:** |
|
|
```bibtex |
|
|
@article{gong2024clibd, |
|
|
title={{CLIBD}: Bridging Vision and Genomics for Biodiversity Monitoring at Scale}, |
|
|
author={Gong, ZeMing and Wang, Austin T. and Huo, Xiaoliang and Haurum, Joakim Bruslund and Lowe, Scott C. and Taylor, Graham W. and Chang, Angel X.}, |
|
|
journal={arXiv preprint arXiv:2405.17537}, |
|
|
year={2024}, |
|
|
eprint={2405.17537}, |
|
|
archivePrefix={arXiv}, |
|
|
primaryClass={cs.AI}, |
|
|
doi={10.48550/arxiv.2405.17537}, |
|
|
} |
|
|
``` |
|
|
|
|
|
## Acknowledgement |
|
|
|
|
|
We would like to express our gratitude for the use of the INSECT dataset, which played a pivotal role in the completion of our experiments. Additionally, we acknowledge the use and modification of code from the [Fine-Grained-ZSL-with-DNA](https://github.com/sbadirli/Fine-Grained-ZSL-with-DNA) repository, which facilitated part of our experimental work. The contributions of these resources have been invaluable to our project, and we appreciate the efforts of all developers and researchers involved. |
|
|
|
|
|
This research was supported by the Government of Canada’s New Frontiers in Research Fund (NFRF) [NFRFT-2020-00073], |
|
|
Canada CIFAR AI Chair grants, and the Pioneer Centre for AI (DNRF grant number P1). |
|
|
This research was also enabled in part by support provided by the Digital Research Alliance of Canada (alliancecan.ca). |