bert-embedding / README.md
genomenet's picture
Set space emoji to 🧬 to override default rocket icon
09774a9
metadata
title: BERT Metagenome Embeddings
emoji: 🧬
colorFrom: gray
colorTo: gray
sdk: docker
pinned: false
license: mit
short_description: Extract DNA sequence embeddings from pretrained BERT

bert-embedding

Extract embeddings from DNA sequences using a BERT model pretrained on metagenomic sequences.

Model

architecture BERT, 24 layers, 768 hidden, 12 heads
parameters ~430M
input DNA sequence (min 1000 bp)
output 768-dim embedding
source genomenet/bert-metagenome

Deployment

cd /vol/hpcprojects/pmuench/crispr_tool/bert-embedding
git add -A && git commit -m "update" && git push

Acknowledgements

  • BMBF de.NBI / GenomeNet
  • DFG SPP 2141
  • HZI BIFO