BBERT-models / README.md
AmirErez's picture
Upload folder using huggingface_hub
4b32750 verified
---
license: mit
tags:
- biology
- genomics
- dna-sequence
- bacterial-classification
- bert
- transformers
---
# BBERT Pre-trained Models
Pre-trained models for [BBERT](https://github.com/AmirErez/BBERT) - BERT for Bacterial DNA Classification.
## Models Included
### 1. BBERT Transformer (`bbert_checkpoint-32500/`)
- Main BERT-based model trained on bacterial DNA sequences
- Hidden size: 768
- Trained on diverse bacterial genomes
### 2. Bacterial Classifier (`bacterial_classifier/epoch_80.pt`)
- Binary classifier for bacterial vs. non-bacterial sequences
- Input: BBERT embeddings (768-dim)
- Trained for 80 epochs on 3.9M sequences
### 3. Reading Frame Classifier (`frame_classifier/classifier_model_2000K_37e.pth`)
- 6-way classifier for reading frame prediction
- Frames: +1, +2, +3, -1, -2, -3
- Trained for 37 epochs on 2M sequences
### 4. Coding Sequence Classifier (`coding_classifier/epoch_46.pt`)
- Binary classifier for coding vs. non-coding sequences
- Trained for 46 epochs on 3.9M sequences
## Usage
These models are automatically downloaded when using BBERT:
\`\`\`bash
# First time setup
pip install bbert # or clone from GitHub
python source/download_models.py
# Then use normally
python bbert.py your_sequences.fasta --output_dir results
\`\`\`
## Citation
If you use BBERT, please cite:
[Add your citation here]
## License
MIT License - see LICENSE file for details