--- license: mit tags: - biology - genomics - dna-sequence - bacterial-classification - bert - transformers --- # BBERT Pre-trained Models Pre-trained models for [BBERT](https://github.com/AmirErez/BBERT) - BERT for Bacterial DNA Classification. ## Models Included ### 1. BBERT Transformer (`bbert_checkpoint-32500/`) - Main BERT-based model trained on bacterial DNA sequences - Hidden size: 768 - Trained on diverse bacterial genomes ### 2. Bacterial Classifier (`bacterial_classifier/epoch_80.pt`) - Binary classifier for bacterial vs. non-bacterial sequences - Input: BBERT embeddings (768-dim) - Trained for 80 epochs on 3.9M sequences ### 3. Reading Frame Classifier (`frame_classifier/classifier_model_2000K_37e.pth`) - 6-way classifier for reading frame prediction - Frames: +1, +2, +3, -1, -2, -3 - Trained for 37 epochs on 2M sequences ### 4. Coding Sequence Classifier (`coding_classifier/epoch_46.pt`) - Binary classifier for coding vs. non-coding sequences - Trained for 46 epochs on 3.9M sequences ## Usage These models are automatically downloaded when using BBERT: \`\`\`bash # First time setup pip install bbert # or clone from GitHub python source/download_models.py # Then use normally python bbert.py your_sequences.fasta --output_dir results \`\`\` ## Citation If you use BBERT, please cite: [Add your citation here] ## License MIT License - see LICENSE file for details