te-ger-order / README.md
Jspinad's picture
Add model card for order
f5f339b verified
---
language:
- dna
tags:
- biology
- genomics
- transposable-elements
- dnabert
- bilstm
- sequence-classification
license: mit
---
# TE-GER — Order Classification
Part of the **TE-GER** (Transposable Elements Genomic Entity Recognition) toolkit.
TE-GER order classification model: classifies Transposable Elements by order (DIRS, HELITRON, LINE, LTR, PLE, SINE, TIR) in genomic sequences. Architecture: DNABERT-2 + BiLSTM hybrid.
## Model Architecture
- **Base:** [DNABERT-2](https://huggingface.co/zhihan1996/DNABERT-2-117M) (DNA language model)
- **Head:** Bidirectional LSTM + Linear Classifier
- **Input:** 512 bp sliding windows over raw FASTA sequences
- **Task:** Sequence classification (token-level TE annotation)
## Usage
Use this model via the [TE-GER CLI](https://github.com/johanpina/te-ger):
```bash
python Te_annotator.py genome.fasta output.gff3 --level order
```
## Labels
- `0`: Background
- `1`: DIRS
- `2`: HELITRON
- `3`: LINE
- `4`: LTR
- `5`: PLE
- `6`: SINE
- `7`: TIR
## Citation
Developed by Johan S. Piña — 2025