metadata
language:
- dna
tags:
- biology
- genomics
- transposable-elements
- dnabert
- bilstm
- sequence-classification
license: mit
TE-GER — Order Classification
Part of the TE-GER (Transposable Elements Genomic Entity Recognition) toolkit.
TE-GER order classification model: classifies Transposable Elements by order (DIRS, HELITRON, LINE, LTR, PLE, SINE, TIR) in genomic sequences. Architecture: DNABERT-2 + BiLSTM hybrid.
Model Architecture
- Base: DNABERT-2 (DNA language model)
- Head: Bidirectional LSTM + Linear Classifier
- Input: 512 bp sliding windows over raw FASTA sequences
- Task: Sequence classification (token-level TE annotation)
Usage
Use this model via the TE-GER CLI:
python Te_annotator.py genome.fasta output.gff3 --level order
Labels
0: Background1: DIRS2: HELITRON3: LINE4: LTR5: PLE6: SINE7: TIR
Citation
Developed by Johan S. Piña — 2025