File size: 973 Bytes
38d1860 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 | ---
language:
- dna
tags:
- biology
- genomics
- transposable-elements
- dnabert
- bilstm
- sequence-classification
license: mit
---
# TE-GER — Binary Detection
Part of the **TE-GER** (Transposable Elements Genomic Entity Recognition) toolkit.
TE-GER binary model: detects presence/absence of Transposable Elements (TE vs Background) in genomic sequences. Architecture: DNABERT-2 + BiLSTM hybrid. Labels: Background, TE.
## Model Architecture
- **Base:** [DNABERT-2](https://huggingface.co/zhihan1996/DNABERT-2-117M) (DNA language model)
- **Head:** Bidirectional LSTM + Linear Classifier
- **Input:** 512 bp sliding windows over raw FASTA sequences
- **Task:** Sequence classification (token-level TE annotation)
## Usage
Use this model via the [TE-GER CLI](https://github.com/johanpina/te-ger):
```bash
python Te_annotator.py genome.fasta output.gff3 --level binary
```
## Labels
- `0`: Background
- `1`: TE
## Citation
Developed by Johan S. Piña — 2025
|