| language: | |
| - dna | |
| tags: | |
| - biology | |
| - genomics | |
| - transposable-elements | |
| - dnabert | |
| - bilstm | |
| - sequence-classification | |
| license: mit | |
| # TE-GER — Binary Detection | |
| Part of the **TE-GER** (Transposable Elements Genomic Entity Recognition) toolkit. | |
| TE-GER binary model: detects presence/absence of Transposable Elements (TE vs Background) in genomic sequences. Architecture: DNABERT-2 + BiLSTM hybrid. Labels: Background, TE. | |
| ## Model Architecture | |
| - **Base:** [DNABERT-2](https://huggingface.co/zhihan1996/DNABERT-2-117M) (DNA language model) | |
| - **Head:** Bidirectional LSTM + Linear Classifier | |
| - **Input:** 512 bp sliding windows over raw FASTA sequences | |
| - **Task:** Sequence classification (token-level TE annotation) | |
| ## Usage | |
| Use this model via the [TE-GER CLI](https://github.com/johanpina/te-ger): | |
| ```bash | |
| python Te_annotator.py genome.fasta output.gff3 --level binary | |
| ``` | |
| ## Labels | |
| - `0`: Background | |
| - `1`: TE | |
| ## Citation | |
| Developed by Johan S. Piña — 2025 | |