te-ger-binary / README.md
Jspinad's picture
Add model card for binary
38d1860 verified
metadata
language:
  - dna
tags:
  - biology
  - genomics
  - transposable-elements
  - dnabert
  - bilstm
  - sequence-classification
license: mit

TE-GER — Binary Detection

Part of the TE-GER (Transposable Elements Genomic Entity Recognition) toolkit.

TE-GER binary model: detects presence/absence of Transposable Elements (TE vs Background) in genomic sequences. Architecture: DNABERT-2 + BiLSTM hybrid. Labels: Background, TE.

Model Architecture

  • Base: DNABERT-2 (DNA language model)
  • Head: Bidirectional LSTM + Linear Classifier
  • Input: 512 bp sliding windows over raw FASTA sequences
  • Task: Sequence classification (token-level TE annotation)

Usage

Use this model via the TE-GER CLI:

python Te_annotator.py genome.fasta output.gff3 --level binary

Labels

  • 0: Background
  • 1: TE

Citation

Developed by Johan S. Piña — 2025