zzq1zh commited on
Commit
a206631
·
verified ·
1 Parent(s): 3af7a86

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -3
README.md CHANGED
@@ -1,3 +1,31 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+
5
+
6
+
7
+ # eccDNAMamba
8
+ **A Pre-Trained Model for Ultra-Long eccDNA Sequence Analysis**
9
+
10
+ ---
11
+
12
+ ### Model Overview
13
+ **eccDNAMamba** is a **bidirectional state-space model (SSM)** designed for efficient and topology-aware modeling of **extrachromosomal circular DNA (eccDNA)**.
14
+ By combining **forward and reverse Mamba-2 encoders**, **motif-level Byte Pair Encoding (BPE)**, and a lightweight **head–tail circular augmentation**, it captures wrap-around dependencies in ultra-long (10–200 kbp) genomic sequences while maintaining linear-time scalability.
15
+ The model provides strong performance across cancer-associated eccDNA prediction, copy-number level estimation, and real vs. pseudo-eccDNA discrimination tasks.
16
+
17
+ ---
18
+
19
+ ### Quick Start
20
+ ```python
21
+ from transformers import AutoTokenizer, AutoModelForMaskedLM
22
+
23
+ tokenizer = AutoTokenizer.from_pretrained("eccdna/eccDNAMamba-1M")
24
+ model = AutoModelForMaskedLM.from_pretrained("eccdna/eccDNAMamba-1M")
25
+
26
+ sequence = "ATGCGTACGTTAGCGTACGT"
27
+ inputs = tokenizer(sequence, return_tensors="pt")
28
+ outputs = model(**inputs)
29
+
30
+ # Access logits or reconstruct masked spans
31
+ logits = outputs.logits