| | --- |
| | license: apache-2.0 |
| | language: |
| | - en |
| | library_name: transformers |
| | tags: |
| | - genomics |
| | - dna |
| | - mamba |
| | - hybrid |
| | - biology |
| | --- |
| | |
| | # HybriDNA-300M |
| |
|
| | HybriDNA is a hybrid Mamba-Attention model for DNA sequence modeling. This is the 300M parameter variant. |
| |
|
| | ## Model Description |
| |
|
| | HybriDNA combines the efficiency of Mamba state space models with the expressiveness of attention mechanisms in a hybrid architecture. The model alternates between Mamba and Attention layers to achieve both computational efficiency and strong sequence modeling capabilities. |
| |
|
| | ### Architecture |
| |
|
| | - **Parameters**: ~300M |
| | - **Hidden Size**: 1024 |
| | - **Layers**: 24 (hybrid Mamba + Attention) |
| | - **Attention Heads**: 32 |
| | - **Key-Value Heads**: 8 (Grouped Query Attention) |
| | - **Mamba Version**: Mamba-2 |
| | - **Vocabulary**: 12 tokens (A, C, G, T, N + special tokens) |
| | - **Max Sequence Length**: 131,074 bp |
| |
|
| | ## Installation |
| |
|
| | ```bash |
| | pip install transformers torch mamba-ssm causal-conv1d flash-attn |
| | ``` |
| |
|
| | ## Usage |
| |
|
| | ### Text Generation |
| |
|
| | ```python |
| | from transformers import AutoTokenizer, AutoModelForCausalLM |
| | |
| | model_name = "Mishamq/HybriDNA-300M" |
| | tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) |
| | model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True) |
| | |
| | prompt = "ACGTACGT" |
| | inputs = tokenizer(prompt, return_tensors="pt") |
| | outputs = model.generate(**inputs, max_new_tokens=64) |
| | print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]) |
| | ``` |
| |
|
| | ### Embeddings |
| |
|
| | ```python |
| | from transformers import AutoTokenizer, AutoModel |
| | import torch |
| | |
| | model_name = "Mishamq/HybriDNA-300M" |
| | tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) |
| | model = AutoModel.from_pretrained(model_name, trust_remote_code=True) |
| | |
| | sequence = "ACGTACGTACGTACGT" |
| | inputs = tokenizer(sequence, return_tensors="pt") |
| | |
| | with torch.no_grad(): |
| | outputs = model(**inputs) |
| | embeddings = outputs.last_hidden_state |
| | ``` |
| |
|
| | ## Model Variants |
| |
|
| | | Model | Parameters | Hidden Size | Layers | |
| | |-------|------------|-------------|--------| |
| | | [HybriDNA-300M](https://huggingface.co/Mishamq/HybriDNA-300M) | 300M | 1024 | 24 | |
| | | HybriDNA-3B | 3B | 4096 | 16 | |
| | | HybriDNA-7B | 7B | 4096 | 32 | |
| |
|
| | ## Citation |
| |
|
| | If you use HybriDNA in your research, please cite: |
| |
|
| | ```bibtex |
| | @article{ma2025hybridna, |
| | title={HybriDNA: A Hybrid Transformer-Mamba2 Long-Range DNA Language Model}, |
| | author={Ma, Mingqian and Liu, Guoqing and Cao, Chuan and Deng, Pan and Dao, Tri and Gu, Albert and Jin, Peiran and Yang, Zhao and Xia, Yingce and Luo, Renqian and others}, |
| | journal={arXiv preprint arXiv:2502.10807}, |
| | year={2025} |
| | } |
| | ``` |
| |
|
| | ## License |
| |
|
| | Apache 2.0 |
| |
|