Mishamq
/

HybriDNA-3B

Model card Files Files and versions

HybriDNA-3B / README.md

Mishamq's picture

Update max sequence length to 131,074 bp

19afab9 verified about 2 months ago

|

history blame contribute delete

2.61 kB

	---
	license: apache-2.0
	language:
	- en
	library_name: transformers
	tags:
	- genomics
	- dna
	- mamba
	- hybrid
	- biology
	---

	# HybriDNA-300M

	HybriDNA is a hybrid Mamba-Attention model for DNA sequence modeling. This is the 300M parameter variant.

	## Model Description

	HybriDNA combines the efficiency of Mamba state space models with the expressiveness of attention mechanisms in a hybrid architecture. The model alternates between Mamba and Attention layers to achieve both computational efficiency and strong sequence modeling capabilities.

	### Architecture

	- Parameters: ~300M
	- Hidden Size: 1024
	- Layers: 24 (hybrid Mamba + Attention)
	- Attention Heads: 32
	- Key-Value Heads: 8 (Grouped Query Attention)
	- Mamba Version: Mamba-2
	- Vocabulary: 12 tokens (A, C, G, T, N + special tokens)
	- Max Sequence Length: 131,074 bp

	## Installation

	```bash
	pip install transformers torch mamba-ssm causal-conv1d flash-attn
	```

	## Usage

	### Text Generation

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM

	model_name = "Mishamq/HybriDNA-300M"
	tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
	model = AutoModelForCausalLM.from_pretrained(model_name, trust_remote_code=True)

	prompt = "ACGTACGT"
	inputs = tokenizer(prompt, return_tensors="pt")
	outputs = model.generate(**inputs, max_new_tokens=64)
	print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])
	```

	### Embeddings

	```python
	from transformers import AutoTokenizer, AutoModel
	import torch

	model_name = "Mishamq/HybriDNA-300M"
	tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
	model = AutoModel.from_pretrained(model_name, trust_remote_code=True)

	sequence = "ACGTACGTACGTACGT"
	inputs = tokenizer(sequence, return_tensors="pt")

	with torch.no_grad():
	outputs = model(**inputs)
	embeddings = outputs.last_hidden_state
	```

	## Model Variants

	\| Model \| Parameters \| Hidden Size \| Layers \|
	\|-------\|------------\|-------------\|--------\|
	\| [HybriDNA-300M](https://huggingface.co/Mishamq/HybriDNA-300M) \| 300M \| 1024 \| 24 \|
	\| HybriDNA-3B \| 3B \| 4096 \| 16 \|
	\| HybriDNA-7B \| 7B \| 4096 \| 32 \|

	## Citation

	If you use HybriDNA in your research, please cite:

	```bibtex
	@article{ma2025hybridna,
	title={HybriDNA: A Hybrid Transformer-Mamba2 Long-Range DNA Language Model},
	author={Ma, Mingqian and Liu, Guoqing and Cao, Chuan and Deng, Pan and Dao, Tri and Gu, Albert and Jin, Peiran and Yang, Zhao and Xia, Yingce and Luo, Renqian and others},
	journal={arXiv preprint arXiv:2502.10807},
	year={2025}
	}
	```

	## License

	Apache 2.0