ZhejiangLab
/

OneGenome-Rice

Model card Files Files and versions

OneGenome-Rice / README.md

xueyunlong's picture

Update README.md

ac80656 verified about 1 month ago

|

1.29 kB

	---
	license: mit
	tags:
	- biology
	---
	<div align="center">
	<img src="https://cdn-uploads.huggingface.co/production/uploads/65a9e8563b9e1f0f308378b7/H2qI2OOSl-KqOlg01fRGR.png" width="100%" />
	</div>

	# OneGenomeRice (OGR)

	OGR is a foundational model for AI-driven precision breeding and functional genomics in rice. It is a generative genomic foundation model trained to process DNA sequences up to 1 million base pairs in length, with 1.25B total parameters and a Mixture-of-Experts (MoE) architecture. It was pre-trained on a curated corpus of 422 rice genomes spanning cultivated and wild Oryza diversity.

	For instructions, details, and examples, see the project repository[OGR GitHub](https://github.com/zhejianglab/OneGenomeRice).

	The table below summarizes training scale and key hyperparameters.

	\| Model Specification \| OGR \|
	\| --- \| --- \|
	\| Model Scale \| \|
	\| Total Parameters \| 1.25B \|
	\| Activated Parameters \| 0.33B \|
	\| Architecture \| \|
	\| Architecture \| MoE \|
	\| Number of Experts \| 8 \|
	\| Selected Experts per Token \| 2 \|
	\| Number of Layers \| 12 \|
	\| Attention Hidden Dimension \| 1024 \|
	\| Number of Attention Heads \| 16 (GQA, 8 KV groups) \|
	\| MoE Hidden Dimension (per Expert) \| 4096 \|
	\| Vocabulary Size \| 128 (padded) \|
	\| Context Length \| up to 1M \|