yakRNA Design: A semantic multimodal RNA composer

yakRNA Design is a 110M-parameter discrete diffusion language model for conditional RNA sequence design. It can generate RNA sequences conditioned on any combination of secondary structure, consensus sequence, and Gene Ontology (GO) terms โ€” or unconditionally from a target length.


Model Details

Architecture ModernBERT-based discrete diffusion
Parameters 110M
Training data Full Rfam database
Max sequence length 636 nt
Supported GO terms 280

Quickstart

The easiest way to use yakRNA Design is via the Google Colab notebook โ€” no setup required.

For local use, see the GitHub repository.


Download the Weights

CLI:

pip install huggingface_hub
huggingface-cli download MasterYster/yakRNA-Design yakRNA_110M.pt --local-dir checkpoints/

Python:

from huggingface_hub import hf_hub_download
hf_hub_download(repo_id="MasterYster/yakRNA-Design", filename="yakRNA_110M.pt", local_dir="checkpoints/")

Generation Modes

Mode Description
Unconditional Generate from a target length
Structure-conditioned Dot-bracket secondary structure
Consensus-conditioned Mixed-case IUPAC consensus
GO term-conditioned Up to 12 GO terms
Sequence infilling Fixed positions + * masks
Multimodal Any combination of the above

Example Usage

# Install
git clone https://github.com/YousufAKhan/yakRNA.git
cd yakRNA
pip install -r requirements.txt

# Generate 5 sequences conditioned on a hairpin structure
python inference/rna_sequence_generator.py \
    --config configs/inference.yaml \
    --checkpoint checkpoints/yakRNA_110M.pt \
    --secondary_structure "((((....))))" \
    --num_sequences 5

# Generate with all three modalities
python inference/rna_sequence_generator.py \
    --config configs/inference.yaml \
    --checkpoint checkpoints/yakRNA_110M.pt \
    --secondary_structure ":::::::<<<<<<<<<-:::--[[[[[-->>>>>>>>><<<<<<<<<<_________>>>->>>>>>>::::]]]]]::::" \
    --consensus "GAGUaaGGGGuuCuAGU...gcaGCcCgcCUaGaaCCCUGcgacacuGGuucuaaaaCagAugucgUuuuaAGgGCuUUUG" \
    --go_terms "GO:0075523" \
    --num_sequences 5

Links


Citation

@software{yakrna2026,
  author = {Khan, Yousuf},
  title  = {yakRNA Design: A semantic multimodal RNA composer},
  year   = {2026},
  url    = {https://github.com/YousufAKhan/yakRNA}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support