MasterYster commited on
Commit
8bca270
·
verified ·
1 Parent(s): f584ce2

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +112 -0
README.md CHANGED
@@ -1,3 +1,115 @@
1
  ---
2
  license: mit
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ language:
4
+ - en
5
+ tags:
6
+ - rna
7
+ - biology
8
+ - diffusion
9
+ - generative
10
+ - sequence-design
11
  ---
12
+
13
+ # yakRNA Design: A semantic multimodal RNA composer
14
+
15
+ <p align="center">
16
+ <img src="https://raw.githubusercontent.com/YousufAKhan/yakRNA/main/mascot.png" width="160"/>
17
+ </p>
18
+
19
+ **yakRNA Design** is a 110M-parameter discrete diffusion language model for conditional RNA sequence design. It can generate RNA sequences conditioned on any combination of secondary structure, consensus sequence, and Gene Ontology (GO) terms — or unconditionally from a target length.
20
+
21
+ ---
22
+
23
+ ## Model Details
24
+
25
+ | | |
26
+ |---|---|
27
+ | **Architecture** | ModernBERT-based discrete diffusion |
28
+ | **Parameters** | 110M |
29
+ | **Training data** | Full Rfam database |
30
+ | **Max sequence length** | 636 nt |
31
+ | **Supported GO terms** | 280 |
32
+
33
+ ---
34
+
35
+ ## Quickstart
36
+
37
+ The easiest way to use yakRNA Design is via the **[Google Colab notebook](https://colab.research.google.com/github/YousufAKhan/yakRNA/blob/main/yakRNA_colab.ipynb)** — no setup required.
38
+
39
+ For local use, see the **[GitHub repository](https://github.com/YousufAKhan/yakRNA)**.
40
+
41
+ ---
42
+
43
+ ## Download the Weights
44
+
45
+ **CLI:**
46
+ ```bash
47
+ pip install huggingface_hub
48
+ huggingface-cli download MasterYster/yakRNA-Design yakRNA_110M.pt --local-dir checkpoints/
49
+ ```
50
+
51
+ **Python:**
52
+ ```python
53
+ from huggingface_hub import hf_hub_download
54
+ hf_hub_download(repo_id="MasterYster/yakRNA-Design", filename="yakRNA_110M.pt", local_dir="checkpoints/")
55
+ ```
56
+
57
+ ---
58
+
59
+ ## Generation Modes
60
+
61
+ | Mode | Description |
62
+ |------|-------------|
63
+ | Unconditional | Generate from a target length |
64
+ | Structure-conditioned | Dot-bracket secondary structure |
65
+ | Consensus-conditioned | Mixed-case IUPAC consensus |
66
+ | GO term-conditioned | Up to 12 GO terms |
67
+ | Sequence infilling | Fixed positions + `*` masks |
68
+ | Multimodal | Any combination of the above |
69
+
70
+ ---
71
+
72
+ ## Example Usage
73
+
74
+ ```bash
75
+ # Install
76
+ git clone https://github.com/YousufAKhan/yakRNA.git
77
+ cd yakRNA
78
+ pip install -r requirements.txt
79
+
80
+ # Generate 5 sequences conditioned on a hairpin structure
81
+ python inference/rna_sequence_generator.py \
82
+ --config configs/inference.yaml \
83
+ --checkpoint checkpoints/yakRNA_110M.pt \
84
+ --secondary_structure "((((....))))" \
85
+ --num_sequences 5
86
+
87
+ # Generate with all three modalities
88
+ python inference/rna_sequence_generator.py \
89
+ --config configs/inference.yaml \
90
+ --checkpoint checkpoints/yakRNA_110M.pt \
91
+ --secondary_structure ":::::::<<<<<<<<<-:::--[[[[[-->>>>>>>>><<<<<<<<<<_________>>>->>>>>>>::::]]]]]::::" \
92
+ --consensus "GAGUaaGGGGuuCuAGU...gcaGCcCgcCUaGaaCCCUGcgacacuGGuucuaaaaCagAugucgUuuuaAGgGCuUUUG" \
93
+ --go_terms "GO:0075523" \
94
+ --num_sequences 5
95
+ ```
96
+
97
+ ---
98
+
99
+ ## Links
100
+
101
+ - **GitHub**: [YousufAKhan/yakRNA](https://github.com/YousufAKhan/yakRNA)
102
+ - **Colab**: [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/YousufAKhan/yakRNA/blob/main/yakRNA_colab.ipynb)
103
+
104
+ ---
105
+
106
+ ## Citation
107
+
108
+ ```bibtex
109
+ @software{yakrna2026,
110
+ author = {Khan, Yousuf},
111
+ title = {yakRNA Design: A semantic multimodal RNA composer},
112
+ year = {2026},
113
+ url = {https://github.com/YousufAKhan/yakRNA}
114
+ }
115
+ ```