Add model card and metadata

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +47 -0
README.md ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ pipeline_tag: text-generation
4
+ tags:
5
+ - biology
6
+ - rna-design
7
+ ---
8
+
9
+ # Designing RNAs with Language Models
10
+
11
+ RNA-Design-LM is a research codebase and model for designing RNA sequences using autoregressive language models. Instead of solving each RNA inverse-folding instance from scratch with combinatorial search, this approach reframes RNA design as conditional sequence generation.
12
+
13
+ ## Description
14
+ The model is instantiated as a decoder-only Transformer (based on the Qwen2 architecture) that maps target secondary structures (represented as dot–bracket strings) directly to RNA sequences. It was trained in a supervised setting on structure–sequence pairs and further optimized using reinforcement learning (RL) to improve thermodynamic folding metrics such as Boltzmann probability, ensemble defect, and MFE uniqueness.
15
+
16
+ - **Paper:** [Designing RNAs with Language Models](https://huggingface.co/papers/2602.12470)
17
+ - **Repository:** [KuNyaa/RNA-Design-LM](https://github.com/KuNyaa/RNA-Design-LM)
18
+
19
+ ## Task and Training
20
+ The model acts as a reusable neural approximator for RNA inverse folding. Key features include:
21
+ - **Amortized Design:** Generates sequences for target structures in a single forward pass.
22
+ - **RL Optimization:** End-to-end optimization for biological and thermodynamic metrics.
23
+ - **Constrained Decoding:** Supports enforcing Watson–Crick–wobble pairing rules during generation to ensure structural validity.
24
+
25
+ ## Usage
26
+ The model can be used for batched inference. For detailed implementation and evaluation, please refer to the [official GitHub repository](https://github.com/KuNyaa/RNA-Design-LM). Below is an example command provided by the authors for running inference with constrained decoding:
27
+
28
+ ```bash
29
+ python ./scripts/constrained_decoding.py \
30
+ --test_path ./test/eterna100.jsonl \
31
+ --model_flavor slrl \
32
+ --n_repeats 100 \
33
+ --batch_size 1024 \
34
+ --do_sample \
35
+ --temp 2 \
36
+ --constrained_decode
37
+ ```
38
+
39
+ ## Citation
40
+ If you use this model in your research, please cite the following paper:
41
+ ```bibtex
42
+ @article{rna_design_lm_2025,
43
+ title={Designing RNAs with Language Models},
44
+ journal={arXiv preprint arXiv:2602.12470},
45
+ year={2025}
46
+ }
47
+ ```