LLM-RNA-Design-2025 / README.md
nielsr's picture
nielsr HF Staff
Add model card and metadata
5545a49 verified
|
raw
history blame
2.2 kB
metadata
library_name: transformers
pipeline_tag: text-generation
tags:
  - biology
  - rna-design

Designing RNAs with Language Models

RNA-Design-LM is a research codebase and model for designing RNA sequences using autoregressive language models. Instead of solving each RNA inverse-folding instance from scratch with combinatorial search, this approach reframes RNA design as conditional sequence generation.

Description

The model is instantiated as a decoder-only Transformer (based on the Qwen2 architecture) that maps target secondary structures (represented as dot–bracket strings) directly to RNA sequences. It was trained in a supervised setting on structure–sequence pairs and further optimized using reinforcement learning (RL) to improve thermodynamic folding metrics such as Boltzmann probability, ensemble defect, and MFE uniqueness.

Task and Training

The model acts as a reusable neural approximator for RNA inverse folding. Key features include:

  • Amortized Design: Generates sequences for target structures in a single forward pass.
  • RL Optimization: End-to-end optimization for biological and thermodynamic metrics.
  • Constrained Decoding: Supports enforcing Watson–Crick–wobble pairing rules during generation to ensure structural validity.

Usage

The model can be used for batched inference. For detailed implementation and evaluation, please refer to the official GitHub repository. Below is an example command provided by the authors for running inference with constrained decoding:

python ./scripts/constrained_decoding.py \
  --test_path ./test/eterna100.jsonl \
  --model_flavor slrl \
  --n_repeats 100 \
  --batch_size 1024 \
  --do_sample \
  --temp 2 \
  --constrained_decode 

Citation

If you use this model in your research, please cite the following paper:

@article{rna_design_lm_2025,
  title={Designing RNAs with Language Models},
  journal={arXiv preprint arXiv:2602.12470},
  year={2025}
}