File size: 1,633 Bytes
303c2e0
 
 
 
c7e3bcd
303c2e0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
# TR2-D2 For Enhancer DNA Design

This part of the code is for finetuning DNA sequence models for optimizing DNA enhancer activity with TR2-D2.

The codebase is partially built upon [PepTune (Tang et.al, 2024)](https://arxiv.org/abs/2412.17780), [MDLM (Sahoo et.al, 2023)](https://github.com/kuleshov-group/mdlm), [Drakes (Wang et.al, 2024)](https://github.com/ChenyuWang-Monica/DRAKES), and [MDNS (Zhu et.al, 2025)](https://arxiv.org/abs/2508.10684).

## Environment Installation
```
conda create -n tr2d2-dna python=3.9.18

conda activate tr2d2-dna

bash env.sh
```

## Model Pretrained Weights Download

All data and model weights can be downloaded from the link below, which is provided by the [DRAKES](https://arxiv.org/abs/2410.13643) author. Save the downloaded file in `$BASE_PATH`.

https://www.dropbox.com/scl/fi/zi6egfppp0o78gr0tmbb1/DRAKES_data.zip?rlkey=yf7w0pm64tlypwsewqc01wmfq&st=xe8dzn8k&dl=0

For downloading using terminal, use 

```
curl -L -o dna.zip "https://www.dropbox.com/scl/fi/zi6egfppp0o78gr0tmbb1/DRAKES_data.zip?rlkey=yf7w0pm64tlypwsewqc01wmfq&st=xe8dzn8k&dl=0"

unzip dna.zip
```

## Finetune with TR2-D2
After downloading the pretrained checkpoints, fill in the `base_path` in `dataloader_gosai.py`, `oracle.py`, and `finetune.sh`. Fill in `HOME_LOC` and `SAVE_PATH` in `finetune.sh` as well.

Reproduce the DNA experiments with $\alpha = 0.1$ using
```
sbatch train.sh
```

## Evaluate saved checkpoints
The checkpoints will be saved to `SAVE_PATH`.
Fill in `RUNS_DIR` in `run_batch_eval.sh` to be the same as `SAVE_PATH`. The checkpoints can be evaluated with
```
sbatch run_batch_eval.sh
```