File size: 17,115 Bytes
e589633
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c319609
e589633
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
---
datasets:
- multimolecule/deltasplice
library_name: multimolecule
license: agpl-3.0
pipeline: splice-site
pipeline_tag: other
tags:
- Biology
- RNA
- Splicing
- rna
widget:
- example_title: microRNA 21
  pipeline_tag: splice-site
  sequence_type: ncRNA
  task: splice-site
  text: UAGCUUAUCAGACUGAUGUUGA
- example_title: microRNA 146a
  pipeline_tag: splice-site
  sequence_type: ncRNA
  task: splice-site
  text: UGAGAACUGAAUUCCAUGGGUU
- example_title: microRNA 155
  pipeline_tag: splice-site
  sequence_type: ncRNA
  task: splice-site
  text: UUAAUGCUAAUCGUGAUAGGGGUU
- example_title: RNA component of mitochondrial RNA processing endoribonuclease
  pipeline_tag: splice-site
  sequence_type: ncRNA
  task: splice-site
  text: GGUUCGUGCUGAAGGCCUGUAUCCUAGGCUACACACUGAGGACUCUGUUCCUCCCCUUUCCGCCUAGGGGAAAGUCCCCGGACCUCGGGCAGAGAGUGCCACGUGCAUACGCACGUAGACAUUCCCCGCUUCCCACUCCAAAGUCCGCCAAGAAGCGUAUCCCGCUGAGCGGCGUGGCGCGGGGGCGUCAUCCGUCAGCUCCCUCUAGUUACGCAGGCAGUGCGUGUCCGCGCACCAACCACACGGGGCUCAUUCUCAGCGCGGCUGUAAAAAAAAA
- example_title: 7SK small nuclear RNA
  pipeline_tag: splice-site
  sequence_type: ncRNA
  task: splice-site
  text: GGAUGUGAGGGCGAUCUGGCUGCGACAUCUGUCACCCCAUUGAUCGCCAGGGUUGAUUCGGCUGAUCUGGCUGGCUAGGCGGGUGUCCCCUUCCUCCCUCACCGCUCCAUGUGCGUCCCUCCCGAAGCUGCGCGCUCGGUCGAAGAGGACGACCAUCCCCGAUAGAGGAGGACCGGUCUUCGGUCAAGGGUAUACGAGUAGCUGCGCUCCCCUGCUAGAACCUCCAAACAAGCUCUCAAGGUCCAUUUGUAGGAGAACGUAGGGUAGUCAAGCUUCCAAGACUCCAGACACAUCCAAAUGAGGCGCUGCAUGUGGCAGUCUGCCUUUCUUUU
- example_title: telomerase RNA component
  pipeline_tag: splice-site
  sequence_type: ncRNA
  task: splice-site
  text: GGGUUGCGGAGGGUGGGCCUGGGAGGGGUGGUGGCCAUUUUUUGUCUAACCCUAACUGAGAAGGGCGUAGGCGCCGUGCUUUUGCUCCCCGCGCGCUGUUUUUCUCGCUGACUUUCAGCGGGCGGAAAAGCCUCGGCCUGCCGCCUUCCACCGUUCAUUCUAGAGCAAACAAAAAAUGUCAGCUGCUGGCCCGUUCGCCCCUCCCGGGGACCUGCGGCGGGUCGCCUGCCCAGCCCCCGAACCCCGCCUGGAGGCCGCGGUCGGCCCGGGGCUUCUCCGGAGGCACCCACUGCCACCGCGAAGAGUUGGGCUCUGUCAGCCGCGGGUCUCUCGGGGGCGAGGGCGAGGUUCAGGCCUUUCAGGCCGCAGGAAGAGGAACGGAGCGAGUCCCCGCGCGCGGCGCGAUUCCCUGAGCUGUGGGACGUGCACCCAGGACUCGGCUCACACAUGC
- example_title: vault RNA 2-1
  pipeline_tag: splice-site
  sequence_type: ncRNA
  task: splice-site
  text: CGGGUCGGAGUUAGCUCAAGCGGUUACCUCCUCAUGCCGGACUUUCUAUCUGUCCAUCUCUGUGCUGGGGUUCGAGACCCGCGGGUGCUUACUGACCCUUUUAUGCAA
- example_title: brain cytoplasmic RNA 1
  pipeline_tag: splice-site
  sequence_type: ncRNA
  task: splice-site
  text: GGCCGGGCGCGGUGGCUCACGCCUGUAAUCCCAGCUCUCAGGGAGGCUAAGAGGCGGGAGGAUAGCUUGAGCCCAGGAGUUCGAGACCUGCCUGGGCAAUAUAGCGAGACCCCGUUCUCCAGAAAAAGGAAAAAAAAAAACAAAAGACAAAAAAAAAAUAAGCGUAACUUCCCUCAAAGCAACAACCCCCCCCCCCCUUU
- example_title: HIV-1 TAR-WT
  pipeline_tag: splice-site
  sequence_type: ncRNA
  task: splice-site
  text: GGUCUCUCUGGUUAGACCAGAUCUGAGCCUGGGAGCUCUCUGGCUAACUAGGGAACC
- example_title: prion protein (Kanno blood group)
  pipeline_tag: splice-site
  sequence_type: mRNA
  task: splice-site
  text: AUGGCGAACCUUGGCUGCUGGAUGCUGGUUCUCUUUGUGGCCACAUGGAGUGACCUGGGCCUCUGC
- example_title: interleukin 10
  pipeline_tag: splice-site
  sequence_type: mRNA
  task: splice-site
  text: AUGCACAGCUCAGCACUGCUCUGUUGCCUGGUCCUCCUGACUGGGGUGAGGGCC
- example_title: Zaire ebolavirus
  pipeline_tag: splice-site
  sequence_type: mRNA
  task: splice-site
  text: AAUGUUCAAACACUUUGUGAAGCUCUGUUAGCUGAUGGUCUUGCUAAAGCAUUUCCUAGCAAUAUGAUGGUAGUCACAGAGCGUGAGCAAAAAGAAAGCUUAUUGCAUCAAGCAUCAUGGCACCACACAAGUGAUGAUUUUGGUGAGCAUGCCACAGUUAGAGGGAGUAGCUUUGUAACUGAUUUAGAGAAAUACAAUCUUGCAUUUAGAUAUGAGUUUACAGCACCUUUUAUAGAAUAUUGUAACCGUUGCUAUGGUGUUAAGAAUGUUUUUAAUUGGAUGCAUUAUACAAUCCCACAGUGUUAU
- example_title: SARS coronavirus
  pipeline_tag: splice-site
  sequence_type: mRNA
  task: splice-site
  text: AUGUUUAUUUUCUUAUUAUUUCUUACUCUCACUAGUGGUAGUGACCUUGACCGGUGCACCACUUUUGAUGAUGUUCAAGCUCCUAAUUACACUCAACAUACUUCAUCUAUGAGGGGGGUUUACUAUCCUGAUGAAAUUUUUAGAUCAGACACUCUUUAUUUAACUCAGGAUUUAUUUCUUCCAUUUUAUUCUAAUGUUACAGGGUUUCAUACUAUUAAUCAUACGUUUGACAACCCUGUCAUACCUUUUAAGGAUGGUAUUUAUUUUGCUGCCACAGAGAAAUCAAAUGUUGUCCGUGGUUGGGUUUUUGGUUCUACCAUGAACAACAAGUCACAGUCGGUGAUUAUUAUUAACAAUUCUACUAAUGUUGUUAUACGAGCAUGUAACUUUGAAUUGUGUGACAACCCUUUCUUUGCUGUUUCUAAACCCAUGGGUACACAGACACAUACUAUGAUAUUCGAUAAUGCAUUUAAAUGCACUUUCGAGUACAUAUCU
- example_title: insulin
  pipeline_tag: splice-site
  sequence_type: mRNA
  task: splice-site
  text: AUGGCCCUGUGGAUGCGCCUCCUGCCCCUGCUGGCGCUGCUGGCCCUCUGGGGACCUGACCCAGCCGCAGCCUUUGUGAACCAACACCUGUGCGGCUCACACCUGGUGGAAGCUCUCUACCUAGUGUGCGGGGAACGAGGCUUCUUCUACACACCCAAGACCCGCCGGGAGGCAGAGGACCUGCAGGUGGGGCAGGUGGAGCUGGGCGGGGGCCCUGGUGCAGGCAGCCUGCAGCCCUUGGCCCUGGAGGGGUCCCUGCAGAAGCGUGGCAUUGUGGAACAAUGCUGUACCAGCAUCUGCUCCCUCUACCAGCUGGAGAACUACUGCAACUAG
- example_title: cyclin dependent kinase inhibitor 2A
  pipeline_tag: splice-site
  sequence_type: mRNA
  task: splice-site
  text: AUGGAGCCGGCGGCGGGGAGCAGCAUGGAGCCUUCGGCUGACUGGCUGGCCACGGCCGCGGCCCGGGGUCGGGUAGAGGAGGUGCGGGCGCUGCUGGAGGCGGGGGCGCUGCCCAACGCACCGAAUAGUUACGGUCGGAGGCCGAUCCAGGUCAUGAUGAUGGGCAGCGCCCGAGUGGCGGAGCUGCUGCUGCUCCACGGCGCGGAGCCCAACUGCGCCGACCCCGCCACUCUCACCCGACCCGUGCACGACGCUGCCCGGGAGGGCUUCCUGGACACGCUGGUGGUGCUGCACCGGGCCGGGGCGCGGCUGGACGUGCGCGAUGCCUGGGGCCGUCUGCCCGUGGACCUGGCUGAGGAGCUGGGCCAUCGCGAUGUCGCACGGUACCUGCGCGCGGCUGCGGGGGGCACCAGAGGCAGUAACCAUGCCCGCAUAGAUGCCGCGGAAGGUCCCUCAGACAUCCCCGAUUGA
- example_title: human papillomavirus type 16 E6
  pipeline_tag: splice-site
  sequence_type: mRNA
  task: splice-site
  text: AUGCACCAAAAGAGAACUGCAAUGUUUCAGGACCCACAGGAGCGACCCAGAAAGUUACCACAGUUAUGCACAGAGCUGCAAACAACUAUACAUGAUAUAAUAUUAGAAUGUGUGUACUGCAAGCAACAGUUACUGCGACGUGAGGUAUAUGACUUUGCUUUUCGGGAUUUAUGCAUAGUAUAUAGAGAUGGGAAUCCAUAUGCUGUAUGUGAUAAAUGUUUAAAGUUUUAUUCUAAAAUUAGUGAGUAUAGACAUUAUUGUUAUAGUUUGUAUGGAACAACAUUAGAACAGCAAUACAACAAACCGUUGUGUGAUUUGUUAAUUAGGUGUAUUAACUGUCAAAAGCCACUGUGUCCUGAAGAAAAGCAAAGACAUCUGGACAAAAAGCAAAGAUUCCAUAAUAUAAGGGGUCGGUGGACCGGUCGAUGUAUGUCUUGUUGCAGAUCAUCAAGAACACGUAGAGAAACCCAGCUGUAA
- example_title: NRAS proto-oncogene
  pipeline_tag: splice-site
  sequence_type: 5' UTR
  task: splice-site
  text: GGGGCCGGAAGUGCCGCUCCUUGGUGGGGGCUGUUCAUGGCGGUUCCGGGGUCUCCAACAUUUUUCCCGGCUGUGGUCCUAAAUCUGUCCAAAGCAGAGGCAGUGGAGCUUGAGGUUCUUGCUGGUGUGAA
- example_title: amyloid beta precursor protein
  pipeline_tag: splice-site
  sequence_type: 5' UTR
  task: splice-site
  text: GUCAGUUUCCUCGGCAGCGGUAGGCGAGAGCACGCGGAGGAGCGUGCGCGGGGGCCCCGGGAGACGGCGGCGGUGGCGGCGCGGGCAGAGCAAGGACGCGGCGGAUCCCACUCGCACAGCAGCGCACUCGGUGCCCCGCGCAGGGUCGCG
- example_title: RUNX family transcription factor 1
  pipeline_tag: splice-site
  sequence_type: 5' UTR
  task: splice-site
  text: ACUUCUUUGGGCCUCAUAAACAACCACAGAACCACAAGUUGGGUAGCCUGGCAGUGUCAGAAGUCUGAACCCAGCAUAGUGGUCAGCAGGCAGGACGAAUCACACUGAAUGCAAACCACAGGGUUUCGCAGCGUGGUAAAAGAAAUCAUUGAGUCCCCCGCCUUCAGAAGAGGGUGCAUUUUCAGGAGGAAGCG
- example_title: fragile X messenger ribonucleoprotein 1
  pipeline_tag: splice-site
  sequence_type: 5' UTR
  task: splice-site
  text: CUCAGUCAGGCGCUCAGCUCCGUUUCGGUUUCACUUCCGGUGGAGGGCCGCCUCUGAGCGGGCGGCGGGCCGACGGCGAGCGCGGGCGGCGGCGGUGACGGAGGCGCCGCUGCCAGGGGGCGUGCGGCAGCGCGGCGGCGGCGGCGGCGGCGGCGGCGGCGGAGGCGGCGGCGGCGGCGGCGGCGGCGGCGGCUGGGCCUCGAGCGCCCGCAGCCCACCUCUCGGGGGCGGGCUCCCGGCGCUAGCAGGGCUGAAGAGAAG
- example_title: MYC proto-oncogene
  pipeline_tag: splice-site
  sequence_type: 5' UTR
  task: splice-site
  text: AACUCGCUGUAGUAAUUCCAGCGAGAGGCAGAGGGAGCGAGCGGGCGGCCGGCUAGGGUGGAAGAGCCGGGCGAGCAGAGCUGCGCUGCGGGCGUCCUGGGAAGGGAGAUCCGGAGCGAAUAGGGGGCUUCGCCUCUGGCCCAGCCCUCCCGCUGAUCCCCCAGCCAGCGGUCCGCAACCCUUGCCGCAUCCACGAAACUUUGCCCAUAGCAGCGGGCGGGCACUUUGCACUGGAACUUACAACACCCGAGCAAGGACGCGACUCUCCCGACGCGGGGAGGCUAUUCUGCCCAUUUGGGGACACUUCCCCGCCGCUGCCAGGACCCGCUUCUCUGAAAGGCUCUCCUUGCAGCUGCUUAGACG
- example_title: activating transcription factor 4
  pipeline_tag: splice-site
  sequence_type: 5' UTR
  task: splice-site
  text: CAUUUCUACUUUGCCCGCCCACAGAUGUAGUUUUCUCUGCGCGUGUGCGUUUUCCCUCCUCCCCGCCCUCAGGGUCCACGGCCACCAUGGCGUAUUAGGGGCAGCAGUGCCUGCGGCAGCAUUGGCCUUUGCAGCGGCGGCAGCAGCACCAGGCUCUGCAGCGGCAACCCCCAGCGGCUUAAGCCAUGGCGCUUCUCACGGCAUUCAGCAGCAGCGUUGCUGUAACCGACAAAGACACCUUCGAAUUAAGCACAUUCCUCGAUUCCAGCAAAGCACCGCAAC
- example_title: Human GPI protein p137
  pipeline_tag: splice-site
  sequence_type: 3' UTR
  task: splice-site
  text: UUUUUAAAAGGAAAAGAUACCAAAUGCCUGCUGCUACCACCCUUUUCAAUUGCUAUGUUUUGAAAGGCACCAGUAUGUGUUUUAGAUUGAUUUAAAUGUUUCAUUUAAAUCACGGACAGUAGUUUCAGUUCUGAUGGUAUAAGCAAAACAAAUAAAACGUUUAUAAAAGUUGUAUCUUGAAACACUGGUGUUCAACAGCUAGCAGCUUAUGUGAUUCACCCCAUGCCACGUUAGUGUCACAAAUUUUAUGGUUUAUCUCCAGCAACAUUUCUCUAGUACUUGCACUUAUUAUCUGAAUUC
- example_title: nucleophosmin 1
  pipeline_tag: splice-site
  sequence_type: 3' UTR
  task: splice-site
  text: GAAAAUAGUUUAAACAAUUUGUUAAAAAAUUUUCCGUCUUAUUUCAUUUCUGUAACAGUUGAUAUCUGGCUGUCCUUUUUAUAAUGCAGAGUGAGAACUUUCCCUACCGUGUUUGAUAAAUGUUGUCCAGGUUCUAUUGCCAAGAAUGUGUUGUCCAAAAUGCCUGUUUAGUUUUUAAAGAUGGAACUCCACCCUUUGCUUGGUUUUAAGUAUGUAUGGAAUGUUAUGAUAGGACAUAGUAGUAGCGGUGGUCAGACAUGGAAAUGGUGGGGAGACAAAAAUAUACAUGUGAAAUAAAACUCAGUAUUUUAAUAAAGUAGCACGGUUUCUAUUGA
- example_title: superoxide dismutase 1
  pipeline_tag: splice-site
  sequence_type: 3' UTR
  task: splice-site
  text: ACAUUCCCUUGGAUGUAGUCUGAGGCCCCUUAACUCAUCUGUUAUCCUGCUAGCUGUAGAAAUGUAUCCUGAUAAACAUUAAACACUGUAAUCUUAAAAGUGUAAUUGUGUGACUUUUUCAGAGUUGCUUUAAAGUACCUGUAGUGAGAAACUGAUUUAUGAUCACUUGGAAGAUUUGUAUAGUUUUAUAAAACUCAGUUAAAAUGUCUGUUUCAAUGACCUGUAUUUUGCCAGACUUAAAUCACAGAUGGGUAUUAAACUUGUCAGAAUUUCUUUGUCAUUCAAGCCUGUGAAUAAAAACCCUGUAUGGCACUUAUUAUGAGGCUAUUAAAAGAAUCCAAAUUCAAACUAAA
- example_title: hemoglobin subunit alpha 2
  pipeline_tag: splice-site
  sequence_type: 3' UTR
  task: splice-site
  text: CUGGAGCCUCGGUAGCCGUUCCUCCUGCCCGCUGGGCCUCCCAACGGGCCCUCCUCCCCUCCUUGCACCGGCCCUUCCUGGUCUUUGAAUAAAGUCUGAGUGGGCAGCA
- example_title: BRAF proto-oncogene
  pipeline_tag: splice-site
  sequence_type: 3' UTR
  task: splice-site
  text: AACAAAUGAGUGAGAGAGUUCAGGAGAGUAGCAACAAAAGGAAAAUAAAUGAACAUAUGUUUGCUUAUAUGUUAAAUUGAAUAAAAUACUCUCUUUUUUUUUAAGGUGAACCAAAGAACACUUGUGUGGUUAAAGACUAGAUAUAAUUUUUCCCCAAACUAAAAUUUAUACUUAACAUUGGAUUUUUAACAUCCAAGGGUUAAAAUACAUAGACAUUGCUAAAAAUUGGCAGAGCCUCUUCUAGAGGCUUUACUUUCUGUUCCGGGUUUGUAUCAUUCACUUGGUUAUUUUAAGUAGUAAACUUCAGUUUCUCAUGCAACUUUUGUUGCCAGCUAUCACAUGUCCACUAGGGACUCCAGAAGAAGACCCUACCUAUGCCUGUGUUUGCAGGUGAGAAGUUGGCAGUCGGUUAGCCUGGG
- example_title: H3 clustered histone 1
  pipeline_tag: splice-site
  sequence_type: 3' UTR
  task: splice-site
  text: UUACUGUGGUCUCUCUGACGGUCCAAGCAAAGGCUCUUUUCAGAGCCACCACCUUUUC
---

# DeltaSplice

Reference-informed prediction of alternative splicing and splicing-altering mutations from sequences.

## Disclaimer

This is an UNOFFICIAL implementation of [Reference-informed prediction of alternative splicing and splicing-altering mutations from sequences](https://doi.org/10.1101/gr.279044.124) by Chencheng Xu, Suying Bao, et al.

The OFFICIAL repository of DeltaSplice is at [chaolinzhanglab/DeltaSplice](https://github.com/chaolinzhanglab/DeltaSplice).

> [!TIP]
> The MultiMolecule team has confirmed that the provided model and checkpoints are producing the same intermediate representations as the original implementation.

**The team releasing DeltaSplice did not write this model card for this model so this model card has been written by the MultiMolecule team.**

## Model Details

DeltaSplice predicts splice-site usage (SSU) and splicing-altering mutation effects from sequence. The model uses a valid-convolution dilated residual encoder and three prediction modules: splice-site usage, reference-informed delta-SSU, and an auxiliary splice-site head. The official package uses the average prediction of five checkpoints for SSU and delta-SSU prediction; MultiMolecule stores the five seed checkpoints of each released data variant as internal ensemble members and returns their average prediction.

### Variants

- **[multimolecule/deltasplice](https://huggingface.co/multimolecule/deltasplice)**: DeltaSplice trained on the multi-species training set used by the upstream default checkpoints.
- **[multimolecule/deltasplice-human](https://huggingface.co/multimolecule/deltasplice-human)**: DeltaSplice human-only comparison checkpoint set released by the upstream project.

### Model Specification

| Variant           | Num Layers | Hidden Size | Context | Ensemble Members | Num Parameters (M) | FLOPs (M)  | MACs (M)  |
| ----------------- | ---------- | ----------- | ------- | ---------------- | ------------------ | ---------- | --------- |
| DeltaSplice       | 24         | 64          | 30000   | 5                | 40.376             | 1642965.72 | 820284.36 |
| DeltaSplice-Human | 24         | 64          | 30000   | 5                | 40.376             | 1642965.72 | 820284.36 |

(FLOPs and MACs measured on one requested output nucleotide with the default 30 kb padded context.)

### Links

- **Code**: [multimolecule.deltasplice](https://github.com/DLS5-Omics/multimolecule/tree/master/multimolecule/models/deltasplice)
- **Paper**: [Reference-informed prediction of alternative splicing and splicing-altering mutations from sequences](https://doi.org/10.1101/gr.279044.124)
- **Developed by**: Chencheng Xu, Suying Bao, Ye Wang, Wenxing Li, Hao Chen, Yufeng Shen, Tao Jiang, Chaolin Zhang
- **Model type**: Dilated residual 1D CNN ensemble for splice-site usage and delta-SSU prediction
- **Original Repository**: [chaolinzhanglab/DeltaSplice](https://github.com/chaolinzhanglab/DeltaSplice)

## Usage

The model file depends on the [`multimolecule`](https://multimolecule.danling.org) library. You can install it using pip:

```bash
pip install multimolecule
```

### Direct Use

#### Splice-Site Usage

```python
>>> from multimolecule import RnaTokenizer
>>> from multimolecule.models.deltasplice import DeltaSpliceModel

>>> tokenizer = RnaTokenizer.from_pretrained("multimolecule/deltasplice")
>>> model = DeltaSpliceModel.from_pretrained("multimolecule/deltasplice")
>>> inputs = tokenizer("AGCAGUCAUUAUGGCGAAUCUGGCAAGUA", return_tensors="pt")
>>> output = model(**inputs)
>>> output["probabilities"].shape
torch.Size([1, 30, 3])
```

#### Variant Effect

```python
>>> from multimolecule import RnaTokenizer
>>> from multimolecule.models.deltasplice import DeltaSpliceModel

>>> tokenizer = RnaTokenizer.from_pretrained("multimolecule/deltasplice")
>>> model = DeltaSpliceModel.from_pretrained("multimolecule/deltasplice")
>>> reference = tokenizer("AGCAGUCAUUAUGGCGAAUCUGGCAAGUA", return_tensors="pt")
>>> alternative = tokenizer("AGCAGUCAUUAUGGCUAAUCUGGCAAGUA", return_tensors="pt")
>>> output = model(reference["input_ids"], alternative_input_ids=alternative["input_ids"], use_reference=True)
>>> output["delta"].shape
torch.Size([1, 30, 3])
```

### Interface

- **Input**: RNA sequence tokenized with `RnaTokenizer`; `N` is encoded as zero nucleotide channels
- **Output channels**: `no_splice`, `acceptor`, `donor`
- **Reference-only call**: returns per-position splice-site usage probabilities in `probabilities`
- **Reference + alternative call**: pass the reference sequence as `input_ids` and the alternate sequence as `alternative_input_ids`
- **Reference usage**: pass `reference_usage` with shape `(batch_size, sequence_length, 3)` or omit it to use the model's own reference usage as the reference signal

## Training Details

DeltaSplice was trained to predict splice-site usage from gene sequence and to improve mutation-effect prediction by incorporating reference splice-site usage.

### Training Data

The upstream repository describes training from `gene_dataset.tsu.txt`, which contains splice-site usage in adult brains of eight mammalian species.

### Training Procedure

The official release provides five seed checkpoints with the same architecture and data split. MultiMolecule represents these seed checkpoints as internal ensemble members rather than public model variants.

## Citation

```bibtex
@article{xu2024deltasplice,
  title     = {Reference-informed prediction of alternative splicing and splicing-altering mutations from sequences},
  author    = {Xu, Chencheng and Bao, Suying and Wang, Ye and Li, Wenxing and Chen, Hao and Shen, Yufeng and Jiang, Tao and Zhang, Chaolin},
  journal   = {Genome Research},
  volume    = {34},
  number    = {7},
  pages     = {1052--1065},
  year      = {2024},
  doi       = {10.1101/gr.279044.124}
}
```

> [!NOTE]
> The artifacts distributed in this repository are part of the MultiMolecule project.
> If MultiMolecule supports your research, please cite the MultiMolecule project as follows:

```bibtex
@software{chen_2024_12638419,
  author    = {Chen, Zhiyuan and Zhu, Sophia Y.},
  title     = {MultiMolecule},
  doi       = {10.5281/zenodo.12638419},
  publisher = {Zenodo},
  url       = {https://doi.org/10.5281/zenodo.12638419},
  year      = 2024,
  month     = may,
  day       = 4
}
```

## Contact

Please use GitHub issues of [MultiMolecule](https://github.com/DLS5-Omics/multimolecule/issues) for any questions or comments on the model card.

Please contact the authors of the [DeltaSplice paper](https://doi.org/10.1101/gr.279044.124) for questions or comments on the paper/model.

## License

This model implementation is licensed under the [GNU Affero General Public License](license.md).

For additional terms and clarifications, please refer to our [License FAQ](license-faq.md).

```spdx
SPDX-License-Identifier: AGPL-3.0-or-later
```