Instructions to use multimolecule/optmrl with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MultiMolecule
How to use multimolecule/optmrl with MultiMolecule:
pip install multimolecule
from multimolecule import AutoModel, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("multimolecule/optmrl") model = AutoModel.from_pretrained("multimolecule/optmrl") inputs = tokenizer("UAGCUUAUCAGACUGAUGUUGA", return_tensors="pt") outputs = model(**inputs) embeddings = outputs.last_hidden_state - Notebooks
- Google Colab
- Kaggle
| library_name: multimolecule | |
| license: agpl-3.0 | |
| pipeline: mean-ribosome-load | |
| pipeline_tag: other | |
| tags: | |
| - Biology | |
| - RNA | |
| - 5' UTR | |
| - Translation | |
| - rna | |
| widget: | |
| - example_title: microRNA 21 | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: ncRNA | |
| task: mean-ribosome-load | |
| text: UAGCUUAUCAGACUGAUGUUGA | |
| - example_title: microRNA 146a | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: ncRNA | |
| task: mean-ribosome-load | |
| text: UGAGAACUGAAUUCCAUGGGUU | |
| - example_title: microRNA 155 | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: ncRNA | |
| task: mean-ribosome-load | |
| text: UUAAUGCUAAUCGUGAUAGGGGUU | |
| - example_title: RNA component of mitochondrial RNA processing endoribonuclease | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: ncRNA | |
| task: mean-ribosome-load | |
| text: GGUUCGUGCUGAAGGCCUGUAUCCUAGGCUACACACUGAGGACUCUGUUCCUCCCCUUUCCGCCUAGGGGAAAGUCCCCGGACCUCGGGCAGAGAGUGCCACGUGCAUACGCACGUAGACAUUCCCCGCUUCCCACUCCAAAGUCCGCCAAGAAGCGUAUCCCGCUGAGCGGCGUGGCGCGGGGGCGUCAUCCGUCAGCUCCCUCUAGUUACGCAGGCAGUGCGUGUCCGCGCACCAACCACACGGGGCUCAUUCUCAGCGCGGCUGUAAAAAAAAA | |
| - example_title: 7SK small nuclear RNA | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: ncRNA | |
| task: mean-ribosome-load | |
| text: GGAUGUGAGGGCGAUCUGGCUGCGACAUCUGUCACCCCAUUGAUCGCCAGGGUUGAUUCGGCUGAUCUGGCUGGCUAGGCGGGUGUCCCCUUCCUCCCUCACCGCUCCAUGUGCGUCCCUCCCGAAGCUGCGCGCUCGGUCGAAGAGGACGACCAUCCCCGAUAGAGGAGGACCGGUCUUCGGUCAAGGGUAUACGAGUAGCUGCGCUCCCCUGCUAGAACCUCCAAACAAGCUCUCAAGGUCCAUUUGUAGGAGAACGUAGGGUAGUCAAGCUUCCAAGACUCCAGACACAUCCAAAUGAGGCGCUGCAUGUGGCAGUCUGCCUUUCUUUU | |
| - example_title: telomerase RNA component | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: ncRNA | |
| task: mean-ribosome-load | |
| text: GGGUUGCGGAGGGUGGGCCUGGGAGGGGUGGUGGCCAUUUUUUGUCUAACCCUAACUGAGAAGGGCGUAGGCGCCGUGCUUUUGCUCCCCGCGCGCUGUUUUUCUCGCUGACUUUCAGCGGGCGGAAAAGCCUCGGCCUGCCGCCUUCCACCGUUCAUUCUAGAGCAAACAAAAAAUGUCAGCUGCUGGCCCGUUCGCCCCUCCCGGGGACCUGCGGCGGGUCGCCUGCCCAGCCCCCGAACCCCGCCUGGAGGCCGCGGUCGGCCCGGGGCUUCUCCGGAGGCACCCACUGCCACCGCGAAGAGUUGGGCUCUGUCAGCCGCGGGUCUCUCGGGGGCGAGGGCGAGGUUCAGGCCUUUCAGGCCGCAGGAAGAGGAACGGAGCGAGUCCCCGCGCGCGGCGCGAUUCCCUGAGCUGUGGGACGUGCACCCAGGACUCGGCUCACACAUGC | |
| - example_title: vault RNA 2-1 | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: ncRNA | |
| task: mean-ribosome-load | |
| text: CGGGUCGGAGUUAGCUCAAGCGGUUACCUCCUCAUGCCGGACUUUCUAUCUGUCCAUCUCUGUGCUGGGGUUCGAGACCCGCGGGUGCUUACUGACCCUUUUAUGCAA | |
| - example_title: brain cytoplasmic RNA 1 | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: ncRNA | |
| task: mean-ribosome-load | |
| text: GGCCGGGCGCGGUGGCUCACGCCUGUAAUCCCAGCUCUCAGGGAGGCUAAGAGGCGGGAGGAUAGCUUGAGCCCAGGAGUUCGAGACCUGCCUGGGCAAUAUAGCGAGACCCCGUUCUCCAGAAAAAGGAAAAAAAAAAACAAAAGACAAAAAAAAAAUAAGCGUAACUUCCCUCAAAGCAACAACCCCCCCCCCCCUUU | |
| - example_title: HIV-1 TAR-WT | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: ncRNA | |
| task: mean-ribosome-load | |
| text: GGUCUCUCUGGUUAGACCAGAUCUGAGCCUGGGAGCUCUCUGGCUAACUAGGGAACC | |
| - example_title: prion protein (Kanno blood group) | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: mRNA | |
| task: mean-ribosome-load | |
| text: AUGGCGAACCUUGGCUGCUGGAUGCUGGUUCUCUUUGUGGCCACAUGGAGUGACCUGGGCCUCUGC | |
| - example_title: interleukin 10 | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: mRNA | |
| task: mean-ribosome-load | |
| text: AUGCACAGCUCAGCACUGCUCUGUUGCCUGGUCCUCCUGACUGGGGUGAGGGCC | |
| - example_title: Zaire ebolavirus | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: mRNA | |
| task: mean-ribosome-load | |
| text: AAUGUUCAAACACUUUGUGAAGCUCUGUUAGCUGAUGGUCUUGCUAAAGCAUUUCCUAGCAAUAUGAUGGUAGUCACAGAGCGUGAGCAAAAAGAAAGCUUAUUGCAUCAAGCAUCAUGGCACCACACAAGUGAUGAUUUUGGUGAGCAUGCCACAGUUAGAGGGAGUAGCUUUGUAACUGAUUUAGAGAAAUACAAUCUUGCAUUUAGAUAUGAGUUUACAGCACCUUUUAUAGAAUAUUGUAACCGUUGCUAUGGUGUUAAGAAUGUUUUUAAUUGGAUGCAUUAUACAAUCCCACAGUGUUAU | |
| - example_title: SARS coronavirus | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: mRNA | |
| task: mean-ribosome-load | |
| text: AUGUUUAUUUUCUUAUUAUUUCUUACUCUCACUAGUGGUAGUGACCUUGACCGGUGCACCACUUUUGAUGAUGUUCAAGCUCCUAAUUACACUCAACAUACUUCAUCUAUGAGGGGGGUUUACUAUCCUGAUGAAAUUUUUAGAUCAGACACUCUUUAUUUAACUCAGGAUUUAUUUCUUCCAUUUUAUUCUAAUGUUACAGGGUUUCAUACUAUUAAUCAUACGUUUGACAACCCUGUCAUACCUUUUAAGGAUGGUAUUUAUUUUGCUGCCACAGAGAAAUCAAAUGUUGUCCGUGGUUGGGUUUUUGGUUCUACCAUGAACAACAAGUCACAGUCGGUGAUUAUUAUUAACAAUUCUACUAAUGUUGUUAUACGAGCAUGUAACUUUGAAUUGUGUGACAACCCUUUCUUUGCUGUUUCUAAACCCAUGGGUACACAGACACAUACUAUGAUAUUCGAUAAUGCAUUUAAAUGCACUUUCGAGUACAUAUCU | |
| - example_title: insulin | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: mRNA | |
| task: mean-ribosome-load | |
| text: AUGGCCCUGUGGAUGCGCCUCCUGCCCCUGCUGGCGCUGCUGGCCCUCUGGGGACCUGACCCAGCCGCAGCCUUUGUGAACCAACACCUGUGCGGCUCACACCUGGUGGAAGCUCUCUACCUAGUGUGCGGGGAACGAGGCUUCUUCUACACACCCAAGACCCGCCGGGAGGCAGAGGACCUGCAGGUGGGGCAGGUGGAGCUGGGCGGGGGCCCUGGUGCAGGCAGCCUGCAGCCCUUGGCCCUGGAGGGGUCCCUGCAGAAGCGUGGCAUUGUGGAACAAUGCUGUACCAGCAUCUGCUCCCUCUACCAGCUGGAGAACUACUGCAACUAG | |
| - example_title: cyclin dependent kinase inhibitor 2A | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: mRNA | |
| task: mean-ribosome-load | |
| text: AUGGAGCCGGCGGCGGGGAGCAGCAUGGAGCCUUCGGCUGACUGGCUGGCCACGGCCGCGGCCCGGGGUCGGGUAGAGGAGGUGCGGGCGCUGCUGGAGGCGGGGGCGCUGCCCAACGCACCGAAUAGUUACGGUCGGAGGCCGAUCCAGGUCAUGAUGAUGGGCAGCGCCCGAGUGGCGGAGCUGCUGCUGCUCCACGGCGCGGAGCCCAACUGCGCCGACCCCGCCACUCUCACCCGACCCGUGCACGACGCUGCCCGGGAGGGCUUCCUGGACACGCUGGUGGUGCUGCACCGGGCCGGGGCGCGGCUGGACGUGCGCGAUGCCUGGGGCCGUCUGCCCGUGGACCUGGCUGAGGAGCUGGGCCAUCGCGAUGUCGCACGGUACCUGCGCGCGGCUGCGGGGGGCACCAGAGGCAGUAACCAUGCCCGCAUAGAUGCCGCGGAAGGUCCCUCAGACAUCCCCGAUUGA | |
| - example_title: human papillomavirus type 16 E6 | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: mRNA | |
| task: mean-ribosome-load | |
| text: AUGCACCAAAAGAGAACUGCAAUGUUUCAGGACCCACAGGAGCGACCCAGAAAGUUACCACAGUUAUGCACAGAGCUGCAAACAACUAUACAUGAUAUAAUAUUAGAAUGUGUGUACUGCAAGCAACAGUUACUGCGACGUGAGGUAUAUGACUUUGCUUUUCGGGAUUUAUGCAUAGUAUAUAGAGAUGGGAAUCCAUAUGCUGUAUGUGAUAAAUGUUUAAAGUUUUAUUCUAAAAUUAGUGAGUAUAGACAUUAUUGUUAUAGUUUGUAUGGAACAACAUUAGAACAGCAAUACAACAAACCGUUGUGUGAUUUGUUAAUUAGGUGUAUUAACUGUCAAAAGCCACUGUGUCCUGAAGAAAAGCAAAGACAUCUGGACAAAAAGCAAAGAUUCCAUAAUAUAAGGGGUCGGUGGACCGGUCGAUGUAUGUCUUGUUGCAGAUCAUCAAGAACACGUAGAGAAACCCAGCUGUAA | |
| - example_title: NRAS proto-oncogene | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: 5' UTR | |
| task: mean-ribosome-load | |
| text: GGGGCCGGAAGUGCCGCUCCUUGGUGGGGGCUGUUCAUGGCGGUUCCGGGGUCUCCAACAUUUUUCCCGGCUGUGGUCCUAAAUCUGUCCAAAGCAGAGGCAGUGGAGCUUGAGGUUCUUGCUGGUGUGAA | |
| - example_title: amyloid beta precursor protein | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: 5' UTR | |
| task: mean-ribosome-load | |
| text: GUCAGUUUCCUCGGCAGCGGUAGGCGAGAGCACGCGGAGGAGCGUGCGCGGGGGCCCCGGGAGACGGCGGCGGUGGCGGCGCGGGCAGAGCAAGGACGCGGCGGAUCCCACUCGCACAGCAGCGCACUCGGUGCCCCGCGCAGGGUCGCG | |
| - example_title: RUNX family transcription factor 1 | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: 5' UTR | |
| task: mean-ribosome-load | |
| text: ACUUCUUUGGGCCUCAUAAACAACCACAGAACCACAAGUUGGGUAGCCUGGCAGUGUCAGAAGUCUGAACCCAGCAUAGUGGUCAGCAGGCAGGACGAAUCACACUGAAUGCAAACCACAGGGUUUCGCAGCGUGGUAAAAGAAAUCAUUGAGUCCCCCGCCUUCAGAAGAGGGUGCAUUUUCAGGAGGAAGCG | |
| - example_title: fragile X messenger ribonucleoprotein 1 | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: 5' UTR | |
| task: mean-ribosome-load | |
| text: CUCAGUCAGGCGCUCAGCUCCGUUUCGGUUUCACUUCCGGUGGAGGGCCGCCUCUGAGCGGGCGGCGGGCCGACGGCGAGCGCGGGCGGCGGCGGUGACGGAGGCGCCGCUGCCAGGGGGCGUGCGGCAGCGCGGCGGCGGCGGCGGCGGCGGCGGCGGCGGAGGCGGCGGCGGCGGCGGCGGCGGCGGCGGCUGGGCCUCGAGCGCCCGCAGCCCACCUCUCGGGGGCGGGCUCCCGGCGCUAGCAGGGCUGAAGAGAAG | |
| - example_title: MYC proto-oncogene | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: 5' UTR | |
| task: mean-ribosome-load | |
| text: AACUCGCUGUAGUAAUUCCAGCGAGAGGCAGAGGGAGCGAGCGGGCGGCCGGCUAGGGUGGAAGAGCCGGGCGAGCAGAGCUGCGCUGCGGGCGUCCUGGGAAGGGAGAUCCGGAGCGAAUAGGGGGCUUCGCCUCUGGCCCAGCCCUCCCGCUGAUCCCCCAGCCAGCGGUCCGCAACCCUUGCCGCAUCCACGAAACUUUGCCCAUAGCAGCGGGCGGGCACUUUGCACUGGAACUUACAACACCCGAGCAAGGACGCGACUCUCCCGACGCGGGGAGGCUAUUCUGCCCAUUUGGGGACACUUCCCCGCCGCUGCCAGGACCCGCUUCUCUGAAAGGCUCUCCUUGCAGCUGCUUAGACG | |
| - example_title: activating transcription factor 4 | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: 5' UTR | |
| task: mean-ribosome-load | |
| text: CAUUUCUACUUUGCCCGCCCACAGAUGUAGUUUUCUCUGCGCGUGUGCGUUUUCCCUCCUCCCCGCCCUCAGGGUCCACGGCCACCAUGGCGUAUUAGGGGCAGCAGUGCCUGCGGCAGCAUUGGCCUUUGCAGCGGCGGCAGCAGCACCAGGCUCUGCAGCGGCAACCCCCAGCGGCUUAAGCCAUGGCGCUUCUCACGGCAUUCAGCAGCAGCGUUGCUGUAACCGACAAAGACACCUUCGAAUUAAGCACAUUCCUCGAUUCCAGCAAAGCACCGCAAC | |
| - example_title: Human GPI protein p137 | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: 3' UTR | |
| task: mean-ribosome-load | |
| text: UUUUUAAAAGGAAAAGAUACCAAAUGCCUGCUGCUACCACCCUUUUCAAUUGCUAUGUUUUGAAAGGCACCAGUAUGUGUUUUAGAUUGAUUUAAAUGUUUCAUUUAAAUCACGGACAGUAGUUUCAGUUCUGAUGGUAUAAGCAAAACAAAUAAAACGUUUAUAAAAGUUGUAUCUUGAAACACUGGUGUUCAACAGCUAGCAGCUUAUGUGAUUCACCCCAUGCCACGUUAGUGUCACAAAUUUUAUGGUUUAUCUCCAGCAACAUUUCUCUAGUACUUGCACUUAUUAUCUGAAUUC | |
| - example_title: nucleophosmin 1 | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: 3' UTR | |
| task: mean-ribosome-load | |
| text: GAAAAUAGUUUAAACAAUUUGUUAAAAAAUUUUCCGUCUUAUUUCAUUUCUGUAACAGUUGAUAUCUGGCUGUCCUUUUUAUAAUGCAGAGUGAGAACUUUCCCUACCGUGUUUGAUAAAUGUUGUCCAGGUUCUAUUGCCAAGAAUGUGUUGUCCAAAAUGCCUGUUUAGUUUUUAAAGAUGGAACUCCACCCUUUGCUUGGUUUUAAGUAUGUAUGGAAUGUUAUGAUAGGACAUAGUAGUAGCGGUGGUCAGACAUGGAAAUGGUGGGGAGACAAAAAUAUACAUGUGAAAUAAAACUCAGUAUUUUAAUAAAGUAGCACGGUUUCUAUUGA | |
| - example_title: superoxide dismutase 1 | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: 3' UTR | |
| task: mean-ribosome-load | |
| text: ACAUUCCCUUGGAUGUAGUCUGAGGCCCCUUAACUCAUCUGUUAUCCUGCUAGCUGUAGAAAUGUAUCCUGAUAAACAUUAAACACUGUAAUCUUAAAAGUGUAAUUGUGUGACUUUUUCAGAGUUGCUUUAAAGUACCUGUAGUGAGAAACUGAUUUAUGAUCACUUGGAAGAUUUGUAUAGUUUUAUAAAACUCAGUUAAAAUGUCUGUUUCAAUGACCUGUAUUUUGCCAGACUUAAAUCACAGAUGGGUAUUAAACUUGUCAGAAUUUCUUUGUCAUUCAAGCCUGUGAAUAAAAACCCUGUAUGGCACUUAUUAUGAGGCUAUUAAAAGAAUCCAAAUUCAAACUAAA | |
| - example_title: hemoglobin subunit alpha 2 | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: 3' UTR | |
| task: mean-ribosome-load | |
| text: CUGGAGCCUCGGUAGCCGUUCCUCCUGCCCGCUGGGCCUCCCAACGGGCCCUCCUCCCCUCCUUGCACCGGCCCUUCCUGGUCUUUGAAUAAAGUCUGAGUGGGCAGCA | |
| - example_title: BRAF proto-oncogene | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: 3' UTR | |
| task: mean-ribosome-load | |
| text: AACAAAUGAGUGAGAGAGUUCAGGAGAGUAGCAACAAAAGGAAAAUAAAUGAACAUAUGUUUGCUUAUAUGUUAAAUUGAAUAAAAUACUCUCUUUUUUUUUAAGGUGAACCAAAGAACACUUGUGUGGUUAAAGACUAGAUAUAAUUUUUCCCCAAACUAAAAUUUAUACUUAACAUUGGAUUUUUAACAUCCAAGGGUUAAAAUACAUAGACAUUGCUAAAAAUUGGCAGAGCCUCUUCUAGAGGCUUUACUUUCUGUUCCGGGUUUGUAUCAUUCACUUGGUUAUUUUAAGUAGUAAACUUCAGUUUCUCAUGCAACUUUUGUUGCCAGCUAUCACAUGUCCACUAGGGACUCCAGAAGAAGACCCUACCUAUGCCUGUGUUUGCAGGUGAGAAGUUGGCAGUCGGUUAGCCUGGG | |
| - example_title: H3 clustered histone 1 | |
| pipeline_tag: mean-ribosome-load | |
| sequence_type: 3' UTR | |
| task: mean-ribosome-load | |
| text: UUACUGUGGUCUCUCUGACGGUCCAAGCAAAGGCUCUUUUCAGAGCCACCACCUUUUC | |
| --- | |
| # OptMRL | |
| Convolutional neural network for predicting the mean ribosome load (MRL) of an mRNA from the 50 nucleotides upstream of the coding sequence. | |
| ## Disclaimer | |
| This is an UNOFFICIAL implementation of [Interpreting Deep Neural Networks for the Prediction of Translation Rates](https://doi.org/10.1101/2023.06.02.543405) by Frederick Korbel, et al. | |
| The OFFICIAL repository of OptMRL is at [ohlerlab/mlcis](https://github.com/ohlerlab/mlcis). | |
| > [!TIP] | |
| > The MultiMolecule team has confirmed that the provided model and checkpoints are producing the same intermediate representations as the original implementation. | |
| **The team releasing OptMRL did not write this model card for this model so this model card has been written by the MultiMolecule team.** | |
| ## Model Details | |
| OptMRL is a small 1D convolutional neural network trained to predict the mean ribosome load (MRL), a polysome-profiling-derived translation efficiency proxy, from the 50 nucleotides of 5' untranslated region (5'UTR) sequence immediately upstream of the coding sequence. The model was first pre-trained on roughly 260,000 random 5'UTR reporters and then fine-tuned on roughly 20,000 endogenous human 5'UTRs. Please refer to the [Training Details](#training-details) section for more information on the training process. | |
| The architecture is a stack of three `Conv1D` layers (120 filters, kernel size 8, `same` padding, ReLU activation) followed by a `Flatten`, a 40-unit `Dense` bottleneck with ReLU activation and dropout, and a final scalar `Dense` regression head. | |
| ### Model Specification | |
| | Num Layers | Hidden Size | Num Parameters (M) | FLOPs (M) | MACs (M) | Max Num Tokens | | |
| | ---------- | ----------- | ------------------ | --------- | -------- | -------------- | | |
| | 5 | 40 | 0.476 | 24.04 | 12.00 | 50 | | |
| ### Links | |
| - **Code**: [multimolecule.optmrl](https://github.com/DLS5-Omics/multimolecule/tree/master/multimolecule/models/optmrl) | |
| - **Data**: 260,000 random 5'UTR reporters (pre-training) + 20,000 human 5'UTR reporters (fine-tuning) | |
| - **Paper**: [Interpreting Deep Neural Networks for the Prediction of Translation Rates](https://doi.org/10.1101/2023.06.02.543405) | |
| - **Developed by**: Frederick Korbel, Ekaterina Eroshok, Uwe Ohler | |
| - **Model type**: 1D CNN for mean-ribosome-load regression from 5'UTR sequence | |
| - **Original Repository**: [ohlerlab/mlcis](https://github.com/ohlerlab/mlcis) | |
| ## Usage | |
| The model file depends on the [`multimolecule`](https://multimolecule.danling.org) library. You can install it using pip: | |
| ```bash | |
| pip install multimolecule | |
| ``` | |
| ### Direct Use | |
| #### Mean Ribosome Load Prediction | |
| You can use this model directly to predict the mean ribosome load of a 50-nucleotide 5'UTR window: | |
| ```python | |
| >>> from multimolecule import RnaTokenizer, OptMrlForSequencePrediction | |
| >>> tokenizer = RnaTokenizer.from_pretrained("multimolecule/optmrl") | |
| >>> model = OptMrlForSequencePrediction.from_pretrained("multimolecule/optmrl") | |
| >>> sequence = "ACGU" * 12 + "AC" # 50 nt | |
| >>> input = tokenizer(sequence, add_special_tokens=False, return_tensors="pt") | |
| >>> output = model(**input) | |
| >>> output.logits.shape | |
| torch.Size([1, 1]) | |
| ``` | |
| ### Interface | |
| - **Input length**: 50 nt fixed 5'UTR window taken immediately upstream of the coding sequence | |
| - **Padding**: shorter sequences are right-padded with zeros to 50 nt; longer sequences are truncated to the first 50 nt | |
| - **Alphabet**: `ACGU` only; unknown / `N` tokens contribute zero one-hot signal | |
| - **Special tokens**: do not add (`add_special_tokens=False`) | |
| - **Output**: single scalar mean-ribosome-load (MRL) score per window | |
| ## Training Details | |
| OptMRL was first pre-trained on a large random-5'UTR reporter library and then fine-tuned on a smaller library of endogenous human 5'UTRs. | |
| ### Training Data | |
| - **Pre-training**: ~260,000 random 5'UTR reporters paired with polysome-profiling MRL measurements. | |
| - **Fine-tuning**: ~20,000 endogenous human 5'UTR reporters paired with polysome-profiling MRL measurements. | |
| Each reporter contributes a 50-nucleotide 5'UTR window immediately upstream of the coding sequence and a scalar MRL label. | |
| Note [`RnaTokenizer`][multimolecule.RnaTokenizer] will convert "T"s to "U"s for you, you may disable this behaviour by passing `replace_T_with_U=False`. | |
| ### Training Procedure | |
| #### Pre-training | |
| The model was first pre-trained as a regression task to predict the measured MRL of each random 5'UTR reporter, then fine-tuned end-to-end on the human-5'UTR reporters using the same regression objective. The published model is the fine-tuned model. | |
| ## Citation | |
| ```bibtex | |
| @article{korbel2023interpreting, | |
| author = {Korbel, Frederick and Eroshok, Ekaterina and Ohler, Uwe}, | |
| title = {Interpreting Deep Neural Networks for the Prediction of Translation Rates}, | |
| journal = {bioRxiv}, | |
| publisher = {Cold Spring Harbor Laboratory}, | |
| year = {2023}, | |
| doi = {10.1101/2023.06.02.543405} | |
| } | |
| ``` | |
| > [!NOTE] | |
| > The artifacts distributed in this repository are part of the MultiMolecule project. | |
| > If MultiMolecule supports your research, please cite the MultiMolecule project as follows: | |
| ```bibtex | |
| @software{chen_2024_12638419, | |
| author = {Chen, Zhiyuan and Zhu, Sophia Y.}, | |
| title = {MultiMolecule}, | |
| doi = {10.5281/zenodo.12638419}, | |
| publisher = {Zenodo}, | |
| url = {https://doi.org/10.5281/zenodo.12638419}, | |
| year = 2024, | |
| month = may, | |
| day = 4 | |
| } | |
| ``` | |
| ## Contact | |
| Please use GitHub issues of [MultiMolecule](https://github.com/DLS5-Omics/multimolecule/issues) for any questions or comments on the model card. | |
| Please contact the authors of the [OptMRL paper](https://doi.org/10.1101/2023.06.02.543405) for questions or comments on the paper/model. | |
| ## License | |
| This model implementation is licensed under the [GNU Affero General Public License](license.md). | |
| For additional terms and clarifications, please refer to our [License FAQ](license-faq.md). | |
| ```spdx | |
| SPDX-License-Identifier: AGPL-3.0-or-later | |
| ``` |