afg1
/

RNAMamba-14M-MLM

Generated from Trainer

text-generation-inference

Model card Files Files and versions

afg1 commited on Apr 18, 2024

Commit

c295d49

·

verified ·

1 Parent(s): 89f6cf2

Update README.md

Files changed (1) hide show

README.md +15 -8

README.md CHANGED Viewed

@@ -1,21 +1,27 @@
 ---
-tags:
-- generated_from_trainer
 model-index:
 - name: RNAMamba-14M
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
 # RNAMamba-14M
-This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
-## Model description
-More information needed
 ## Intended uses & limitations
@@ -26,6 +32,7 @@ More information needed
 More information needed
 ## Training procedure
 ### Training hyperparameters
@@ -43,4 +50,4 @@ The following hyperparameters were used during training:
 - Transformers 4.39.3
 - Pytorch 2.2.2+cu118
 - Datasets 2.18.0
-- Tokenizers 0.15.2

 ---
 model-index:
 - name: RNAMamba-14M
   results: []
+license: apache-2.0
+datasets:
+- afg1/rnacentral_subset
+pipeline_tag: fill-mask
 ---
 # RNAMamba-14M
+This model is a small Mamba based model trained from scratch on 1.96 million sequences (1.56 billion bases) extracted from RNAcentral's active sequences FASTA file for release 24 (March 2024).
+This is intended to be a sequence embedding model for downstream processing of ncRNA sequences.
+It is trained with a masked language modelling objective, and a context size of 8,192 nucleotides.
+The [dataset](https://huggingface.co/datasets/afg1/rnacentral_subset) has sequences ranging in length from 10 to 8192, so the model should be pretty good at handling sequences in that range.
+This is a deliberately small model with only 14.1 million parameters (8 hidden layers, hidden dim 512, intermediate size 1024) such that it will run fast without a GPU. We may train something bigger if it looks like these embeddings are not good enough.
+<!--## Model description
+I'll fill this in later...
 ## Intended uses & limitations
 More information needed
 ## Training procedure
+-->
 ### Training hyperparameters
 - Transformers 4.39.3
 - Pytorch 2.2.2+cu118
 - Datasets 2.18.0
+- Tokenizers 0.15.2