afg1
/

RNAMamba-14M

+---
+tags:
+- generated_from_trainer
+model-index:
+- name: RNAMamba-14M
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# RNAMamba-14M
+This model is a small Mamba based model trained from scratch on 1.96 million sequences (1.56 billion bases) extracted from RNAcentral's active sequences FASTA file for release 24 (March 2024).
+This is intended to be a sequence embedding model for downstream processing of ncRNA sequences.
+It is trained with a masked language modelling objective, and a context size of 8,192 nucleotides. This particular model has the MLM head stripped off and so should be almost ready to use for embedding.
+The [dataset](https://huggingface.co/datasets/afg1/rnacentral_subset) has sequences ranging in length from 10 to 8192, so the model should be pretty good at handling sequences in that range.
+This is a deliberately small model with only 14.1 million parameters (8 hidden layers, hidden dim 512, intermediate size 1024) such that it will run fast without a GPU. We may train something bigger if it looks like these embeddings are not good enough.
+<!--## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+-->
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 2e-05
+- train_batch_size: 32
+- eval_batch_size: 8
+- seed: 42
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 1.0
+### Framework versions
+- Transformers 4.39.3
+- Pytorch 2.2.2+cu118
+- Datasets 2.18.0
+- Tokenizers 0.15.2