Model save

Files changed (3) hide show

README.md CHANGED Viewed

@@ -1,28 +1,21 @@
 ---
 model-index:
 - name: RNAMamba-14M
   results: []
-license: apache-2.0
-datasets:
-- afg1/rnacentral_subset
-pipeline_tag: fill-mask
-inference: false
 ---
 # RNAMamba-14M
-This model is a small Mamba based model trained from scratch on 1.96 million sequences (1.56 billion bases) extracted from RNAcentral's active sequences FASTA file for release 24 (March 2024).
-This is intended to be a sequence embedding model for downstream processing of ncRNA sequences.
-It is trained with a masked language modelling objective, and a context size of 8,192 nucleotides.
-The [dataset](https://huggingface.co/datasets/afg1/rnacentral_subset) has sequences ranging in length from 10 to 8192, so the model should be pretty good at handling sequences in that range.
-This is a deliberately small model with only 14.1 million parameters (8 hidden layers, hidden dim 512, intermediate size 1024) such that it will run fast without a GPU. We may train something bigger if it looks like these embeddings are not good enough.
-<!--## Model description
-I'll fill this in later...
 ## Intended uses & limitations
@@ -33,7 +26,6 @@ More information needed
 More information needed
 ## Training procedure
--->
 ### Training hyperparameters
@@ -51,4 +43,4 @@ The following hyperparameters were used during training:
 - Transformers 4.39.3
 - Pytorch 2.2.2+cu118
 - Datasets 2.18.0
-- Tokenizers 0.15.2

 ---
+tags:
+- generated_from_trainer
 model-index:
 - name: RNAMamba-14M
   results: []
 ---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
 # RNAMamba-14M
+This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
+## Model description
+More information needed
 ## Intended uses & limitations
 More information needed
 ## Training procedure
 ### Training hyperparameters
 - Transformers 4.39.3
 - Pytorch 2.2.2+cu118
 - Datasets 2.18.0
+- Tokenizers 0.15.2

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ad6184c990246aa1ec418cdb4be417990a57c548487463855a2c07fee3de32d1
 size 56398500

 version https://git-lfs.github.com/spec/v1
+oid sha256:a238e0813bbfb570b18ff89c92cd63f03935cbb8c9282a250c533df673dca66d
 size 56398500

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:761a44663ae5eeea6c7af0dd949641cb77ee38b0a5cde871ed4a6fa638def7f5
 size 4920

 version https://git-lfs.github.com/spec/v1
+oid sha256:d62db5204d4bd6d9c7f84466a6ad35588667099a445a3b5d547f96a5373c5955
 size 4920