Feature Extraction
Transformers
PyTorch
English
modernbert
genomics
rna
nucleotide
sequence-modeling
biology
bioinformatics
electra
Instructions to use FreakingPotato/RNAElectra with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use FreakingPotato/RNAElectra with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="FreakingPotato/RNAElectra")# Load model directly from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("FreakingPotato/RNAElectra") model = AutoModel.from_pretrained("FreakingPotato/RNAElectra") - Notebooks
- Google Colab
- Kaggle
| language: | |
| - en | |
| license: apache-2.0 | |
| library_name: transformers | |
| tags: | |
| - genomics | |
| - rna | |
| - nucleotide | |
| - sequence-modeling | |
| - biology | |
| - bioinformatics | |
| - electra | |
| pipeline_tag: feature-extraction | |
| # RNAElectra: Single-Nucleotide ELECTRA-Style Pre-training for RNA Representation Learning | |
| RNAElectra is a nucleotide-resolution RNA language model trained using an ELECTRA-style objective for efficient and discriminative representation learning. The model produces contextualized embeddings for RNA sequences and is designed for downstream transcriptomic and regulatory modeling tasks. | |
| ## Model Details | |
| - **Model Type**: Transformer-based discriminator model | |
| - **Training Objective**: ELECTRA-style replaced-token detection | |
| - **Resolution**: Single-nucleotide | |
| - **Domain**: RNA and transcriptomic sequences | |
| - **Architecture**: ModernBERT-style backbone adapted for nucleotide sequences | |
| RNAElectra focuses on efficient pre-training by learning to discriminate corrupted tokens rather than reconstruct them, leading to strong representations with improved training efficiency. | |
| ## Key Features | |
| - Single-nucleotide tokenization | |
| - Contextual RNA sequence embeddings | |
| - ELECTRA-style discriminative pre-training | |
| - Suitable for RNA function prediction, RBP binding modeling, stability prediction, regulatory element analysis, and downstream fine-tuning tasks | |
| ## Usage | |
| ### Basic Feature Extraction | |
| ```python | |
| import torch | |
| from transformers import AutoModel | |
| from tokenizer import NucEL_Tokenizer | |
| device = "cuda" if torch.cuda.is_available() else "cpu" | |
| model = AutoModel.from_pretrained( | |
| "FreakingPotato/RNAElectra", | |
| trust_remote_code=True | |
| ).to(device) | |
| model.eval() | |
| tokenizer = NucEL_Tokenizer.from_pretrained( | |
| "FreakingPotato/RNAElectra", | |
| trust_remote_code=True | |
| ) | |
| sequence = "AUGCAUGCAUGCAUGC" | |
| inputs = tokenizer(sequence, return_tensors="pt") | |
| inputs = {k: v.to(device) for k, v in inputs.items()} | |
| with torch.no_grad(): | |
| outputs = model(**inputs) | |
| embeddings = outputs.last_hidden_state | |
| print(f"Sequence embeddings shape: {embeddings.shape}") | |
| ``` | |
| ## Installation | |
| ```bash | |
| pip install transformers torch | |
| ``` | |
| ## Requirements | |
| - transformers >= 5.0.0 | |
| - torch >= 2.10.0 | |
| - Python >= 3.12.3 | |
| GPU is recommended for large-scale inference. | |
| ## Pre-training Overview | |
| RNAElectra was trained using an ELECTRA-style generator–discriminator framework. A generator predicts corrupted tokens, and a discriminator learns to detect replaced tokens. Only the discriminator weights are released in this repository. This objective improves training efficiency compared to masked language modeling while preserving strong contextual representations. | |
| ## Intended Use | |
| RNAElectra is intended for feature extraction, downstream fine-tuning, and representation learning in RNA and transcriptomic modeling tasks. It is not intended for clinical decision-making or medical diagnostics. | |
| ## License | |
| This model is released under the Apache 2.0 License. |