LncPNdeep

LncPNdeep is a long non-coding RNA classifier that integrates nucleotide and peptide-level sequence embeddings. The model combines RNA embeddings generated from pretrained BigBird/Longformer masked language models with peptide/protein embeddings generated from ProtBERT/ProtTrans, then applies a downstream neural classifier for lncRNA versus coding RNA prediction.

Code and usage scripts are available on GitHub:

https://github.com/yatoka233/LncPNdeep

Uploaded Weights

This repository stores the model weights used by the LncPNdeep codebase.

File Purpose
weights/rna_pretrain/save.Longformer.pretrain.epoch20.params PyTorch Longformer RNA MLM checkpoint used to generate Longformer256 nucleotide embeddings.
weights/rna_pretrain/save.bigbird.pretrain.epoch20.params PyTorch BigBird RNA MLM checkpoint used to generate Bigbird256 nucleotide embeddings.
weights/rna_pretrain/save.bigbird_full.pretrain.epoch20.params PyTorch BigBird RNA MLM checkpoint used to generate Bigbird768 nucleotide embeddings.
weights/final_classifier/ProteinTransAllfeature_ResCNN2_07_08.h5 TensorFlow/Keras final classifier that combines RNA and peptide embeddings.

Model Inputs

The final classifier expects six precomputed embeddings:

  1. Average peptide embedding
  2. Fake peptide embedding
  3. Max peptide embedding
  4. BigBird256 RNA embedding
  5. BigBird768 RNA embedding
  6. Longformer256 RNA embedding

The RNA embedding checkpoints are used to generate the nucleotide embeddings. Peptide embeddings are generated using ProtBERT/ProtTrans in the accompanying GitHub code.

Download Example

from huggingface_hub import hf_hub_download

repo_id = "yatoka/LncPNdeep"

longformer = hf_hub_download(
    repo_id=repo_id,
    filename="weights/rna_pretrain/save.Longformer.pretrain.epoch20.params",
)

bigbird256 = hf_hub_download(
    repo_id=repo_id,
    filename="weights/rna_pretrain/save.bigbird.pretrain.epoch20.params",
)

bigbird768 = hf_hub_download(
    repo_id=repo_id,
    filename="weights/rna_pretrain/save.bigbird_full.pretrain.epoch20.params",
)

final_classifier = hf_hub_download(
    repo_id=repo_id,
    filename="weights/final_classifier/ProteinTransAllfeature_ResCNN2_07_08.h5",
)

Important Notes

These files are not packaged as standard Hugging Face AutoModel.from_pretrained() checkpoints. They should be loaded with the model definitions and preprocessing code in the GitHub repository.

The PyTorch RNA checkpoints are used for embedding extraction, while the .h5 file is the downstream TensorFlow/Keras classifier.

Citation

If you use LncPNdeep, please cite the corresponding LncPNdeep paper. Citation details will be added here once available.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support