Spaces:

InstaDeepAI
/

ntv3

Running

App Files Files Community

ntv3 / README.md

bernardo-de-almeida

feat: improve title

88d1cd8 4 months ago

2.28 kB

title: NTv3 — Foundation Models for Long-Range Genomics
emoji: 🧬
colorFrom: indigo
colorTo: blue
sdk: static
pinned: false

NTv3 — Foundation Models for Long-Range Genomics

This Space is the companion hub for NTv3 checkpoints on the Hugging Face Hub. It provides PyTorch notebooks and minimal examples for inference, sequence-to-function prediction (functional tracks), genome annotation, fine-tuning, model interpretation and sequence generation.

Notebooks

Notebooks live in ./notebooks/:

00_quickstart_inference.ipynb — load a checkpoint + run inference
01_tracks_prediction.ipynb — sequence → functional tracks (+ plotting)
02_genome_annotation_segmentation.ipynb — sequence → annotation
03_finetune_head.ipynb — fine-tune on a bigwig track
04_model_interpretation.ipynb — interpretation of post-trained model
05_sequence_generation.ipynb — fine-tune NTv3 to generate enhancer sequences

Install

pip install torch transformers accelerate safetensors huggingface_hub numpy

Load a model (To DO)

Pipelines (To DO)

from transformers import pipeline
import torch

pipe = pipeline(
    task="ntv3-tracks",
    model="InstaDeepAI/ntv3_106M_7downsample_post_trained_1mb",
    trust_remote_code=True,
    device="cuda",
    torch_dtype=torch.bfloat16,
)

out = pipe("ACGT...")

Checkpoints

Pre-trained: InstaDeepAI/ntv3_8M_7downsample_pretrained_le_1mb, InstaDeepAI/ntv3_106M_7downsample_pretrained_le_1mb, InstaDeepAI/ntv3_650M_7downsample_pretrained_le_1mb

Post-trained: InstaDeepAI/ntv3_650M_7downsample_post_trained_1mb, InstaDeepAI/ntv3_106M_7downsample_post_trained_1mb

Citation

@article{ntv3,
  title   = {A foundational model for joint sequence-function multi-species modeling at scale for long-range genomic prediction},
  author  = {…},
  journal = {…},
  year    = {…}
}

License

Code & notebooks in this Space: (choose and add, e.g., Apache-2.0)

Model weights: see the license specified in each model repository