Spaces:
Running
Running
metadata
title: NTv3 — Foundation Models for Long-Range Genomics
emoji: 🧬
colorFrom: indigo
colorTo: blue
sdk: static
pinned: false
NTv3 — Foundation Models for Long-Range Genomics
This Space is the companion hub for NTv3 checkpoints on the Hugging Face Hub. It provides PyTorch notebooks and minimal examples for inference, sequence-to-function prediction (functional tracks), genome annotation, fine-tuning, model interpretation and sequence generation.
Notebooks
Notebooks live in ./notebooks/:
00_quickstart_inference.ipynb— load a checkpoint + run inference01_tracks_prediction.ipynb— sequence → functional tracks (+ plotting)02_genome_annotation_segmentation.ipynb— sequence → annotation03_finetune_head.ipynb— fine-tune on a bigwig track04_model_interpretation.ipynb— interpretation of post-trained model05_sequence_generation.ipynb— fine-tune NTv3 to generate enhancer sequences
Install
pip install torch transformers accelerate safetensors huggingface_hub numpy
Load a model (To DO)
Pipelines (To DO)
from transformers import pipeline
import torch
pipe = pipeline(
task="ntv3-tracks",
model="InstaDeepAI/ntv3_106M_7downsample_post_trained_1mb",
trust_remote_code=True,
device="cuda",
torch_dtype=torch.bfloat16,
)
out = pipe("ACGT...")
Checkpoints
Pre-trained: InstaDeepAI/ntv3_8M_7downsample_pretrained_le_1mb, InstaDeepAI/ntv3_106M_7downsample_pretrained_le_1mb, InstaDeepAI/ntv3_650M_7downsample_pretrained_le_1mb
Post-trained: InstaDeepAI/ntv3_650M_7downsample_post_trained_1mb, InstaDeepAI/ntv3_106M_7downsample_post_trained_1mb
Links
- Paper: (add link)
- JAX research code (GitHub): https://github.com/instadeepai/nucleotide-transformer
Citation
@article{ntv3,
title = {A foundational model for joint sequence-function multi-species modeling at scale for long-range genomic prediction},
author = {…},
journal = {…},
year = {…}
}
License
Code & notebooks in this Space: (choose and add, e.g., Apache-2.0)
Model weights: see the license specified in each model repository