metadata
tags:
- genomics
- gene-expression-prediction
- multimodal
- biology
- arxiv:2602.21550
library_name: pytorch
datasets:
- xingyusu/GeneExp
Prism
Prism provides pretrained checkpoints for gene expression prediction by integrating genomic sequence and multimodal signals.
This repository is the model release for:
Extending Sequence Length is Not All You Need: Effective Integration of Multimodal Signals for Gene Expression Prediction (ICLR 2026)
Paper
Model Contents
- Pretrained checkpoints for
K562andGM12878 - Five random seeds for each cell type:
2,22,222,2222,22222
Dataset
Prism follows the same dataset setting as Seq2Exp (xingyusu/GeneExp).
Quick Start
Download checkpoints:
pip install huggingface_hub
python -c "from huggingface_hub import snapshot_download; snapshot_download(repo_id='yangyz1230/Prism', repo_type='model', local_dir='./ckpt')"
Run inference with the official code:
git clone https://github.com/yangzhao1230/Prism
cd Prism
pip install -r requirements.txt
DATA_ROOT=/path/to/data
bash test.sh $DATA_ROOT ./ckpt
Limitations
- Research use only
- Performance may vary across preprocessing settings and seeds
- Not intended for clinical or diagnostic use
Citation
@inproceedings{
yang2026extending,
title={Extending Sequence Length is Not All You Need: Effective Integration of Multimodal Signals for Gene Expression Prediction},
author={Zhao Yang and Yi Duan and Jiwei Zhu and Ying Ba and Chuan Cao and Bing Su},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026}
}