Papers
arxiv:2602.02093

Cell-JEPA: Latent Representation Learning for Single-Cell Transcriptomics

Published on Feb 2
Authors:
,
,
,
,
,
,
,
,
,
,
,

Abstract

Cell-JEPA, a joint-embedding predictive architecture, improves single-cell RNA sequencing analysis by learning robust features through latent space prediction rather than direct reconstruction, achieving better zero-shot cell-type clustering performance.

Single-cell foundation models learn by reconstructing masked gene expression, implicitly treating technical noise as signal. With dropout rates exceeding 90%, reconstruction objectives encourage models to encode measurement artifacts rather than stable cellular programs. We introduce Cell-JEPA, a joint-embedding predictive architecture that shifts learning from reconstructing sparse counts to predicting in latent space. The key insight is that cell identity is redundantly encoded across genes. We show predicting cell-level embeddings from partial observations forces the model to learn dropout-robust features. On cell-type clustering, Cell-JEPA achieves 0.72 AvgBIO in zero-shot transfer versus 0.53 for scGPT, a 36% relative improvement. On perturbation prediction within a single cell line, Cell-JEPA improves absolute-state reconstruction but not effect-size estimation, suggesting that representation learning and perturbation modeling address complementary aspects of cellular prediction.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2602.02093
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2602.02093 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2602.02093 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2602.02093 in a Space README.md to link it from this page.

Collections including this paper 1