Commit ·
07737b4
0
Parent(s):
Super-squash branch 'main' using huggingface_hub
Browse files- .gitattributes +36 -0
- README.md +98 -0
- x-cell-overview.png +3 -0
.gitattributes
ADDED
|
@@ -0,0 +1,36 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
*.7z filter=lfs diff=lfs merge=lfs -text
|
| 2 |
+
*.arrow filter=lfs diff=lfs merge=lfs -text
|
| 3 |
+
*.bin filter=lfs diff=lfs merge=lfs -text
|
| 4 |
+
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
| 5 |
+
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
| 6 |
+
*.ftz filter=lfs diff=lfs merge=lfs -text
|
| 7 |
+
*.gz filter=lfs diff=lfs merge=lfs -text
|
| 8 |
+
*.h5 filter=lfs diff=lfs merge=lfs -text
|
| 9 |
+
*.joblib filter=lfs diff=lfs merge=lfs -text
|
| 10 |
+
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
| 11 |
+
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
| 12 |
+
*.model filter=lfs diff=lfs merge=lfs -text
|
| 13 |
+
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
| 14 |
+
*.npy filter=lfs diff=lfs merge=lfs -text
|
| 15 |
+
*.npz filter=lfs diff=lfs merge=lfs -text
|
| 16 |
+
*.onnx filter=lfs diff=lfs merge=lfs -text
|
| 17 |
+
*.ot filter=lfs diff=lfs merge=lfs -text
|
| 18 |
+
*.parquet filter=lfs diff=lfs merge=lfs -text
|
| 19 |
+
*.pb filter=lfs diff=lfs merge=lfs -text
|
| 20 |
+
*.pickle filter=lfs diff=lfs merge=lfs -text
|
| 21 |
+
*.pkl filter=lfs diff=lfs merge=lfs -text
|
| 22 |
+
*.pt filter=lfs diff=lfs merge=lfs -text
|
| 23 |
+
*.pth filter=lfs diff=lfs merge=lfs -text
|
| 24 |
+
*.rar filter=lfs diff=lfs merge=lfs -text
|
| 25 |
+
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
| 26 |
+
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
| 27 |
+
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
| 28 |
+
*.tar filter=lfs diff=lfs merge=lfs -text
|
| 29 |
+
*.tflite filter=lfs diff=lfs merge=lfs -text
|
| 30 |
+
*.tgz filter=lfs diff=lfs merge=lfs -text
|
| 31 |
+
*.wasm filter=lfs diff=lfs merge=lfs -text
|
| 32 |
+
*.xz filter=lfs diff=lfs merge=lfs -text
|
| 33 |
+
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
+
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
+
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
x-cell-overview.png filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
|
@@ -0,0 +1,98 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: cc-by-nc-sa-4.0
|
| 3 |
+
language:
|
| 4 |
+
- en
|
| 5 |
+
thumbnail: x-cell-overview.png
|
| 6 |
+
tags:
|
| 7 |
+
- biology
|
| 8 |
+
- single-cell
|
| 9 |
+
- perturbation-prediction
|
| 10 |
+
- diffusion-model
|
| 11 |
+
- genomics
|
| 12 |
+
- CRISPRi
|
| 13 |
+
datasets:
|
| 14 |
+
- Xaira-Therapeutics/X-Atlas-Pisces
|
| 15 |
+
pipeline_tag: other
|
| 16 |
+
---
|
| 17 |
+
|
| 18 |
+
# X-Cell
|
| 19 |
+
|
| 20 |
+
**A diffusion language model for genome-scale perturbation prediction across diverse cellular contexts.**
|
| 21 |
+
|
| 22 |
+
> **Status: Model weights and inference code coming soon.**
|
| 23 |
+
> The Python API, model weights, and tutorials are under active development.
|
| 24 |
+
> Watch the [GitHub repository](https://github.com/Xaira-Therapeutics/x-cell) for release updates.
|
| 25 |
+
|
| 26 |
+
<p align="center">
|
| 27 |
+
<img src="x-cell-overview.png" alt="X-Cell Architecture" width="100%">
|
| 28 |
+
</p>
|
| 29 |
+
|
| 30 |
+
## Model Description
|
| 31 |
+
|
| 32 |
+
X-Cell predicts genome-scale transcriptional responses to genetic perturbations across diverse cellular contexts. Trained on **X-Atlas/Pisces** (25.6M perturbed single cells, 7 CRISPRi Perturb-seq screens), X-Cell integrates multi-modal biological priors through cross-attention and generalizes zero-shot to unseen cell types and perturbations.
|
| 33 |
+
|
| 34 |
+
### Key Results
|
| 35 |
+
|
| 36 |
+
- **5x higher Pearson delta** than the next-best method on held-out iPSC perturbations
|
| 37 |
+
- **Zero-shot T-cell inactivation** — predicts CD3 complex inactivators and novel regulators (LRBA, APPL2)
|
| 38 |
+
- **LLM-class scaling laws** — train loss scales as L(N) ~ N^-0.32 (R^2 = 0.96)
|
| 39 |
+
- **Zero-shot cell type generalization** to melanocyte progenitors and primary human CD4+ T cells
|
| 40 |
+
|
| 41 |
+
## Model
|
| 42 |
+
|
| 43 |
+
| Model | Parameters | Description |
|
| 44 |
+
|-------|-----------|-------------|
|
| 45 |
+
| **X-Cell Mini** | 55M | Fast inference; initialized from scGPT |
|
| 46 |
+
|
| 47 |
+
## Architecture
|
| 48 |
+
|
| 49 |
+
X-Cell is a **set-level diffusion transformer** that operates on sets of cells (not individual cells) and refines predictions iteratively via a masked diffusion process. Key components:
|
| 50 |
+
|
| 51 |
+
- **Diffusion-based training** with 4-step coarse-to-fine refinement at inference
|
| 52 |
+
- **Multi-modal biological priors** via Flamingo-style cross-attention (ESM-2, STRING, GenePT, DepMap, JUMP-Cell Painting, scGPT)
|
| 53 |
+
- **Tied output embeddings** with PaLM-style 1/sqrt(d) scaling
|
| 54 |
+
|
| 55 |
+
## Intended Use
|
| 56 |
+
|
| 57 |
+
X-Cell is designed for predicting transcriptional responses to CRISPRi gene knockdowns. It is intended for research use in computational biology and genomics.
|
| 58 |
+
|
| 59 |
+
## Training Data
|
| 60 |
+
|
| 61 |
+
Trained on X-Atlas/Pisces — the largest CRISPRi Perturb-seq compendium to date:
|
| 62 |
+
|
| 63 |
+
| Screen | Context | Perturbations | Cells |
|
| 64 |
+
|--------|---------|--------------|-------|
|
| 65 |
+
| HCT116 | Colorectal cancer | 18,924 | 3.4M |
|
| 66 |
+
| HEK293T | Kidney epithelial | 18,312 | 4.5M |
|
| 67 |
+
| HepG2 | Hepatocellular carcinoma | 9,735 | 2.6M |
|
| 68 |
+
| iPSC | Induced pluripotent stem cells | 10,095 | 4.2M |
|
| 69 |
+
| Jurkat Resting | T lymphoblastic leukemia | 10,872 | 2.8M |
|
| 70 |
+
| Jurkat Active | CD3/CD28-stimulated T cells | 10,878 | 2.8M |
|
| 71 |
+
| iPSC Multi-Diff | Multi-lineage differentiation | 12,175 | 5.1M |
|
| 72 |
+
|
| 73 |
+
Dataset: [Xaira-Therapeutics/X-Atlas-Pisces](https://huggingface.co/datasets/Xaira-Therapeutics/X-Atlas-Pisces)
|
| 74 |
+
|
| 75 |
+
## Usage (Coming Soon)
|
| 76 |
+
|
| 77 |
+
```python
|
| 78 |
+
from xcell import XCell
|
| 79 |
+
|
| 80 |
+
model = XCell.from_pretrained("Xaira-Therapeutics/X-Cell", variant="mini")
|
| 81 |
+
predictions = model.predict("control_cells.h5ad", perturbation="BRCA1")
|
| 82 |
+
```
|
| 83 |
+
|
| 84 |
+
Full documentation: [Xaira-Therapeutics.github.io/x-cell](https://xaira-therapeutics.github.io/x-cell)
|
| 85 |
+
|
| 86 |
+
## Citation
|
| 87 |
+
|
| 88 |
+
```bibtex
|
| 89 |
+
@article{xcell2026,
|
| 90 |
+
title = {X-Cell: Scaling Causal Perturbation Prediction Across Diverse
|
| 91 |
+
Cellular Contexts via Diffusion Language Models},
|
| 92 |
+
year = {2026},
|
| 93 |
+
}
|
| 94 |
+
```
|
| 95 |
+
|
| 96 |
+
## License
|
| 97 |
+
|
| 98 |
+
This model is released under the [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/) license.
|
x-cell-overview.png
ADDED
|
Git LFS Details
|