mitospace / README.md
DrvAgwl's picture
docs: describe manifest.parquet (S3 + HF) per github README
b0de4c3 verified
|
raw
history blame
4.59 kB
---
license: other
license_name: review-license-uc-regents-2026
license_link: LICENSE
pipeline_tag: feature-extraction
library_name: pytorch
tags:
- biology
- cell-biology
- microscopy
- lattice-light-sheet
- mitochondria
- mitotracker
- self-supervised-learning
- contrastive-learning
- simclr
- 3d-resnet
- bilstm
- 4d
extra_gated_heading: "Pre-publication peer-review materials"
extra_gated_description: "By accessing these weights you agree to use them solely for peer review of the associated manuscript submitted to Cell. No redistribution, modification, or derivative works are permitted prior to publication."
---
# MitoSpace
Self-supervised 3D-ResNet + Bi-LSTM trained with SimCLR on 4D MitoTracker lattice light-sheet volumes. Produces 2048-d spatiotemporal embeddings of mitochondrial morphology.
- **Interactive atlas:** [mitospace.ai](https://mitospace.ai)
- **Code:** [github.com/schoeneberglab/MitoSpace4D](https://github.com/schoeneberglab/MitoSpace4D)
- **Status:** released alongside a manuscript under review at *Cell*
![MitoSpace](mitospace.gif)
## Specs
| | |
| --- | --- |
| Input | `(B, T=20, C=1, Z=60, H=256, W=256)`, float32 in `[0, 1]` |
| Output | `(B, T, 2048)` per-frame features (typically L2-normalized) |
| Objective | InfoNCE (SimCLR), `τ = 0.07`, 2 views per sample |
| Checkpoint | `ms4d` — Cal27, 26 perturbation conditions |
| Hardware (training) | 15 × 4 V100 (SDSC), DDP + fp16, 300 epochs, ≈ 3 days |
| Optimizer | Adam, `lr = 3e-4`, `wd = 1e-5`, cosine annealing |
| Framework | PyTorch + PyTorch Lightning |
## Files
| File | |
| --- | --- |
| `model.safetensors` | Trained weights (flat state-dict, no `model.` Lightning prefix) |
| `config.json` | Input contract + architecture summary |
| `LICENSE` | Review License — see below |
| `mitospace.gif` | Atlas preview |
## Data manifest
The processed-data manifest is a **Parquet** table of sample metadata and relative paths used across the pipeline (autoencoder training, SimCLR training, evaluation). Point `data.manifest_path` in `autoencoder/config.yaml` (and the corresponding field in `simclr/config.yaml`) at the local file after download.
**S3**
```bash
aws s3 cp s3://mitospace4d/processed_data/manifest.parquet manifest.parquet
```
**Hugging Face** (`schoeneberglab/mitospace`; needs `HF_TOKEN` with read access while the repo is private)
```bash
export HF_TOKEN=<read_token>
python utils/hf_checkpoint.py download --filename processed_data/manifest.parquet
```
## Usage
Requires a CUDA GPU (`MitoSpace4D.__init__` unconditionally moves its augmentation pipeline to CUDA). Clone the [code repo](https://github.com/schoeneberglab/MitoSpace4D) and `pip install -e .` first.
```python
import sys, torch
import torch.nn.functional as F
from huggingface_hub import hf_hub_download
from safetensors.torch import load_file
sys.path.insert(0, "/path/to/MitoSpace4D")
from simclr.model import MitoSpace4D
from utils.utils import load_config
weights = hf_hub_download("schoeneberglab/mitospace", "model.safetensors")
cfg = load_config("simclr/config.yaml")
model = MitoSpace4D(embedding_size=2048, cfg=cfg, apply_aug=False).cuda().eval()
missing, unexpected = model.load_state_dict(load_file(weights), strict=False)
assert not missing and not unexpected, (missing[:5], unexpected[:5])
model._channels = model._channels.cuda() # required for GPU fancy-indexing
x = torch.rand(1, 20, 1, 60, 256, 256, device="cuda")
with torch.no_grad():
features, _ = model(x) # (1, 20, 2048)
embedding = F.normalize(features, dim=-1)
```
For batch inference + k-NN evaluation over a dataset, use `evaluate.py` in the code repo.
## Intended use and limitations
Research only. Not validated or fit for clinical or diagnostic use. Trained on MitoTracker LLSM volumes from a specific cell-line panel; performance on different microscopes, magnifications, cell types, or reporters should be evaluated empirically. Input geometry is fixed at `T=20`, `Z=60`.
## Citation
Manuscript under review at *Cell*; BibTeX will be added on publication. For now:
```bibtex
@software{mitospace4d,
author = {Schoeneberg Lab},
title = {MitoSpace},
url = {https://github.com/schoeneberglab/MitoSpace4D},
year = {2026},
}
```
## License
Released under the terms in [`LICENSE`](LICENSE) — a Review License granted solely for evaluating the associated manuscript submitted to *Cell*. No redistribution, modification, or derivative works are permitted prior to publication. Copyright © 2026 The Regents of the University of California.