EXAONE-Path-CRCMSI-Predictor / README.md

Update README.md (#2)

599021c verified 6 months ago

5.18 kB

	---
	license: other
	license_name: exaonepath
	license_link: LICENSE
	tags:
	- lg-ai
	- EXAONEPath-1.5
	- pathology
	---
	<!--# EXAONE Path for CRCMSI – CRCMSI-centric Whole-Slide Image Classifier
	A purpose-built upgrade of EXAONE Path 1.5**-->

	## Introduction
	<!--EXAONE Path for CRCMSI is an enhanced whole-slide image (WSI) classification framework that retains the core architecture of EXAONE Path 1.5 while upgrading its internals for greater efficiency and richer multimodal integration.-->
	EXAONE Path MSI is an enhanced whole-slide image (WSI) classification framework that retains the core architecture of EXAONE Path while upgrading its internals for greater efficiency and richer multimodal integration.

	The pipeline still unfolds in two stages:

	1. Patch-wise feature extraction – Each WSI is tiled into 256 × 256 px patches, which are embedded into 768-dimensional vectors using the frozen [EXAONE Path](https://huggingface.co/LGAI-EXAONE/EXAONEPath) encoder.
	2. Slide-level aggregation – The patch embeddings are aggregated using a Vision Transformer, producing a unified slide-level representation that a lightweight classification head transforms into task-specific probabilities.

	---

	## Key Improvements

	- [FlexAttention](https://pytorch.org/blog/flexattention/) + `torch.compile`
	What changed: Replaced vanilla multi‑head self‑attention with IO‑aware FlexAttention kernels and enabled `torch.compile` to fuse the forward/backward graph at runtime. The new kernel layout dramatically improves both memory efficiency and training-and-inference throughput.

	- Coordinate‑aware Relative Bias
	What changed: Added an ALiBi‑style distance bias that is computed from the (x, y) patch coordinates themselves, allowing the ViT aggregator to reason about spatial proximity.

	- Scalable Mixed‑Omics Encoder (Token‑mixing Transformer)
	What changed: Each omics modality is first tokenised into a fixed‑length set. All modality‑specific tokens are concatenated into a single sequence and passed through a shared multi‑head self‑attention stack, enabling direct information exchange across modalities in one shot. The aggregated omics representation is subsequently fused with image tokens via cross‑attention. This release uses three modalities (RNA, CNV, DNA‑methylation), but the design is agnostic to modality count and scales linearly with token number.

	---


	## Quick Start

	### Requirements
	- NVIDIA GPU (≥ 40 GB)
	- CUDA 12.8
	- pytorch 2.7.0+cu128

	### Installation
	```bash
	git clone https://huggingface.co/LGAI-EXAONE/{MODEL_NAME}.git
	cd {MODEL_NAME}
	pip install -r requirements.txt
	```

	### Quick Inference
	```python
	from models.exaonepath import EXAONEPathV1p5Downstream

	hf_token = "YOUR_HUGGING_FACE_ACCESS_TOKEN"
	model = EXAONEPathV1p5Downstream.from_pretrained(
	"LGAI-EXAONE/{MODEL_NAME}",
	use_auth_token=hf_token
	)
	probs = model("./samples/wsis/1/1.svs")
	print(f"P(CRCMSI mutant) = {probs[1]:.3f}")
	```

	#### Command‑line
	```bash
	python inference.py --svs_path ./samples/wsis/1/1.svs
	```


	### Model Performance Comparison

	\| Metric (AUC) / Task \| Titan (Conch v1.5 + iBot, image-text) \| PRISM (virchow + perceiver, image-text) \| CHIEF (CTransPath + CLAM, image-text, WSI-contrastive) \| Prov-GigaPath (GigaPath + LongNet, image-only, mask-prediction) \| UNI2-h + CLAM (image-only) \| EXAONE Path 1.5 \| EXAONE Path MSI \|
	\|------------------------------------\|---------------------------------------\|-----------------------------------------\|--------------------------------------------------------\|-----------------------------------------------------------------\|---------------------------\|------------------------\|------------------------\|
	\| CRC-MSI \| 0.9370 \| 0.9432 \| 0.9273 \| 0.9541 \| <u>0.9808</u> \| 0.9537 \|0.9844 \|
	<!--\| LUAD-TMB (cutoff 10) \| 0.6901 \| 0.6445 \| 0.6501 \| 0.6744 \| 0.6686 \| 0.6846 \| \|
	\| LUAD-EGFR-mut \| 0.8197 \| 0.8152 \| 0.7691 \| 0.7623 \| 0.8577 \| 0.7607 \| \|
	\| LUAD-KRAS-mut \| 0.5405 \| 0.6299 \| 0.4676 \| 0.5110 \| 0.4690 \| 0.5480 \| \|
	\| BRCA-ER \| 0.9343 \| 0.8998 \| 0.9115 \| 0.9186 \| 0.9454 \| 0.9096 \| \|
	\| BRCA-PR \| 0.8804 \| 0.8613 \| 0.8470 \| 0.8595 \| 0.8770 \| 0.8215 \| \|
	\| BRCA-HER2 \| 0.8046 \| 0.8154 \| 0.7822 \| 0.7891 \| 0.8322 \| 0.7811 \| \|
	\| BRCA-TP53 \| 0.7879 \| 0.8415 \| 0.7879 \| 0.7388 \| 0.8080 \| 0.6607 \| \|
	\| BRCA-PIK3CA \| 0.7577 \| 0.8929 \| 0.7015 \| 0.7347 \| 0.8571 \| 0.7066 \| \|
	\| RCC-PBRM1 \| 0.6383 \| 0.5570 \| 0.5129 \| 0.5270 \| 0.5011 \| 0.4445 \| \|
	\| RCC-BAP1 \| 0.7188 \| 0.7690 \| 0.7310 \| 0.6970 \| 0.7160 \| 0.7337 \| \|
	\| COAD-KRAS \| 0.7642 \| 0.7443 \| 0.6989 \| 0.8153 \| 0.9432 \| 0.6790 \| \|
	\| COAD-TP53 \| 0.8889 \| 0.8160 \| 0.7014 \| 0.7118 \| 0.7830 \| 0.8785 \| \|
	\| <span style="color:red">Average</span> \| 0.7817 \| 0.7869 \| 0.7299 \| 0.7457 \| <u>0.7876</u> \| <span style="color:red">0.7932</span> \|-->