--- license: other license_name: exaonepath license_link: LICENSE tags: - lg-ai - EXAONEPath-1.5 - pathology --- ## Introduction **EXAONE Path MSI** is an **enhanced whole-slide image (WSI) classification framework** that retains the core architecture of EXAONE Path while upgrading its internals for greater efficiency and richer multimodal integration. The pipeline still unfolds in two stages: 1. **Patch-wise feature extraction** – Each WSI is tiled into 256 × 256 px patches, which are embedded into 768-dimensional vectors using the frozen **[EXAONE Path](https://huggingface.co/LGAI-EXAONE/EXAONEPath)** encoder. 2. **Slide-level aggregation** – The patch embeddings are aggregated using a Vision Transformer, producing a unified slide-level representation that a lightweight classification head transforms into task-specific probabilities. --- ## Key Improvements - **[FlexAttention](https://pytorch.org/blog/flexattention/) + `torch.compile`** *What changed:* Replaced vanilla multi‑head self‑attention with IO‑aware **FlexAttention** kernels and enabled `torch.compile` to fuse the forward/backward graph at runtime. The new kernel layout dramatically improves both memory efficiency and training-and-inference throughput. - **Coordinate‑aware Relative Bias** *What changed:* Added an ALiBi‑style distance bias that is computed from the (x, y) patch coordinates themselves, allowing the ViT aggregator to reason about spatial proximity. - **Scalable Mixed‑Omics Encoder (Token‑mixing Transformer)** *What changed:* Each omics modality is first tokenised into a fixed‑length set. **All modality‑specific tokens are concatenated into a single sequence and passed through a shared multi‑head self‑attention stack**, enabling direct information exchange across modalities in one shot. The aggregated omics representation is subsequently fused with image tokens via cross‑attention. This release uses **three modalities (RNA, CNV, DNA‑methylation)**, but the design is agnostic to modality count and scales linearly with token number. --- ## Quick Start ### Requirements - NVIDIA GPU (≥ 40 GB) - CUDA 12.8 - pytorch 2.7.0+cu128 ### Installation ```bash git clone https://huggingface.co/LGAI-EXAONE/{MODEL_NAME}.git cd {MODEL_NAME} pip install -r requirements.txt ``` ### Quick Inference ```python from models.exaonepath import EXAONEPathV1p5Downstream hf_token = "YOUR_HUGGING_FACE_ACCESS_TOKEN" model = EXAONEPathV1p5Downstream.from_pretrained( "LGAI-EXAONE/{MODEL_NAME}", use_auth_token=hf_token ) probs = model("./samples/wsis/1/1.svs") print(f"P(CRCMSI mutant) = {probs[1]:.3f}") ``` #### Command‑line ```bash python inference.py --svs_path ./samples/wsis/1/1.svs ``` ### Model Performance Comparison | Metric (AUC) / Task | Titan (Conch v1.5 + iBot, image-text) | PRISM (virchow + perceiver, image-text) | CHIEF (CTransPath + CLAM, image-text, WSI-contrastive) | Prov-GigaPath (GigaPath + LongNet, image-only, mask-prediction) | UNI2-h + CLAM (image-only) | EXAONE Path 1.5 | EXAONE Path MSI | |------------------------------------|---------------------------------------|-----------------------------------------|--------------------------------------------------------|-----------------------------------------------------------------|---------------------------|------------------------|------------------------| | **CRC-MSI** | 0.9370 | 0.9432 | 0.9273 | 0.9541 | 0.9808 | 0.9537 |**0.9844** |