File size: 8,241 Bytes
3b3b8bc be2cc43 3b3b8bc 2c4e46d 3b3b8bc 2c4e46d 3b3b8bc 354bd9d 3b3b8bc 75e1668 140a71c 354bd9d 3b3b8bc 75e1668 3eed719 4ed5e3b 3b3b8bc 140a71c 3b3b8bc 4ed5e3b 3b3b8bc bbdec92 3b3b8bc bbdec92 3b3b8bc 3eed719 2c4e46d 3eed719 2c4e46d 3eed719 b05bb9b 3eed719 ed4c15a 3eed719 b05bb9b 3b3b8bc 2c4e46d de31eb7 2c4e46d e80c387 2c4e46d 354bd9d e80c387 354bd9d | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 | ---
license: llama3.1
datasets:
- CNX-PathLLM/GTEx-WSI-CloseQA-Balanced
- CNX-PathLLM/GTEx-WSI-OpenQA
- CNX-PathLLM/TCGA-WSI-CloseQA-Balanced
- CNX-PathLLM/TCGA-WSI-OpenQA
- CNX-PathLLM/TCGA-BRCA-Details-CloseQA
- CNX-PathLLM/TCGA-BRCA-Details-OpenQA
- CNX-PathLLM/PathChat_CloseQA_Balanced
- CNX-PathLLM/PathChat_OpenQA
language:
- en
metrics:
- accuracy
- f1
base_model:
- meta-llama/Llama-3.1-8B-Instruct
---
# ALPaCA: Adapting Llama for Pathology Context Analysis
Welcome to ALPaCA, a multimodal training framework tailored for slide-level question answering in computational pathology. ALPaCA integrates Llama3.1-8B-Instruct as the language backbone and CONCH as the vision encoder.
This repository aims to provide a straightforward reproduction of the ALPaCA framework.
The model trained using this framework is named **Llama-slideQA**.
To run ALPaCA, please first download **Llama3.1-8b-instruct** as the base model.
For data from TCGA and GTEx, you can visit the [GDC Data Portal Homepage](https://portal.gdc.cancer.gov/) and [GTEx Portal](https://www.gtexportal.org/) to download and extract patch features yourself by [CONCH](https://huggingface.co/MahmoodLab/CONCH). The data processing code is available at https://github.com/ZeyuGaoAi/SMMILe.
Alternatively, you can use the features we have already extracted based on CONCH: `CNX-PathLLM/GTEx-TCGA-Embeddings`, `CNX-PathLLM/GTEx-TCGA-KMeans-Embeddings`, `CNX-PathLLM/GMM_Embeddings`. After downloading, please unzip them into the respective folders for `TCGA-Embedding` and `GMM_Embedding`.
Please ensure you have access to all the datasets.
After completing all the setups mentioned above and setting up the correct Python environment, you can start the training process using the provided shell script, e.g., `run_wsi_stage*.sh`, or follow the instructions in the [Train Step](#train-step-1) section below.
Do not forget to adjust the TCGA and GMM embedding paths to reflect your own file locations.
## Settings
### Different Aggregate Strategies
You can change aggregate strategies using the `--agg_strategy` flag, such as `sample`, `kmeans`, `gmm`, `abmil`, `qformer`, and `longnet`. You can also reproduce the `hybrid` method described in our paper by setting `--agg_strategy gmm,longnet` in the `.sh` script.
### Configurable Settings
```
--vision_adaptor False (vision-query-question interaction)
--vision_adaptor True (vision-query interaction)
--hierarchical_adaptor False (same adaptor for all levels)
--hierarchical_adaptor True (different adaptors for different levels)
```
## Train Step 1 ##
```
accelerate launch --config_file=./accelerate_configs/deepspeed_zero2.yaml run_wsi.py --learning_rate 1e-4 --num_train_epochs 20 --warmup_steps 1000\
--gpu 2 --train_batch_size 4 --eval_batch_size 2 --max_seq_length 512 \
--agg_strategy gmm,longnet --embed_dim 512 --vision_adaptor False --hierachical_token True --hierachical_adaptor True\
--n_heads 32,16,8 --llm_requires_grad False --resume_from_checkpoint False \
--llm_name /data_local/pxb/LLM_models/llama3/llama3.1-8b-instruct \
--dataset_name_list CNX-PathLLM/TCGA-WSI-Description-4onew,CNX-PathLLM/TCGA-WSI-Description-4omini,CNX-PathLLM/GTEx-WSI-Description \
--data_cache_dir /data_local/pxb/CNX-PathLLM/.cache \
--fea_root /path/to/CNX-PathLLM/GTEx-TCGA-Embeddings \
--gmm_root /path/to/GMM_Embeddings\
--output_dir path/to/output/of/step2
```
## Train Step 2 ##
```
accelerate launch --config_file=./accelerate_configs/deepspeed_zero2.yaml run_wsi.py --num_train_epochs 5 --warmup_steps 1000\
--gpu 2 --train_batch_size 8 --eval_batch_size 2 --max_seq_length 256 \
--agg_strategy gmm,longnet --embed_dim 512 --vision_adaptor False --hierachical_token True --hierachical_adaptor True\
--n_heads 32,16,8 --llm_requires_grad True --resume_from_checkpoint False \
--llm_name /data_local/pxb/LLM_models/llama3/llama3.1-8b-instruct \
--dataset_name_list CNX-PathLLM/TCGA-WSI-CloseQA-Balanced,CNX-PathLLM/GTEx-WSI-CloseQA-Balanced,CNX-PathLLM/TCGA-WSI-OpenQA,CNX-PathLLM/GTEx-WSI-OpenQA \
--data_cache_dir /data_local/pxb/CNX-PathLLM/.cache \
--fea_root /path/to/CNX-PathLLM/GTEx-TCGA-Embeddings \
--gmm_root /path/to/GMM_Embeddings\
--output_dir path/to/output/of/step2\
--ckpt_path path/to/ckpt.bin/of/step1
```
## Train Step 3 ##
You can continue training (--ckpt_path path/to/ckpt.bin/of/step2) with the specific detailed TCGA-BRCA dataset (`CNX-PathLLM/TCGA-BRCA-Details-CloseQA, CNX-PathLLM/TCGA-BRCA-Details-OpenQA`).
You can also continue training (--ckpt_path path/to/ckpt.bin/of/step2) with the morphological description generated by [PathChat](https://www.nature.com/articles/s41586-024-07618-3) for TCGA-STAD, TCGA-KIRC and TCGA-OV using `CNX-PathLLM/PathChat_CloseQA_Balanced,CNX-PathLLM/PathChat_OpenQA`!
Make sure you can access the dataset and change the above commands with the dataset you want.
## Checkpoints:
Llama-slideQA.bin: Trained with general QA following [Train Step 2](#train-step-2).
Llama-slideQA-morphology.bin: Trained with detailed morphological QA generated by PathChat following [Train Step 3](#train-step-3).
Llama-slideQA-BRCA.bin: Trained with detailed TCGA-BRCA dataset following [Train Step 3](#train-step-3).
## Test of Step2 General QA ##
```
python test_wsi.py --max_seq_length 128 --batch_size 1 --select_data_num -1 --eval_sample_size -1 --n_heads 32,16,8 --llm_name /data_local/pxb/LLM_models/llama3/llama3.1-8b-instruct --vision_adaptor False --hierachical_token True --hierachical_adaptor True \
--shuffle False --data_cache_dir /data_local/pxb/CNX-PathLLM/.cache\
--dataset_name_list CNX-PathLLM/TCGA-WSI-CloseQA-Balanced,CNX-PathLLM/GTEx-WSI-CloseQA-Balanced,CNX-PathLLM/TCGA-WSI-OpenQA,CNX-PathLLM/GTEx-WSI-OpenQA\
--agg_strategy gmm,longnet --embed_dim 512\
--fea_root /path/to/CNX-PathLLM/GTEx-TCGA-Embeddings \
--gmm_root /path/to/GMM_Embeddings\
--ckpt_path path/to/ckpt.bin/of/step2\
--results_save_path /path/to/the/output.csv\
--use_peft False
```
## Test of Step3 Specific QA ##
```
python test_wsi.py --max_seq_length 128 --batch_size 1 --select_data_num -1 --eval_sample_size -1 --n_heads 32,16,8 --llm_name /data_local/pxb/LLM_models/llama3/llama3.1-8b-instruct --vision_adaptor False --hierachical_token True --hierachical_adaptor True \
--shuffle False --data_cache_dir /data_local/pxb/CNX-PathLLM/.cache\
--dataset_name_list CNX-PathLLM/TCGA-BRCA-Details-CloseQA,CNX-PathLLM/TCGA-BRCA-Details-OpenQA (CNX-PathLLM/PathChat_CloseQA_Balanced,CNX-PathLLM/PathChat_OpenQA)\
--agg_strategy gmm,longnet --embed_dim 512\
--fea_root /path/to/CNX-PathLLM/GTEx-TCGA-Embeddings \
--gmm_root /path/to/GMM_Embeddings\
--ckpt_path path/to/ckpt.bin/of/step3\
--results_save_path /path/to/the/output.csv\
--use_peft False
```
## Toy test case
For the demo test, you can try this small dataset; no need to download the full tcga & gtex embeddings.
Embeddings: CNX-PathLLM/Toy-GTEx-TCGA-Embeddings, CNX-PathLLM/Toy_GMM_Embeddings
Datasets (Slide-QA): CNX-PathLLM/CloseQA-Toy, CNX-PathLLM/OpenQA-Toy
Follow the same instruction as ## Test of Step2 General QA ##, set
```
--dataset_name_list CNX-PathLLM/CloseQA-Toy, CNX-PathLLM/OpenQA-Toy
--fea_root /path/to/CNX-PathLLM/Toy-GTEx-TCGA-Embeddings \
--gmm_root /path/to/Toy_GMM_Embeddings \
```
## Disclaimer
This repository and all associated models are intended solely for academic research and non-commercial use. The model involves medical data (e.g., TCGA, GTEx) and pathology-related tasks, but is not approved for clinical diagnosis or medical decision-making.
The developers are not responsible for any misuse of this code or model in medical or commercial contexts.
## License
This model is developed using Meta’s LLaMA 3 model as part of its architecture. Following the LLaMA 3.1 License. |