estenhl commited on
Commit
4f9da36
·
1 Parent(s): b42b662

Working on preprocess and predict container

Browse files
README.md CHANGED
@@ -93,6 +93,7 @@ All the approaches described below rely on having the IXI dataset downloaded. If
93
  python tutorials/download_ixi.py
94
  ```
95
  ## Generate predictions
 
96
  <details>
97
  <summary> Preprocess and predict manually </summary>
98
 
@@ -101,11 +102,7 @@ Preprocessing and predicting manually relies on using the scripts provided in th
101
  ### Preprocessing
102
  The images must be preprocessed using FastSurfer. First, FastSurfer must be downloaded. If any of the subsequent steps fail, a comprehensive installation-guide can be found in the [FastSurfer GitHub repository](https://github.com/Deep-MI/FastSurfer/blob/dev/doc/overview/INSTALL.md#native-ubuntu-2004-or-ubuntu-2204). The following steps downloads and installs FastSurfer into the folder `~/repos/fastsurfer`. First, some system packages must be installed:
103
  ```
104
- sudo apt-get update && apt-get install -y --no-install-recommends \
105
- wget \
106
- git \
107
- ca-certificates \
108
- file
109
  ```
110
  Next, we can clone FastSurfer, and change to the correct branch:
111
  ```
@@ -141,3 +138,46 @@ mkdir ~/data/ixi/outputs
141
  python scripts/predict_from_fastsurfer_folder.py ~/data/ixi/preprocessed -d ~/data/ixi/outputs/predictions.csv
142
  ```
143
  </details>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
93
  python tutorials/download_ixi.py
94
  ```
95
  ## Generate predictions
96
+
97
  <details>
98
  <summary> Preprocess and predict manually </summary>
99
 
 
102
  ### Preprocessing
103
  The images must be preprocessed using FastSurfer. First, FastSurfer must be downloaded. If any of the subsequent steps fail, a comprehensive installation-guide can be found in the [FastSurfer GitHub repository](https://github.com/Deep-MI/FastSurfer/blob/dev/doc/overview/INSTALL.md#native-ubuntu-2004-or-ubuntu-2204). The following steps downloads and installs FastSurfer into the folder `~/repos/fastsurfer`. First, some system packages must be installed:
104
  ```
105
+ sudo apt-get update && apt-get install -y --no-install-recommends wget git ca-certificates file
 
 
 
 
106
  ```
107
  Next, we can clone FastSurfer, and change to the correct branch:
108
  ```
 
138
  python scripts/predict_from_fastsurfer_folder.py ~/data/ixi/preprocessed -d ~/data/ixi/outputs/predictions.csv
139
  ```
140
  </details>
141
+
142
+ <details>
143
+ <summary> Preprocess and predict in two steps via docker </summary>
144
+ Preprocessing and predicting in two steps via docker requires using the two prebuilt docker containers for the two steps independently.
145
+
146
+ ### Preprocessing
147
+ Running the container for preprocessing requires mounting three volumes:
148
+ - Inputs: A folder containing input data. All nifti-files detected in this folder or one of its subfolders will be processed
149
+ - Outputs: A folder where the preprocessed images will be written. This must be created prior to running the container
150
+ - Licenses: A folder containing the freesurfer license. The file must be named freesurfer.txt
151
+ ```
152
+ mkdir -p ~/data/ixi/outputs
153
+ docker run --rm \
154
+ --user $(id -u):$(id -g) \
155
+ --volume $HOME/data/ixi/images:/input \
156
+ --volume $HOME/data/ixi/outputs:/output \
157
+ --volume <path_to_licenses>:/licenses \
158
+ --gpus all \
159
+ estenhl/pyment-preprocessing:1.0.0
160
+ ```
161
+
162
+ ### Generate predictions
163
+ Running the container for predictions requires two volumes:
164
+ - Fastsurfer: The folder containing fastsurfer-processed images
165
+ - Outputs: The folder where the predictions are written
166
+ ```
167
+ docker run --rm -it \
168
+ --user $(id -u):$(id -g) \
169
+ --volume $HOME/data/ixi/outputs/fastsurfer:/fastsurfer \
170
+ --volume $HOME/data/ixi/outputs:/output \
171
+ --gpus all \
172
+ estenhl/pyment-predict:1.0.0
173
+ ```
174
+
175
+ </details>
176
+
177
+ ## Evaluate predictions
178
+
179
+ Evaluate the IXI predictions with
180
+ ```
181
+ python tutorials/evaluate_ixi_predictions.py
182
+ ```
183
+ If everything is set up correctly, this should yield an MAE of 3.12. Note that the paths to both the labels and predictions can be given as keyword arguments to the script if they don't reside in the standard locations.
docker/README.md CHANGED
@@ -1,24 +1,20 @@
1
- ## Build docker container for preprocessing
2
- Note that for now, building the container requires a folder called <checkpoints> that contains the FastSurfer segmentation checkpoints
 
 
3
 
4
  ```
5
  docker build \
6
  -f docker/preprocess.Dockerfile \
7
- -t pyment/preprocessing:1.0.0 \
8
  .
9
  ```
10
 
11
- ## Run docker container for preprocessing
12
- Running the container for preprocessing requires three volumes:
13
- - Inputs: A folder containing input data. All nifti-files detected in this folder or one of its subfolders will be processed
14
- - Outputs: A folder where the preprocessed images will be written.
15
- - Licenses: A folder containing the freesurfer license
16
  ```
17
- docker run --rm \
18
- --user $(id -u):$(id -g) \
19
- --volume <path_to_input>:/input \
20
- --volume <path_to_ouput>:/output \
21
- --volume <path_to_licenses>:/licenses \
22
- --gpus all \
23
- pyment/preprocessing:1.0.0
24
  ```
 
1
+ # Building docker containers
2
+
3
+ ## Building docker container for preprocessing
4
+ Note that for now, building the container requires a folder called <checkpoints> that contains the FastSurfer segmentation checkpoints in a subfolder called `fastsurfer`. This folder should contain the files `aparc_vinn_axial_v2.0.0.pkl`, `aparc_vinn_coronal_v2.0.0.pkl`, and `aparc_vinn_sagittal_v2.0.0.pkl`. The command should be run from the root of the repository:
5
 
6
  ```
7
  docker build \
8
  -f docker/preprocess.Dockerfile \
9
+ -t estenhl/pyment-preprocessing:1.0.0 \
10
  .
11
  ```
12
 
13
+ ## Building docker container for preprocessing
14
+ Note that for now, building the container requires a folder called <checkpoints> that contains the multi-task model checkpoints in a subfolder called `pyment`. This folder should contain the files `sfcn-multi.data-00000-of-00001`and `sfcn-multi.index`. The command should be run from the root of the repository:
 
 
 
15
  ```
16
+ docker build \
17
+ -f docker/predict.Dockerfile \
18
+ -t estenhl/pyment-predict:1.0.0 \
19
+ .
 
 
 
20
  ```
docker/predict.Dockerfile CHANGED
@@ -1,11 +1,21 @@
1
- FROM estenhl/pyment-preprocessing:1.0.0
2
 
3
- RUN python -m venv /envs/pyment
4
 
5
- RUN mkdir /repos/pyment
 
 
 
 
6
 
7
- COPY . /repos/pyment
 
 
8
 
9
- RUN cd /repos/pyment && \
10
- /envs/pyment/bin/pip install --upgrade pip && \
11
- /envs/pyment/bin/pip install .
 
 
 
 
 
1
+ FROM python:3.10.4-slim
2
 
3
+ RUN mkdir -p /repos/pyment
4
 
5
+ COPY scripts /repos/pyment/scripts
6
+ COPY pyment /repos/pyment/pyment
7
+ COPY pyproject.toml /repos/pyment/
8
+ COPY README.md /repos/pyment/
9
+ COPY LICENSE.md /repos/pyment/
10
 
11
+ RUN pip install --upgrade pip poetry-core build && \
12
+ cd /repos/pyment && \
13
+ pip install --no-cache-dir .
14
 
15
+ RUN mkdir -p /.pyment/weights && \
16
+ chmod -R 1777 /.pyment
17
+ COPY checkpoints/pyment /.pyment/weights
18
+
19
+ CMD ["python", "/repos/pyment/scripts/predict_from_fastsurfer_folder.py", \
20
+ "/fastsurfer", \
21
+ "-d", "/output/predictions.csv"]
docker/preprocess.Dockerfile CHANGED
@@ -1,7 +1,5 @@
1
  FROM python:3.10.2-slim
2
 
3
- #ARG CHECKPOINTS_FOLDER
4
-
5
  RUN apt-get update && apt-get install -y \
6
  apt-utils git \
7
  && rm -rf /var/lib/apt/lists/*
@@ -23,7 +21,7 @@ RUN /envs/fastsurfer/bin/pip install --upgrade pip && \
23
  /envs/fastsurfer/bin/pip install -r ${FASTSURFER_HOME}/requirements.txt
24
 
25
  #COPY ${CHECKPOINTS_FOLDER} ${FASTSURFER_HOME}/FastSurferCNN/checkpoints
26
- COPY checkpoints ${FASTSURFER_HOME}/FastSurferCNN/checkpoints
27
 
28
  RUN mkdir /scripts
29
  COPY scripts/preprocess.sh /scripts/preprocess.sh
 
1
  FROM python:3.10.2-slim
2
 
 
 
3
  RUN apt-get update && apt-get install -y \
4
  apt-utils git \
5
  && rm -rf /var/lib/apt/lists/*
 
21
  /envs/fastsurfer/bin/pip install -r ${FASTSURFER_HOME}/requirements.txt
22
 
23
  #COPY ${CHECKPOINTS_FOLDER} ${FASTSURFER_HOME}/FastSurferCNN/checkpoints
24
+ COPY checkpoints/fastsurfer ${FASTSURFER_HOME}/FastSurferCNN/checkpoints
25
 
26
  RUN mkdir /scripts
27
  COPY scripts/preprocess.sh /scripts/preprocess.sh
docker/preprocess_and_predict.Dockerfile ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM estenhl/pyment-preprocessing:1.0.0
2
+
3
+ RUN apt-get update && apt-get install -y \
4
+ make build-essential libssl-dev zlib1g-dev \
5
+ libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm \
6
+ libncursesw5-dev xz-utils tk-dev libxml2-dev libxmlsec1-dev \
7
+ libffi-dev liblzma-dev git \
8
+ && rm -rf /var/lib/apt/lists/*
9
+
10
+ ENV PYENV_ROOT=/root/.pyenv
11
+ ENV PATH="$PYENV_ROOT/bin:$PATH"
12
+ RUN curl https://pyenv.run | bash && \
13
+ echo 'eval "$(pyenv init -)"' >> ~/.bashrc
14
+
15
+ RUN eval "$(pyenv init -)" && \
16
+ pyenv install 3.10.4
17
+
18
+ RUN mkdir -p /envs && \
19
+ $PYENV_ROOT/versions/3.10.4/bin/python -m venv /envs/pyment
20
+
21
+ RUN mkdir -p /repos/pyment
22
+
23
+ COPY scripts /repos/pyment/scripts
24
+ COPY pyment /repos/pyment/pyment
25
+ COPY pyproject.toml /repos/pyment/
26
+ COPY README.md /repos/pyment/
27
+ COPY LICENSE.md /repos/pyment/
28
+
29
+ RUN /envs/pyment/bin/pip install --upgrade pip poetry-core build && \
30
+ cd /repos/pyment && \
31
+ /envs/pyment/bin/pip install --no-cache-dir .
32
+
33
+ CMD ["/bin/sh", "-c", \
34
+ "/scripts/preprocess.sh \
35
+ --license /licenses/freesurfer.txt \
36
+ --python /envs/fastsurfer/bin/python \
37
+ /inputs \
38
+ /outputs/fastsurfer \
39
+ && /envs/pyment/bin/python /repos/pyment/scripts/predict_from_fastsurfer_folder.py \
40
+ /outputs/fastsurfer \
41
+ -d /outputs/predictions.csv"]
pyment/__init__.py CHANGED
@@ -1,15 +1,20 @@
1
- import os
2
- import tomli
3
-
4
  def _get_version():
5
- """Get version from pyproject.toml"""
6
- pyproject_path = os.path.join(
7
- os.path.dirname(__file__), os.pardir, 'pyproject.toml'
8
- )
 
 
 
 
 
 
 
 
 
 
9
 
10
- with open(pyproject_path, 'rb') as f:
11
- data = tomli.load(f)
12
 
13
- return data['project']['version']
14
 
15
  __version__ = _get_version()
 
 
 
 
1
  def _get_version():
2
+ """Get version from package metadata (generated from pyproject.toml during installation)"""
3
+ try:
4
+ from importlib.metadata import version, PackageNotFoundError
5
+ return version('pyment')
6
+ except PackageNotFoundError:
7
+ import os
8
+ import tomli
9
+
10
+ pyproject_path = os.path.join(
11
+ os.path.dirname(__file__), os.pardir, 'pyproject.toml'
12
+ )
13
+ if os.path.exists(pyproject_path):
14
+ with open(pyproject_path, 'rb') as f:
15
+ data = tomli.load(f)
16
 
17
+ return data['project']['version']
 
18
 
 
19
 
20
  __version__ = _get_version()
pyproject.toml CHANGED
@@ -14,6 +14,9 @@ requires-python = "==3.10.4"
14
 
15
  [tool.poetry]
16
  packages = [{include = "pyment"}]
 
 
 
17
 
18
  [tool.poetry.dependencies]
19
  python = "3.10.4"
 
14
 
15
  [tool.poetry]
16
  packages = [{include = "pyment"}]
17
+ include = [
18
+ {path = "pyproject.toml", format = "sdist"}
19
+ ]
20
 
21
  [tool.poetry.dependencies]
22
  python = "3.10.4"
scripts/predict_from_fastsurfer_folder.py CHANGED
@@ -12,7 +12,10 @@ import nibabel as nib
12
  from pyment.models.sfcn import sfcn_factory
13
  from pyment.preprocessing.conform import conform
14
 
15
-
 
 
 
16
  logger = logging.getLogger(__name__)
17
 
18
  def _parse_folder_name(name: str) -> Tuple[str, str, str]:
@@ -25,7 +28,8 @@ def _parse_folder_name(name: str) -> Tuple[str, str, str]:
25
 
26
  def predict_from_fastsurfer_folder(
27
  source: str,
28
- weights: str,
 
29
  model_name: str = 'sfcn-multi',
30
  targets: List[str] = [
31
  'age', 'sex', 'handedness', 'bmi', 'fluid_intelligence', 'neuroticism'
@@ -42,13 +46,26 @@ def predict_from_fastsurfer_folder(
42
 
43
  results = []
44
 
45
- for folder in tqdm(os.listdir(source)):
 
 
 
 
 
 
 
 
 
 
46
  orig = os.path.join(source, folder, 'mri', 'orig.mgz')
47
 
48
  subject, session, run = _parse_folder_name(folder)
49
 
50
  if not os.path.isfile(orig):
51
- logger.warning('No orig.mgz file for folder %s', folder)
 
 
 
52
  continue
53
 
54
  orig = nib.load(orig)
@@ -56,10 +73,18 @@ def predict_from_fastsurfer_folder(
56
  brainmask = os.path.join(source, folder, 'mri', 'mask.mgz')
57
 
58
  if not os.path.isfile(brainmask):
59
- logger.warning('No mask.mgz file for folder %s', folder)
 
 
 
 
 
 
 
 
 
60
  continue
61
 
62
- brainmask = nib.load(brainmask)
63
  brainmask = brainmask.get_fdata()
64
 
65
  image = orig * brainmask
@@ -109,8 +134,9 @@ if __name__ == '__main__':
109
  default='multi-2025',
110
  help=(
111
  'Weights to use. Should either point to a local file path, or a '
112
- 'known identifier. If a local file path <path> is used, there should '
113
- 'exist files named <path>.index and <path>.data-00000-of-00001'
 
114
  )
115
  )
116
  parser.add_argument(
@@ -131,6 +157,15 @@ if __name__ == '__main__':
131
  ],
132
  help='Name to use for each of the prediction heads in the output CSV'
133
  )
 
 
 
 
 
 
 
 
 
134
  parser.add_argument(
135
  '-d', '--destination',
136
  required=False,
@@ -142,9 +177,10 @@ if __name__ == '__main__':
142
 
143
  predict_from_fastsurfer_folder(
144
  source=args.root,
 
145
  model_name=args.model,
146
  weights=args.weights,
147
  targets=args.targets,
148
- destination=args.destination
149
  )
150
 
 
12
  from pyment.models.sfcn import sfcn_factory
13
  from pyment.preprocessing.conform import conform
14
 
15
+ logging.basicConfig(
16
+ format='%(asctime)s - %(levelname)s - %(name)s: %(message)s',
17
+ level=logging.DEBUG
18
+ )
19
  logger = logging.getLogger(__name__)
20
 
21
  def _parse_folder_name(name: str) -> Tuple[str, str, str]:
 
28
 
29
  def predict_from_fastsurfer_folder(
30
  source: str,
31
+ folders: List[str] = None,
32
+ weights: str = None,
33
  model_name: str = 'sfcn-multi',
34
  targets: List[str] = [
35
  'age', 'sex', 'handedness', 'bmi', 'fluid_intelligence', 'neuroticism'
 
46
 
47
  results = []
48
 
49
+ logger.info(f'Reading fastsurfer folders from {source}')
50
+
51
+ folders = (
52
+ folders if folders is not None
53
+ else [
54
+ folder for folder in os.listdir(source)
55
+ if os.path.isdir(os.path.join(source, folder))
56
+ ]
57
+ )
58
+
59
+ for folder in tqdm(folders):
60
  orig = os.path.join(source, folder, 'mri', 'orig.mgz')
61
 
62
  subject, session, run = _parse_folder_name(folder)
63
 
64
  if not os.path.isfile(orig):
65
+ logger.warning(
66
+ 'No orig.mgz file for folder %s',
67
+ os.path.join(source, folder)
68
+ )
69
  continue
70
 
71
  orig = nib.load(orig)
 
73
  brainmask = os.path.join(source, folder, 'mri', 'mask.mgz')
74
 
75
  if not os.path.isfile(brainmask):
76
+ logger.warning(
77
+ 'No mask.mgz file for folder %s',
78
+ os.path.join(source, folder)
79
+ )
80
+ continue
81
+
82
+ try:
83
+ brainmask = nib.load(brainmask)
84
+ except Exception as e:
85
+ logger.error('Error loading brainmask for folder %s: %s', folder, e)
86
  continue
87
 
 
88
  brainmask = brainmask.get_fdata()
89
 
90
  image = orig * brainmask
 
134
  default='multi-2025',
135
  help=(
136
  'Weights to use. Should either point to a local file path, or a '
137
+ 'known identifier. If a local file path <path> is used, there '
138
+ 'should exist files named <path>.index and '
139
+ '<path>.data-00000-of-00001'
140
  )
141
  )
142
  parser.add_argument(
 
157
  ],
158
  help='Name to use for each of the prediction heads in the output CSV'
159
  )
160
+ parser.add_argument(
161
+ '-f', '--folders',
162
+ default=None,
163
+ nargs='+',
164
+ help=(
165
+ 'List of folders to process. If not provided, all folders in '
166
+ 'the source folder will be processed.'
167
+ )
168
+ )
169
  parser.add_argument(
170
  '-d', '--destination',
171
  required=False,
 
177
 
178
  predict_from_fastsurfer_folder(
179
  source=args.root,
180
+ folders=args.folders,
181
  model_name=args.model,
182
  weights=args.weights,
183
  targets=args.targets,
184
+ destination=args.destination,
185
  )
186
 
scripts/preprocess.sh CHANGED
@@ -52,7 +52,6 @@ if [ -z "$INPUT" ] || [ -z "$OUTPUT" ]; then
52
  exit 1
53
  fi
54
 
55
- # Validate that license is provided
56
  if [ -z "$LICENSE" ]; then
57
  echo "Error: License is required"
58
  usage
@@ -86,14 +85,17 @@ fi
86
 
87
  # Loop through each NIFTI file path
88
  echo "$NIFTI_FILES" | while IFS= read -r filepath; do
89
- # Extract filename without .nii.gz suffix
90
  IMAGE=$(basename "$filepath" .nii.gz)
91
- $FASTSURFER_HOME/run_fastsurfer.sh \
92
- --sd $OUTPUT \
93
- --sid $IMAGE \
94
- --t1 $filepath \
95
- --fs_license $LICENSE \
96
- --py $PYTHON \
97
- --seg_only
 
 
 
 
98
  done
99
 
 
52
  exit 1
53
  fi
54
 
 
55
  if [ -z "$LICENSE" ]; then
56
  echo "Error: License is required"
57
  usage
 
85
 
86
  # Loop through each NIFTI file path
87
  echo "$NIFTI_FILES" | while IFS= read -r filepath; do
 
88
  IMAGE=$(basename "$filepath" .nii.gz)
89
+ if [ ! -f "$OUTPUT/$IMAGE/mri/mask.mgz" ]; then
90
+ $FASTSURFER_HOME/run_fastsurfer.sh \
91
+ --sd $OUTPUT \
92
+ --sid $IMAGE \
93
+ --t1 $filepath \
94
+ --fs_license $LICENSE \
95
+ --py $PYTHON \
96
+ --seg_only
97
+ else
98
+ echo "$IMAGE already processed"
99
+ fi
100
  done
101
 
tutorials/evaluate_ixi_predictions.py CHANGED
@@ -13,9 +13,8 @@ def evaluate_ixi_predictions(
13
  predictions: str
14
  ) -> None:
15
  labels = pd.read_excel(labels)
16
- print(labels.head())
17
  predictions = pd.read_csv(predictions)
18
- print(predictions.head())
19
  predictions['IXI_ID'] = predictions['source'].apply(
20
  lambda path: int(path.split('/')[-1][3:6])
21
  )
@@ -28,7 +27,7 @@ def evaluate_ixi_predictions(
28
  )
29
 
30
  mae = np.mean(np.abs(predictions['AGE'] - predictions['age_prediction']))
31
- print(f'MAE: {mae}')
32
 
33
  plt.scatter(predictions['AGE'], predictions['age_prediction'])
34
  plt.xlabel('True age')
 
13
  predictions: str
14
  ) -> None:
15
  labels = pd.read_excel(labels)
 
16
  predictions = pd.read_csv(predictions)
17
+
18
  predictions['IXI_ID'] = predictions['source'].apply(
19
  lambda path: int(path.split('/')[-1][3:6])
20
  )
 
27
  )
28
 
29
  mae = np.mean(np.abs(predictions['AGE'] - predictions['age_prediction']))
30
+ print(f'MAE: {mae:.2f}')
31
 
32
  plt.scatter(predictions['AGE'], predictions['age_prediction'])
33
  plt.xlabel('True age')