Deep Learning based medical image segmentation — Part 7 — Inference

6 min readJul 17, 2023

Comparison of Ground Truth and Predicted Glioblastoma tumor segmentation. Image by Author

We have completed the training of our nnUNet deep learning model on the TCIA Glioblastoma MRI dataset. Now, let’s find the best configuration and run inference on our test dataset — that is, we will use the trained model to predict tumor segmentation on unseen images.

Open a new Colab notebook — 06_t501_glio_model_inference.ipynb
Make sure you have selected the A100 GPU and High RAM option in your Runtime menu
Mount your Google Drive

from google.colab import drive
drive.mount('/content/drive')

Install nnUNet V2 in this instance of your VM

!pip install nnunetv2

Import packages and set your environment variables

import os
os.environ['nnUNet_raw'] = "/content/drive/MyDrive/TCIA/nnUNet/nnUNet_raw"
os.environ['nnUNet_preprocessed'] =  "/content/drive/MyDrive/TCIA/nnUNet/nnUNet_preprocessed"
os.environ['nnUNet_results'] = "/content/drive/MyDrive/TCIA/nnUNet/nnUNet_results"

If you have used “--npz” parameter during model training, the softmax outputs are saved during final validation, and this can now be used by nnUNet to suggest the best configuration for inference. It shows us the best DICE score (0.787 in this case) and also tries different ensembles to see if the DICE score improves. nnUNet provides us the actual commands to run for inference (prediction) and for post-processing as shown in the results below.

# The following is an operating system command, remember to prefix with !
!nnUNetv2_find_best_configuration Dataset501_Glioblastoma -c 3d_fullres

***All results:***
nnUNetTrainer__nnUNetPlans__3d_fullres: 0.7871442290536131

*Best*: nnUNetTrainer__nnUNetPlans__3d_fullres: 0.7871442290536131

***Determining postprocessing for best model/ensemble***
Removing all but the largest foreground region did not improve results!
Removing all but the largest component for 1 did not improve results! Dice before: 0.73489 after: 0.71574
Removing all but the largest component for 2 did not improve results! Dice before: 0.80715 after: 0.79979
Removing all but the largest component for 3 did not improve results! Dice before: 0.8194 after: 0.80888

***Run inference like this:***

nnUNetv2_predict -d Dataset501_Glioblastoma -i INPUT_FOLDER -o OUTPUT_FOLDER -f  0 1 2 3 4 -tr nnUNetTrainer -c 3d_fullres -p nnUNetPlans

***Once inference is completed, run postprocessing like this:***

nnUNetv2_apply_postprocessing -i OUTPUT_FOLDER -o OUTPUT_FOLDER_PP -pp_pkl_file /content/drive/MyDrive/TCIA/nnUNet/nnUNet_results/Dataset501_Glioblastoma/nnUNetTrainer__nnUNetPlans__3d_fullres/crossval_results_folds_0_1_2_3_4/postprocessing.pkl -np 8 -plans_json /content/drive/MyDrive/TCIA/nnUNet/nnUNet_results/Dataset501_Glioblastoma/nnUNetTrainer__nnUNetPlans__3d_fullres/crossval_results_folds_0_1_2_3_4/plans.json

So now we can run our inference like below. Our input folder is the imageTs folder, which has the test cohort images that are unseen by our trained model. The output folder is a separate, empty folder called inference we had created earlier to store our predicted segmentation files. You can see from the output that it is running inference on 27 cases, which is 20% of the overall dataset of 147 cases, and the same set that we had split off as the test dataset when we started our project.

# The following is an operating system command, remember to prefix with !
!nnUNetv2_predict -d Dataset501_Glioblastoma -i /content/drive/MyDrive/TCIA/nnUNet/nnUNet_raw/Dataset501_Glioblastoma/imagesTs -o /content/drive/MyDrive/TCIA/nnUNet/nnUNet_results/Dataset501_Glioblastoma/inference -f  0 1 2 3 4 -tr nnUNetTrainer -c 3d_fullres -p nnUNetPlans

#######################################################################
Please cite the following paper when using nnU-Net:
Isensee, F., Jaeger, P. F., Kohl, S. A., Petersen, J., & Maier-Hein, K. H. (2021). nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nature methods, 18(2), 203-211.
#######################################################################

There are 27 cases in the source folder
I am process 0 out of 1 (max process ID is 0, we start counting with 0!)
There are 27 cases that I would like to predict
using pin_memory on device 0

Predicting 101:
perform_everything_on_gpu: True
Prediction done, transferring to CPU if needed
sending off prediction to background worker for resampling and export
done with 101

Predicting 114:
perform_everything_on_gpu: True
Prediction done, transferring to CPU if needed
sending off prediction to background worker for resampling and export
done with 114

Predicting 117:
perform_everything_on_gpu: True
Prediction done, transferring to CPU if needed
sending off prediction to background worker for resampling and export
done with 117

Predicting 124:
perform_everything_on_gpu: True
Prediction done, transferring to CPU if needed
sending off prediction to background worker for resampling and export
done with 124

......
......

Predicting 93:
perform_everything_on_gpu: True
Prediction done, transferring to CPU if needed
sending off prediction to background worker for resampling and export
done with 93

You can see that the model has predicted the tumor segmentation on the test dataset and saved the segmentation files in the inference folder.

Tumor segmentation files predicted by our trained model. Image by Author

Now let’s run the post-processing step to generate the final segmentation files using the best ensemble/model as determined by nnUNet based on the DICE score. The input folder is our inference folder, and the output folder is a separate, empty folder called postprocessing that we had created earlier.

# The following is an operating system command, remember to prefix with !
!nnUNetv2_apply_postprocessing -i /content/drive/MyDrive/TCIA/nnUNet/nnUNet_results/Dataset501_Glioblastoma/inference -o /content/drive/MyDrive/TCIA/nnUNet/nnUNet_results/Dataset501_Glioblastoma/postprocessing -pp_pkl_file /content/drive/MyDrive/TCIA/nnUNet/nnUNet_results/Dataset501_Glioblastoma/nnUNetTrainer__nnUNetPlans__3d_fullres/crossval_results_folds_0_1_2_3_4/postprocessing.pkl -np 8 -plans_json /content/drive/MyDrive/TCIA/nnUNet/nnUNet_results/Dataset501_Glioblastoma/nnUNetTrainer__nnUNetPlans__3d_fullres/crossval_results_folds_0_1_2_3_4/plans.json

You will see the final predicted segmentation files in the postprocessing folder

Final tumor segmentation files after post-processing. Image by Author

We can do a visual inspection of one of our predicted segmentation files using the ITKSnap software. We will also open the ground truth segmentation file where an expert radiologist demarcated the tumor regions. Remember, a visual comparison of ground truth and predicted segmentation files is not recommended or sufficient to determine clinical applicability. There are several model performance metrics such as Sensitivity, Hausdorff Distance and Volumetric analysis that will need to be performed to assess our model’s prediction accuracy.
Let’s chose patient # 20 from the test cohort. You will need the main MRI scan and the segmentation file for this patient. In ITKSnap, open the T1GD scan from imagesTs for patient #20. We had renamed it to 20_0001.nii.gz during pre-processing. The ground truth segmentation file is in the images_segm folder that we had downloaded from TCIA, and is called UPENN-GBM-00020_11_segm.nii.gz. See my Part 5 — Data Preprocessing post to learn how to open the main MRI image and overlay the segmentation file on top.

Ground truth segmentation overlaid on a T1-GD MRI scan. Image by Author

Now download the post-processed, predicted tumor segmentation file for patient 20 from the postprocessing folder and open it in ITKSnap.

Predicted tumor segmentation overlaid on the T1-GD MRI scan. Image by Author

Notice the slice number — 78 of 155 is the same for both the ground truth and predicted segmentation. You can see the tumor subregions clearly segmented by our model. You can scroll through the different slices in ITKSnap (from 1–155 in this case) to see how the tumor appears as you go from the top of the head to the neck portion.

How accurate is it? Our DICE score is 0.787 on a training dataset of just 117 patients, and for under $10 in GPU spend. Not bad at all, but not clinical grade accuracy. We can tweak the model training hyperparameters to see if we can achieve a higher DICE score, but having more training data will certainly help improve accuracy significantly. You may get different results based on the patients in your training dataset and the model training parameters you used.

This has been a great series for me to write about, and I really believe AI can help better diagnose and treat this horrendous, malignant tumor. I hope you have been able to follow along and learnt about automated tumor segmentation along the way. You can also try this approach on other datasets such as the BRaTS challenge.

Thank you for reading!!

Deep Learning based medical image segmentation — Part 7 — Inference

Written by Diya S

No responses yet