A gastrointestinal physician (G1)
segmented 550 endoscopy images of rectal tumors into tumor and non-tumor
regions. To quantify the inter-observer variability, a second gastrointestinal physician
(G2) contoured 319 of the images independently.
The 550 images and annotations
from G1 were divided into 408 training, 82 validation, and 60 testing sets. Three
deep learning architectures were trained; a fully convolutional neural network
(FCN32), a U-Net, and a SegNet. These architectures have been used for robust
medical image segmentation in previous studies.
All models were trained on a CPU
supercomputing cluster. Data augmentation in the form of random image
transformations, including scaling, rotation, shearing, Gaussian blurring, and
noise addition, was used to improve the models' robustness.
The neural networks' output went through a final
layer of noise removal and hole filling before evaluation. Finally, the segmentations
from G2 and the neural networks' predictions were compared against the ground
truth labels from G1.