Vienna, Austria

ESTRO 2023

Session Item

Poster (Digital)
Interobserver study of deep learning-based segmentation for nodal target volumes in breast cancer
Nienke Hoekstra, The Netherlands


Interobserver study of deep learning-based segmentation for nodal target volumes in breast cancer

Chrissy Adriaans1,2, Frank Dankers2, Chrysi Papalazarou2, Nienke Hoekstra2

1Delft University of Technology, Technical Medicine, Delft, The Netherlands; 2Leiden University Medical Center, Radiation Oncology, Leiden, The Netherlands

Show Affiliations
Purpose or Objective

Adjuvant radiotherapy (RT) after breast-conserving surgery for breast cancer reduces the risk of local recurrence and increases overall survival. Accurate segmentation of the target volumes and organs at risk (OARs) is a crucial step in the RT workflow to deliver an adequate therapeutic radiation dose to the target volumes while sparing the OARs. Target volumes comprise the tumor bed, whole breast, and the axillary (L1-L4) and interpectoral (IP) lymph nodes. Delineating these target volumes is a labor intensive task and is known to have a high interobserver variability (IOV).
This study aims to compare the IOV of manual segmentations of the regional lymph node levels to the IOV of deep learning (DL)-based segmentations manually corrected by radiation oncologists (ROs). Implementation of the DL-based segmentation in the RT workflow as initial segmentation might improve consistency and reproducibility, and reduce the workload of ROs.

Material and Methods

Five experienced ROs delineated the axillary (L1-L4) and IP lymph node levels on the planning CT in the RayStation TPS (RayStation 10B, RaySearch Laboratories) of two breast cancer patients requiring locoregional adjuvant RT. Additionally, DL-based segmentations were generated using a 3D U-net convolutional neural network (CNN) available in the TPS, which was pretrained on delineations following the ESTRO delineation guidelines. The ROs corrected the DL-based segmentations to clinically acceptable segmentations after a two-week interval. The time to create the manual and cDL segmentations was recorded in minutes for each RO. IOV within both the manual segmentations and corrected DL (cDL) segmentations was compared using the Dice Similarity Coefficient (DSC), surface DSC (sDSC) and 95% Hausdorff Distance (HD). A Wilcoxon signed-rank test was conducted to assess the statistical differences in IOV between the manual and cDL segmentations.


An example of the delineations is shown in Figure 1. The median DSC of all lymph node levels combined for the manual and cDL segmentations was 0.67 (± 0.071 IQR) and 0.90 (± 0.040 IQR), respectively. The cDL segmentations had a significantly higher DSC for all individual lymph node levels (Figure 2). Additionally, all median 95% HD and sDSC were improved (Table 1). The median time investment for the manual and cDL segmentations was 13 minutes (10-17) and 9 minutes (5-16), respectively.

Table 1. Similarity metrics (median)

95% HD [mm] 

  95% HD [mm]






This study shows that the use of deep learning auto-segmentation results in a higher interobserver agreement in the delineations of the nodal target volumes for breast cancer patients. All similarity metrics improved compared to the manual delineations. Therefore, implementation of deep learning auto-segmentation results in more consistent and reproducible segmentations, while reducing the delineation time by about one third.