We selected 195 patients with left breast cancer treated with 3D-CRT. We included patients with axillary and supraclavicular lymph nodes but excluded those with an internal mammary nodal (IMN) chain.
For the model creation, we trained the CNN renormalizing all plans to 2 Gy/fraction, to take into account different prescribed doses. For our CNN model, we implemented a transfer learning approach using a pre-trained VGG-16 and replacing its three last layers with a fully connected neural network.
Input data was the planning CT contour information. Output was a 2D lung and heart DVH for every slice. All slices were subsequently added up to account for the final whole OAR DVH.
For the outliers detection, we partitioned our set in training, validation, and test (176, 10, and 10 patients, respectively). First, we trained the CNN with early stopping. Second, we evaluated how good our model fitted the data in the test set and searched for the presence of any potential outlier using the sum of residuals method to measure the discrepancy between the predicted and the clinical approved DVH; we defined an outlier as any prediction having a sum of residuals greater than one standard deviation from the population mean value. Finally, we repeated this two-step process using different partitions, until all the patients contained in the first training set were once in the test set. At every iteration, we initialized all the CNN parameters to avoid information bleeding.
Once we selected all potential outliers, one researcher (M.L.) proceeded to re-optimized all the plans. We recalculated the sum of residuals for them and elaborated
a confusion matrix with the model results.