ESTRO 2023

Session Item

Automation

Session Type: Poster (Digital)

Track: Physics

Journey:

Multicentric evaluation of a machine learning model to streamline the RT patient-specific QA process

Nicola Lambri, Italy

Presentation Number: PO-1617

Abstract

Abstract Title:

Multicentric evaluation of a machine learning model to streamline the RT patient-specific QA process

Authors:

Nicola Lambri^1,2, Victor Hernandez³, Jordi Sáez⁴, Marco Pelizzoli¹, Sara Parabicoli¹, Andrea Bresolin¹, Damiano Dei^1,2, Ciro Franzese^1,2, Pasqualina Gallo¹, Francesco La Fauci¹, Francesca Lobefalo¹, Lucia Paganini¹, Giacomo Reggiori^1,2, Stefano Tomatis¹, Daniele Loiacono⁵, Marta Scorsetti^1,2, Pietro Mancosu¹

¹IRCCS Humanitas Research Hospital, Radiotherapy and Radiosurgery Department, Milan, Italy; ²Humanitas University, Department of Biomedical Sciences, Milan, Italy; ³Hospital Universitari Sant Joan de Reus, Department of Medical Physics, Tarragona, Spain; ⁴Hospital Clínic de Barcelona, Department of Radiation Oncology, Barcelona, Spain; ⁵Politecnico di Milano, Dipartimento di Elettronica, Informazione e Bioingegneria, Milan, Italy

Show Affiliations

Purpose or Objective

Patient-specific quality assurance (PSQA) is an important step of intensity modulated plan verification to ensure that treatment plans can be delivered as intended. The time and effort required to perform measurement-based PSQA constitutes a substantial workload that could slow down the radiotherapy process and delay the start of clinical treatments. In this study, a machine learning (ML) tree-based ensemble model to predict the gamma passing rate (GPR) was developed, and its applicability in three independent Institutions was evaluated.

Material and Methods

5622 VMAT plans from multiple treatment sites were selected from the internal database of Institution 1. After a thorough data cleaning procedure, ~2% of candidate plans were discarded. XGBoost, a tree-based ensemble ML model, was trained on 5522 VMAT plans using 19 input features (10 plan complexity metrics and 9 plan parameters). The GPR analyses were performed automatically on acquired images using the criteria 3%/1 mm (global normalization with absolute dose, 10% threshold) and 95% action limit. To examine the sensitivity of the model to the density of data points above 95% GPR, where more than 80% of the GPRs resided, the training set was randomly undersampled. The ratio of the minority class (i.e., GPR <95%) over the majority class (i.e., GPR >=95%) was increased from 20% of the complete training set, to 40%, 60%, 80%, and 100%. Then, for each undersampling level, a new regression model was trained. Models performance was evaluated on an out-of-sample test set of Institution 1 and on two independent sets of measurements collected at Institution 2 and Institution 3. The mean absolute error (MAE), absolute error statistics, as well as the models’ sensitivity and specificity, were computed.

Results

Figure 1 shows the distribution of the residuals (i.e., the difference between measurements and predictions) for each Institution for the model trained on all available training data (20% class balance). Small positive median values were observed (0.95%, 1.66%, and 3.42%). Thus, the model’s predictions were, on average, close to the real values and, in most cases, tended to slightly underestimate the experimental GPR, providing a conservative estimation. Table 1 reports the evaluation metrics of the regression models for each Institution. In general, an increase in class balance was associated with a degradation in the MAE and specificity, whereas the models’ sensitivity improved. The model trained on all available training data (20% class balance) achieved the lowest MAE of 2.33%, 2.54%, and 3.91% on the three Institutions, with a specificity of 0.90, 0.90 and 0.68, and a sensitivity of 0.61, 0.25, and 0.55, respectively.

Conclusion

Our results indicate that ML models can be integrated into clinical practice to streamline the radiotherapy workflow, but they should be centre-specific or thoroughly verified within centres before clinical use.