Online

ESTRO 2020

Session Item

Poster highlights 21 PH: Predictive modelling
8300
Poster Highlights
Physics
14:47 - 14:55
Validating a hidden Markov model for lung anatomical change classification using EPID dosimetry
Cecile Wolfs, The Netherlands
PH-0651

Abstract

Validating a hidden Markov model for lung anatomical change classification using EPID dosimetry
Authors: Louis Archambault.(CHU de Québec/Université Laval, Département de Radio-oncologie/Physics Department, Québec QC, Canada), Richard Canters.(GROW-School for Oncology and Developmental Biology - Maastricht University Medical Centre, Department of Radiation Oncology - Maastro, Maastricht, The Netherlands), Sebastiaan Nijsten.(GROW-School for Oncology and Developmental Biology - Maastricht University Medical Centre, Department of Radiation Oncology - Maastro, Maastricht, The Netherlands), Nicolas Varfalvy.(CHU de Québec, Département de Radio-oncologie, Québec QC, Canada), Frank Verhaegen.(GROW-School for Oncology and Developmental Biology - Maastricht University Medical Centre, Department of Radiation Oncology - Maastro, Maastricht, The Netherlands), Cecile Wolfs.(GROW-School for Oncology and Developmental Biology - Maastricht University Medical Centre, Department of Radiation Oncology - Maastro, Maastricht, The Netherlands)
Show Affiliations
Purpose or Objective

A hidden Markov model (HMM) for classifying gamma (γ) analysis results of in vivo electronic portal imaging device (EPID) measurements into different categories of anatomical change for lung cancer patients was externally validated. The relation between model classification and differences in dose-volume histogram (DVH) metrics was also analyzed.

Material and Methods

The HMM was developed at institute A, and trained on features extracted from γ analysis maps of 2197 in vivo time-integrated (TI) EPID images from 490 fractions (22 patients, treated with 3D-CRT or IMRT), using (3%,3mm) criteria, 10% low dose threshold, and the EPID image of the first treatment fraction as reference. The model inputs were the average γ value, standard deviation, and average value of the top 1% of γ values, averaged over all beams in a fraction. The HMM classified each fraction into one of three categories: no anatomical change (Cat1), some change (no clinical action needed, Cat2) and severe change (clinical action needed, Cat3). The external validation dataset consisted of 760 TI EPID images from 266 fractions (31 patients) treated at institute B with VMAT or hybrid plans (static beams and VMAT arcs). Features in both datasets were extracted in the same way. For patients in the validation set, a cone beam CT (CBCT) scan was made before each fraction. Contours were propagated from the planning CT to the CBCTs using Mirada (Mirada Medical Ltd., Oxford, UK), and the dose was recalculated. DVH metrics for targets and organs-at-risk (OARs) were extracted for each fraction, and compared to the planned dose. Mann-Whitney U tests were performed to evaluate statistical significance of deviations in DVH metrics between each pair of HMM categories.

Results

The HMM achieved 78.9% accuracy compared to threshold classification based on the average γ value alone (a surrogate for clinical classification). The confusion matrix (Fig.1) shows that the HMM overestimates the amount of fractions in Cat2 compared to both Cat1 and Cat3. Fig.2 shows that for lungs-GTV, heart and mediastinum, there is a trend towards higher deviations in DVH metrics with classification into higher categories by the HMM.



Fig.1: Confusion matrix comparing HMM classification to threshold classification based on the average γ value.




Fig.2: Boxplots for the deviations in the DVH metrics, excluding outliers. x-axis: HMM classification, y-axis: change in DVH metric, *: p<0.05, **: p<0.01.


Conclusion

The HMM performs well on an external dataset considering accuracy, showing that it can be transferred between institutes. However, underestimation of categories can lead to relevant fractions not being flagged, potentially missing anatomical changes (false negatives), while overestimation leads to unnecessary flagging (false positives), thus increasing workload. Model fine-tuning may resolve this. Relating HMM classification based on γ features to increasing DVH differences is possible for some OARs, but not for the target volumes.