Copenhagen, Denmark
Onsite/Online

ESTRO 2022

Session Item

Saturday
May 07
10:30 - 11:30
Poster Station 1
03: Functional imaging & modelling
Eliana Maria Vasquez Osorio, United Kingdom
1300
Poster Discussion
Physics
Deep learning and radiomics of PET/CT images for head and neck cancer treatment outcome prediction
Bao Ngoc Huynh, Norway
PD-0159

Abstract

Deep learning and radiomics of PET/CT images for head and neck cancer treatment outcome prediction
Authors:

Bao Ngoc Huynh1, Aurora Rosvoll Groendahl1, Severin Elvatun Rakh Langberg2, Oliver Tomic1, Eirik Malinen3,4, Einar Dale5, Cecilia Marie Futsaether1

1Norwegian University of Life Sciences, Faculty of Science and Technology, Ås, Norway; 2Cancer Registry of Norway, Department of Registry Informatics, Oslo, Norway; 3Oslo University Hospital, Department of Medical Physics, Oslo, Norway; 4University of Oslo, Department of Physics, Oslo, Norway; 5Oslo University Hospital, Department of Oncology, Oslo, Norway

Show Affiliations
Purpose or Objective

Deep learning models were used to elucidate the roles of clinical factors and radiomics features for predicting disease free survival (DFS), loco-regional control (LRC) and overall survival (OS) in head and neck cancer (HNC) patients.

Material and Methods

139 HNC patients with an 18F-FDG PET/CT scan acquired before radiotherapy were included. The input data consisted of 11 clinical factors, 3 PET parameters (SUVpeak, MTV, TLG) and 468 IBSI-listed radiomics features extracted from the primary tumor volume in the PET and CT images. All numeric features were preprocessed using z-score normalization. Three different groups of input data were used to tune separate fully connected deep learning architectures (Fig. 1a, Table 1 models M1-M3): Input data 1 (D1) 11 clinical factors; Input data 2 (D2) 3 PET parameters, 60 1st order statistical & shape features; Input data 3 (D3) 408 textural features. D2 and D3 were defined as radiomics features. A dropout layer, which randomly deactivated 25% of the nodes, was added to the end of each architecture to prevent overfitting. The prediction targets DFS, LRC and OS were treated as binary responses. Local or regional failure was counted as an LRC event, whereas DFS also included metastatic disease or death as a DFS event.

Deep learning architectures designed separately for D1, D2 and D3 (Fig. 1a), were then concatenated in the second last layer, creating four additional models (Table 1 M4-M7) with multiple input paths (Fig. 1b) trained on input data D1 & D2, D1 & D3, D2 & D3 and D1 & D2 & D3. Ensembles (Table 1 M8-M12, Fig. 1c) based on the mean predicted class probability of models M1-M3 and M6 were also evaluated.

All models were trained using five-fold cross-validation, where the folds were stratified to conserve the proportion of stage I+II vs. III+IV patients (8th edition AJCC/UICC) in the full dataset. The Area Under the Receiver Operating Characteristic Curve (ROC-AUC) was used to evaluate model performance.

Results

For single group input (Table 1 M1-M3), models based solely on clinical factors resulted in the highest ROC-AUC for predicting DFS and OS. For LRC prediction, however, models trained on textural features achieved the highest ROC-AUC scores. Concatenated models (Table 1 M4-M7) trained on multiple inputs performed similarly to those trained on single input, suggesting that there was no added gain by including more than one input.

The highest performance was obtained using an ensemble of models (Table 1 M8-M12). The ensemble M11 of the single-input model M1 trained on clinical factors only and the multiple-input model M6 trained on radiomics features gave the highest ROC-AUC scores for all endpoints.


Conclusion

Textural features extracted from PET/CT images were the best predictors of LRC, while clinical factors were more important for predicting DFS and OS. An ensemble of models trained on clinical factors and radiomics features separately can achieve overall good DFS, LRC and OS predictions.