Independent external validation of predictive models for urinary dysfunction following external beam radiotherapy of the prostate: Issues in model development and reporting

Noorazrul Azmie Yahya, Martin A. Ebert, Max Bulsara, Angel Kennedy, David J. Joseph, James W. Denham

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

Background and purpose: Most predictive models are not sufficiently validated for prospective use. We performed independent external validation of published predictive models for urinary dysfunctions following radiotherapy of the prostate. Materials/methods: Multivariable models developed to predict atomised and generalised urinary symptoms, both acute and late, were considered for validation using a dataset representing 754 participants from the TROG 03.04-RADAR trial. Endpoints and features were harmonised to match the predictive models. The overall performance, calibration and discrimination were assessed. Results: 14 models from four publications were validated. The discrimination of the predictive models in an independent external validation cohort, measured using the area under the receiver operating characteristic (ROC) curve, ranged from 0.473 to 0.695, generally lower than in internal validation. 4 models had ROC >0.6. Shrinkage was required for all predictive models' coefficients ranging from -0.309 (prediction probability was inverse to observed proportion) to 0.823. Predictive models which include baseline symptoms as a feature produced the highest discrimination. Two models produced a predicted probability of 0 and 1 for all patients. Conclusions: Predictive models vary in performance and transferability illustrating the need for improvements in model development and reporting. Several models showed reasonable potential but efforts should be increased to improve performance. Baseline symptoms should always be considered as potential features for predictive models.

Original languageEnglish
JournalRadiotherapy and Oncology
DOIs
Publication statusAccepted/In press - 11 Mar 2016

Fingerprint

ROC Curve
Prostate
Radiotherapy
Calibration
Publications
Datasets

Keywords

  • Independent external validation
  • Normal tissue complications
  • Predictive model
  • Prostate radiotherapy
  • Urinary symptoms

ASJC Scopus subject areas

  • Oncology
  • Radiology Nuclear Medicine and imaging
  • Hematology

Cite this

Independent external validation of predictive models for urinary dysfunction following external beam radiotherapy of the prostate : Issues in model development and reporting. / Yahya, Noorazrul Azmie; Ebert, Martin A.; Bulsara, Max; Kennedy, Angel; Joseph, David J.; Denham, James W.

In: Radiotherapy and Oncology, 11.03.2016.

Research output: Contribution to journalArticle

@article{dfd4ed96cf204f3f886a2cfd18d002be,
title = "Independent external validation of predictive models for urinary dysfunction following external beam radiotherapy of the prostate: Issues in model development and reporting",
abstract = "Background and purpose: Most predictive models are not sufficiently validated for prospective use. We performed independent external validation of published predictive models for urinary dysfunctions following radiotherapy of the prostate. Materials/methods: Multivariable models developed to predict atomised and generalised urinary symptoms, both acute and late, were considered for validation using a dataset representing 754 participants from the TROG 03.04-RADAR trial. Endpoints and features were harmonised to match the predictive models. The overall performance, calibration and discrimination were assessed. Results: 14 models from four publications were validated. The discrimination of the predictive models in an independent external validation cohort, measured using the area under the receiver operating characteristic (ROC) curve, ranged from 0.473 to 0.695, generally lower than in internal validation. 4 models had ROC >0.6. Shrinkage was required for all predictive models' coefficients ranging from -0.309 (prediction probability was inverse to observed proportion) to 0.823. Predictive models which include baseline symptoms as a feature produced the highest discrimination. Two models produced a predicted probability of 0 and 1 for all patients. Conclusions: Predictive models vary in performance and transferability illustrating the need for improvements in model development and reporting. Several models showed reasonable potential but efforts should be increased to improve performance. Baseline symptoms should always be considered as potential features for predictive models.",
keywords = "Independent external validation, Normal tissue complications, Predictive model, Prostate radiotherapy, Urinary symptoms",
author = "Yahya, {Noorazrul Azmie} and Ebert, {Martin A.} and Max Bulsara and Angel Kennedy and Joseph, {David J.} and Denham, {James W.}",
year = "2016",
month = "3",
day = "11",
doi = "10.1016/j.radonc.2016.05.010",
language = "English",
journal = "Radiotherapy and Oncology",
issn = "0167-8140",
publisher = "Elsevier Ireland Ltd",

}

TY - JOUR

T1 - Independent external validation of predictive models for urinary dysfunction following external beam radiotherapy of the prostate

T2 - Issues in model development and reporting

AU - Yahya, Noorazrul Azmie

AU - Ebert, Martin A.

AU - Bulsara, Max

AU - Kennedy, Angel

AU - Joseph, David J.

AU - Denham, James W.

PY - 2016/3/11

Y1 - 2016/3/11

N2 - Background and purpose: Most predictive models are not sufficiently validated for prospective use. We performed independent external validation of published predictive models for urinary dysfunctions following radiotherapy of the prostate. Materials/methods: Multivariable models developed to predict atomised and generalised urinary symptoms, both acute and late, were considered for validation using a dataset representing 754 participants from the TROG 03.04-RADAR trial. Endpoints and features were harmonised to match the predictive models. The overall performance, calibration and discrimination were assessed. Results: 14 models from four publications were validated. The discrimination of the predictive models in an independent external validation cohort, measured using the area under the receiver operating characteristic (ROC) curve, ranged from 0.473 to 0.695, generally lower than in internal validation. 4 models had ROC >0.6. Shrinkage was required for all predictive models' coefficients ranging from -0.309 (prediction probability was inverse to observed proportion) to 0.823. Predictive models which include baseline symptoms as a feature produced the highest discrimination. Two models produced a predicted probability of 0 and 1 for all patients. Conclusions: Predictive models vary in performance and transferability illustrating the need for improvements in model development and reporting. Several models showed reasonable potential but efforts should be increased to improve performance. Baseline symptoms should always be considered as potential features for predictive models.

AB - Background and purpose: Most predictive models are not sufficiently validated for prospective use. We performed independent external validation of published predictive models for urinary dysfunctions following radiotherapy of the prostate. Materials/methods: Multivariable models developed to predict atomised and generalised urinary symptoms, both acute and late, were considered for validation using a dataset representing 754 participants from the TROG 03.04-RADAR trial. Endpoints and features were harmonised to match the predictive models. The overall performance, calibration and discrimination were assessed. Results: 14 models from four publications were validated. The discrimination of the predictive models in an independent external validation cohort, measured using the area under the receiver operating characteristic (ROC) curve, ranged from 0.473 to 0.695, generally lower than in internal validation. 4 models had ROC >0.6. Shrinkage was required for all predictive models' coefficients ranging from -0.309 (prediction probability was inverse to observed proportion) to 0.823. Predictive models which include baseline symptoms as a feature produced the highest discrimination. Two models produced a predicted probability of 0 and 1 for all patients. Conclusions: Predictive models vary in performance and transferability illustrating the need for improvements in model development and reporting. Several models showed reasonable potential but efforts should be increased to improve performance. Baseline symptoms should always be considered as potential features for predictive models.

KW - Independent external validation

KW - Normal tissue complications

KW - Predictive model

KW - Prostate radiotherapy

KW - Urinary symptoms

UR - http://www.scopus.com/inward/record.url?scp=84979697041&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84979697041&partnerID=8YFLogxK

U2 - 10.1016/j.radonc.2016.05.010

DO - 10.1016/j.radonc.2016.05.010

M3 - Article

C2 - 27370204

AN - SCOPUS:84979697041

JO - Radiotherapy and Oncology

JF - Radiotherapy and Oncology

SN - 0167-8140

ER -