Competition scoring

Judging the quality of a plan where there are multiple criteria and clinical trade-offs to be made is quite subject. However, a challenge, like the Auto-RTP challenge, requires a single quality score. Therefore, we are proposing to define a measure following the approach of Nelms et al [1], whereby the criteria to be met are converted into scoring functions, and the plan quality measure (PQM) is the sum of all individual criteria scores.

The following is proposed as the PQM for this challenge, but is subject to change prior to the competition starting

Target objectives

For the prescribed dose to the primary treatment volume, a linear scale to the given tolerance will be used, such that a score of 1 will be given for a median dose at the prescribed dose and a score of 0 will be given for being outside of, or at, the tolerance.


For minimum and maximum doses, a linear scale to the prescribed dose will be applied such that a score of 1 will be given for achieving the prescribed dose, and a score of zero will be given for doing worse than, or the same as, the min/max objective. This effectively assumes we would like the dose to be as homogeneous as possible to the prescribed dose. Based on participant feedback minimum dose to the PTV will be score against the D95% rather than the absolute minimum. Furthermore, the absolute voxel minimum for the CTV_Prostate,  CTV_ProstateBed, CTV_SeminalVes and CTV_LN Pelvic  will be evaluated against the prescribe PTV minimum where they are relevant.


For outer treatment volumes (e.g. PTVp_7100-PTVp_7400), a one-side tolerance of 1% will be used such that a score of 1 is given if the objective is met, and a score of 0 is given if the median dose is more than 1% of the prescribed dose outside of the objective.


This will give a maximum of 9 points for prostate only cases, 12 points for prostate + nodes cases, and 7 points for prostate bed cases. Each case will be rescaled to give a score out of 50.

OAR constraints

For each constraint, a linear scale will be used between the mandatory and optimal values, such that any plan better or equal to the optimal of a constraint will score 1, and any plan worse or equal to the mandatory will score 0.


V30 and V40 will be ignored for the Rectum, the V70 will be scored double for the Bladder, and the femoral heads will score five times the value, such that each OAR will be scored out of 5. The total score available for all OARs will equal 25. This total score will be doubled to give a score out of 50.

Final scoring

The sum of the Target objectives and OAR constraint scores will be used to give a PQM out of 100. The mean across all cases will be used for ranking purposes.

Important technical details

The consensus contour of all expert contours will be used when calculating the PQM for ranking. No dose calculation will be performed by the scoring system, rather the dose volume supplied by the participant will be used for calculation of DVH parameters. The organizers reserve the right to verify that the RTPlan submitted would generate the dose volume submitted. The DVH parameters will be calculated after resampling of the dose volume to the CT image resolution using linear interpolation.

Additional Measures

Additional measures may be calculated and reported, but not used for ranking, to give better insight into the challenge. These may include the PQM against each individual expert to assess what the impact of different contours might be on the measures, and the Added Path Length against the consensus and each individual expert to assess how similar the contours are to the experts taking into account interobserver variation.

References

  1. Nelms BE, Robinson G, Markham J, et al. Variation in external beam treatment plan quality: an inter-institutional study of planners and planning systems. Practical radiation oncology. 2012 Oct 1;2(4):296-305.