Evaluation of robustly optimised intensity modulated proton therapy for nasopharyngeal carcinoma

Background and purpose: To evaluate the dosimetric changes occurring over the treatment course for nasopharyngeal carcinoma (NPC) patients treated with robustly optimised intensity modulated proton therapy (IMPT). Materials and methods: 25 NPC patients were treated to two dose levels (CTV1: 70 Gy, CTV2: 54.25 Gy) with robustly optimised IMPT plans. Robustness evaluation was performed over 28 error scenarios using voxel-wise minimum distributions to assess target coverage and voxel-wise maximum distributions to assess possible hotspots and critical organ doses. Daily CBCT was used for positioning and weekly repeat CTs (rCT) were taken, on which the plan dose was recalculated and robustly evaluated. Deformable image registration was used to warp and accumulate the nominal, voxel-wise minimum and maximum rCT dose distributions. Changes to target coverage, critical organ and normal tissue dose between the accumulated and planned doses were investigated. Results: 2 patients required a plan adaptation due to reduced target coverage. The D98% in the accumulated voxel-wise minimum distribution was higher than planned for CTV1 in 24/25 patients and for CTV2 in 20/25 patients. Maximum doses to the critical organs remained acceptable in all patients. Other normal tissue doses showed some variation as a result of soft tissue deformations and weight change. Normal tissue complication probabilities for grade (cid:1) 2 dysphagia and grade (cid:1)

Head and neck radiotherapy requires irradiation of target volumes to high dose levels in the proximity of many normal tissues.Nasopharyngeal carcinoma (NPC), in particular, frequently extends into areas adjacent to dose limiting critical organs whereby concessions on target coverage may be needed in order to carry out the treatment safely.When including nodal volumes, NPC treatments may extend from the clavicles to the skull base and beyond, making it a complex challenge to plan and treat.Treatment related side effects can particularly include dysphagia, xerostomia, hearing loss, optic neuropathy, neurocognitive impairment or temporal lobe necrosis, and may significantly impair quality of life [1].Advanced treatment techniques which can improve dose conformity, such as volumetric arc therapy (VMAT), are therefore standard of care for NPC [2].
Intensity modulated proton therapy (IMPT) has previously been shown to further reduce normal tissue doses in head and neck radiotherapy compared to photons [3][4][5].By utilising the finite range of protons, healthy tissue can be spared from beam exit doses that would otherwise be irradiated with photon-based treatments.However, the exact location of the Bragg peak and the subsequent sharp distal dose fall-off is uncertain [6,7] and is sensitive to geometric variations in the patient, such as positioning errors, tissue deformations, and anatomical changes, which are common in HN patients [8][9][10][11].In IMPT, deviations from the expected range can not only affect the dose at the periphery of the target but also introduce cold or hot spots within the target itself, making marginbased planning of limited value [12][13].This is especially of importance in NPC, where the complexity regarding the sparing of critical organs in and around the target volumes is amplified by significant tissue heterogeneities and the potential for anatomical changes during treatment.To mitigate the impact of these uncertainties, robust optimisation techniques have been developed that account for positioning and range errors explicitly through the optimisation of the treatment plan over a range of simulated error scenarios [14,15].Robustly optimised HN IMPT plans have been shown to provide both superior target coverage robustness and OAR sparing compared to margin-based plans when evaluated for these considered uncertainties [16][17][18].This, however, does not guarantee robustness to tissue deformations, non-rigid positioning issues or anatomical changes, all of which may occur over the treatment course.Methods of mitigating these additional sources of uncertainty are currently being investigated, such as the use of multiple-CT robust optimisation [19][20][21] or on-line plan adaptation [19,[22][23], however these are not yet routine clinical practise.
Aside from small cohort in-silico dose comparison studies, existing literature reporting on clinical experiences with IMPT in NPC have thus far used margin-based optimisation and have reported on preliminary outcomes and toxicities after treatment [24,25].In this study, we report on our clinical experiences in planning and delivering robustly optimised IMPT in NPC and we performed longitudinal evaluation of the plan robustness over the treatment course.We investigated this for a cohort of 25 NPC patients by accumulating delivered doses based on robustly evaluated weekly repeat CTs (rCT) and we report on target coverage and dose to normal tissues.

Patient selection
25 consecutive NPC patients treated between June 2018 and September 2020 have been retrospectively included in this study (Table 1).19/25 patients had node positive disease, however all patients received bilateral elective neck irradiation.They were immobilised using a 5-point thermoplastic mask and underwent a planning CT (pCT) scan using iterative metal artefact reduction techniques.The patients went through a model-based selection procedure and qualified for proton therapy based on a significant reduction in normal tissue complication probability (NTCP) compared to the VMAT plan, as outlined by the National Indication Protocols for PT (NIPP) in the Netherlands [26][27][28][29].

IMPT planning and delivery
Target volumes for each patient consisted of 2 dose levels: CTV1 received 70 Gy (RBE) and CTV2 received 54.25 Gy (RBE) delivered in 35 fractions using a simultaneous integrated boost technique.Spot scanning IMPT plans were made using the RayStation treatment planning system (TPS) (Versions 8B and 9A, RaySearch Laboratories, Sweden) with a constant RBE of 1.1 and using the Monte Carlo dose calculation algorithm exclusively.Plans consisted of 4-6 beam directions, with the majority of patients treated with the 6beam arrangement shown in Fig. 1a-d.After our initial experiences in the first 10 patients, the use of 6 beams was chosen as a standard configuration as it was seen that certain beam directions were more beneficial to OAR sparing and target robustness at different locations within the whole target region (Table S1).Efforts were also made to reduce the sensitivity of the dose distribution to uncertainties not (fully) addressed by robust optimisation, such as the variable filling of the sinuses, the reproducibility of the shoulder position, mobile soft tissues or skin folds in the neck,  and metallic implants or dental fillings.For example, using the 6beam arrangement the anterior oblique beams were often prevented from shooting through the maxillary sinus, a source of potential variation [18,30], while maintaining dose contributions from the remaining 4 beams.A 4-cm water-equivalent thickness range shifter was used for beams treating shallow targets and with a minimum air gap of 5 cm.During the ramp-up phase of delivering IMPT at our institute, all HN patients were initially treated using 5 mm margins while gaining clinical experience in this complex treatment site.Based on the first cohort of HN patients treated, a plan comparison study was initiated to investigate the impact of various position and range uncertainty settings and reported that positioning margins could be reduced to 2 mm [31].The study's methodology -using probabilistic error sampling and simulating fractionated treatments -was different to the clinical robustness evaluation approach and included only 10 cases, therefore 3 mm was applied clinically.Range uncertainty was also investigated for HN patients using in-vivo proton range probing and determined to be within 3% [32].As such, robust optimisation was used on the CTVs with a 3 mm position uncertainty and ±3.0%range uncertainty, with the exception of the first group of patients (patients 1-3) which were planned with 5 mm, ±3.0%.Robustness evaluation was performed over 28 scenarios -consisting of the combination of 14 position and 2 range errors -using the same error magnitudes as for optimisation.The resulting 28 scenario dose distributions were combined into voxel-wise minimum and voxel-wise maximum evaluation distributions [33].In terms of target coverage, a plan was clinically acceptable if CTV D98% > 94% of the prescribed dose in the voxel-wise minimum distribution (CTV1: 65.8 Gy, CTV2: 51.0 Gy) and D2% < 110% in the voxel-wise maximum distribution (CTV1: 77.0 Gy).If these values were not initially met, the plan was re-optimised and re-evaluated in an iterative fashion.
For patients with target volumes adjacent to critical OARs, CTV coverage may have been compromised to reach the required constraints.This was done by assessing D 0.03cc maximum dose values to the OAR in both the nominal and the voxel-wise maximum distribution, which is analogous to the use of planning risk volumes (PRV) in margin-based planning.All treatment plans were assessed by a multi-disciplinary team, including slice-by-slice inspection of the robustness evaluations.Examples of the voxel-wise minimum and maximum evaluation is shown in Fig. 1e and f.At each fraction, patients were first aligned with orthogonal Xray imaging to correct for any non-rigid errors, such as positioning of the shoulders and the lower jaw or rotations of the head.Subsequently, daily cone beam CT (CBCT) was acquired and table corrections were applied in all six degrees of freedom using a robotic couch.Treatment room timeslots were 25 min, including position verification.Patients also received weekly rCTs in a near-room CT, which was standard clinical practise for all HN patients treated with PT in our institute at the time.These were used to assess the need for plan adaptations based on target coverage or critical organ doses -plans were not adapted due to changes in NTCP alone.

Dose accumulation and analysis
Each rCT was rigidly registered to the pCT using a region of interest (ROI) consistent to that performed during CBCT.A deformable image registration (DIR) was made using the ANACONDA algorithm built into the TPS and used to propagate the relevant ROIs to the rCT [34].CTV1, CTV2, chiasm, brainstem and both optic nerve ROIs were manually checked and corrected, if necessary, by a physician.Additionally, both parotid glands were manually delineated by a dedicated dosimetrist, as previous research has shown that DIR-based parotid delineation led to dose and NTCP reporting errors [35,36].
The treatment plan dose was recalculated on each rCT and a fraction-wise robustness evaluation was performed using a 1mm residual position uncertainty and ±3.0%range uncertainty, whereby voxel-wise minimum and maximum evaluation distributions were created from the resulting 28 error scenarios using an in-house developed script operating within the TPS.This residual uncertainty includes the coincidence between the treatment and imaging isocentres, the accuracy of the robotic table correction and the potential difference in rigid registration performed on the rCT in the TPS compared to the CBCT at the treatment gantry.Dose warping was performed to map rCT doses to the pCT, each fraction was assigned to the rCT nearest in time and then accumulated to form the total delivered dose.This was performed for the nominal dose, the voxel-wise minimum and the voxel-wise maximum evaluation distributions, resulting in three accumulated distributions.
Target coverage was assessed using the D98% of the voxel-wise minimum dose and the D2% of the voxel-wise maximum dose.Critical organ doses were assessed by extracting the D 0.03cc values of both the nominal and voxel-wise maximum doses.Mean doses to the oral cavity, pharyngeal constrictor muscles (PCM; split into superior, medial and inferior components) and submandibular glands were taken from the accumulated nominal dose distribution (using the DIR), whereas mean doses to the parotid glands were calculated as the weighted average of the manually delineated ROIs on each rCT.Organ doses were converted to NTCP for grade ! 2 patient-rated xerostomia and grade ! 2 physicianrated dysphagia according to the validated models in version 2.2 of the NIPP for head and neck cancer [29].

Results
Two patients required a plan adaptation during treatment due to reduced target coverage (Fig. 2a and b).Patient 3 had difficulty to reproduce positioning from the pCT, leading to a different curvature of the cervical vertebrae and subsequent misalignment of the surrounding anatomy.Patient 21 had mobile soft tissues in the neck, which despite accurate positioning of the bony anatomy led to changes to the proton range and a misalignment of the CTV1 nodal target.These changes were confirmed to be systematic using the daily CBCTs and plan adaptations were made on rCT2 and rCT4, respectively.
The robustly evaluated target coverage for all 25 patients is shown in Fig. 3, comparing the D98% values of the planned and accumulated voxel-wise minimum distributions.24/25 patients had a higher D98% in CTV1 than originally planned, the exception being patient 18 with an insignificant reduction of 0.01 Gy.The accumulated D98% was on average 1.0 Gy higher for all patients (65.7 Gy vs 64.7 Gy, SD = 0.9, Range = À0.01 to 3.8).20/25 patients had a higher D98% in CTV2 than originally planned and was on average 0.4 Gy higher across all patients (51.8 Gy vs 51.4 Gy, SD = 0.7, Range = À1.5 to 1.3).Of the patients with lower CTV2 coverage, only two reduced by more than 0.5 Gy.Patients 5 and 6 experienced a 4 kg (8%) and 5 kg (13%) weight gain during treatment, respectively, which in both cases increased the soft tissues in the beam paths and deformed the position of CTV2 in the neck (Fig. 2c).Nevertheless, the differences in weekly rCT target coverage were assessed by the physician during the treatment course and were deemed not clinically significant to warrant plan adaptations.
Differences between the planned and accumulated maximum doses for selected serial organs are shown in Fig. 4, both for the nominal dose and voxel-wise maximum distribution.In the nominal case, dose deviations to the optical structures and brainstem are expected, considering they are often located in steep dose gradients and the metric reported (D 0.03cc ) is a very small volume.
Nevertheless, the differences were small and acceptable, remaining below 1 Gy in the vast majority of patients and never exceeding 2 Gy.In the voxel-wise maximum case, with the exception of the spinal cord, no OAR increased in maximum dose compared to the original plan.
Differences between the planned and accumulated mean doses for selected OARs are shown in Fig. 5, together with the NTCP values for grade ! 2 dysphagia and grade ! 2 xerostomia.3/25 patients experienced an increase in grade ! 2 xerostomia NTCP greater than 2.0% with the largest increase at 3.2%.Patient 19 lost 5 kg (6%) during treatment resulting in a shrinking of the parotid gland volume by 40% and increasing the relative volume inside the high dose area (Fig. 2d).The NTCP decreased for 11/25 patients, with the largest decrease seen for patient 6 at À3.1%.Across the cohort, submandibular glands, being smaller in volume and in a dose gradient region adjacent to the nodal targets, showed larger dose variations.
In addition to the DIR-based accumulated dose distributions presented in this manuscript, patient-specific dose statistics for each individual rCT are presented in the supplementary materials.

Discussion
This study has demonstrated that robustly optimised IMPT resulted in excellent target coverage and acceptable OAR dose variations in our NPC cohort when assessed over longitudinal data.With the use of robustly evaluated weekly rCTs, comprehensive dose accumulation was performed for the nominal dose, the voxel-wise minimum and voxel-wise maximum distributions.To our knowledge, this is the first study to do so.
The IMPT plans showed sufficient target coverage, particularly to CTV1, despite the strict evaluation criteria of D98% > 94% of the prescribed dose in the accumulated voxel-wise minimum distribution.CTV2 coverage showed more variation and coverage loss was larger in patients with weight gain.Two patients, both with advanced nodal spread (N2), required a plan adaptation during treatment due to CTV1 coverage loss in the nodal neck region.Our clinical experience treating other HN indications with IMPT indicates that plan adaptations are required in around 30% of patients, substantially more than reported here for NPC despite similar planning approaches.This may be due to a lower severity of soft tissue changes and positioning errors occurring in the nasopharynx compared to regions in the neck, where the effects of weight change are more pronounced.As 24% of our NPC cohort had no nodal disease (N0) and 12% were planned with 5 mm 'margins', these plans were likely to be less sensitive to target coverage degradation.Further, a portion of plan adaptations seen in the clinic arise from inter-fraction motion of targets in the oral cavity and larynx, an issue not present in NPC.Our experience is, however, in contrast to the work by Jir ˇí et al., who reported that nearly all 40 NPC patients in their cohort required one or more adaptations due to >5% change in dose to the target or critical organs, using 3-field PTV-based IMPT plans which were robustly optimised manually through an iterative process of robustness evaluation and re-optimisation [24].Outside of NPC, robustly optimised IMPT has also been investigated in post-operative oropharyngeal cancer by Hague et al. [37], who reported that all 6 patients in their cohort met target coverage criteria after dose accumulation on weekly CBCT-based synthetic CTs (i.e., 0% adaptation).In the work by Yang et al. investigating the role of multiple-CT optimisation to mitigate the impact of anatomical changes, they report that approximately 40% of all clinical HN patients treated at their institute with IMPT required at least one plan adaptation [20].Other in-silico, small cohort studies investigating the impact of anatomical changes in robustly optimised HN IMPT have reported inadequate target coverage in 25-60% of cases [21,23].The differences in the reported rates of plan adaptation highlight that experiences with IMPT in HN are variable and likely to be influenced by institutional protocols, planning approaches, treatment site and patient cohort characteristics.
Maximum doses in critical organs showed only small deviations in the accumulated nominal dose, remaining clinically acceptable.The D 0.03cc metric can be quite sensitive to planning factors, such as statistical uncertainty within the Monte Carlo dose calculation algorithm and the resolution of the dose grid, particularly for structures in a dose gradient at the periphery of the CTV.Therefore, small dose deviations are to be expected even assuming perfect patient positioning.The doses reported by the accumulated voxel-wise maximum distributions, however, are substantially reduced -even with a residual position error of 1 mm in the rCT robustness evaluation.Together with the excellent CTV1 robust coverage, this suggests that positioning of the nasopharynx region is being performed more accurately than the 3 mm robustness 'margin' currently used and that this area remains relatively insensitive to anatomical changes with the planning techniques employed here.Margin reduction can be especially beneficial in the cases where the target coverage is compromised by critical organ dose constraints but may also lead to more plan adaptations -more work is required to investigate the optimal balance.In order to implement margin reductions for NPC, an in-vivo range verification process such as proton radiography [32], would be helpful.
Mean doses to other organs, such as the salivary glands, oral cavity and swallowing muscles, remained quite stable across the cohort although with some variation seen in individual cases.As these structures are located in regions surrounded by soft tissues, deformations in the beam paths can lead to range errors and dose differences.Significant weight loss appears to have had the largest influence on grade ! 2 Xerostomia NTCP, as this combines the effect of range errors with the reduction in volume of the parotid glands, consistent to reports from Hague et al. [37].Treatment plans were only adapted in cases of unacceptable changes to target coverage and not for changes in NTCP.Despite this, only 3 patients had an increase of NTCP greater than 2.0%.
There are several limitations to this study.DIR-based dose warping and accumulation introduces uncertainty in dose reporting in the presence of anatomical changes and for reporting small volume dose metrics to structures in a steep dose gradient [38][39][40][41].While soft-tissue changes are minimal intracranially, some uncertainty remains and we therefore also include dose statistics on individual rCTs in the supplementary materials.With the exception of the parotid glands which were manually contoured on each rCT, the mean doses to the remaining NTCP structures were taken directly from the DIR-based accumulated dose distribution.Earlier research in our institute has investigated the dose errors from this approach and reported that the difference in NTCP between DIRbased and manual contouring is within the range caused by inter-observer variability [36].Additionally, NTCP has been reported assuming a constant dose per fraction as well as a spatially uniform dose within the organ.In reality, this is unlikely to be the case and the biological effect may be underestimated, although the impact is negligible for variations under 10% [42].Finally, the rCT procedure in our near-room CT has been performed without the position verification functionality available at the treatment gantry, where great care is taken to accurately align individual components of the patient before each fraction.As such, it is expected that results based on the actual patient geometry during treatment would be even better than those reported here.In a follow up study, we aim to address this with the conversion of daily CBCT into synthetic CT [43], as well as to investigate whether more transient anatomical changes occurring on a daily basis may influence the results.
This study of 25 patients adds to the growing body of evidence that IMPT is a promising treatment option in nasopharyngeal carcinoma.Robustly optimised IMPT plans, in combination with comprehensive verification imaging and adaptive planning, mitigated the impact of position and range uncertainties and anatomical changes in our cohort.We look forward to reporting on toxicity and treatment outcomes once the appropriate patient follow-up intervals are reached.

Fig. 1 .
Fig. 1.A representative treatment plan using 6 beam directions with CTV1 (cyan) and CTV2 (pink) highlighted, for a T3N2M0 NPC (patient 24).Images (a-d) show the nominal dose distribution in the sagittal, coronal and two axial planes.Image (e) shows the voxel-wise minimum distribution, with the 94% isodoses for CTV1 (65.8 Gy (RBE), orange) and CTV2 (51.0 Gy (RBE), green) which are assessed sliceby-slice for coverage.Image (f) is the voxel-wise maximum distribution with 63 Gy (RBE) and 70 Gy (RBE) isodoses, showing the sparing of the chiasm (yellow) and optic nerves (lime) from the high dose.(For interpretation of the references to colour in this figure, the reader is referred to the web version of this article).

Fig. 2 .
Fig. 2. Patient 3 (a) showed a different flex in the vertebrae on treatment compared to pCT which was systematic and could not be corrected.Patient 21 (b) exhibited soft tissue deformations at the location of a CTV1 nodal volume (red), despite good bony positioning of the surrounding area.Patient 6 (c) increased 5 kg (13%) leading to soft tissue increases and deformed CTV2 (blue).Patient 19 (d) decreased 5 kg (6%) and total parotid gland volume (green) shrunk by 40% and towards the high dose area (CTV1 in red).Solid lines = Delineation on pCT.Dotted lines = Delineation on rCT.Blue image = pCT.Orange image = rCT.(For interpretation of the references to colour in this figure, the reader is referred to the web version of this article).

Fig. 3 .
Fig.3.Robustly evaluated target coverage in the treatment plan (blue, diamond) and the accumulated dose (red, cross) for all 25 patients.Top: D98% for CTV1 in the voxel-wise minimum distributions.Middle: D98% for CTV2 in the voxel-wise minimum distributions.Bottom: D2% for CTV1 in the voxel-wise maximum distributions.Note: Several patients had planned values of vox min D98% below the clinical goal (65.8 GyRBE for CTV1) due to maximum dose tolerances of surrounding critical organs, as described in Methods and Materials.(For interpretation of the references to colour in this figure, the reader is referred to the web version of this article).

Fig. 4 .
Fig. 4. Differences in maximum dose (D 0.03cc ) between the accumulated dose and the treatment plan for selected critical organs.Top: nominal (blue) and Bottom: voxel-wise maximum (red) distributions.Box = interquartile range (25th-75th percentile), solid line = median, x = mean, whiskers = range, dots = outliers.Abbreviations: Cord -spinal cord, LON -left optic nerve, RON -right optic nerve, BS -brainstem.(For interpretation of the references to colour in this figure, the reader is referred to the web version of this article).

Fig. 5 .
Fig. 5. Differences in mean dose and NTCP between the accumulated dose and the treatment plan for selected normal tissues and complications.Abbreviations: OC -oral cavity, PCM -pharyngeal constrictor muscle divided into superior, medial and inferior components, Par -parotid glands right and left, Subm -submandibular glands right and left, Dys -dysphagia, Xero -xerostomia.

Table 1
Patient, tumour and treatment characteristics for all patients in this cohort.