Kidney Int . 2026 Apr;109(4):738-749. doi: 10.1016/j.kint.2025.12.024. Epub 2026 Jan 20.
A secondary analysis of the TESTING trial predicted individual patient response to corticosteroid treatment in IgA nephropathy
Mark Canney 1, Sana Shan 2, Lee Er 3, Laurent Billot 2, Jialin Han 4, Muh Geot Wong 5, Helen Monaghan 2, Michelle Hladunewich 6, Lai Seong Hooi 7, Vivek Jha 8, Jicheng Lv 9, Vlado Perkovic 2, Hong Zhang 9, Daniel C Cattran 10, Sean J Barbour 11; TESTING trial steering committee
PMID: 41571096
Why was this study done?
The TESTING trial (Lv et al, JAMA 2022| NephJC summary) is the largest RCT of corticosteroids in IgA nephropathy (IgA) with 503 adults with proteinuria ≥1 g/d on optimized RAS blockade. Oral methylprednisolone halved the risk of the composite kidney endpoint (HR 0.53, 95% CI 0.39–0.72), with an absolute risk reduction of 16.1% at 4 years. These results were quite impressive. However, the study also showed significant toxicity with corticosteroids, which led to a mid-trial protocol revision from full dose (0.6–0.8 mg/kg/d) to a reduced-dose regimen (0.4 mg/kg/d). The clinical conundrum is the effectiveness of a treatment in an IgAN population with heterogeneous risk and variable drug tolerability. The finding of 47% relative risk reduction is an average across many individuals who likely differ substantially in their clinical risk and disease progression patterns.
The principal question this post hoc analysis of TESTING is: Can we predict, for a given patient, what their individual absolute risk reduction from methylprednisone would actually be?
Now you may ask, why does this matter when we have so many other options for treatment of IgAN, when even more treatment options are appearing rapidly? Not surprisingly, because in many parts of the world steroids are all you have (apart from RASi, flozins, and possibly MRAs). Even if new therapies become available, they will be very expensive and less accessible, at least in the near future. Second, these methods are cool - exploring the heterogeneity of effects to draw some tentative conclusions for shared decision making. They apply to other intervention in different settings, so worthwhile trying to understand this methodology.
How was the study done, and what did it show?
This was a secondary analysis of 483 of the 503 TESTING participants (20 excluded for missing MEST-C or covariate data). The analytic approach followed the PATH statement in four steps:
Cox proportional hazards model with backward elimination (P < 0.2) to select main-effect treatment effect modifiers
Selected variables entered with treatment exposure and all treatment × variable interaction terms (no selection on interactions)
Ridge regression applied throughout
For each patient, predicted 4-year absolute risk generated under both treatment scenarios; the difference = individual ARR
The primary outcome was the same as TESTING: ≥40% eGFR decline, kidney failure, or death from kidney disease.
Study population
After backward elimination, the following variables were selected as main effects and forced into the model with interaction terms: eGFR, age, proteinuria, RAAS blockade dose, ethnicity, time from biopsy, systolic blood pressure, sex, BMI, T-score, and C-score (MEST-C).
Results
This figure shows the point estimates for each interaction term (treatment multiplied by each variable) on the hazard ratio scale. Variables to the right of the dashed line are associated with a better response to methylprednisolone; variables to the left with a worse response.
Better response to methylprednisolone:
Higher eGFR
Male sex
MEST-C score: T1/T2 (vs T0)
MEST-C score: C1/C2 (vs C0)
Worse response to methylprednisolone:
Higher proteinuria
Higher RASi dose
Chinese ethnicity
Minimal impact: Age, SBP, BMI, time from biopsy to enrollment.
Of note, time from biopsy to enrollment had minimal impact on modifying treatment response, likely in part because most patients enrolled relatively soon after biopsy (median 5 months), limiting variability in this predictor.
This is the distribution of individual-level predicted 4-year absolute risk reduction across the entire cohort. The colors represent tertiles - lightest blue is the lowest tertile, darkest blue is the highest. So, what are tertiles? The model generates a predicted absolute risk reduction (ARR) for every patient. Then all 483 patients are ranked from lowest to highest predicted benefit and divided into three equal groups (tertiles) of 161. The lowest tertile contains patients predicted to benefit least, some are predicted to have no benefit or even harm. The highest tertile contains patients predicted to benefit most.
The key finding here is the broad spread of the predicted ARR ranging from about −10% to +40%, compared to the average of 16.1% (in the trial). This illustrates exactly why the average treatment effect is insufficient for individual decision-making. Patients at the left tail may experience no benefit or even harm from treatment, while those at the right tail have substantial benefit. Wouldn’t that be nice to know before you prescribe a given treatment?
This is the key clinical finding. The model was used to split patients into two groups: those with predicted ARR of 10% or less, and those with predicted ARR greater than 10%. Among patients predicted to have ARR greater than 10%, the observed ARR was 24% (95% CI 13% to 36%). That's a large, statistically significant benefit. Among patients predicted to have an ARR of 10% or less, the observed ARR was −5% (95% CI −25% to 15%). That's no benefit and the point estimate actually suggests possible harm.
Overall, the model successfully identifies a subgroup with no observed benefit and a subgroup with substantial benefit. The 10% threshold was chosen as a clinically meaningful treatment threshold, but this can be adjusted based on individual patient values and risk tolerance. For example, a risk-averse patient might want a higher threshold, while a risk-accepting patient might accept a lower one.
Two key metric performances were used:
Restricted mean survival time (RMST): If only patients with predicted ARR >10% were treated (targeted treatment policy), the RMST was 1,194 event-free days, compared with 1,028 days under random allocation. This gives 166 additional event-free days with targeted treatment.
C-statistic for benefit: the C-statistic for benefit was 0.63 (95% CI 0.56 to 0.70). As discussed earlier, this should not be compared to the 0.7–0.8 threshold for prognostic models. The C-for-benefit predicts an inherently unobservable quantity (the individual treatment effect) which introduces fundamental noise. Values of 0.55–0.65 are considered meaningful. For context, the SYNTAX Score II, one of the best-validated benefit prediction models in all of cardiology, achieved a C-for-benefit of about 0.59. So, 0.63 is actually quite good for this type of metric.
Lastly, calibration showed good agreement between predicted and observed ARR, with mild overestimation at the low end and underestimation at the high end. Sensitivity analyses across reduced and full-dose subgroups were consistent. eGFR slope analyses selected similar predictors. Notably, adverse events did not increase across predicted benefit tertiles, and patients predicted to benefit most did not experience more toxicity.
The PRED-IgAN tool
A web-based calculator is publicly available at www.gnpredict.com. Inputs include eGFR, age, proteinuria, RAAS blockade dose, ethnicity, time from biopsy, systolic BP, sex, BMI, T-score, and C-score. Output is a predicted 4-year ARR from methylprednisolone, which can be used with any clinician-chosen threshold in shared decision-making.
Here are two examples on PRED-IgAN in which the first patient is unlikely to benefit (low ARR) and the second patient is likely to benefit (high ARR).
What are the implications?
This is the first individual treatment effect prediction tool for immunosuppression in IgAN, and it addresses a real clinical need. PRED-IgAN could help frame the risk-benefit discussion for patients, particularly relevant given that most patients likely fall in a zone of genuine equipoise for corticosteroids.
That said, important caveats deserve emphasis. The absence of external validation is the most significant limitation and, as the authors acknowledge, a second large corticosteroid RCT in IgAN is unlikely. The cohort is 74% Chinese, limiting generalizability to other populations. Overfitting risk persists despite ridge regression, given 483 patients and 174 events with many interaction terms. MEST-C scoring relied on local pathology reports without central review. Finally, the model predicts benefit, not individual-level harm so adverse event analysis remains at the group level.
Contextually, KDIGO 2025 now positions targeted-release budesonide (Nefecon) as first-line immunosuppression in IgAN, with systemic corticosteroids reserved for settings where budesonide is unavailable (grade 2B).
Yet corticosteroids remain the only accessible immunosuppressive option across much of the world.
For those contexts, PRED-IgAN offers a practical tool to optimize who receives treatment and who might reasonably be spared. One should remember that claims of Nefecon’s superiority or safety over prednisone have been made without any direct comparisons. Nefecon withdrawal can result in adrenal insufficiency (thus there is clear, significant systemic effects) as described by Laxamana et al Clin Kid J 2026. Even ordinary budesonide might be just as good as ‘targeted release’ budesonide for IgAN patients ( Jhaveri and Reich, KI Reports 2026). Even in places where it is available, the cost of Nefecon is prohibitively high, and it is always good to have cheaper and effective options.
The question of whether a C-statistic for benefit of 0.63 is sufficient to change practice is fair, however perhaps the right framing is whether it improves on ‘doing nothing’. In a disease where the average ARR is 16% but individual responses are heterogeneous and toxicity is real, even modest discrimination has value. The model is best used not as an absolute gatekeeper, but as a conversation starter for shared-decision making. The treatment threshold, which is the minimum predicted benefit that justifies accepting the risks of corticosteroids, really depends on individual patient values. This is what personalized evidence-based medicine looks like in practice. The tool doesn't replace clinical judgment; it informs it with individualized data.
Summary
ARR = absolute risk reduction; IgAN = IgA nephropathy; PATH = Predictive Approaches to Treatment effect Heterogeneity; RAAS = renin–angiotensin–aldosterone system; RMST = restricted mean survival time; RCT = randomized controlled trial.
Summary by Sumaiya Ahmed
Nephrology Fellow, University of Ottawa
Reviewed by Swapnil Hiremath, Brian Rifkin


Coming up next we have the TESTING RCT (again), with updated results on methylprednisolone treatment for IgA nephropathy.