Skip to content

Conversation

@hfrick
Copy link
Member

@hfrick hfrick commented Dec 15, 2025

Follow-up to #564, now adding a new metric of the new linear_pred_surival_metric type

This PR add the metric from Royston and Sauerbrei (2004) and extends it to work with case weights. Two comments on statistics aspects below, please weigh in on whether or not this seems sensible to you.

For the basic implementation without case weights, I compare it to the one from the survival package. In survival::royston(), the predictions get rescaled:
https://github.com/therneau/survival/blob/20dd980acc7b051b1b4a6ef5988b8f130c81474b/R/royston.R#L24-L29 I'm not sure why though, I didn't see that part in the Royston and Sauerbrei paper. Given that we don't even have the model formula at this point, I've gone ahead with the predictions as they are given to the yardstick function.

For case weights: the metric function from survival does not take case weights, so this here is my own take on it. Generally, we pass case weights to the metric calculation from the tuning process if they are frequency weights (important weights are for model fitting only). So I've treated case weights here as frequency weights.
The metric is a two-parter: calculate Blom's approximation to rankits (based on ranks), then fit a Cox model on them and use the coefficient to calculate the metric. Both of those elements should be affected by frequency weights imo, so that's how I've implemented it. While fitting the Cox model could take importance weights, I don't think they'd make much sense for the rankits, so that assumption of frequency weights is built into normal_score_blom().

Comment on lines +35 to +38
lung_surv <- lung_surv |>
bind_cols(
predict(model_fit, lung_data, type = "linear_pred")
)
Copy link
Member Author

@hfrick hfrick Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding this below instead of with the other two predictions as to not break some tests that seem to rely on the order


normal_score_blom <- function(x, case_weights) {
tibble::tibble(
.row = rep(seq_along(x), times = case_weights),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the nod to "weights are frequency weights"

if (is.null(case_weights)) {
case_weights <- rep(1, length(estimate))
} else {
case_weights <- vec_cast(case_weights, to = double())
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't stop importance weights from going through this for the other metrics, so I also allowed it here

#' estimate = .pred_linear_pred
#' )
#' @export
royston_survival <- function(data, ...) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this was the most straightforward name but if you think it should carry a different name, let me know!

@hfrick
Copy link
Member Author

hfrick commented Dec 15, 2025

Emil, using this metric to test metric sets which mix static and linear pred survival metrics is the next PR, as we chatted about.

Copy link
Member

@EmilHvitfeldt EmilHvitfeldt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is all implemented very nicely! following the existing frameworks.

#' D measures prognostic separation of survival curves, and is closely related
#' to the standard deviation of the prognostic index.
#'
#' Larger values of the score are associated with better model performance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have a range of possible values?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the definition, I'd say it's asymptotically between 0 and 1, but I didn't come across a range in the paper. Because of this and because I didn't see it for the other survival metrics, I hadn't added a range.

#' Royston and Sauerbrei proposed $R^2_D$ as a measure of explained variation
#' on the log relative hazard scale based on the authors’ D statistic.
#' D measures prognostic separation of survival curves, and is closely related
#' to the standard deviation of the prognostic index.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how hard would it be to add the formula in the docs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the metric definition involves fitting a Cox model, there isn't a single equation/formula that captures it fully. If we describe the whole process, we should probably talk about how the implementation doesn't involve scaling with $\kappa$ since we are only interested in $R^2_D$ but not $D$. I've added code comments for future us or particularly invested users.


fit <- survival::coxph(truth ~ bns, weights = case_weights)
est <- unname(stats::coef(fit))
est^2 / (est^2 + pi^2 / 6)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the pi^2 / 6 feels a little magic number to me, could you show where we got that from?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally went with the formula from Royston & Altman (2013) that Antoine linked. In Royston & Sauerbrei (2004), the definition of $R^2_D$, including the scaling factor $\sigma$, is in Appendix A2. ✨

hfrick and others added 3 commits December 17, 2025 14:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants