Support censoring (in latent projection) #528

fweber144 · 2025-07-26T20:09:26Z

This adds support for censored observations (i.e., survival or time-to-event analysis) when using the latent projection. Originally, #2 already mentioned that support for time-to-event models would be a nice feature, and the request came up again recently in https://discourse.mc-stan.org/t/using-projpred-latent-projection-with-brms-weibull-family-models/39275. This PR only makes it possible to use the latent projection for time-to-event models. For a more general perspective, see #2.

The solution presented here is not optimal because it would probably be more desirable to allow for an extension of the formula in init_refmodel() (similarly to brms's resp_cens() term), see https://discourse.mc-stan.org/t/using-projpred-latent-projection-with-brms-weibull-family-models/39275/10. Via such an extension of the init_refmodel() formula, we could also handle observation weights in a better way.

In contrast to the less elegant solution suggested in https://discourse.mc-stan.org/t/using-projpred-latent-projection-with-brms-weibull-family-models/39275/10, the solution presented here requires users to add a right-hand side formula as an attribute called cens_var to the latent_ll_oscale function. The variable named in that formula is then retrieved (internally, whenever calling the latent_ll_oscale function) from the original dataset (possibly subsetted to the observations from a given test set), newdata, or element data from varsel()'s argument d_test, whichever is applicable in the specific situation where the latent_ll_oscale function is called. The content of the retrieved variable is then passed to argument cens of the latent_ll_oscale function. In other words, the variable mentioned in the cens_var formula needs to contain the censoring indicators (e.g., 0 = uncensored, 1 = censored) which can then be used (by the user) within the latent_ll_oscale function. On the Stan Forums, I will update the Weibull (and perhaps also the log-normal) example from https://discourse.mc-stan.org/t/using-projpred-latent-projection-with-brms-weibull-family-models/39275 to illustrate this.

The solution presented here is still less elegant than the brms::resp_cens()-like solution, but in my opinion, it is preferable over the less elegant solution I suggested in https://discourse.mc-stan.org/t/using-projpred-latent-projection-with-brms-weibull-family-models/39275/10. Via the cens_var attribute, we also catch the censored-observations case (when using the latent projection) in validate_vsel_object_stats() to throw a warning if a performance statistic other than "elpd", "mlpd", or "gmpd" is used.

This is only a draft PR. Still to do:

Update NEWS.md.
Consider updating the vignettes. E.g., mention the support for censored observations in case of the latent projection in the main vignette (at https://mc-stan.org/projpred/articles/projpred.html#modtypes) and add an example to the latent-projection vignette. For the latter, the final example(s) from https://discourse.mc-stan.org/t/using-projpred-latent-projection-with-brms-weibull-family-models/39275 could be used.
Consider updating the package description in the DESCRIPTION file to mention survival models.
Add unit tests (for the latent projection with censored observations).

@avehtari: A few weeks ago, we discussed how we should make this feature available and we decided to keep this feature in a separate branch. Thinking over this again, I would say we could merge this into master as soon as the to-do's from above are completed. The reason is that I consider the implementation presented here not as "hacky" as the solution we discussed back then (which was the less elegant solution I suggested in https://discourse.mc-stan.org/t/using-projpred-latent-projection-with-brms-weibull-family-models/39275/10). But I'll leave the decision up to you and @aloctavodia what you want to do with this PR. Unfortunately, I was not able to request a review from @aloctavodia for this PR.

fweber144 · 2025-07-26T20:37:14Z

On the Stan Forums, I will update the Weibull (and perhaps also the log-normal) example from https://discourse.mc-stan.org/t/using-projpred-latent-projection-with-brms-weibull-family-models/39275 to illustrate this.

Added now at https://discourse.mc-stan.org/t/using-projpred-latent-projection-with-brms-weibull-family-models/39275/23 (for the Weibull case).

(add argument `cens` to internal `latent_ll_oscale` functions)

fweber144 · 2025-07-27T19:40:19Z

I forgot to add the new argument cens to internal latent_ll_oscale functions, but this is now done in commit 6b76269. The tests are running through now (i.e., they don't result in errors).

avehtari · 2025-08-12T12:28:15Z

Thinking over this again, I would say we could merge this into master as soon as the to-do's from above are completed. The reason is that I consider the implementation presented here not as "hacky" as the solution we discussed back then

OK for me to merge when it's ready

fweber144 · 2025-09-08T19:23:09Z

@avehtari do you know someone who could finish this PR (the four to-do's listed above)?

fweber144 · 2025-09-09T20:45:49Z

A log-normal example may now be found at https://discourse.mc-stan.org/t/using-projpred-latent-projection-with-brms-weibull-family-models/39275/28.

aloctavodia · 2025-09-10T15:27:12Z

I could give it a try. Do you prefer to merge this and do the rest in a separated PR? Or add everything here?

fweber144 · 2025-09-10T19:50:46Z

I think I would prefer to add everything here so this PR is self-contained.

fweber144 · 2025-10-25T09:31:08Z

@aloctavodia I saw you opened PR #537 and there updated the package version to a release version (i.e., a non-development version). Does that mean you are planning a CRAN release soon? If yes, do you think it would be possible to include this PR in that CRAN release?

aloctavodia · 2025-10-25T18:29:52Z

As the main goal of the PR (and release) is to unlock the publication of the package mclogit that our tests are currently blocking, I would prefer to do the release ASAP. I am sorry that I delayed working on this. I will end this PR next week, and then I also want to work on making the test more robust, as I have been having issues running them locally.

fweber144 · 2025-10-25T19:58:17Z

Ok, I understand. The tests are running through on my machine without problems, but we've realized at several occasions in the past that the tests are quite instable as they depend on rstanarm and brms models being fitted at their beginning. That's why I deactivated most of the tests on CRAN (in #537, it looks like you modified tests that are indeed run on CRAN, so I understand that a new version of mclogit might face problems when running our tests). Unfortunately, I haven't had the time to revise our tests to make them more robust during the time when I was working on projpred. Thank you for working on this now.

…(7.3.3)"

<https://discourse.mc-stan.org/t/using-projpred-latent-projection-with-brms-weibull-family-models/39275/23>

<https://discourse.mc-stan.org/t/using-projpred-latent-projection-with-brms-weibull-family-models/39275/28>

censored-observations examples First, we change the censoring times to avoid problems in the log-normal example (there, it seems we had too many censored observations). Second, we change the true `sdlog` parameter to avoid problems in the log-normal example (there, it seems we had a too right-skewed distribution). Third, we remove the `set.seed()` call because we already have one at the beginning of the vignette.

…dd `silent` and `refresh`)

… `R CMD check`: ``` process chunk of elements [...] is [...] MiB. This exceeds the maximum allowed size 500.00 MiB per by R option "future.globals.maxSize". This limit is set to protect against transfering too large objects to parallel workers by mistake, which may not be intended and could be costly. See help("future.globals.maxSize", package = "future") for further explainations and how to adjust or remove this threshold The three largest globals are 'one_obs' ([...] MiB of class 'function'), 'refmodel' ([...] MiB of class 'list') and 'lw' ([...] MiB of class 'numeric') ``` Hence, switch back to the `doParallel` backend.

(this is only a very basic test for `latent_ll_oscale` because we have more comprehensive examples with simulated data in the latent-projection vignette and they are run when `R CMD check` is run)

…unctions (to check that arguments added after the initial latent-projection release (v2.4.0) have indeed been added (or taken into account via `...`) by users)

…his is done here

the updated latent-projection vignette)

fweber144 · 2025-11-01T05:08:56Z

I just pushed commits completing this PR. From my side, it is ready to be merged, but check it if you like.

I refrained from updating the package description in the DESCRIPTION file because I prefer the general wording that we currently have and because Catalina, Bürkner, and Vehtari (2021, https://doi.org/10.48550/arXiv.2109.04702) already deal with time-to-event models, so the existing reference to that preprint should already be a sufficient reference to time-to-event models.

Note that roxygen2 version 7.3.3 seems to have a bug that replaces cross-links to summary() by cross-links to cmdstanr::summary(). I unstaged these changes after each re-documentation, but this is a bit cumbersome. If you prefer, you can always switch back to roxygen2 v7.3.2 (personally, I have installed v7.3.3 and didn't want to downgrade for this PR).

R CMD check runs through on my machine (with R 4.5.1).

fweber144 · 2025-11-06T14:30:45Z

Thanks for approving, @aloctavodia.

@avehtari, do you want to review this PR, too?

avehtari

I checked the documentation and left a couple questions

avehtari · 2025-11-07T17:35:51Z

NEWS.md

-
 ## Major changes

+* Added support for censored observations when using the latent projection (with response-scale analyses). This makes it possible, e.g., to use the latent projection for time-to-event models (also known as models for survival analysis). Due to this new feature, the function passed to argument `latent_ll_oscale` of `extend_family()` now needs to have an argument `cens`. See `?extend_family` (section "Latent projection") as well as the latent-projection vignette (section ["Censored observations (survival analysis)"](https://mc-stan.org/projpred/articles/latent.html#cens)) for more information and examples. Note that only the performance statistics `"elpd"`, `"mlpd"`, and `"gmpd"` take censoring into account (on response scale). (GitHub: #528)


Due to this new feature, the function passed to argument latent_ll_oscale of extend_family() now needs to have an argument cens.

Does this possibly break existing code?

Have you run reverse-dependency checks?

Now I did. They succeeded. (brms and BayesERtools both do not specify a latent_ll_oscale function, so they are not affected by this PR.)

avehtari · 2025-11-07T17:37:44Z

R/extend_family.R

+#' * `cens` accepts a vector containing censoring indicators for the
+#' observations for which to calculate the response-scale log-likelihood values
+#' (i.e., for the observations from the second dimension of `ilpreds`). This is
+#' only relevant if attribute `cens_var` of `latent_ll_oscale` is not `NULL`


Should cens have some default value? So that there is no need to provide it if it's not relevant? Or am I misreading the code and documentation?

It does not need to have a default value because all calls to latent_ll_oscale specify cens. Users could assign a default value if they want, but it would not be used.

Can you clarify the documentation?

You mean to add something like "Argument cens does not need to have a default value."?

I'm confused because first there is

#' The function supplied to argument latent_ll_oscale needs to have the
#' prototype
#' ```{r, eval = FALSE}
#' latent_ll_oscale(ilpreds, dis, y_oscale, wobs = rep(1, ncol(ilpreds)),
#' cens, cl_ref, wdraws_ref = rep(1, length(cl_ref)))

So I have to have it. But then there is

#' * cens accepts a vector containing censoring indicators for the
#' observations for which to calculate the response-scale log-likelihood values
This is
#' only relevant if attribute cens_var of latent_ll_oscale is not NULL

So I don't need to have it? But if I use latent_ll_oscale without and with cens how does it work without default value? Why not have default as NULL?

As I don't understand the current documentation and the logic of the code, I don't know what to add there, but it does need some clarification.

Ah I see, the part

This is only relevant if attribute cens_var of latent_ll_oscale is not NULL [...]

might be confusing. Argument cens is always needed (otherwise, R would throw an error because we would be calling a function with an argument that the function does not have), but it does not need to have a default value (because we always specify cens when calling the latent_ll_oscale function). So I would propose to simply omit the sentence "This is only relevant if attribute cens_var of latent_ll_oscale is not NULL [...]". Is that ok for you?

allow for censoring when using the latent projection

c614b59

fweber144 requested a review from avehtari July 26, 2025 20:09

fweber144 mentioned this pull request Jul 26, 2025

Implement survival models #2

Open

fixup! allow for censoring when using the latent projection

6b76269

(add argument `cens` to internal `latent_ll_oscale` functions)

Merge remote-tracking branch 'upstream/master' into cens_latent

47139d5

Merge remote-tracking branch 'upstream/master' into cens_latent

a8ce636

fweber144 added 14 commits October 29, 2025 22:14

Merge remote-tracking branch 'upstream/master' into cens_latent

a3040fe

fix typo (in internal docs)

c6a276a

re-document with most recent roxygen2 version (7.3.3)

7ad6373

manual fix for commit "re-document with most recent roxygen2 version …

1bdd768

…(7.3.3)"

add NEWS.md entry

f8f015b

docs: clarify use of censoring by different performance statistics

3face64

update main vignette

6971643

minor wording adaptation in the main vignette

23f4619

latent-projection vignette: add a Weibull example adapted from

d18fdbe

<https://discourse.mc-stan.org/t/using-projpred-latent-projection-with-brms-weibull-family-models/39275/23>

latent-projection vignette: add a log-normal example adapted from

01bcdb7

<https://discourse.mc-stan.org/t/using-projpred-latent-projection-with-brms-weibull-family-models/39275/28>

latent-projection vignette: fix progressr use

3321973

latent-projection vignette: fix chunk labels

b820d1d

latent-projection vignette: slightly adapt the brms::brm() calls (a…

6ae1062

…dd `silent` and `refresh`)

fweber144 added 8 commits October 31, 2025 21:58

latent-projection vignette: add introductory sentence

defb5ea

latent-projection vignette: add disclaimer / link

7a1848e

latent-projection vignette: add disclaimers for the ignored warnings

53ef198

add unit test for censoring in case of the latent projection

64a6482

(this is only a very basic test for `latent_ll_oscale` because we have more comprehensive examples with simulated data in the latent-projection vignette and they are run when `R CMD check` is run)

add input checks for the latent_ll_oscale and latent_ppd_oscale f…

1451259

…unctions (to check that arguments added after the initial latent-projection release (v2.4.0) have indeed been added (or taken into account via `...`) by users)

PR stan-dev#535 did not add folder pkgdown to .Rbuildignore, so t…

fe0ee4a

…his is done here

add ggfortify package to "Suggests" list (needed for the PPPCs in

b67f205

the updated latent-projection vignette)

fweber144 marked this pull request as ready for review November 1, 2025 05:09

aloctavodia approved these changes Nov 4, 2025

View reviewed changes

avehtari reviewed Nov 7, 2025

View reviewed changes

Merge remote-tracking branch 'upstream/master' into cens_latent

2251eb1


		## Major changes

		* Added support for censored observations when using the latent projection (with response-scale analyses). This makes it possible, e.g., to use the latent projection for time-to-event models (also known as models for survival analysis). Due to this new feature, the function passed to argument `latent_ll_oscale` of `extend_family()` now needs to have an argument `cens`. See `?extend_family` (section "Latent projection") as well as the latent-projection vignette (section ["Censored observations (survival analysis)"](https://mc-stan.org/projpred/articles/latent.html#cens)) for more information and examples. Note that only the performance statistics `"elpd"`, `"mlpd"`, and `"gmpd"` take censoring into account (on response scale). (GitHub: #528)

Uh oh!

Support censoring (in latent projection) #528

Are you sure you want to change the base?

Support censoring (in latent projection) #528

Uh oh!

Conversation

fweber144 commented Jul 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fweber144 commented Jul 26, 2025

Uh oh!

fweber144 commented Jul 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

avehtari commented Aug 12, 2025

Uh oh!

fweber144 commented Sep 8, 2025

Uh oh!

fweber144 commented Sep 9, 2025

Uh oh!

aloctavodia commented Sep 10, 2025

Uh oh!

fweber144 commented Sep 10, 2025

Uh oh!

fweber144 commented Oct 25, 2025

Uh oh!

aloctavodia commented Oct 25, 2025

Uh oh!

fweber144 commented Oct 25, 2025

Uh oh!

fweber144 commented Nov 1, 2025

Uh oh!

fweber144 commented Nov 6, 2025

Uh oh!

avehtari left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fweber144 commented Jul 26, 2025 •

edited

Loading

fweber144 commented Jul 27, 2025 •

edited

Loading