Skip to content

Conversation

@sgreenbury
Copy link
Collaborator

@sgreenbury sgreenbury commented Nov 4, 2025

Closes #848.
Contributes towards #589.
(merge after #917)

This PR:

  • Updates the base emulator API to enable the specification of validation data (to be used here as calibration data) (Let user pass train/val/test data split #905)
  • Adds a base class wrapper for emulators to provide conformal prediction (analogous to Ensemble)
  • Adds a ConformalMLP class (analogous to EnsembleMLP). This supports both constant width and quantile regression derived UQ.
  • The quantile regression version uses an MLP with a quantile training loss to fit the upper and lower quantiles and then uses a constant per-target correction based on performance on the calibration training data
  • Adds ConformalMLP to registry and re-exports (not adding as default emulator)
  • Updates API to support passing n_samples - this is to enable this to be controlled at lower levels of the API from the high-level AutoEmulate.
  • Adds output_to_tensors method on ConversionMixin to avoid repeated mapping of outputs to tensors
  • Fixes indentation of docstrings in MLP subclasses

Question:

  • Currently the two methods for deriving the intervals are part of the same base class with a parameter passed as a keyword argument - this is similar to the way the kernel fn is passed for GPs. Would it be preferable to have two classes? Or perhaps a general emulator subclass factory extending the one currently that only supports GPs.

Next steps:

  • Tests for passing validation data
  • Check UQ performance of ConformalMLP (e.g. on projectile and other simulators)
    • For projectile "split" constant-width intervals does not provide good UQ while "quantile" is an improvement but less good than e.g. a GP.

@sgreenbury sgreenbury force-pushed the 848-conformal-prediction branch from 2d5adb6 to 6a315d2 Compare November 4, 2025 16:08
@sgreenbury sgreenbury force-pushed the 848-conformal-prediction branch from 2cbf790 to 340d1e7 Compare November 4, 2025 18:36
@sgreenbury sgreenbury changed the base branch from main to fix-docs-samples-transformed November 4, 2025 18:36
@sgreenbury sgreenbury marked this pull request as draft November 4, 2025 19:41
@sgreenbury sgreenbury force-pushed the 848-conformal-prediction branch from 08b732c to 3c609fd Compare November 5, 2025 13:54
@sgreenbury sgreenbury marked this pull request as ready for review November 5, 2025 14:24
Base automatically changed from fix-docs-samples-transformed to main November 5, 2025 15:33
@sgreenbury sgreenbury requested a review from EdwinB12 November 7, 2025 13:35
Copy link
Member

@radka-j radka-j left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great, I am just writing down some immediate thoughts I had after a quick look before I have a chance to go through it in more detail.

1: Method naming/structure: There are different ways to handle heteroscedacity with conformal methods. As here, one can use some kind of model that predicts quantiles (this can be quantile regression but could also be any other model like random forest or a NN). Alternatively, people have focused on predicting a scaling factor $\sigma_i$ that is also a function of data that is then used to scale the intervals (again, there are many ways that $\sigma_i$ can be computed. Therefore it feels that we should explicitly call this ConformalizedQuantileMLP or something like that (see e.g., paper).

2: What are the pros/cons of using an MLP vs just a standard quantised regression here?

3: I think it would be better to write the ConformalEmulator more along the lines of TransformedEmulator i.e., a wrapper that can take any deterministic model (rather than having a ConformalMLP specifically. It feels that all the conformal components are model agnostic. Is the issue here that we need a name for the emulator to pass to AutoEmulate()?

@radka-j
Copy link
Member

radka-j commented Nov 12, 2025

Should we also just add standard quantile regression as an optional UQ emulator (for cases where one might not have enough data to hold out for a conformal set)?

@radka-j
Copy link
Member

radka-j commented Nov 12, 2025

I don't think this PR closes #905 since we still don't let the user specify the test data set (and use all x as train data).

Copy link
Member

@radka-j radka-j left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! All my comments are really just about naming/docstrings.

@sgreenbury
Copy link
Collaborator Author

I don't think this PR closes #905 since we still don't let the user specify the test data set (and use all x as train data).

Ah I see - #905 is distinct from updates to the Emulator API required in #589. I'll update the top-level comment as contributes towards #589. Are the API changes in #905 specifically towards the AutoEmulate class?

@sgreenbury
Copy link
Collaborator Author

sgreenbury commented Nov 12, 2025

From discussion with @radka-j:

  • Add reference from paper above in notes docstring (we can mention that you can do the quantile regression in different ways uses regression in paper but we're using MLP here)
  • Extending to methods like the scaling can be in conformal base - add comment to explain and open an issue (different methodologies such as distance to training data and even MLP, to be explored)
  • Change to constant
  • Update to specific subclasses
  • Update API to support distribution type specification and docs explaining assumption

@radka-j
Copy link
Member

radka-j commented Nov 13, 2025

I was thinking about our discussion yesterday about evaluating conformal intervals using MSLL. I only had a brief look but I don't think we can treat the conformal intervals as a distribution - we don't know if it is normal but also don't have a reason to assume it is uniform. We only have the 2 quantiles we compute (and the mean).

@sgreenbury
Copy link
Collaborator Author

I was thinking about our discussion yesterday about evaluating conformal intervals using MSLL. I only had a brief look but I don't think we can treat the conformal intervals as a distribution - we don't know if it is normal but also don't have a reason to assume it is uniform. We only have the 2 quantiles we compute (and the mean).

Yes good point - adding a distribution s imposing an additional assumption so that we can represent as a ProbabilisticEmulator. One option could be to provide API to support specifying / overriding the distributional assumption and some docs to describe the design choice.

sgreenbury and others added 4 commits November 13, 2025 13:14
Co-authored-by: Radka Jersakova <r.jersakova@gmail.com>
Also revises comments
@radka-j
Copy link
Member

radka-j commented Nov 13, 2025

Yes good point - adding a distribution s imposing an additional assumption so that we can represent as a ProbabilisticEmulator. One option could be to provide API to support specifying / overriding the distributional assumption and some docs to describe the design choice.

I think that sounds really good.

This discussion made me think about whether in the case of Ensembles we also shouldn't assume normality. One option would be to introduce something like an EmpiricalDistribution that wraps around the ensemble predictions and can compute all the different empirical quantiles. Sampling in this case could be sampling from the ensemble with replacement. I think assuming normality is common so we can leave it as is but this might be more accurate - maybe we open an issue to discuss?

@radka-j
Copy link
Member

radka-j commented Nov 13, 2025

Related to some of the discussion here, I also think we might want to update our plots to show 95% intervals (rather than 2 sigma). Can create an issue for that if you agree.

@sgreenbury
Copy link
Collaborator Author

Related to some of the discussion here, I also think we might want to update our plots to show 95% intervals (rather than 2 sigma). Can create an issue for that if you agree.

Yes good point given we have other distributions than normals potentially - let's update this.

@sgreenbury sgreenbury requested a review from radka-j November 18, 2025 12:16
@EdwinB12
Copy link
Collaborator

Can I check where we are with this PR?

@radka-j
Copy link
Member

radka-j commented Nov 20, 2025

@EdwinB12 waiting for me to do a second review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for UQ with conformal prediction

4 participants