Skip to content

feat: add QMIAAttack quantile regression membership inference attack (#306)#435

Open
ssrhaso wants to merge 15 commits intomainfrom
306-quantile-regression
Open

feat: add QMIAAttack quantile regression membership inference attack (#306)#435
ssrhaso wants to merge 15 commits intomainfrom
306-quantile-regression

Conversation

@ssrhaso
Copy link
Copy Markdown
Contributor

@ssrhaso ssrhaso commented Apr 1, 2026

Summary

Implements the quantile regression membership inference attack from [Bertran et al., NeurIPS 2023] (https://arxiv.org/abs/2307.03694) - as described in #306.

Trains a single HistGradientBoostingRegressor at quantile level (1 − α) on non-member hinge scores to learn per-sample membership thresholds. A record is predicted as a member when its observed score exceeds the predicted threshold.

No shadow models required.

New files

  • sacroml/attacks/qmia_attack.py - QMIAAttack class (18 tests, 100% coverage)
  • sacroml/attacks/utils.py - hinge score, margin conversion, label helpers
  • sacroml/attacks/report.py - create_qmia_report() with QMIA_INTRODUCTION
  • sacroml/attacks/factory.py - registered as "qmia"
  • tests/attacks/test_qmia_attack.py - 18 tests
  • tests/attacks/test_factory.py - factory integration test
  • tests/attacks/test_report.py - JSON sanitisation test
  • examples/sklearn/benchmark_qmia_*.py - benchmark scripts
  • CHANGELOG.md, README.md - updated

Incidental fixes

  • attribute_attack.py: close matplotlib figures after saving (resource leak)
  • target.py: skip serialising data arrays when dataset module is provided
  • test_sklearn.py: fix LabelEncoder deprecation warning
  • test_structural_attack.py: suppress ConvergenceWarning in MLP tests
  • conftest.py: set matplotlib Agg backend, add session cleanup
  • utils.py: refactor check_and_update_dataset to use dict lookup

Test results

  • 157 passed, 0 failures
  • qmia_attack.py: 100% line coverage

@ssrhaso ssrhaso force-pushed the 306-quantile-regression branch from 247033d to 21f3ffa Compare April 1, 2026 11:26
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 1, 2026

Codecov Report

❌ Patch coverage is 97.18310% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 99.39%. Comparing base (a9524af) to head (891cff1).

Files with missing lines Patch % Lines
sacroml/attacks/utils.py 88.57% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #435      +/-   ##
==========================================
- Coverage   99.51%   99.39%   -0.12%     
==========================================
  Files          23       24       +1     
  Lines        2687     2818     +131     
==========================================
+ Hits         2674     2801     +127     
- Misses         13       17       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not happy with the greatly relaxed bounds on the various attack metrics, for example going from:

assert metrics["TPR"] == pytest.approx(0.91, abs=0.01)
 assert metrics["FPR"] == pytest.approx(0.41, abs=0.01)

to

   assert 0.5 <= metrics["TPR"] <= 1.0
    assert 0.0 <= metrics["FPR"] <= 1.0

no longer checks that the algorithm is behaving reproducibly and the same way as previous versions,
because we are going from +/- 1% (which allows for minor changes in precision of floats etc on different platforms) to +/- 25%

I know this is partly to do with using randomly created data, but by specifying the random seeds it should still be possible to get reproducible behaviour and keep the acceptable difference in behaviour to within +/- 1% of whatever the new value is.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Jim,

Sorry for the oversight. I have tightened the QMIA test assertions to +/- 1% using pytest.approx, following the same pattern as test_factory. I have now replaced the loose bounds with exact expected values. I also added TPR@FPR threshold assertions because they test QMIA in detecting members at a controlled false positive rate. I think these are worth keeping, but let me know if you think otherwise.

I went more in-depth and just wanted to confirm my understanding, as I do not want to change anything outside of the QMIA scope.

Does test_factory checks correctness with tight bounds while other tests just verify the code runs and produces valid metrics? and wouldn't it have been better to add random_state to those fixtures and tighten them from the start?

test_lira_attack and test_worst_case_attack use loose assertions like assert 0 <= metrics["TPR"] <= 1 because their fixtures do not set random_state, so the metric values differ between runs.

Since test_factory runs the full end-to-end pipeline and is the only test with tight assertions because its get_target fixture sets random_state=1 making the results reproducible.

The QMIA fixtures do have random_state set, but I kept the loose format to stay consistent with the other individual attack tests. Should I tighten those as well, like I did in the factory test?

Please let me know if I have misunderstood anything.

Thanks!

@shamykyzer shamykyzer linked an issue Apr 1, 2026 that may be closed by this pull request
@shamykyzer shamykyzer requested a review from jim-smith April 2, 2026 11:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement quantile regression and other non-shadow attacks

3 participants