Skip to content

Big PR about using uT as fit axis, and developments to fit the W boson width (partially based on the other changes)#678

Merged
davidwalter2 merged 63 commits intoWMass:mainfrom
cippy:main_mergeUpstream_devel_150326
Mar 31, 2026
Merged

Big PR about using uT as fit axis, and developments to fit the W boson width (partially based on the other changes)#678
davidwalter2 merged 63 commits intoWMass:mainfrom
cippy:main_mergeUpstream_devel_150326

Conversation

@cippy
Copy link
Copy Markdown
Collaborator

@cippy cippy commented Mar 25, 2026

This PR is supposed to supersede the previous PR #643

It does many things, but doesn't modify the default analysis behaviour. A short executive summary is the following

  • extension of the ABCD method to work with the uT axis
  • developments to fit the W width, either standalone or simultaneously with mass
  • customisation of some systematic uncertainties related to the recoil (optional)
  • several updates in plotting scripts

cippy and others added 30 commits October 17, 2025 15:44
… analysis bins of specific axes (e.g. utAngleSign) which have identically zero yield in the A-Ax-B-Bx regions

(cherry picked from commit 773c1d9)
… analysis bins of specific axes (e.g. utAngleSign) which have identically zero yield in the A-Ax-B-Bx regions
…current version of Rabbit, it will be probably reverted when the definitions of poi/noi/nuisanceNotConstrained are fixed
…ion on uT>0 with an hardcoded factor (normalization)
if args.fakeTransferAxis in args.fakerateAxes
else ""
),
fakeTransferCorrFileName=args.fakeTransferCorrFileName,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this actually do? It reads another file that used a looser selection, or something?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fakeTransferCorrFileName contains the correction needed for the prediction of the fakes with negative uT based on the one from high uT. It is needed because right now the low mT region for negative uT has no events because of kinematics, so the ABCD method cannot predict the background in the high mT region there. The correction is basically the ratio of the data yields between the two uT regions as obtained from the secondary vertices control region, subtracting the small prompt component. Then we also propagate uncertainties based on non closures between control and signal region, and data and QCD MC. It is essentially what Ruben presented during the workshop in Pisa.

systAxes=["width"],
systNameReplace=[["2p09053GeV", "Down"], ["2p09173GeV", "Up"]],
passToFakes=passSystToFakes,
widthVarTag = ""
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the width related stuff should go in rabbit_helpers since it's gotten so long

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I can try to move there as many things as possible

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

labelsByAxis=["varTF", "downUpVar"],
)

## syst for transfer factor difference between data and QCD MC in control region
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete commented out stuff (here and other places)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This particular item is one thing we are still finalising, but I guess I can remove it for now.
Other places with commented code were in some cases non added by this PR, so I am not sure what to do with them.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done for now, it will be added back in the near future

actionArgs=dict(
altHistName="fakeCorr_closQCDsignal",
varIdxs=[],
correctionFile=f"{common.data_dir}/fakesWmass/{args.fakeTransferCorrFileName}.pkl.lz4",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this not be computed on the fly? Where does it come from?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now it cannot, because it makes use of the control region, so it is supposed to be produced in advance (the files will need to be stored in wremnants-data).
However, we are considering modifying the procedure so to derive the correction directly from the signal region using the QCD MC, rather than from data from the control region, to avoid depending too much on the latter. However, systematic uncertainties would still need to be produced using the control region where everything is validated, so they should be available as external inputs anyway.

@bendavid
Copy link
Copy Markdown
Collaborator

What actually needed to be changed to fit the W width? This was in principle already working.

@cippy
Copy link
Copy Markdown
Collaborator Author

cippy commented Mar 25, 2026

What actually needed to be changed to fit the W width? This was in principle already working.

I made the choice of the prefit variation more flexible, through options. Then I also added new options to perform the same studies as we did for mW, such as decorrelation by variable or fitting the width difference between charges.
There were also some minor things related to calling addSystematic, some arguments were propagated as if one was measuring the mass rather than the width.

@cippy
Copy link
Copy Markdown
Collaborator Author

cippy commented Mar 26, 2026

Last commit moved the definition of the mass and width systs from setupRabbit.py to rabbit_helpers.py, also fixing a couple of bugs/inconsistencies in their definition.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a lot of stuff added, new columns and histograms even if one runs without any argument. Can those things that are only needed for tests be put into a block to be disabled by default e.g. using "--auxiliaryHistograms" or a new flag like "--addHistsForMETStudies" or so

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point, I'll do that.

@@ -36,14 +133,14 @@ def add_mass_diff_variations(
)
# mass difference by swapping the +50MeV with the -50MeV variations for half of the bins
args = ["massShift", f"massShift{label}50MeVUp", f"massShift{label}50MeVDown"]
if mass_diff_var == "charge":
if any(mass_diff_var == var for var in ["charge", "utAngleSign"]):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if mass_diff_var in ["charge", "utAngleSign"]:

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why I did like that, silly me :)

@@ -40,10 +40,13 @@ TVector2 get_z_mom(const float pt1, const float phi1, const float pt2,
TVector2 l1 = TVector2();
l1.SetMagPhi(pt1, phi1);

TVector2 l2 = TVector2();
l2.SetMagPhi(pt2, phi2);
// TVector2 l2 = TVector2();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove comments

@@ -166,6 +176,8 @@ def translate_html_to_latex(n):
"MET_pt": {"label": r"$\mathit{p}_{\mathrm{T}}^{miss}$", "unit": "GeV"},
"MET": {"label": r"$\mathit{p}_{\mathrm{T}}^{miss}$", "unit": "GeV"},
"met": {"label": r"$\mathit{p}_{\mathrm{T}}^{miss}$", "unit": "GeV"},
# "mt": {"label": r"$\mathit{m}_{T}^{\mu,MET}$", "unit": "GeV"},
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

Copy link
Copy Markdown
Collaborator

@davidwalter2 davidwalter2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review of the non-notebook, non-analysisTools code changes. The uT-axis ABCD extension and W width fitting additions are clearly useful. A few comments below.

trTensorPath = (
f"{data_dir}/fakesWmass/{self.fakeTransferCorrFileName}.pkl.lz4"
)
logger.warning(f"Loaded transfer tensor for fakes: {trTensorPath}")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be logger.info rather than logger.warning — loading a file successfully is not a warning situation. Same for line 479 (self.fakerate_integration_axes = ...), which is purely informational.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok


else:
logger.debug(
f"All ABCD values are zeros! Returning Fake estimate based on other bin, with an HARDCODED norm. factor"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says "HARDCODED norm. factor" but no explicit normalization factor is actually applied here — the fallback uses the complementary bin's smoothed spectrum multiplied by self.fakeTransferTensor. Could you clarify whether a normalization is intentionally missing, or update the comment to accurately describe what is being done?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was an old message when we were only scaling the normalization of the prediction taken from the other bin, while now we propagate an actual shape (also with a different normalization) with some uncertainties. I'll change the message


// use scale=1.1 and smear=0.2 for 10% larger mean value and 20% resolution
// smearing
std::seed_seq seq{std::size_t(run), std::size_t(lumi), std::size_t(event)};
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each call to get_scaled_smeared_variable within the same event uses identical seeds (run, lumi, event), so smearMET_pt and smearMET_phi will draw from the same RNG state and be 100% correlated. Consider adding a call-site tag (e.g. a string hash or integer constant) to the seed_seq to ensure independence between the three smear/scale variations.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I didn't realise it

@@ -1460,7 +1463,8 @@ def add_recoil_unc_W(
return df

def setup_recoil_Z_unc(self):
if not self.dataset.name in self.datasets_to_apply or not self.storeHists:
# if not self.dataset.name in self.datasets_to_apply or not self.storeHists:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left-over commented code. If the self.storeHists guard is no longer needed, please remove the commented-out line rather than leaving it.

# to define dedicated systematics for scale/smearing of met_pt
df = df.Define(
"scaleMET_pt",
"wrem::get_scaled_smeared_variable(run, luminosityBlock, event, MET_corr_rec_pt, 1.01, 0.0)",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scale/smear values (1% scale, 5% smear for pt and phi) look like temporary test values, especially since in setupRabbit.py the phi smear is further rescaled by 0.4 with a # for test comment. What are the physics-motivated values for these? A TODO referencing the derivation would help.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still doing some studies to estimate reasonable values to use. I ended up with these based on the shape variations, but I actually believe they are still overestimated, because the uncertainty applied to MC should reflect the residual data/MC difference after calibration, and this is probably of order 1% or less (otherwise extracting mW from mT should be hopeless).

"eta-sign",
"eta-range",
],
help="For use with --noi widthDiffW, select the variable to define the different mass differences",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy-paste typo in the help string: "define the different mass differences" should be "define the different width differences".

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you

"expNoLumi",
"expNoCalib",
],
scale=0.4, # from 5% -> scale * 5% for test (see histmaker)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scale=0.1 and scale=0.5 are commented out for scaleMET_pt and smearMET_pt, while smearMET_phi has an active scale=0.4 with a # for test comment. This inconsistency suggests the custom recoil syst block is not yet finalized. Are these ready to merge, or should this remain clearly marked as experimental?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can probably keep the values I tested so far as the baseline, even though they might be optimised further.

variation_size=0.1,
keepConstantAxisBin={},
fakeselector=None,
*args,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having *args between keyword arguments with defaults (keepConstantAxisBin, fakeselector) and **kwargs is fragile — any extra positional argument would be silently captured by *args and ignored. Unless forwarding positional args to fakeselector.get_hist is intentional (which it appears to be from line 990), please add a comment clarifying this, or restructure to avoid the ambiguity.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try

@cippy
Copy link
Copy Markdown
Collaborator Author

cippy commented Mar 28, 2026

Ok, this should be ready, assuming I didn't forget to implement any single comment that I received.
CI looks good, its the usual small difference in he BSM workflow.

For the seeding of the smearing, I could have implemented a more general solution to automatically increment the counter for the seed generation after each function call, but it was simpler to pass an integer directly, since the function is currently called only twice (actually three times, but the first one doesn't use the random number generator) and the manual setting allows the user to use the same number if ever needed.

@davidwalter2 davidwalter2 merged commit fe7c6a9 into WMass:main Mar 31, 2026
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants