-
Notifications
You must be signed in to change notification settings - Fork 20
DM-44889: Investigate feasibility of resurrecting matchBackgrounds.py background fitting algorithms #956
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
d99bc60 to
8dfbe5e
Compare
a95c234 to
b0463a5
Compare
7c1554f to
fea12d4
Compare
fea12d4 to
30ebfa0
Compare
52df871 to
3210ec9
Compare
a969a20 to
f379682
Compare
ac42437 to
c652ce0
Compare
6a6037f to
10cf7cc
Compare
10cf7cc to
072ece2
Compare
TallJimbo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a decent cleanup of the existing code. Only serious problem is the use of a single BackgroundList to represent the backgrounds for all visits, but that should be easy to fix (and I think using more visit-ID-keyed dicts instead of lists will help in other parts of the code, too).
| refWarpVisit = Field[int]( | ||
| doc="Visit ID of the reference warp. If None, the best warp is chosen from the list of warps.", | ||
| optional=True, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This field would always have to be None in production. I assume it's here just for testing?
| default=0.2, | ||
| min=0.0, | ||
| max=1.0, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might make sense to put reference image selection into a subtask, so if you wanted to use a different algorithm with totally different config parameters, you could retarget.
| default=256 | ||
| binSize = Field[int]( | ||
| doc="Bin size for gridding the difference image and fitting a spatial model.", | ||
| default=1024, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This new default seems enormous for background matching. I think a lot of the point of the algorithm is to be able to fit the background on small scales, because you know what you're fitting is not astrophysical.
| self.statsFlag = stringToStatisticsProperty(self.config.gridStatistic) | ||
| self.statsCtrl = StatisticsControl() | ||
| # TODO: Check that setting the mask planes here work - these planes | ||
| # can vary from exposure to exposure, I think? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is safe. The mask planes do not vary from exposure to exposure (that's how getPlaneBitMask can be a static method). Enforcing that adds complexity in other places, but it's code like this that benefits.
| All fields except isReference will be None if isReference True or the fit failed. | ||
| result : `~lsst.afw.math.BackgroundList` | ||
| Differential background model | ||
| Add this to the science exposure to match the reference exposure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What "science exposure"? This method takes N warps, so it seems like it needs to return N-1 BackgroundList objects.
| return pipeBase.Struct( | ||
| backgroundInfoList=backgroundInfoList) | ||
| # TODO: more elegant solution than inserting blank model at ref ind? | ||
| backgroundInfoList.insert(refInd, blank) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this explains my confusion about the return type in the docs. A BackgroundList represents the background for a single image (in this case, it'd all be single-element lists, or zero for the reference).
I think you want this to be a dict mapping visit ID to BackgroundList - and that solves the problem of what to do for the reference image, too: it just doesn't get an entry in that dict.
| warpNPointsGlobal = [] # Global coverage | ||
| warpNPointsEdge = [] # Edge coverage | ||
| for warpDDH in warps: | ||
| warp = warpDDH.get() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the long term, it'd be better to compute these quantities earlier, when we make the warps, and save a compact representation so we don't have read each full warp twice in this task.
| costFunctionArr += self.config.bestRefWeightCoverage * coverageArr | ||
| return numpy.nanargmin(costFunctionArr) | ||
| nx = warp.getWidth() // self.config.binSize | ||
| ny = warp.getHeight() // self.config.binSize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it a problem that you're rounding down here but rounding up around line 291?
| """ | ||
| maskedImage = exposure.getMaskedImage() | ||
| fluxZp = exposure.getPhotoCalib().instFluxToNanojansky(1) | ||
| exposure.image.array *= fluxZp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure the PhotoCalib method has a method that calibrates an image directly. That should also take care of scaling the variance appropriately, which is not happening (but should) here.
| Process creates a difference image of the reference exposure minus the | ||
| science exposure, and then generates an afw.math.Background object. It | ||
| assumes (but does not require/check) that the mask plane already has | ||
| detections set. If detections have not been set/masked, sources will |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole point of background matching is that the astrophysical sources should subtract away, and I think that means we don't need to mask detections. Even if the subtractions are't super clean, they should average out if we've done the right photometric scaling.
072ece2 to
d957a3c
Compare
Initial version of this task is written using old architecture, and so needs updating. In MatchBackgroundsTask, and its method selectRefExposure, required parameters were equally outdated: DataId, DatasetType, and ImageScaler. All of these now seem consolidated under lsst.afw.image.Exposure, so separate calls to DataId and DatasetType are now single calls for Exposure objects. ImageScaler calls were replaced in-line with Exposure.getPhotoCalib() calls, to scale all image flux to the same zeropoint (nJy). Also, we want to process visit-level images using this, so a MatchBackgroundsConnections class was created, MatchBackgroundsConfig was updated to inherit from PipelineTaskConfig (and those connections), and a rudimentary runQuantum method was added to MatchBackgroundsTask.
Code now runs without complaint through self.matchBackgrounds. Also added a self._fluxScale method to replace repeat code blocks. Will decide later if scaling to nJy is the best way to do this.
Code is now functional, in that it accepts images and returns difference image background models as "psfMatchedWarpBackground_diff" (name likely to be altered later). Uses a fit to a blank image for that corresponding to the reference image.
Difference background models are now formatted properly, to allow for image creation from the spline parameters. Also did some adjustments to documentation for Flake8 formatting.
_defineWarps() now rejects any image with all NaNs along any image edge, and creates the cost function using a sky-subtracted image. This sky-subtraction fits a 1st order Chebyshev polynomial to the masked image background. Also fixed a bug from LSK refactor by inserting a blank sky model into the background model list at the chosen reference image index.
Otherwise, changes are clean-up from previous refactoring to restore functionality, plus a bug fix. Bug fix was the restoration of two lines of code in MatchBackgroundsTask.matchBackgrounds() which produced a difference image to work from.
All images and background models now returned in counts, not nJy.
`matchBackgrounds` in its original form matches by warps, i.e. single detectors. A more sensible thing to do is to match backgrounds across the whole focal plane. This functionality needed to be added to do this using warped exposures, in tract coordinates, so this is now added to `backgrounds.py`. This commit also includes a partial revision to `matchBackgrounds.py` using this new functionality, choosing a reference visit ID based on these background models instead of on individual warps. Full functionality has yet to be restored in light of these changes.
All methods, including run, updated to function properly with tract backgrounds rather than warps. Task completes without error when run on three full visits, and output appears roughly correct.
d957a3c to
b7318ec
Compare
| "SkyMeasurementTask", | ||
| "SkyStatsConfig", | ||
| "TractBackground", | ||
| "TractBackgroundConfig", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move these tasks to a new module, and then revert any changes to the rest of this module that were just formatting.
Reformatting an existing file (in a package where we don't have GitHub Actions set up to keep it that way) is only merited in cases where you're doing significant modifications to that code anyway (so I think matchBackgrounds.py qualifies but background.py does not), and in that case the reformatting really should be on a separate commit.
| class TractBackground: | ||
| """ | ||
| As FocalPlaneBackground, but works in warped tract coordinates | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docstring should start on the first line, and I think it needs to be a bit more than just this sentence.
| background : `TractBackground` | ||
| Something guaranteed to be a `TractBackground`. | ||
| """ | ||
| return cls(other.config, other.tract, other.dims, other.transform, other._values, other._numbers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method doesn't seem to be used anywhere.
| """Constructor | ||
| Developers should note that changes to the signature of this method | ||
| require coordinated changes to the `__reduce__` and `clone` methods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put this in a code comment.
| values : `lsst.afw.image.ImageF` | ||
| Measured background values. | ||
| numbers : `lsst.afw.image.ImageF` | ||
| Number of pixels in each background measurement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__init__ parameter docs should go in the class docstring. I don't think a docstring on an __init__ actually goes anywhere useful. Or at least it doesn't end up in all of the useful places.
| # TODO: as stated above, fitting a pre-binned image results in a | ||
| # null variance image. But we want to add variance into the cost | ||
| # function. How best to do that? Below is a bad temporary | ||
| # solution, just assuming variance = mean |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend computing an estimate of the variance earlier and attaching it to the tract backgrounds (i.e. as a scalar). variance = mean is definitely not going to hold in nJy.
| Data for ref 1. | ||
| fluxZp = exposure.getPhotoCalib().instFluxToNanojansky(1) | ||
| exposure.image *= fluxZp | ||
| return fluxZp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since warps are now nJy, I don't think you need this method anymore, and deleting it is the easiest way to find the places that call it.
| binned FFP reference image, then generates TractBackground | ||
| objects. It assumes (but does not require/check) that the mask planes | ||
| already have detections set. If detections have not been set/masked, | ||
| sources will bias the difference image background estimation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the sources subtract out, and hence not bias the difference image background estimation?
If it looks like they are biasing the backgrounds, then that sort of throws the whole premise into doubt (unless it's something simple, like needing to use PSF-matched warps instead of direct warps).
| maskIm.image += bkgdIm | ||
| # Then convert everything back to counts | ||
| maskIm.image /= instFluxToNanojansky | ||
| bkgdIm /= instFluxToNanojansky |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More now-unnecessary pixel calibrations.
| backgrounds are then used to generate 'offset' images for each warp | ||
| comprising the full science exposure visit, which are then added to | ||
| each warp to match the background to that of the reference visit at the | ||
| warp's location within the tract. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens when one of the patches doesn't fully overlap the reference visit?
The matchBackgrounds.py script was an initial attempt at matching multiple warped visit-level images to a given reference image. Taking the difference images between the reference and each subsequent image allows for each of these images to be modified to match the background of the reference image. These images may then be stacked (coadded), and a single background estimation made on the combined (and, importantly, deeper) coadd image.
The existing matchBackgroundsTask has not been utilized in some time. This ticket aims to explore the feasibility of resurrecting this code base, with a mind towards testing on HSC in the near-term.