Skip to content

Draft: Add Phase-Resolved Analysis Tutorial and Tools.#497

Draft
KoothodilAbhijithAugustine wants to merge 1 commit intocositools:developfrom
KoothodilAbhijithAugustine:phase_resolved_analysis_version_one
Draft

Draft: Add Phase-Resolved Analysis Tutorial and Tools.#497
KoothodilAbhijithAugustine wants to merge 1 commit intocositools:developfrom
KoothodilAbhijithAugustine:phase_resolved_analysis_version_one

Conversation

@KoothodilAbhijithAugustine
Copy link
Contributor

@KoothodilAbhijithAugustine KoothodilAbhijithAugustine commented Feb 27, 2026

Description

Resolves #395 and #413

This draft PR introduces a complete tutorial and a set of supporting Python tools for performing phase-resolved pulsar analysis within the cosipy framework.

Currently, the workflow demonstrates how to process Data Challenge 2 (DC2) FITS data for the Crab pulsar and its corresponding albedo background. The end goal of this pipeline is to prepare isolated on-pulse and off-pulse FITS files that are ready for downstream COSI Spectral Fitting.

Key Additions

  • Phase-Resolved Analysis Tutorial (example_notebook.ipynb): A step-by-step Jupyter Notebook guiding users through combining unbinned data, slicing by Mission Elapsed Time (MET), assigning phases, plotting the pulse profile, and selecting phase intervals.
  • phase_assigner.py: A new module that reads a pulsar ephemeris file (e.g., crab.par) and calculates/appends the exact rotational phase to individual photon events in a FITS file.
  • plot_pulse_profile.py: A plotting utility to generate and visualize the folded light curve/pulse profile from the phase-assigned data.
  • phase_selector.py: A filtering tool to extract specific phase windows (e.g., on-pulse and off-pulse regions) and save them as new FITS files.

@KoothodilAbhijithAugustine KoothodilAbhijithAugustine added the phase_analysis Modulations, pulses and phase resolved analyses label Feb 27, 2026
@KoothodilAbhijithAugustine KoothodilAbhijithAugustine added this to the v4.0 - DC4 milestone Feb 27, 2026
@codecov
Copy link

codecov bot commented Feb 27, 2026

Codecov Report

❌ Patch coverage is 0% with 302 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.35%. Comparing base (fb0f9da) to head (d3584a3).
⚠️ Report is 1653 commits behind head on develop.

Files with missing lines Patch % Lines
cosipy/phase_resolved_analysis/phase_selector.py 0.00% 97 Missing ⚠️
cosipy/phase_resolved_analysis/time_selector.py 0.00% 92 Missing ⚠️
...sipy/phase_resolved_analysis/plot_pulse_profile.py 0.00% 65 Missing ⚠️
cosipy/phase_resolved_analysis/phase_assigner.py 0.00% 45 Missing ⚠️
cosipy/phase_resolved_analysis/__init__.py 0.00% 3 Missing ⚠️
Files with missing lines Coverage Δ
cosipy/phase_resolved_analysis/__init__.py 0.00% <0.00%> (ø)
cosipy/phase_resolved_analysis/phase_assigner.py 0.00% <0.00%> (ø)
...sipy/phase_resolved_analysis/plot_pulse_profile.py 0.00% <0.00%> (ø)
cosipy/phase_resolved_analysis/time_selector.py 0.00% <0.00%> (ø)
cosipy/phase_resolved_analysis/phase_selector.py 0.00% <0.00%> (ø)

... and 72 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@hiyoneda hiyoneda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @KoothodilAbhijithAugustine, I am checking your codes. Let me share some comments.

Selects events based on pulsar phase.
Highly optimized for FITS files via NumPy vectorization.
"""
def __init__(self, ephemeris_file, pstart, pstop, batch_size=10000):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to allow users to put a list of phase interval ranges.


def _get_vectorized_mask(self, phases: np.ndarray) -> np.ndarray:
"""Core logic applied across an entire NumPy array instantly."""
pstop_norm = self.pstop % 1.0 if self.pstop > 1.0 else self.pstop
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the multiple ranges are allowered, we can constrain the pstop and pstat to be from 0 to 1.

self.params = self._parse_par_file(par_file)

# Parse F0 (Frequency) or P0 (Period)
if 'F0' in self.params:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future work: let's put "F1" and "F2" in the future, not this time.

logger = logging.getLogger(__name__)

# --- ROBUST IMPORTS (Safety Switch) ---
try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new interface has been merged, so this part can be updated.

for sel in mask:
yield bool(sel)

def filter_events(self, events, output_fits=None, template_fits=None):
Copy link
Contributor

@hiyoneda hiyoneda Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filter_event, _save_fits_fast, and save_fits can be removed because these functionalities should be implemented as a part of DataIO and are duplicated if they are added here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we decided to make this class independent of the event selection interface, we can leave these methods.

Generates a 3-panel figure: Pulse Profile, Phaseogram, and Significance Test.
Optimized for direct FITS table input (vectorized).
"""
def __init__(self, data, n_bins=50, n_time_bins=50):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be happy if you could add a docstring here to explain what kind of "data" is assumed as an input of this class.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, what kinds of plots will be generated

if not batch: return
yield batch

class TimeSelector(EventSelectorInterface):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can delete this class.

@israelmcmc israelmcmc moved this to Under review in cosipy Mar 24, 2026
@hiyoneda
Copy link
Contributor

hiyoneda commented Mar 25, 2026

I checked the notebooks and found some issues. This is a draft pull request, so you may already have noticed these points and plan to resolve them in the next pull request. Let me share them anyway.

  • The first cell raises errors due to wrong importing. from time_selector import TimeSelector should be from cosipy.phase_resolved_analysis import TimeSelector, PhaseSelector, PlotPulseProfile
  • fetch_wasabi_file is not imported but called. The same for analysis.combine_unbinned_data.
  • For fetch_wasabi_file, you can add checksum values to avoid downloading the same data multiple times. The checksum value can be checked with md5 command on mac.
    • MD5 (Crab_DC2_3months_unbinned_data.fits.gz) = dc261c4adf86a44e0349d47c72185a99
    • MD5 (albedo_photons_3months_unbinned_data.fits.gz) = 4c125410d8f127d7e12682f008d5651d
    • MD5 (20280301_3_month_with_orbital_info.ori) = 206122c81670bbd8765b43c9d4872bd4
  • On the 5th cell, the filename extension is changed to '.fits' from '.fits.gz', which causes an error. Overall, it would be great to ensure the notebook runs without any modifications.
  • To assigning pulse phase, PhaseAssigner should be added in __init_.py, and it can be imported as from cosipy.phase_resolved_analysis import PhaseAssigner.
  • FileNotFoundError: [Errno 2] No such file or directory: 'crab_with_albedo_1d.fits'. It should be 'crab_with_albedo_1d.fits.gz'.

With modifications, the notebook worked well in my environment.

@hiyoneda
Copy link
Contributor

Actually, I got somewhat different results for the 2D phaseogram. Do you have any idea on this?

Screenshot 2026-03-25 at 12 08 21

@hiyoneda
Copy link
Contributor

@israelmcmc
Is there an example of how to perform event selection using the new interface with COSI data? We currently use the filter_events method in the PhaseSelector class for pulse-phase-based event selection, and would like to migrate to the new interface instead.
I am also trying to understand the overall data-handling structure under the new interface. The DataIO class provides the event list as a dictionary (cosi_dataset) as before, and the event selection class appears to support TimeTagEmCDSEventDataInSCFrameFromDC3Fits — but it does not seem to be used in the DataIO unbinned class. Are there plans to integrate DataIO with the new interface more?

@israelmcmc
Copy link
Collaborator

@hirokiyoneda Here's how the event selection works, but please see my point at the end that it might now be the best way to go about this for the phase.

Currently the only example of an EventSelectorInterface is TimeSelector, but it's enough to understand how it works.

The EventSelectorInterface is pretty simple: it receives a list of events of a certain type, and for each one it returns a boolean flag that signals whether it was selected or not. The event data type determines which parameters a given selector can use for the passing criteria. For example, the EventSelectorInterface uses TimeTagEventDataInterface, which is guaranteed to have time. You can pass a particular event data that has even more information, through inheritance, but EventSelectorInterface will only consider time. For example, TimeSelector supports the particular implementation that you mentioned, TimeTagEmCDSEventDataInSCFrameFromDC3Fits, because it inherits from TimeTagEmCDSEventDataInSCFrameFromArrays > TimeTagEmCDSEventDataInSCFrameInterface > TimeTagEventDataInterface.

Note that EventSelectorInterface doesn't know where the data comes from. The DataIO.UnbinnedData class is the one responsible for handling the files on disk. Internally, TimeTagEmCDSEventDataInSCFrameFromDC3Fits uses it to load the events. Thats currently the extent of the integration between the interfaces and DataIO, but it can be improved! However, I think we'll need to work on #344 first (or do both at the same time).

Ideally, UnbinnedData should be able to wraps it's data using instances of EventDataInterface. It can have it's own selection methods for convenience, but it should internally delegate the calculations to a particular EventSelectorInterface.

For the case at hand, phase selection, you would need:

  1. Define the EventWithPhaseInterface/EventWithPhaseInterface (akin to TimeTagEventInterface/TimeTagEventDataInterface).
  2. Define specific implementation of the above classes, which can actually be instatiated e.g. EventWithPhase/EventDataWithPhase (akin to TimeTagEmCDSEventInSCFrame/TimeTagEmCDSEventDataInSCFrameFromArrays).
  3. Write a PhaseSelector(EventSelectorInterface) class, that can take a EventWithPhaseInterface an returns the list of bool flags.

However, using a PhaseSelector might not be the most efficient way to do phase analysis, since you'd have to loop through the same event list multiple time for each phase selection. It would be better to use the phase information to split the data directly, one set of data for each phase range. Or, if you want it binned, you can have a "phase" axis in your histogram and bin it directly.

@israelmcmc
Copy link
Collaborator

On a separate topic, note that we'll also need a method to update the exposure in SpacecraftHistory based on the phase selection. This is how we can get the right normalization without changing the rest of the analysis.

@hiyoneda
Copy link
Contributor

Thanks @israelmcmc for the thorough explanation.

I understand that the DataIO refactoring is still ongoing. For now, I thought that it would be better to decouple PhaseSelector from EventSelectorInterface and make it a standalone class that directly filters a DataIO.UnbinnedData instance. This keeps things simple and lets us move forward with DC4 using Abhi's current implementation as a base. It might be possible to use EventSelectorInterface but at this moment we need to think about how to save the filtered data (something like opening TimeTagEmCDSEventDataInSCFrameFromDC3Fits with filtering -> converting TimeTagEmCDSEventDataInSCFrameFromDC3Fits to UnbinnedData -> saving data as a fits could be possible but complicating). Once the DataIO refactoring is complete, we can revisit and properly integrate PhaseSelector into the interface hierarchy.

For SpacecraftHistory and the response, I think the same approach would be good; for now, we write a utility function that manually adjusts the exposure (and response if needed) based on the phase selection, and migrate to passing an EventSelectorInterface once the infrastructure is ready.

Does this two-step approach sound reasonable to you @israelmcmc @KoothodilAbhijithAugustine ?

@israelmcmc
Copy link
Collaborator

@hiyoneda Yes, I'm OK with the plan for PhaseSelector/UnbinnedData. It is indeed hard to achieve a nice integration with the interfaces without refactoring DataIO first, so we can wait until then.

For SpacecraftHistory, that's independent of the event interfaces, since they don't have a concept of uptime (it's just a list of events). I'm thinking there should be an new interface e.g.

class PhaseEphemeris(Protocol):

    def get_phase(time:Time) -> np.ndarray:
        """"
        Assign a phase between 0 an 1 to each timestamp.
        """"
        
    def get_duty_cycle(time_start:Time, time_stop:Time, phase_start:float, phase_stop:float) -> u.Quantity:
        """"
        Compute the time spend between the phase range for each time range.
        """"

PhaseEphemeris will then be inherited by classes that can open specific formats or can be initialized from specific user parameters (e.g. t0 + period).

Then, we can add a method to SpacecraftHistory that applies this duty cycle e.g.

class SpacecraftHistory:

    def update_ephemeris(ephemeris: PhaseEphemeris, phase_start:float, phase_stop:float):
        """
        Updates live time based on a phase cut.
        """
        # Here you can call ephemeris.get_duty_cycle()

The advantages:

  • The specific PhaseEphemeris will know the fastest way to compute the duty cycle
  • SpacecraftHistory delegates this calculation, but it know what's the best way to update it's own live time.
  • It'll be easy to add new ephemeris formats or models if needed.

@hiyoneda
Copy link
Contributor

@israelmcmc Thanks. Adding the new protocol and the method to SpacecraftHistory in this PR sounds good.
@KoothodilAbhijithAugustine Can you work on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

phase_analysis Modulations, pulses and phase resolved analyses

Projects

Status: Under review

Development

Successfully merging this pull request may close these issues.

Tool to obtain a phasogram

3 participants