-
Notifications
You must be signed in to change notification settings - Fork 8
ENH: New Annotation API #149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Introduces enhancements to profiling and annotation APIs, including improved kernel-event association, a new breakdown function for annotation timings, and extended flat profiling with custom grouping and parallelism controls.
- Expanded
flat_profilemethod withmapper,parallelism_level,ascending, andidle_timeoptions - Added
time_breakdownto compute CPU and launched-kernel time per annotation - Updated SQLite reader to disable unused CUPTI events and switch from
_childrento_kernel_launch
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| pipit/trace.py | Enhanced flat_profile, added time_breakdown, removed dropna=False, and debug print |
| pipit/readers/nsight_sqlite_reader.py | Commented out MEMCPY/MEMSET/SYNCH SQL, and replaced _children assignments with _kernel_launch |
Comments suppressed due to low confidence (4)
pipit/readers/nsight_sqlite_reader.py:293
- Swapping
_childrenfor_kernel_launchbreaks consumers (e.g.,time_breakdownandfilter_by_label) that expect_children. Ensure both fields are populated or update all references.
trace_df.loc[calls_that_launch["index_x"].to_numpy(), "_kernel_launch"] = (calls_that_launch["index_y"].to_numpy())
pipit/trace.py:1040
- [nitpick] Using
DataFrame.applywith Python loops for each kernel may be slow on large traces. Consider vectorized operations or grouping strategies to improve performance.
kernels.apply(_calc_kernel_time, axis=1,)
pipit/readers/nsight_sqlite_reader.py:71
- [nitpick] Large commented-out SQL queries add clutter. If these events are unused long-term, remove the dead code or move it to documentation.
# """ ... large commented SQL block ... """
pipit/trace.py:528
- New parameters and logic in
flat_profileandtime_breakdownlack dedicated unit tests. Add tests covering custommapper, parallelism levels,idle_time, sorting, and the new breakdown method.
def flat_profile(..., mapper=None, parallelism_level=None, ascending=None, idle_time=False):
| pd_grouper[label] = group | ||
| res = ( | ||
| res.set_index("Name") | ||
| .groupby([pd_grouper] + parallelism_level)[["time.exc"]] |
Copilot
AI
May 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mapper branch hard-codes time.exc instead of using the metrics parameter; this will ignore any other metrics list provided. Use the metrics variable here.
| .groupby([pd_grouper] + parallelism_level)[["time.exc"]] | |
| .groupby([pd_grouper] + parallelism_level)[metrics] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll agree with this, worth making sure this function works when other metrics than time.exc are passed in.
jhdavis8
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Features and documentation looks good overall. I added a few comments.
One high-level question, are there new unit tests checking the added features? Would be good to add these here or in a later PR.
| pd_grouper[label] = group | ||
| res = ( | ||
| res.set_index("Name") | ||
| .groupby([pd_grouper] + parallelism_level)[["time.exc"]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll agree with this, worth making sure this function works when other metrics than time.exc are passed in.
|
|
||
| # Use explode to expand every child in children list to a row | ||
| # This can include duplicates (e.g. for nested annotations) that we should drop | ||
| kernels = events.loc[host_events["_children"].dropna().explode().to_numpy()] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_children referred to here
| # Postprocessing using mapper | ||
| if mapper is not None: | ||
| # pandas expects label->group | ||
| labels = res["Name"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be groupby_column? I'm not sure but wanted to double-check. I think we should check whether we want to use the hardcoded "Name" column or groupby_column in every line we use "Name".
|
Changes