-
Notifications
You must be signed in to change notification settings - Fork 53
Description
Line 113 in d3b8447
| ref_count = np.apply_along_axis(np.sum, 1, haps == 0) |
Suggested Code Replacement
Current implementation:
nonref_count = np.apply_along_axis(np.sum, 1, haps == 1)
ref_count = np.apply_along_axis(np.sum, 1, haps == 0)
Recommended replacement:
nonref_count = (haps == 1).sum(axis=1)
ref_count = (haps == 0).sum(axis=1)
The use of np.apply_along_axis here is unnecessary and introduces significant performance overhead. Internally, apply_along_axis works by slicing the array row by row and applying a Python function (np.sum) to each slice, which is implemented as a Python-level loop. This results in slower execution and additional memory handling, especially on large arrays.
In contrast, the direct use of sum(axis=1) is a vectorized operation fully implemented in C under NumPy's core. It avoids Python overhead, works directly on contiguous memory blocks, and executes substantially faster. In benchmark tests, replacing apply_along_axis(np.sum, 1, ...) with .sum(axis=1) results in 10–100× speed improvements depending on array size.
The two implementations are functionally equivalent in this context, so the optimized version is a safe drop-in replacement that improves both performance and readability.