Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -557,6 +557,7 @@ set(SIMPLNX_HDRS
${SIMPLNX_SOURCE_DIR}/Utilities/ParallelTaskAlgorithm.hpp
${SIMPLNX_SOURCE_DIR}/Utilities/SamplingUtils.hpp
${SIMPLNX_SOURCE_DIR}/Utilities/SegmentFeatures.hpp
${SIMPLNX_SOURCE_DIR}/Utilities/UnionFind.hpp
${SIMPLNX_SOURCE_DIR}/Utilities/TimeUtilities.hpp
${SIMPLNX_SOURCE_DIR}/Utilities/TooltipGenerator.hpp
${SIMPLNX_SOURCE_DIR}/Utilities/TooltipRowItem.hpp
Expand Down
138 changes: 138 additions & 0 deletions Code_Review/Filter_Performance_Updates.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
# Filter Performance Updates

## Milestone 1: Segment Features & FillBadData - CCL Optimization

### Overview

Replaced the DFS (depth-first search) flood-fill algorithm in the `SegmentFeatures` base class with a two-phase chunk-sequential Connected Component Labeling (CCL) algorithm, and applied the same data structure optimizations to the existing `FillBadData` CCL. Both algorithms now use vector-based Union-Find with path compression, in-memory provisional labels buffers, and direct vector lookups instead of hash maps. This eliminates random-access patterns that cause severe chunk thrashing with out-of-core (OOC) ZarrStore storage.

### Algorithm Change (SegmentFeatures)

**Before:** Stack-based DFS flood fill. Each voxel popped from the stack triggers neighbor lookups at random spatial locations, causing cache misses and chunk evictions in OOC mode.

**After:** Two-phase scanline CCL:
1. **Phase 1 (Forward CCL):** Linear Z-Y-X iteration over all voxels. For each valid unlabeled voxel, check backward neighbors only. Assign provisional labels via Union-Find into an in-memory buffer.
2. **Phase 2 (Resolution + Relabeling):** Flatten Union-Find, build direct lookup table, then chunk-sequential pass to write final contiguous feature IDs.

### Performance Results (SegmentFeatures)

**Before (from slowdown_analysis.csv, DFS algorithm):**

| Test | In-Core (s) | OOC (s) | Slowdown |
|------|------------|---------|----------|
| ScalarSegmentFeatures | 0.22 | 105.57 | 479.86x |
| ScalarSegmentFeatures: Neighbor Scheme | 0.11 | 0.21 | 1.91x |
| EBSDSegmentFeatures:Face | 0.06 | 3.32 | 55.33x |
| EBSDSegmentFeatures:All | 0.06 | 3.52 | 58.67x |
| CAxisSegmentFeatures:Face | 0.06 | 2.65 | 44.17x |
| CAxisSegmentFeatures:All | 0.06 | 3.01 | 50.17x |

**After (optimized CCL algorithm):**

| Test | In-Core (s) | OOC (s) | Slowdown |
|------|------------|---------|----------|
| ScalarSegmentFeatures* | 0.44 | 3.61 | 8.2x |
| ScalarSegmentFeatures: Neighbor Scheme* | 0.08 | 0.69 | 8.6x |
| EBSDSegmentFeatures:Face | 0.13 | 0.11 | 0.85x |
| EBSDSegmentFeatures:All | 0.15 | 0.16 | 1.07x |
| CAxisSegmentFeatures:Face | 0.09 | 0.09 | 1.00x |
| CAxisSegmentFeatures:All | 0.10 | 0.12 | 1.20x |

*ScalarSegmentFeatures uses `PreferencesSentinel("Zarr", 189*201, true)` to force small OOC chunks (one XY slice per chunk). Neighbor Scheme uses `PreferencesSentinel("Zarr", 50, true)` with very small chunks. These aggressively exercise the OOC path beyond typical usage.

**OOC Speedup Summary:**

| Test | Before OOC (s) | After OOC (s) | Speedup |
|------|---------------|---------------|---------|
| ScalarSegmentFeatures | 105.57 | 3.61 | **29x** |
| EBSDSegmentFeatures:Face | 3.32 | 0.11 | **30x** |
| EBSDSegmentFeatures:All | 3.52 | 0.16 | **22x** |
| CAxisSegmentFeatures:Face | 2.65 | 0.09 | **29x** |
| CAxisSegmentFeatures:All | 3.01 | 0.12 | **25x** |

**In-Core Overhead:** The CCL algorithm adds ~2x overhead for the main ScalarSegmentFeatures test (0.22s -> 0.44s) due to the two-pass approach + in-memory provisional labels allocation vs single-pass DFS. The EBSD/CAxis tests show ~1.5-2x overhead. The tradeoff is justified by the 22-30x OOC speedup.

### Optimizations Applied (Shared by SegmentFeatures and FillBadData)

Three performance optimizations were applied to both algorithms:

1. **Vector-based UnionFind:** Replaced `std::unordered_map<int64, int64>` with contiguous `std::vector<int64>` for parent, rank, and size storage. Eliminates hash map overhead and provides cache-friendly O(1) indexed access.

2. **Path-halving compression:** `find()` uses path halving (`parent[x] = parent[parent[x]]`) while walking to the root, giving near-O(1) amortized lookups.

3. **In-memory provisional labels buffer:** Both algorithms use a dense `std::vector<int32>(totalVoxels, 0)` instead of `std::unordered_map<usize, int64>` for provisional labels. Backward neighbor lookups read from this buffer instead of the OOC featureIdsStore, eliminating cross-chunk reads entirely.

Additional SegmentFeatures-specific optimization:

4. **Merged Phase 2+3:** The resolution and relabeling phases are combined. A direct `std::vector<int32> labelToFinal` lookup table replaces per-voxel `find()` + `unordered_map` lookup during relabeling. This reduces the algorithm from 3 passes to 2 passes.

Additional FillBadData-specific optimization:

5. **Direct vector classification in Phase 3:** Replaced `unordered_map<int64, uint64>` for root sizes and `unordered_set<int64>` for small region tracking with a direct `std::vector<int8> isSmallRoot` lookup table indexed by provisional label. Each voxel's region classification is now a single O(1) array access.

### Filters Updated

| Filter | Plugin | OOC Slowdown (Before) | OOC Slowdown (After) |
|--------|--------|----------------------|---------------------|
| ScalarSegmentFeatures | SimplnxCore | 479.86x | 8.2x* |
| EBSDSegmentFeatures | OrientationAnalysis | 55-59x | 0.85-1.07x |
| CAxisSegmentFeatures | OrientationAnalysis | 44-50x | 1.0-1.2x |
| FillBadData | SimplnxCore | 49.29x | ~1.0x** |

*ScalarSegmentFeatures OOC slowdown is higher because the test forces aggressively small chunk sizes via PreferencesSentinel.
**FillBadData was already optimized with CCL in a prior commit; this update applies the same vector-based data structure optimizations.

### Implementation Details

**SegmentFeatures:**
- New `executeCCL()` method in `SegmentFeatures` base class handles the algorithm
- Subclasses implement two virtual methods: `isValidVoxel()` and `areNeighborsSimilar()`
- Old DFS `execute()` preserved for backward compatibility
- Supports both Face (6-neighbor) and FaceEdgeVertex (26-neighbor) connectivity schemes
- Phase 1 uses an in-memory `provisionalLabels` buffer instead of reading backward neighbor labels from the OOC featureIds store
- Phase 2 builds a direct lookup table and writes final labels in chunk-sequential order

**FillBadData:**
- Phase 1: In-memory `std::vector<int32>` provisional labels buffer replaces `std::unordered_map<usize, int64>`. Backward neighbor checks read `provisionalLabels[neighIdx] > 0` instead of `featureIdsStore[neighborIdx] == 0`, eliminating all cross-chunk OOC reads.
- Phase 3: Direct `std::vector<int8> isSmallRoot` classification replaces `unordered_map`/`unordered_set` hash lookups. Classification is propagated from roots to all labels for O(1) per-voxel lookup.
- Removed `#include <unordered_map>` and `#include <unordered_set>` from both .hpp and .cpp.

**Shared UnionFind utility:** `src/simplnx/Utilities/UnionFind.hpp`
- Vector-based storage with path-halving compression and union-by-rank
- Labels are contiguous positive integers starting from 1
- `flatten()` provides full path compression and size accumulation

**Test infrastructure:**
- `ScalarSegmentFeaturesTest` uses `PreferencesSentinel` to force small OOC chunk sizes, exercising the OOC code path

### Test Results

**In-Core (simplnx-Rel):**
- ScalarSegmentFeatures: 2/2 tests passed
- EBSDSegmentFeatures: 4/4 tests passed
- CAxisSegmentFeatures: 4/4 tests passed
- FillBadData: 10/10 tests passed

**Out-of-Core (simplnx-ooc-Rel):**
- ScalarSegmentFeatures: 2/2 tests passed
- EBSDSegmentFeatures: 4/4 tests passed
- CAxisSegmentFeatures: 4/4 tests passed
- FillBadData: 10/10 tests passed

### Files Changed

| File | Change |
|------|--------|
| `src/simplnx/Utilities/UnionFind.hpp` | NEW - Vector-based Union-Find with path compression |
| `CMakeLists.txt` | Added UnionFind.hpp to header list |
| `src/simplnx/Utilities/SegmentFeatures.hpp` | Added `executeCCL()`, `isValidVoxel()`, `areNeighborsSimilar()` |
| `src/simplnx/Utilities/SegmentFeatures.cpp` | Implemented two-phase CCL algorithm |
| `src/Plugins/SimplnxCore/.../FillBadData.hpp` | Replaced hash maps with vectors, uses shared UnionFind |
| `src/Plugins/SimplnxCore/.../FillBadData.cpp` | In-memory provisional labels, vector classification, no cross-chunk reads |
| `src/Plugins/SimplnxCore/.../ScalarSegmentFeatures.hpp` | Added CCL overrides |
| `src/Plugins/SimplnxCore/.../ScalarSegmentFeatures.cpp` | Implemented CCL overrides, `compare()` on functors |
| `src/Plugins/OrientationAnalysis/.../EBSDSegmentFeatures.hpp` | Added CCL overrides |
| `src/Plugins/OrientationAnalysis/.../EBSDSegmentFeatures.cpp` | Implemented CCL overrides |
| `src/Plugins/OrientationAnalysis/.../CAxisSegmentFeatures.hpp` | Added CCL overrides |
| `src/Plugins/OrientationAnalysis/.../CAxisSegmentFeatures.cpp` | Implemented CCL overrides, removed per-seed AM resize |
| `src/Plugins/SimplnxCore/test/ScalarSegmentFeaturesTest.cpp` | Added PreferencesSentinel for OOC chunk size testing |
Original file line number Diff line number Diff line change
Expand Up @@ -68,8 +68,10 @@ Result<> CAxisSegmentFeatures::operator()()
auto* active = m_DataStructure.getDataAs<UInt8Array>(m_InputValues->ActiveArrayPath);
active->fill(1);

// Run the segmentation algorithm
execute(imageGeometry);
// Run the CCL-based segmentation algorithm
auto& featureIdsStore = m_FeatureIdsArray->getDataStoreRef();
executeCCL(imageGeometry, featureIdsStore);

// Sanity check the result.
if(this->m_FoundFeatures < 1)
{
Expand Down Expand Up @@ -127,10 +129,7 @@ int64 CAxisSegmentFeatures::getSeed(int32 gnum, int64 nextSeed) const
}
if(seed >= 0)
{
auto& cellFeatureAM = m_DataStructure.getDataRefAs<AttributeMatrix>(m_InputValues->CellFeatureAttributeMatrixPath);
featureIds[static_cast<usize>(seed)] = gnum;
const ShapeType tDims = {static_cast<usize>(gnum) + 1};
cellFeatureAM.resizeTuples(tDims); // This will resize the active array
}
return seed;
}
Expand Down Expand Up @@ -182,3 +181,59 @@ bool CAxisSegmentFeatures::determineGrouping(int64 referencepoint, int64 neighbo
}
return group;
}

// -----------------------------------------------------------------------------
bool CAxisSegmentFeatures::isValidVoxel(int64 point) const
{
// Check mask
if(m_InputValues->UseMask && !m_GoodVoxelsArray->isTrue(point))
{
return false;
}
// Check that the voxel has a valid phase (> 0)
Int32Array& cellPhases = *m_CellPhases;
if(cellPhases[point] <= 0)
{
return false;
}
return true;
}

// -----------------------------------------------------------------------------
bool CAxisSegmentFeatures::areNeighborsSimilar(int64 point1, int64 point2) const
{
// The neighbor must also be valid
if(!isValidVoxel(point2))
{
return false;
}

Int32Array& cellPhases = *m_CellPhases;

// Must be same phase
if(cellPhases[point1] != cellPhases[point2])
{
return false;
}

// Calculate c-axis misalignment
const Eigen::Vector3f cAxis{0.0f, 0.0f, 1.0f};
Float32Array& quats = *m_QuatsArray;

const ebsdlib::QuatF q1(quats[point1 * 4], quats[point1 * 4 + 1], quats[point1 * 4 + 2], quats[point1 * 4 + 3]);
const ebsdlib::QuatF q2(quats[point2 * 4], quats[point2 * 4 + 1], quats[point2 * 4 + 2], quats[point2 * 4 + 3]);

const ebsdlib::OrientationMatrixFType oMatrix1 = q1.toOrientationMatrix();
const ebsdlib::OrientationMatrixFType oMatrix2 = q2.toOrientationMatrix();

Eigen::Vector3f c1 = oMatrix1.transpose() * cAxis;
Eigen::Vector3f c2 = oMatrix2.transpose() * cAxis;

c1.normalize();
c2.normalize();

float32 w = std::clamp(((c1[0] * c2[0]) + (c1[1] * c2[1]) + (c1[2] * c2[2])), -1.0F, 1.0F);
w = std::acos(w);

return w <= m_InputValues->MisorientationTolerance || (Constants::k_PiD - w) <= m_InputValues->MisorientationTolerance;
}
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,10 @@ class ORIENTATIONANALYSIS_EXPORT CAxisSegmentFeatures : public SegmentFeatures
int64 getSeed(int32 gnum, int64 nextSeed) const override;
bool determineGrouping(int64 referencePoint, int64 neighborPoint, int32 gnum) const override;

// CCL virtual method overrides
bool isValidVoxel(int64 point) const override;
bool areNeighborsSimilar(int64 point1, int64 point2) const override;

private:
const CAxisSegmentFeaturesInputValues* m_InputValues = nullptr;

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,10 @@ Result<> EBSDSegmentFeatures::operator()()
m_FeatureIdsArray = m_DataStructure.getDataAs<Int32Array>(m_InputValues->FeatureIdsArrayPath);
m_FeatureIdsArray->fill(0); // initialize the output array with zeros

// Run the segmentation algorithm
execute(gridGeom);
// Run the CCL-based segmentation algorithm
auto& featureIdsStore = m_FeatureIdsArray->getDataStoreRef();
executeCCL(gridGeom, featureIdsStore);

// Sanity check the result.
if(this->m_FoundFeatures < 1)
{
Expand Down Expand Up @@ -152,3 +154,55 @@ bool EBSDSegmentFeatures::determineGrouping(int64 referencePoint, int64 neighbor

return group;
}

// -----------------------------------------------------------------------------
bool EBSDSegmentFeatures::isValidVoxel(int64 point) const
{
// Check mask
if(m_InputValues->UseMask && !m_GoodVoxelsArray->isTrue(point))
{
return false;
}
// Check that the voxel has a valid phase (> 0)
AbstractDataStore<int32>& cellPhases = m_CellPhases->getDataStoreRef();
if(cellPhases[point] <= 0)
{
return false;
}
return true;
}

// -----------------------------------------------------------------------------
bool EBSDSegmentFeatures::areNeighborsSimilar(int64 point1, int64 point2) const
{
// The neighbor must also be valid
if(!isValidVoxel(point2))
{
return false;
}

AbstractDataStore<int32>& cellPhases = m_CellPhases->getDataStoreRef();

// Must be same phase
if(cellPhases[point1] != cellPhases[point2])
{
return false;
}

// Check crystal structure validity
int32 laueClass = (*m_CrystalStructures)[cellPhases[point1]];
if(static_cast<usize>(laueClass) >= m_OrientationOps.size())
{
return false;
}

// Calculate misorientation
Float32Array& quats = *m_QuatsArray;
const ebsdlib::QuatD q1(quats[point1 * 4], quats[point1 * 4 + 1], quats[point1 * 4 + 2], quats[point1 * 4 + 3]);
const ebsdlib::QuatD q2(quats[point2 * 4], quats[point2 * 4 + 1], quats[point2 * 4 + 2], quats[point2 * 4 + 3]);

ebsdlib::AxisAngleDType axisAngle = m_OrientationOps[laueClass]->calculateMisorientation(q1, q2);
float w = static_cast<float>(axisAngle[3]);

return w < m_InputValues->MisorientationTolerance;
}
Original file line number Diff line number Diff line change
Expand Up @@ -56,27 +56,13 @@ class ORIENTATIONANALYSIS_EXPORT EBSDSegmentFeatures : public SegmentFeatures
Result<> operator()();

protected:
/**
* @brief
* @param data
* @param args
* @param gnum
* @param nextSeed
* @return int64
*/
int64_t getSeed(int32 gnum, int64 nextSeed) const override;

/**
* @brief
* @param data
* @param args
* @param referencepoint
* @param neighborpoint
* @param gnum
* @return bool
*/
bool determineGrouping(int64 referencePoint, int64 neighborPoint, int32 gnum) const override;

// CCL virtual method overrides
bool isValidVoxel(int64 point) const override;
bool areNeighborsSimilar(int64 point1, int64 point2) const override;

private:
const EBSDSegmentFeaturesInputValues* m_InputValues = nullptr;
Float32Array* m_QuatsArray = nullptr;
Expand Down
Loading
Loading