Skip to content

Conversation

@rabii-chaarani
Copy link
Member

Description

Added the AlongSection thickness calculator that estimates thickness along cross-sections defined manually by the user.

Fixes #(issue)

Type of change

  • Documentation update
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Test improvement

How Has This Been Tested?

Please describe any tests that you ran to verify your changes.
Provide branch name so we can reproduce.

Checklist:

  • This branch is up-to-date with master
  • All gh-action checks are passing
  • I have performed a self-review of my own code
  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • My tests run with pytest from the map2loop folder
  • New and existing tests pass locally with my changes

Checklist continued (if PR includes changes to documentation)

  • I have built the documentation locally with make.bat
  • I have built this documentation in docker, following the docker configuration in map2loop/docs
  • I have checked my spelling and grammar

@lachlangrose lachlangrose removed their assignment Nov 13, 2025
Copilot finished reviewing on behalf of lachlangrose November 13, 2025 01:24
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a new AlongSection thickness calculator that estimates true stratigraphic thicknesses along user-defined cross-section lines. The calculator intersects section lines with geology polygons, identifies unit boundaries, and computes thickness using nearby dip measurements.

Key Changes:

  • Added four new utility functions in utils.py for geometry processing and spatial queries
  • Implemented the AlongSection class that extends ThicknessCalculator with section-based thickness estimation
  • Integrated spatial indexing (KDTree) for efficient nearest-neighbor dip queries

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 9 comments.

File Description
map2loop/utils.py Added utility functions (clean_line_geometry, iter_line_segments, nearest_orientation_to_line, segment_measure_range) to support geometry cleaning, segment extraction, and spatial queries for the new thickness calculator
map2loop/thickness_calculator.py Implemented AlongSection class with comprehensive compute method that intersects section lines with geology polygons, filters segments at unit boundaries, queries nearest dip measurements, and aggregates thickness statistics

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 592 to 797
def clean_line_geometry(geometry: shapely.geometry.base.BaseGeometry):
"""Clean and normalize a Shapely geometry to a single LineString or None.
Args:
geometry (shapely.geometry.base.BaseGeometry | None): Input geometry to be
cleaned. Accepted inputs include ``LineString``, ``MultiLineString``,
``GeometryCollection``, and other Shapely geometries. If ``None`` or
an empty geometry is provided, the function returns ``None``.
Returns:
shapely.geometry.LineString or None: A single ``LineString`` if the
geometry is already a ``LineString`` or can be merged/converted into one.
If the input is ``None``, empty, or cannot be converted into a valid
``LineString``, the function returns ``None``.
Raises:
None: Exceptions raised by ``shapely.ops.linemerge`` are caught and result
in a ``None`` return rather than being propagated.
Notes:
- The function uses ``shapely.ops.linemerge`` to attempt to merge multipart
geometries into a single ``LineString``.
- If ``linemerge`` returns a ``MultiLineString``, the longest constituent
``LineString`` (by ``length``) is returned.
- Geometries that are already ``LineString`` are returned unchanged.
Examples:
>>> from shapely.geometry import LineString
>>> clean_line_geometry(None) is None
True
>>> ls = LineString([(0, 0), (1, 1)])
>>> clean_line_geometry(ls) is ls
True
"""
if geometry is None or geometry.is_empty:
return None
if geometry.geom_type == "LineString":
return geometry
try:
merged = shapely.ops.linemerge(geometry)
except Exception:
return None
if merged.geom_type == "LineString":
return merged
if merged.geom_type == "MultiLineString":
try:
return max(merged.geoms, key=lambda geom: geom.length)
except ValueError:
return None
return None

def iter_line_segments(geometry):
"""Produce all LineString segments contained in a Shapely geometry.
Args:
geometry (shapely.geometry.base.BaseGeometry | None): Input geometry. Accepted
types include ``LineString``, ``MultiLineString``, ``GeometryCollection``
and other Shapely geometries that may contain line parts. If ``None`` or
an empty geometry is provided, an empty list is returned.
Returns:
list[shapely.geometry.LineString]: A list of ``LineString`` objects extracted
from the input geometry. Behavior by input type:
- ``LineString``: returned as a single-element list.
- ``MultiLineString``: returns the non-zero-length constituent parts.
- ``GeometryCollection``: recursively extracts contained line segments.
- Other geometry types: returns an empty list if no line segments found.
Notes:
Zero-length segments are filtered out. The function is defensive and
will return an empty list for ``None`` or empty geometries rather than
raising an exception.
Examples:
>>> from shapely.geometry import LineString
>>> iter_line_segments(LineString([(0, 0), (1, 1)]))
[LineString([(0, 0), (1, 1)])]
"""
if geometry is None or geometry.is_empty:
return []
if isinstance(geometry, shapely.geometry.LineString):
return [geometry]
if isinstance(geometry, shapely.geometry.MultiLineString):
return [geom for geom in geometry.geoms if geom.length > 0]
if isinstance(geometry, shapely.geometry.GeometryCollection):
segments = []
for geom in geometry.geoms:
segments.extend(iter_line_segments(geom))
return segments
return []

def nearest_orientation_to_line(orientation_tree, orientation_dips, orientation_coords, line_geom: shapely.geometry.LineString):
"""Find the nearest orientation measurement to the midpoint of a line.
This function queries the provided spatial index and orientation arrays and
returns the dip value, the index of the matched orientation, and the
distance from the line to that orientation point.
Args:
orientation_tree: Spatial index object (e.g., KDTree) with a ``query``
method accepting an (x, y) tuple and optional ``k``. Typically
provided by the caller so the utility remains stateless.
orientation_dips (Sequence[float]): Sequence of dip values aligned with
``orientation_coords``.
orientation_coords (Sequence[tuple]): Sequence of (x, y) coordinate
tuples for each orientation measurement.
line_geom (shapely.geometry.LineString): Line geometry for which the
nearest orientation should be found. The midpoint (interpolated at
50% of the line length) is used as the query point; if the
midpoint is empty, the line centroid is used as a fallback.
Returns:
tuple: A tuple ``(dip, index, distance)`` where:
- ``dip`` (float or numpy.nan): the dip value from
``orientation_dips`` at the nearest orientation. Returns
``numpy.nan`` if no valid orientation can be found or on error.
- ``index`` (int or None): the integer index into the orientation
arrays for the best match, or ``None`` if not found.
- ``distance`` (float or None): the shortest geometric distance
between ``line_geom`` and the matched orientation point, or
``None`` if not available.
Notes:
- The function queries up to 5 nearest neighbors (or fewer if fewer
orientations exist) and then computes the actual geometry distance
between the ``line_geom`` and each candidate point to select the
closest match. Any exceptions from the spatial query are caught and
result in ``(numpy.nan, None, None)`` being returned instead of
propagating the exception.
- The function is defensive: invalid or empty geometries return
``(numpy.nan, None, None)`` rather than raising.
Examples:
>>> dip, idx, dist = nearest_orientation_to_line(tree, dips, coords, my_line)
>>> if idx is not None:
... print(f"Nearest dip={dip} at index={idx} (distance={dist})")
"""
if orientation_tree is None or orientation_dips is None or orientation_coords is None:
return numpy.nan, None, None
midpoint = line_geom.interpolate(0.5, normalized=True)
if midpoint.is_empty:
midpoint = line_geom.centroid
if midpoint.is_empty:
return numpy.nan, None, None
query_xy = (float(midpoint.x), float(midpoint.y))
k = min(len(orientation_dips), 5)
try:
distances, indices = orientation_tree.query(query_xy, k=k)
except Exception:
return numpy.nan, None, None
distances = numpy.atleast_1d(distances)
indices = numpy.atleast_1d(indices)
best_idx = None
best_dist = numpy.inf
for _approx_dist, idx in zip(distances, indices):
if idx is None:
continue
candidate_point = shapely.geometry.Point(orientation_coords[int(idx)])
actual_dist = line_geom.distance(candidate_point)
if actual_dist < best_dist:
best_dist = actual_dist
best_idx = int(idx)
if best_idx is None:
return numpy.nan, None, None
return float(orientation_dips[best_idx]), best_idx, best_dist

def segment_measure_range(parent_line: shapely.geometry.LineString, segment: shapely.geometry.LineString):
"""Compute projected measures of a segment's end points along a parent line.
This function projects the start and end points of ``segment`` onto
``parent_line`` using Shapely's ``project`` method and returns the two
measures in ascending order (start <= end). Measures are distances along
the parent line's linear reference (units of the geometry's CRS).
Args:
parent_line (shapely.geometry.LineString): The reference line onto which
the segment end points will be projected.
segment (shapely.geometry.LineString): The segment whose first and last
vertices will be projected onto ``parent_line``. The function uses
``segment.coords[0]`` and ``segment.coords[-1]`` as the start and
end points respectively.
Returns:
tuple(float, float): A tuple ``(start_measure, end_measure)`` containing
the projected distances along ``parent_line`` for the segment's start
and end points. The values are ordered so that ``start_measure <= end_measure``.
Notes:
- If ``segment`` is degenerate (e.g., a single point), both measures may
be equal.
- The returned measures are in the same linear units as the input
geometries (e.g., metres for projected CRS).
- No validation is performed on the inputs; callers should ensure both
geometries are valid and share a common CRS.
Examples:
>>> start_m, end_m = segment_measure_range(parent_line, segment)
>>> assert start_m <= end_m
"""
start_point = shapely.geometry.Point(segment.coords[0])
end_point = shapely.geometry.Point(segment.coords[-1])
start_measure = parent_line.project(start_point)
end_measure = parent_line.project(end_point)
if end_measure < start_measure:
start_measure, end_measure = end_measure, start_measure
return start_measure, end_measure
Copy link

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new utility functions are missing the @beartype.beartype decorator that is consistently applied to other functions in this file (e.g., lines 16, 89, 124, 140, etc.). For consistency with the project's style and to enable runtime type checking, consider adding the @beartype.beartype decorator to these functions: clean_line_geometry, iter_line_segments, nearest_orientation_to_line, and segment_measure_range.

Copilot uses AI. Check for mistakes.
Comment on lines +979 to +984
if prev_unit is None or next_unit is None:
continue
if prev_unit == segment["unit"] or next_unit == segment["unit"]:
continue
if prev_unit == next_unit:
continue
Copy link

Copilot AI Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The segment filtering logic may not correctly identify all segments that represent a unit boundary. The current conditions (lines 979-984) skip segments where:

  1. Either neighbor is None (edge segments)
  2. The segment's unit matches either neighbor
  3. Both neighbors are the same unit

However, this logic assumes that a valid thickness measurement requires the segment to be sandwiched between two different neighboring units, neither of which is the segment's own unit. This means segments where the unit itself is flanked by two different units (segment unit != prev_unit and segment unit != next_unit and prev_unit != next_unit) are kept. Consider documenting this assumption more clearly in the code or the docstring, as it represents a specific geological interpretation.

Copilot uses AI. Check for mistakes.
lachlangrose and others added 2 commits November 13, 2025 12:35
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Member

@lachlangrose lachlangrose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rabii looks good a couple of comments but nothing major

)
return units

unit_column_candidates = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not already consistent in map2loop? I think the thickness calculators/samplers should all assume the data is appropriately formatted.

except Exception as e:
logger.error("Failed to create spatial index for geology data: %s", e, exc_info=True)
geology_sindex = None

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be worth having a check to make sure the segments are in the same place as the polygons? Maybe just check that the extents overlap?

orientation_tree = None
default_dip_warning_emitted = False

units_lookup = dict(zip(units["name"], units.index))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are allowing different colum names this need to also change

split_segments_by_section: dict = defaultdict(list)

for section_idx, section_row in sections.iterrows():
line = clean_line_geometry(section_row.geometry)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to merge multi-linestrings in this case or treat them as separate lines? I am not sure

if prev_unit == next_unit:
continue

dip_value, orientation_idx, orientation_distance = nearest_orientation_to_line(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be buffered? in case there is nothing nearby?

)
default_dip_warning_emitted = True

thickness = segment["length"] * abs(math.sin(math.radians(dip_value)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to calculate apparent dip?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants