Skip to content

ST_Distance on Projected Geometries gives false results #693

@feyzicaneser

Description

@feyzicaneser

Hello, I'm finding that ST_Distance on geometries projected in utm_zones (eg:EPSG:32618) or conus projection (EPSG:5070) is giving inconsistent result : Namely , ST_Distance returns 0 even when ST_Intersects on the projected geometries return false, and the geometries indeed do not intersect

Also I had raised a somewhat similar issue thread months ago here which was resolved

Steps to reproduce:
Below are datasets of US counties (from www2.census.gov) and EPA US Superfund sites data (publicly available here)
All geometries are in EPSG:4326
counties.parquet.zip

test_features.parquet.zip - updated file here

I've made two different tests to highlight the issue with both EPSG:5070 and EPSG:32618. The first should be enough to highlight the issue but wanted to be thorough.

Test 1: EPSG:5070
We load all our features in Continental US (a handful are in HI and AK) and county data for Maryland (MD)
load spatial;

create or replace table counties as select  state, fips, geometry as county_geom
from read_parquet("~/Downloads/counties.parquet") where state not in ('HI', 'AK');
create or replace table features as select * from read_parquet("~/Downloads/test_features.parquet")
WHERE ST_Intersects(geometry, (select ST_Union_Agg(county_geom) from counties));
create or replace table md_counties as select  state, fips, county_geom
from counties where state = 'MD';

We cross join all the features on MD counties and calculate the projections

create or replace table features_all_counties
as select
f.OBJECTID as feature_id,
f.geometry as feature_geom,
ST_Transform(feature_geom, 'EPSG:4326', 'EPSG:5070', TRUE) as feature_geom_conus,
c.fips,
c.county_geom,
ST_Transform(c.county_geom, 'EPSG:4326', 'EPSG:5070', TRUE) as county_geom_conus
from features f CROSS JOIN md_counties c;

For every feature/county combination we calculate the results of ST_Intersects and ST_Distance with conus geometries and for EPSG:4326 geometries (using ST_DistanceSpheroid)

create or replace table diffs_conus
as 
SELECT feature_id, fips, intersects, intersects_conus,
distance_to_fips_m_spheroid, 
distance_to_fips_m_conus
,diff 
FROM
(
select
  *,
  ST_Intersects(feature_geom, county_geom) as intersects,
  ST_Intersects(feature_geom_conus, county_geom_conus) as intersects_conus,
  ST_FlipCoordinates (ST_ShortestLine (feature_geom, county_geom)) AS shortest_line,
  ST_PointN (shortest_line, 1) AS p1,
  ST_PointN (shortest_line, 2) AS p2,
  ROUND(ST_Distance_Spheroid (p1, p2)) AS distance_to_fips_m_spheroid,
  ROUND(ST_Distance(feature_geom_conus, county_geom_conus)) AS distance_to_fips_m_conus,
  ROUND((distance_to_fips_m_conus - distance_to_fips_m_spheroid)/distance_to_fips_m_conus *100,2) AS diff
FROM features_all_counties
)

At this point you can inspect the resulting table to see the issue, for example

select * from diffs_conus
where intersects_conus = False and distance_to_fips_m_conus =0 ;

gives you

Image

Also note that there is no case where intersects and intersects_conus mismatch, or no case where distance_conus =0 and intersects_conus is False.

Test 2: EPSG 32618
We can also reproduce the error in this utm zone, which is valid in most of MD and PA. Note: The UTM zones were calculated using the standard methodology here
We get all the counties MD and PA in this zone

CREATE OR REPLACE TABLE utm_counties AS select 
state,
fips,
utm_crs,
geometry as county_geom,
ST_Transform(county_geom, 'EPSG:4326', utm_crs, TRUE) as county_geom_utm,
from read_parquet("~/Downloads/counties.parquet")
where 
utm_crs = 'EPSG:32618' and state in ('MD', 'PA');

We get all the features within these counties

create or replace table features_in_zone as
  SELECT
    OBJECTID AS feature_id,
    geometry AS feature_geom
  FROM features 
  WHERE ST_Intersects(geometry, ( select st_union_AGG(county_geom) from utm_counties));

We again cross join

CREATE OR REPLACE TABLE features_cross_counties AS
SELECT
  fz.feature_id,
  fz.feature_geom,
  ST_Transform(fz.feature_geom, 'EPSG:4326', c.utm_crs, TRUE)     AS feature_geom_utm,
  c.fips,
  c.county_geom,
  c.county_geom_utm,
FROM features_in_zone fz
CROSS JOIN utm_counties c;

We recreate the diff table

create or replace table features_counties_distances_utm
as 
SELECT feature_id, fips, intersects, intersects_utm,
distance_to_fips_m_spheroid, 
distance_to_fips_m_utm
,diff, 
FROM
(
select
  *,
  ST_Intersects(feature_geom, county_geom) as intersects,
  ST_Intersects(feature_geom_utm, county_geom_utm) as intersects_utm,
  ST_FlipCoordinates (ST_ShortestLine (feature_geom, county_geom)) AS shortest_line,
  ST_PointN (shortest_line, 1) AS p1,
  ST_PointN (shortest_line, 2) AS p2,
  ROUND(ST_Distance_Spheroid (p1, p2)) AS distance_to_fips_m_spheroid,
  ROUND(ST_Distance(feature_geom_utm, county_geom_utm)) AS distance_to_fips_m_utm,
  ROUND((distance_to_fips_m_utm - distance_to_fips_m_spheroid)/distance_to_fips_m_utm*100, 2) AS diff,
FROM features_cross_counties
);

We can see the same issue

select * from features_counties_distances_utm 
where intersects_utm = False and distance_to_fips_m_utm = 0
Image

__

I am using duckdb 1.4.1 with the latest spatial extension from core.

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions