-
Notifications
You must be signed in to change notification settings - Fork 69
Description
Hello, I'm finding that ST_Distance on geometries projected in utm_zones (eg:EPSG:32618) or conus projection (EPSG:5070) is giving inconsistent result : Namely , ST_Distance returns 0 even when ST_Intersects on the projected geometries return false, and the geometries indeed do not intersect
Also I had raised a somewhat similar issue thread months ago here which was resolved
Steps to reproduce:
Below are datasets of US counties (from www2.census.gov) and EPA US Superfund sites data (publicly available here)
All geometries are in EPSG:4326
counties.parquet.zip
test_features.parquet.zip - updated file here
I've made two different tests to highlight the issue with both EPSG:5070 and EPSG:32618. The first should be enough to highlight the issue but wanted to be thorough.
Test 1: EPSG:5070
We load all our features in Continental US (a handful are in HI and AK) and county data for Maryland (MD)
load spatial;
create or replace table counties as select state, fips, geometry as county_geom
from read_parquet("~/Downloads/counties.parquet") where state not in ('HI', 'AK');
create or replace table features as select * from read_parquet("~/Downloads/test_features.parquet")
WHERE ST_Intersects(geometry, (select ST_Union_Agg(county_geom) from counties));
create or replace table md_counties as select state, fips, county_geom
from counties where state = 'MD';
We cross join all the features on MD counties and calculate the projections
create or replace table features_all_counties
as select
f.OBJECTID as feature_id,
f.geometry as feature_geom,
ST_Transform(feature_geom, 'EPSG:4326', 'EPSG:5070', TRUE) as feature_geom_conus,
c.fips,
c.county_geom,
ST_Transform(c.county_geom, 'EPSG:4326', 'EPSG:5070', TRUE) as county_geom_conus
from features f CROSS JOIN md_counties c;
For every feature/county combination we calculate the results of ST_Intersects and ST_Distance with conus geometries and for EPSG:4326 geometries (using ST_DistanceSpheroid)
create or replace table diffs_conus
as
SELECT feature_id, fips, intersects, intersects_conus,
distance_to_fips_m_spheroid,
distance_to_fips_m_conus
,diff
FROM
(
select
*,
ST_Intersects(feature_geom, county_geom) as intersects,
ST_Intersects(feature_geom_conus, county_geom_conus) as intersects_conus,
ST_FlipCoordinates (ST_ShortestLine (feature_geom, county_geom)) AS shortest_line,
ST_PointN (shortest_line, 1) AS p1,
ST_PointN (shortest_line, 2) AS p2,
ROUND(ST_Distance_Spheroid (p1, p2)) AS distance_to_fips_m_spheroid,
ROUND(ST_Distance(feature_geom_conus, county_geom_conus)) AS distance_to_fips_m_conus,
ROUND((distance_to_fips_m_conus - distance_to_fips_m_spheroid)/distance_to_fips_m_conus *100,2) AS diff
FROM features_all_counties
)
At this point you can inspect the resulting table to see the issue, for example
select * from diffs_conus
where intersects_conus = False and distance_to_fips_m_conus =0 ;
gives you
Also note that there is no case where intersects and intersects_conus mismatch, or no case where distance_conus =0 and intersects_conus is False.
Test 2: EPSG 32618
We can also reproduce the error in this utm zone, which is valid in most of MD and PA. Note: The UTM zones were calculated using the standard methodology here
We get all the counties MD and PA in this zone
CREATE OR REPLACE TABLE utm_counties AS select
state,
fips,
utm_crs,
geometry as county_geom,
ST_Transform(county_geom, 'EPSG:4326', utm_crs, TRUE) as county_geom_utm,
from read_parquet("~/Downloads/counties.parquet")
where
utm_crs = 'EPSG:32618' and state in ('MD', 'PA');
We get all the features within these counties
create or replace table features_in_zone as
SELECT
OBJECTID AS feature_id,
geometry AS feature_geom
FROM features
WHERE ST_Intersects(geometry, ( select st_union_AGG(county_geom) from utm_counties));
We again cross join
CREATE OR REPLACE TABLE features_cross_counties AS
SELECT
fz.feature_id,
fz.feature_geom,
ST_Transform(fz.feature_geom, 'EPSG:4326', c.utm_crs, TRUE) AS feature_geom_utm,
c.fips,
c.county_geom,
c.county_geom_utm,
FROM features_in_zone fz
CROSS JOIN utm_counties c;
We recreate the diff table
create or replace table features_counties_distances_utm
as
SELECT feature_id, fips, intersects, intersects_utm,
distance_to_fips_m_spheroid,
distance_to_fips_m_utm
,diff,
FROM
(
select
*,
ST_Intersects(feature_geom, county_geom) as intersects,
ST_Intersects(feature_geom_utm, county_geom_utm) as intersects_utm,
ST_FlipCoordinates (ST_ShortestLine (feature_geom, county_geom)) AS shortest_line,
ST_PointN (shortest_line, 1) AS p1,
ST_PointN (shortest_line, 2) AS p2,
ROUND(ST_Distance_Spheroid (p1, p2)) AS distance_to_fips_m_spheroid,
ROUND(ST_Distance(feature_geom_utm, county_geom_utm)) AS distance_to_fips_m_utm,
ROUND((distance_to_fips_m_utm - distance_to_fips_m_spheroid)/distance_to_fips_m_utm*100, 2) AS diff,
FROM features_cross_counties
);
We can see the same issue
select * from features_counties_distances_utm
where intersects_utm = False and distance_to_fips_m_utm = 0
__
I am using duckdb 1.4.1 with the latest spatial extension from core.
