-
Notifications
You must be signed in to change notification settings - Fork 295
Description
I am following the tutorial Tasty Bytes - Zero to Snowflake - Geospatial with my own data and I am not getting the right answer in Step 4. I am missing one comparison. I have 5 places and I am only getting 9 distances. I should get 10.
Qualify window function filter doesn't work for the first element as Lag is null (something <> null is not true)
WITH _top_10_locations AS
(
SELECT TOP 10
location_id,
ST_MAKEPOINT(longitude, latitude) AS geo_point,
SUM(price) AS total_sales_usd
FROM analytics.orders_v
WHERE primary_city = 'Paris'
GROUP BY location_id, latitude, longitude
ORDER BY total_sales_usd DESC
)
SELECT
a.location_id,
b.location_id,
ROUND(ST_DISTANCE(a.geo_point, b.geo_point)/1609,2) AS geography_distance_miles,
ROUND(ST_DISTANCE(a.geo_point, b.geo_point)/1000,2) AS geography_distance_kilometers
FROM _top_10_locations a
JOIN _top_10_locations b
ON a.location_id <> b.location_id -- avoid calculating the distance between the point itself
QUALIFY a.location_id <> LAG(b.location_id) OVER (ORDER BY geography_distance_miles) -- avoid duplicate: a to b, b to a distances
ORDER BY geography_distance_miles;
My suggestion:
WITH _top_10_locations AS
(
SELECT TOP 10
location_id,
ST_MAKEPOINT(longitude, latitude) AS geo_point,
SUM(price) AS total_sales_usd
FROM analytics.orders_v
WHERE primary_city = 'Paris'
GROUP BY location_id, latitude, longitude
ORDER BY total_sales_usd DESC
)
SELECT
a.location_id,
b.location_id,
ROUND(ST_DISTANCE(a.geo_point, b.geo_point)/1609,2) AS geography_distance_miles,
ROUND(ST_DISTANCE(a.geo_point, b.geo_point)/1000,2) AS geography_distance_kilometers
FROM _top_10_locations a
JOIN _top_10_locations b
ON a.location_id < b.location_id -- avoid calculating the distance between the point itself. Also avoid duplicate: a to b, b to a
ORDER BY geography_distance_miles;