Skip to content

Conversation

@joto
Copy link
Collaborator

@joto joto commented Oct 29, 2025

We have had some code in osm2pgsql basically forever that is supposed to handle the case where a (multi)polygon crosses the 180° line (antimeridian).

The idea is as follows: Lets say we have a polygon (like a small island) straddling the 180° line:

+---------------------+
|                     |
|                     |
|>                   <|
|                     |
+---------------------+

If we calculate the bounding box by taking the smallest and the largest longitude value for that polygon, we'll get this:

+---------------------+
|                     |
| ___________________ |
|[___________________]|
|                     |
+---------------------+

To detect this, we check the width of the bounding box. If it is larger than half the circumference of the earth, we assume that we have a case like this.

Then we create two bounding boxes, one going from -180° to the left edge of the original bounding box, one from the right edge of the original bounding box to +180° like this:

+---------------------+
|                     |
|                     |
|]                   [|
|                     |
+---------------------+

The same code is run for multipolygon relations.

So far, so good. The problem is that this is not how we map things. We don't have polygons crossing the 180° line in OSM, because they lead to all sorts of problems when using the data.

Instead mapping is done using multipolygon relations. The polygon is split into two, one to the left and one to the right of the 180° line.

The effect is that we have two polygons, both touching the 180° line. When we use this algorithm, it calculates the bounding box as going around the whole earth from -180° to +180°, then decides that this is more than half the circumference of the earth, so it creates two new bounding boxes with zero width, one at -180°, one at +180°. That doesn't make any sense.

So this code handles cases we don't have in our current planet file, but does the wrong thing for some cases we do have. It comes from way back, probably before we had relations, we should retire it.

We might still want some checks in the future to catch overlarge bounding boxes to defend against bad edits that will lead to a large number of tiles to be expired, but that is a different issue.

We have had some code in osm2pgsql basically forever that is supposed to
handle the case where a (multi)polygon crosses the 180° line
(antimeridian).

The idea is as follows: Lets say we have a polygon (like a small island)
straddling the 180° line:

+---------------------+
|                     |
|                     |
|>                   <|
|                     |
+---------------------+

If we calculate the bounding box by taking the smallest and the largest
longitude value for that polygon, we'll get this:

+---------------------+
|                     |
| ___________________ |
|[___________________]|
|                     |
+---------------------+

To detect this, we check the width of the bounding box. If it is larger
than half the circumference of the earth, we assume that we have a case
like this.

Then we create two bounding boxes, one going from -180° to the left edge
of the original bounding box, one from the right edge of the original
bounding box to +180° like this:

+---------------------+
|                     |
|                     |
|]                   [|
|                     |
+---------------------+

The same code is run for multipolygon relations.

So far, so good. The problem is that this is not how we map things. We
don't have polygons crossing the 180° line in OSM, because they lead to
all sorts of problems when using the data.

Instead mapping is done using multipolygon relations. The polygon is
split into two, one to the left and one to the right of the 180° line.

The effect is that we have two polygons, both touching the 180° line.
When we use this algorithm, it calculates the bounding box as going
around the whole earth from -180° to +180°, then decides that this is
more than half the circumference of the earth, so it creates two new
bounding boxes with zero width, one at -180°, one at +180°. That doesn't
make any sense.

So this code handles cases we don't have in our current planet file, but
does the wrong thing for some cases we do have. It comes from way back,
probably before we had relations, we should retire it.
@lonvia
Copy link
Collaborator

lonvia commented Oct 31, 2025

If I understand you explanation right, then removing this code means that instead of a zero-size bounding box, the code will work with a 360-degree-size bounding box.

Isn't the actual issue here that we don't split multipolygons into separate outer before doing the expiry?

@joto
Copy link
Collaborator Author

joto commented Oct 31, 2025

We should split the multipolygons first for the creation of expire tiles. But the current code is somewhat complex and doesn't allow that easily and we need to deal with the backwards compatibility issue that currently we are using the whole bounding box to determine the cutoff in the hybrid case. We also should avoid calculating the bounding box twice (once on the whole mp, once for all the polygons). All of this is easier to reason about if we clean up the code first.

Splitting into the mps would fix most of the cases where the current code doesn't work, but not all of them. We do have at least one mp that goes 360° here that this code will break. I couldn't find a single case where this code acts as it was intended to do, so I believe we should just remove it.

There is the separate issue of how to handle too large expire areas as discussed here but that's a separate issue. The current code saves us from expiring the whole planet but still allows half the planet which is still way too big to be sensible.

@lonvia
Copy link
Collaborator

lonvia commented Oct 31, 2025

Removing this code means we go from wrongly expiring nothing to expiring all tiles along the affected latitude. The first might be slightly annoying but obviously not enough that anybody has noticed in a decade. The latter means that expiry work might increase significantly. That's not acceptable in my opinion. Not as a standalone PR. There must be another way to go about this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants