Prefer country over locality when deduping#1713
Merged
orangejulius merged 1 commit intomasterfrom Dec 12, 2025
Merged
Conversation
This is a simple change to prefer country records over localities when we are deduplicating two matching records. Note that this doesn't change _when_ deduplication will occur, just which record we prefer when doing so and having identified two records as duplicates. It mostly affects city-states and small countries like Singapore, Hong-Kong, etc. Some not-quite-city-states like Luxembourg are also affected. It's a bit of a stylistic change, in these cases either the country or locality is _technically_ correct. Some things this fixes: - Previously, Mexico City and Meixco were both deduplicated to Mexico City. Thus there was nothing you could type in to get Mexico the country. Now, you'll get the country until you type 'Mexico C'. This is still not ideal, we really should not deduplicate those two records, but for now this is at least better. - We've found there are some junk locality records out there that match country names, and this conveniently removes all of them.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is a simple change to prefer country records over localities when we are deduplicating two matching records.
Note that this doesn't change when deduplication will occur, just which record we prefer when doing so and having identified two records as duplicates.
It mostly affects city-states and small countries like Singapore, Hong-Kong, etc. Some not-quite-city-states like Luxembourg are also affected.
It's a bit of a stylistic change, in these cases either the country or locality is technically correct.
Some things this fixes: