-
Notifications
You must be signed in to change notification settings - Fork 17
Description
Right now, when a user selects multiple counties, it averages the percentiles to get the new value. Mathematically, this is incorrect. We need to recalculate based on the estimates.
Per SVI documentation, CDC uses the excel function PERCENTRANK.INC on the corresponding EP field with 4 significant digits. Unfortunately (most of) the EP fields are also a percentage.
To be 100% accurate, we will need to go back to the estimate field, recalculate the percentage for that multi-county/tract selection, then calculate the percentile using that new percentage as it compares to the rest of the counties or tracts. The EP field calculation is also in the documentation above.
Including this data will not be possible at the tract level, as our mbtiles file is already at the maximum file size. We could do this at the county level though.
At the tract level, some fields we will be able to do this anyway (EP_PCI is the estimate, not the percentage, so we can still aggregate this accurately). Some fields we just need to multiply by the population estimate to get the correct estimate. But there will be some fields we cannot calculate an accurate percentile (ex. EP_CROWD, which requires the estimated household units as the quotient).
I'll work on identifying which fields we can calculate with the data we already have (and how to do so), and which fields we cannot calculate accurately with our current data.