Skip to content

Conversation

@jayohday
Copy link
Collaborator

@jayohday jayohday commented Jan 7, 2023

This Python script goes through all necessary and applicable data-cleaning steps for inclusion in The Accountability Project. This Python script produces a file similar to, but not exactly the same as, the airmen files previously produced for TAP. This file contains a few different fields created out of raw data fields:

  1. uszip5 - the 5-digit US ZIP created out of the 9-digit ZIP
  2. nonuszip - a field for all non-US ZIP codes
  3. Fields for each the month, date and year (where applicable) from any and all date fields.

This script also combines the PILOT_BASIC.csv and NONPILOT_BASIC.csv files.

There are two other airmen files included in the .zip file downloaded from the FAA — they are NONPILOT_CERT.csv and PILOT_CERT.csv — those files are not currently included in this script, but O'Dea can add them next week. The file on TAP right now seems to only include data from the _BASIC files, but the UNIQUEID field should allow for joining of all four files, if we want that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants