This repository contains infrastructure for downloading, pre-processing, and seeding a PostGIS database for serving a backend that provides county-level climate data for the United States. This work is part of the American Resiliency community and is associated with NCACounties, the frontend component of this project.
The download_sources.py script fetches all required data sources associated with this project. This step is not included in the Dockerfile itself to maintain accessibility for those without AWS access. You will need to be a user on the American Resiliency AWS account to download the NCA Atlas geojson files directly, otherwise you will need to download them manually from the following website: NCA Atlas Climate Data. If you have generated an access key for your user on AWS, you can set up the environment for the script like so:
# Follow the instructions to enter your access key and secret key, region is us-east-2, format can be skipped
aws configure
python3 -m venv venv
source venv/bin/activate
python3 -m pip install -r requirements.txt
scripts/download_sources.pyAll source data files should now be located in the data/sources directory (i.e., from the top level directory of this repository). If you downloaded the files manually, you should make sure to place them in this directory for the next step.
If you are on a Linux or MacOS system, simply run the build script after completing the previous section:
./build.shIf you mess up and need to start over, run the clean script:
./clean.shA Windows compatible version of these steps will be available later.
County-level normals are not included in the NCA dataset, so they will be generated by scripts in this repository in a future update. Some pre-processing scripts are included to aid the analysis. After downloading the CSV data from NOAA NCEI with wget or download_sources.py, run the following commands:
scripts/merge_to_single_csv.py data/sources/ all_stations.csv [--recursive]
scripts/analyze_wide_csv.py all_stations.csv --output-plot heatmap.png
The first script, merge_to_single_csv.py, will combine individual CSV files from the weather stations in the data set into a combined CSV comprising all available columns (i.e., climatic measurements) found across all CSVs. You may wish to use --recursive if you unpacked the data into separate directories. If a particular weather station does not record a certain measurement, that column is left blank in the wide CSV. The second script, analyze_wide_csv.py creates a heatmap plot describing which measurements are most common across all weather stations.
The script create_gridded_raster.py is used to create gridded raster GeoTIFF images of climate variables from the wide CSV produced in the previous step. Note that only the first script from the previous section is required to create gridded rasters. The example commands for generating the rasters are:
scripts/create_gridded_raster.py all_stations.csv annual_temp_grid_10km.tif --resolution 10000
scripts/create_gridded_raster.py all_stations.csv tmean_jja_grid_10km.tif -m JJA-TAVG-NORMAL --resolution 10000
scripts/create_gridded_raster.py all_stations.csv annual_prcp_grid_10km.tif -m ANN-PRCP-NORMAL --resolution 10000
scripts/create_gridded_raster.py all_stations.csv jja_tmin_grid_10km.tif -m JJA-TMIN-NORMAL --resolution 10000Jacqueline Ryan email