Skip to content

csvs-to-sqlite 0.3

Choose a tag to compare

@simonw simonw released this 17 Nov 05:33
· 83 commits to main since this release
e75fc68
  • Mechanism for converting columns into separate tables

    Let's say you have a CSV file that looks like this:

    county,precinct,office,district,party,candidate,votes
    Clark,1,President,,REP,John R. Kasich,5
    Clark,2,President,,REP,John R. Kasich,0
    Clark,3,President,,REP,John R. Kasich,7
    

    (Real example from https://github.com/openelections/openelections-data-sd/blob/ master/2016/20160607__sd__primary__clark__precinct.csv )

    You can now convert selected columns into separate lookup tables using the new
    --extract-column option (shortname: -c) - for example:

    csvs-to-sqlite openelections-data-*/*.csv \
        -c county:County:name \
        -c precinct:Precinct:name \
        -c office -c district -c party -c candidate \
        openelections.db
    

    The format is as follows:

    column_name:optional_table_name:optional_table_value_column_name
    

    If you just specify the column name e.g. -c office, the following table will
    be created:

    CREATE TABLE "party" (
        "id" INTEGER PRIMARY KEY,
        "value" TEXT
    );
    

    If you specify all three options, e.g. -c precinct:Precinct:name the table
    will look like this:

    CREATE TABLE "Precinct" (
        "id" INTEGER PRIMARY KEY,
        "name" TEXT
    );
    

    The original tables will be created like this:

    CREATE TABLE "ca__primary__san_francisco__precinct" (
        "county" INTEGER,
        "precinct" INTEGER,
        "office" INTEGER,
        "district" INTEGER,
        "party" INTEGER,
        "candidate" INTEGER,
        "votes" INTEGER,
        FOREIGN KEY (county) REFERENCES County(id),
        FOREIGN KEY (party) REFERENCES party(id),
        FOREIGN KEY (precinct) REFERENCES Precinct(id),
        FOREIGN KEY (office) REFERENCES office(id),
        FOREIGN KEY (candidate) REFERENCES candidate(id)
    );
    

    They will be populated with IDs that reference the new derived tables.

    Closes #2