Skip to content

Data format #2

@nealf

Description

@nealf

Here's what I'm proposing for our crime data format:
{
“_id”: String (autogenerated),
“AgencyID”: String,
“AgencyName”: String,
“CaseNumber”: String,
“CriminalOffense”: String,
“DateReported”: DateTime (YYYY-MM-DDTHH:MM:SS),
“Description”: String,
“Location”: String,
“OccurenceDate”: DateTime (YYYY-MM-DDTHH:MM:SS),
“Disposition”: String,
“Lat”: float,
“Lng”: float
}

  • _id is autogenerated by CouchDB if you need, or could use a unique id like CrimeCodeID
  • Coordinates will need to be converted or geocoded to lat/lng
  • Location should have the city/state added to the street address
  • CriminalOffense (CrimeCode) should probably be standardized to some extent

Does anybody have any thoughts on whether we should essentially keep all of the original data we scrape and then add the standardized fields separately?

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions