Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions DataProcessing/CleanSightingsData.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
import pandas as pd
import re

# Load the CSV file
file_path = "Species_Sightings_1km_Block_16Oct2024.csv"
df = pd.read_csv(file_path)

# Function to extract species, count, and date
def parse_species_data(entry):
observations = entry.split(", ")
parsed_entries = []
for obs in observations:
match = re.match(r"(.+) \((\d+)\) (\d{4}-\d{2}-\d{2})", obs)
if match:
species, count, date = match.groups()
parsed_entries.append((species, int(count), date))
return parsed_entries

# Expand rows for multiple species observations
expanded_rows = []
for _, row in df.iterrows():
parsed_entries = parse_species_data(row['Aggregated_Species_Data'])
for species, count, date in parsed_entries:
expanded_rows.append({
'Latitude': row['Latitude'],
'Longitude': row['Longitude'],
'Species': species,
'Count': count,
'Date': date
})

# Create a new DataFrame with structured data
structured_df = pd.DataFrame(expanded_rows)

# Save the structured data
structured_df.to_csv("Structured_Bird_Data.csv", index=False)

print("Structured data saved as 'Structured_Bird_Data.csv'")
15 changes: 15 additions & 0 deletions DataProcessing/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Data Processing

1. Run CleanSightingsData.py to generate a new CSV called Structured_Bird_Data.csv
1. This should hopefully be better formatted than the original
2. Open QGIS
1. To Download
2. https://qgis.org/download/
3. Click New Project
4. Click Layer > Add Layer > Add Delimited Text Layer
5. Select the Cleaned CSV to use
6. Update the projection and make sure the X values are Longitude and Y values are Latitude
7. Select OK
8. Right-Click the newly generated layer and select Export > Save Feature As
9. Select GeoJson
10. Save to the DataProcessing folder
Loading