A comprehensive end-to-end data analytics platform that processes NYC taxi trip data from Google BigQuery and presents actionable insights through an interactive React dashboard. This project combines big data processing with modern web visualization to explore transportation patterns, revenue metrics, and geographic distributions across New York City's 263 taxi zones.
π View Live Dashboard
- Dynamic Heatmaps - Color-coded zones with real-time metric switching
- Temporal Analysis - Trip patterns across 30-minute intervals
- Flow Visualization - Sankey diagrams showing pickup-dropoff relationships
- Revenue Insights - Financial metrics and tipping patterns
- 263 NYC Taxi Zones - Complete coverage across all five boroughs
- Interactive Maps - Powered by React Leaflet with custom styling
- Zone-based Metrics - Neighborhood-level granular analysis
- Coordinate Precision - Proper WGS84 projection handling
- Material-UI v7 - Professional design system
- Dark/Light Themes - Automatic mode switching
- Responsive Design - Mobile-first approach
- Performance Optimized - Lazy loading and efficient rendering
graph LR
A[Google BigQuery] --> B[Python Processing]
B --> C[CSV/GeoJSON Files]
C --> D[React Dashboard]
D --> E[Interactive Visualizations]
subgraph "Data Sources"
A1[NYC TLC Yellow Trips]
A2[NYC TLC Green Trips]
A3[Taxi Zone Shapefiles]
end
subgraph "Processing Pipeline"
B1[Data Cleaning]
B2[Metric Calculation]
B3[Geographic Merging]
end
subgraph "Frontend Components"
D1[Revenue Analytics]
D2[Demand Analysis]
D3[Trip Characteristics]
D4[Interactive Maps]
end
- Primary: Google BigQuery public dataset (
bigquery-public-data.new_york_taxi_trips) - Geographic: NYC TLC official taxi zone shapefiles (263 zones)
- Temporal Coverage: Configurable date ranges (default: Jan-Feb 2025)
- Trip Types: Yellow and Green taxi services
| Metric | Description | Use Case |
|---|---|---|
avg_tip_amount.csv |
Average tip amounts by zone | Revenue optimization |
avg_total_amount.csv |
Average trip costs | Price analysis |
revenue_per_pickup.csv |
Total revenue by zone | Business intelligence |
trips_by_time_of_day.csv |
Temporal trip patterns | Demand forecasting |
duration_by_time_of_day.csv |
Trip duration analysis | Traffic insights |
dropoff_by_pickup_*.csv |
Origin-destination flows | Route optimization |
merged.geojson |
Geographic boundaries + metrics | Map visualization |
- β BigQuery Integration - Serverless SQL processing at scale
- β Automated Data Cleaning - Missing value handling and validation
- β Coordinate Transformation - EPSG:2263 to WGS84 conversion
- β Performance Optimization - Partitioned queries and efficient aggregation
{
"react": "^19.1.1",
"typescript": "~5.8.3",
"vite": "^7.1.2"
}{
"@mui/material": "^7.3.2",
"plotly.js": "^3.1.0",
"react-leaflet": "^5.0.0",
"chroma-js": "^3.1.2"
}{
"eslint": "^9.33.0",
"typescript-eslint": "^8.39.1",
"@vitejs/plugin-react-swc": "^4.0.0"
}- Node.js >= 18.0.0
- npm >= 9.0.0
- Python >= 3.8 (for data processing)
- Google Cloud Account (for BigQuery access)
# Clone the repository
git clone https://github.com/amirrezaskh/nyc-taxi-dashboard.git
cd nyc-taxi-dashboard
# Install dependencies
npm installIf you want to regenerate the data from BigQuery:
# Install Python dependencies
pip install pandas numpy matplotlib folium geopandas google-cloud-bigquery
# Set up Google Cloud authentication
export GOOGLE_APPLICATION_CREDENTIALS="path/to/service-account-key.json"
# Run data processing
jupyter notebook processing/main.ipynb# Start development server
npm run dev
# Open browser at http://localhost:5173nyc-taxi-dashboard/
βββ π public/ # Static assets and processed data
β βββ π data/ # CSV files and GeoJSON
β β βββ avg_tip_amount.csv
β β βββ trips_by_time_of_day.csv
β β βββ merged.geojson
β β βββ ...
β βββ πΌοΈ favicon.ico
βββ π src/
β βββ π app/ # Application core
β β βββ layout/ # Layout components
β β β βββ App.tsx # Main app wrapper
β β β βββ Navbar.tsx # Navigation
β β β βββ styles.css # Global styles
β β βββ router/ # Routing configuration
β β βββ routes.tsx # Route definitions
β βββ π features/ # Feature-based architecture
β β βββ about/ # About page
β β βββ revenue/ # Revenue analytics
β β β βββ RevenueTips.tsx
β β βββ demand/ # Trip demand analysis
β β β βββ Demand.tsx
β β β βββ TripsByTime.tsx
β β β βββ TaxiSankey.tsx
β β βββ characteristic/ # Trip characteristics
β β β βββ TripCharacteristic.tsx
β β β βββ DurationByTime.tsx
β β βββ map/ # Interactive mapping
β β βββ NYCMap.tsx
β βββ π theme/ # Design system
β β βββ AppTheme.tsx # Theme provider
β β βββ palettes.ts # Color schemes
β β βββ themePrimitives.ts # Design tokens
β β βββ customizations/ # Component overrides
β βββ π lib/ # Utilities and types
β β βββ types/
β β βββ util/
β βββ main.tsx # Application entry point
βββ π processing/ # Data analysis pipeline
β βββ main.ipynb # Primary analysis notebook
β βββ preprocess.ipynb # Data preprocessing
β βββ process.py # Zone processing script
β βββ merge.py # Data merging script
βββ π package.json # Dependencies and scripts
βββ π tsconfig.json # TypeScript configuration
βββ π vite.config.ts # Vite build configuration
βββ π README.md # This file
Explore the financial landscape of NYC's taxi ecosystem:
- Manhattan Premium - Higher tips in business districts
- Airport Revenue - JFK/LaGuardia pickup patterns
- Tip Percentage Trends - Business vs entertainment zones
- Economic Geography - Revenue concentration analysis
Key Metrics: Average tip amount, total revenue, tip percentages
Understand dynamic patterns of taxi utilization:
- Rush Hour Peaks - Morning (7-9 AM) and evening (5-7 PM) patterns
- Weekend Shifts - Late-night entertainment district activity
- Commuter Flows - Directional pickup/dropoff relationships
- Seasonal Variations - Weather and event impact analysis
Visualizations: Time series charts, Sankey flow diagrams, demand heatmaps
Analyze fundamental trip attributes:
- Passenger Patterns - Business (1.2-1.4) vs entertainment (1.6-2.0) occupancy
- Distance Variations - Airport vs intra-Manhattan comparisons
- Duration Dynamics - Traffic impact (40-60% longer in rush hour)
- Geographic Influence - Bridge/tunnel access effects
Components: Duration analysis, distance distributions, passenger metrics
This project is licensed under the MIT License - see the LICENSE file for details.