Skip to content

xkuang/financial-data-assistant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Financial Data Assistant

5321738902254_ pic

5311738902216_ pic

Author: Xiaoting Kuang

Development Tools:

  • Code Development: Cursor - The AI-first code editor
  • AI Assistance: Claude (Anthropic) - For rapid development and problem-solving

A comprehensive financial data platform that aggregates and analyzes data from FDIC (Federal Deposit Insurance Corporation) and NCUA (National Credit Union Administration). The platform includes an automated data pipeline using Apache Airflow and a web interface with SQL querying capabilities.

5331738902295_ pic

Features

  • Automated data collection from FDIC and NCUA sources
  • Daily data freshness checks and updates
  • Interactive SQL query interface
  • Pre-built analytics queries
  • AI-powered chat interface for data exploration
  • Comprehensive documentation and API reference

System Requirements

  • Python 3.11+
  • PostgreSQL 13+
  • Redis (for Airflow)
  • Virtual environment management tool (venv recommended)

Directory Structure

.
├── airflow/
│   ├── dags/
│   │   ├── refresh_financial_data_dag.py
│   │   ├── fdic_ingestion.py
│   │   └── ncua_ingestion.py
├── docs/
│   ├── architecture.md
│   ├── data_mechanics.md
│   └── technical_overview.md
├── prospects/
│   ├── models.py
│   ├── views.py
│   └── urls.py
├── templates/
│   ├── base.html
│   ├── chat/
│   └── documentation/
└── manage.py

Installation

  1. Clone the repository:
git clone https://github.com/yourusername/financial-data-assistant.git
cd financial-data-assistant
  1. Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install required packages:
pip install -r requirements.txt
  1. Set up environment variables (create .env file):
# Django settings
DEBUG=True
SECRET_KEY=your_secret_key_here
ALLOWED_HOSTS=localhost,127.0.0.1

# Database settings
DATABASE_URL=postgresql://user:password@localhost:5432/financial_db

# Airflow settings
AIRFLOW_HOME=/path/to/your/airflow
AIRFLOW__CORE__SQL_ALCHEMY_CONN=sqlite:////path/to/your/airflow/airflow.db
AIRFLOW__CORE__LOAD_EXAMPLES=False
  1. Initialize the database:
python manage.py migrate
python manage.py createsuperuser

Running the Application

1. Start Airflow Services

Initialize Airflow database (first time only):

airflow db init

Create Airflow user (first time only):

airflow users create \
    --username admin \
    --firstname Admin \
    --lastname User \
    --role Admin \
    --email admin@example.com \
    --password admin

Start Airflow services:

# Start the web server (in a separate terminal)
airflow webserver -p 8081

# Start the scheduler (in another separate terminal)
airflow scheduler

Airflow UI will be available at: http://localhost:8081

2. Start Django Development Server

python manage.py runserver 8000

The web application will be available at: http://localhost:8000

Available URLs

Data Pipeline

The Airflow DAG (refresh_financial_data) runs daily at 6 AM and performs the following tasks:

  1. Checks FDIC website for new data
  2. Checks NCUA website for new data
  3. Generates a freshness report
  4. Updates database if new data is available

Development

Running Tests

python manage.py test

Code Style

# Install development dependencies
pip install -r requirements-dev.txt

# Run linting
flake8 .

# Run type checking
mypy .

API Documentation

The application provides several API endpoints for data access:

  • /api/institutions/: List of financial institutions
  • /api/stats/: Quarterly statistics
  • /chat/query/: Natural language query endpoint
  • /chat/sql/: SQL query endpoint

Acknowledgments

  • FDIC for providing financial institution data
  • NCUA for providing credit union data
  • Apache Airflow team for the amazing workflow management platform
  • Django team for the robust web framework

Learning Resources

Interview Preparation Courses

Looking to level up your interview skills? Check out these comprehensive courses:

Essential Reading

  • Designing Data-Intensive Applications by Martin Kleppmann - The definitive guide to building reliable, scalable, and maintainable systems. A must-read for understanding the principles behind modern data systems.

Hosting Solutions

For deploying your own instance of this project, consider these reliable hosting options:

About

A comprehensive financial data platform that aggregates and analyzes data from FDIC and NCUA sources

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published