This repository contains a collection of data science and machine learning projects exploring topics in
finance, analytics, clustering, classification, and real-world pattern discovery.
Each project is organized in its own folder with notebooks, datasets, and documentation.
Explores short-term trading strategies using:
- Rule-based intraday inertia signals
- Weekly machine learning labeling (kNN, SVM, Random Forest, etc.)
- K-means clustering and Hamming distance analysis across Dow Jones stocks
Includes trading simulations, model comparisons, and visual insights.
Using the UCI Banknote Authentication Dataset:
- Exploratory pairplots for real vs. fake notes
- Simple rule-based classifier
- Optimized k-NN model (k=3 β ~99.7% accuracy)
- Logistic Regression with coefficient interpretation
- Feature-importance analysis through exclusion tests
Focuses on understanding classifier behavior and feature influence.
Analysis of the Kaggle Food Nutrition dataset:
- Nutrient distributions across food categories
- Rule-based nutrient classifier
- Correlation analysis of vitamins/minerals
- Visual summaries and descriptive reporting
- Python, Pandas, NumPy
- scikit-learn, SciPy
- Matplotlib, Seaborn, Plotly
- Jupyter Notebook, Quarto
- Version control via GitHub
Each project folder contains:
- Jupyter notebooks (
.ipynb) - Supporting datasets (CSV)
- Visualizations
- README files with explanations and findings