Clustering Algorithm Implementation and Visualization from Scratch with Python

Overview

This project implements four popular clustering algorithms from scratch in Python, designed to work for datasets with d >= 2 dimensions and k >= 2 clusters. The implementations are tested on 2D datasets and compared visually with scikit-learn's implementations to evaluate correctness and performance.

Implemented Clustering Algorithms

K-Means Clustering
Gaussian Mixture Model (GMM) using Expectation-Maximization (EM)
Mean-Shift Clustering
Agglomerative Clustering

Python Implementations

KMeans.py: K-Means clustering.
KMeans_Ver0.py: K-Means clustering (2nd version).
GaussianMM.py: EM-GMM.
GaussianMM_Ver0.py: EM-GMM with functions of AIC, BIC and predict (2nd version).
MeanShift.py: Mean-Shift clustering.
Agglomerative.py: Agglomerative clustering.

Evaluations and Tests

test_2d_visualization.py:
Tests each implementation on 2D datasets with visualization, comparing the results to scikit-learn's equivalent algorithms.
data_2d_test/:
Contains the datasets used for testing.
test_2d_visualization_results/:
Stores the output images of the clustering results.

Visualization Results

Blobs Dataset

Algorithm	My Implementation	Scikit-learn
Agglomerative
EM-GMM
K-Means
Mean-Shift

Moons and Stars Dataset

Algorithm	My Implementation	Scikit-learn
Agglomerative
EM-GMM
K-Means
Mean-Shift

Sticks Dataset

Algorithm	My Implementation	Scikit-learn
Agglomerative
EM-GMM
K-Means
Mean-Shift

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Clustering Algorithm Implementation and Visualization from Scratch with Python

Overview

Implemented Clustering Algorithms

Python Implementations

Evaluations and Tests

Visualization Results

Blobs Dataset

Moons and Stars Dataset

Sticks Dataset

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
data_2d_test		data_2d_test
test_2d_visualization_results		test_2d_visualization_results
Agglomerative.py		Agglomerative.py
GaussianMM.py		GaussianMM.py
GaussianMM_Ver0.py		GaussianMM_Ver0.py
KMeans.py		KMeans.py
KMeans_Ver0.py		KMeans_Ver0.py
MeanShift.py		MeanShift.py
README.md		README.md
test_2d_visualization.py		test_2d_visualization.py

DolbyUUU/clustering_algorithm_implementation_python

Folders and files

Latest commit

History

Repository files navigation

Clustering Algorithm Implementation and Visualization from Scratch with Python

Overview

Implemented Clustering Algorithms

Python Implementations

Evaluations and Tests

Visualization Results

Blobs Dataset

Moons and Stars Dataset

Sticks Dataset

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages