diff --git a/.DS_Store b/.DS_Store new file mode 100644 index 000000000..9502f83a8 Binary files /dev/null and b/.DS_Store differ diff --git a/.gitignore b/.gitignore new file mode 100644 index 000000000..9facff227 --- /dev/null +++ b/.gitignore @@ -0,0 +1,5 @@ +data/ +runs/ +ISIC2018_Task1-2_Test_Input/ +yolov8n-seg.pt +yolov8n.pt diff --git a/README.md b/README.md deleted file mode 100644 index 3a10f6515..000000000 --- a/README.md +++ /dev/null @@ -1,19 +0,0 @@ -# Pattern Analysis -Pattern Analysis of various datasets by COMP3710 students in 2024 at the University of Queensland. - -We create pattern recognition and image processing library for Tensorflow (TF), PyTorch or JAX. - -This library is created and maintained by The University of Queensland [COMP3710](https://my.uq.edu.au/programs-courses/course.html?course_code=comp3710) students. - -The library includes the following implemented in Tensorflow: -* fractals -* recognition problems - -In the recognition folder, you will find many recognition problems solved including: -* segmentation -* classification -* graph neural networks -* StyleGAN -* Stable diffusion -* transformers -etc. diff --git a/recognition/46717481-ThanhTrungPham-ProjectRecognition/README.md b/recognition/46717481-ThanhTrungPham-ProjectRecognition/README.md new file mode 100644 index 000000000..016c21cd0 --- /dev/null +++ b/recognition/46717481-ThanhTrungPham-ProjectRecognition/README.md @@ -0,0 +1,104 @@ +# YOLOv8 for Skin Lesion Detection: ISIC 2018 Challenge - Thanh Trung Pham (46717481) + +## Problem and algorithm + +- This repository makes use of YOLOv8, a highly accurate and fast object detection model (but modified to do segmentation task) in order to tackle the ISIC 2018 Skin Lesion Challenge, it is a challenge focused on the dagnosis of dangerous skin issues such as melanoma. This work leverages the strength of YOLO algorithm in order to segment skin lesions. + +![Alt text](figures/figure1.png?raw=true "YOLOv8 architecture") + +- YOLO (You only look once) algorithm for object detection works by first dividing the input images into a S x S grid of cells, then, each cell's responsibility is detecting objects whose center is within that cell, by doing this YOLO is superior to many other object detection algorithms such as R-CNN as it is able to do works simultaneously and 'only look once' through the entire images, making its speed ideal for real-time applications. + +- Each cell in the grid needs to predict many bounding boxes and generate a confidence score for all of the boxes, where each bounding box's defined by using: 'x' representing x-coordinate of the center of the bounding box with respect to the cell, 'y' representing y-coordinate of the center of the bounding box with respect to the cell, and 'w' and 'h' representing width and height of the box with respect to the image itself. Moreover, a confidence score is also generated for each cell, it is a product of the probability of the object being there and the IoU (Intersection over union) between the box and the ground truth. Each cell can produce many bounding boxes but then Non-Maximum Suppression (NMS) is applied to prevent detection of multiple same objects hence getting rid of redundant ones, this is done by sorting the bounding boxes -> select the box with the highest confidence score -> removing boxes that have high IoU score with this box -> repeat until boxes do not overlap too much anymore (no more redundant boxes). + +- YOLO also makes use of CNN network as its backbone for feature extraction. It also leverages 'Feature fusion' where features at different scales when being extracted are combined to 'see the bigger picture'. YOLO algorithm loss is a combination of three losses: The first one is bounding box loss to make sure predicted boxes are close to ground truth boxes, second one is 'object presence/objectness' loss designed to penalizes false positives where YOLO detect an object (of any class) when there's only the background (it measures the confidence of the model when it comes to the presence of objects in a bounding box), the third is the classification loss to ensure the predicted probabilities for classes match the ground truth. + +- YOLOv8-segmentation goes a step further and instead of just identifying the bounding boxes, it segments each individual object's pixel (assigning pixel-wise labels to objects in images). YOLOv8-segmentation includes another branch in its architecture to detect segmentation masks for all detected instances by bounding boxes. + +``` +YOLOv8 segmentation: +├── Backbone: CSPDarknet +├── Neck: PANet +└── Head: Decoupled Detection Head + ├── Classification Branch + └── Regression Branch +``` + +## Requirements + +``` +python>=3.8 +ultralytics==8.0.58 +opencv-python>=4.1.2 +albumentations==1.4 +scikit-learn +scikit-build +``` +# Preprocessing + +The dataset provided has ground truth labels being a binary mask, however, YOLOv8-segmentation from ultralytics accepts a different type of label, that is a polyglon, therefore Opencv was utilised to generate contours from the mask in order to create a polyglon that can be used to train. + +The train-validation/test split is 80-20. + +# Results +![Alt text](figures/results.png?raw=true "Training results") + +Model Training Observations +The training process of the fine-tuned YOLOv8 model was monitored over 30 epochs, during which a steady decline in the loss function was observed. This indicates that the model effectively minimized the error and adapted well to the underlying patterns in the training data. The decreasing trend of the loss function suggests a successful learning process, as the model adjusted its weights to better fit the provided examples. + +Evaluation Metric: mAP50-95 (Mean Average Precision) +The key evaluation metric used to assess the model's performance is the mean Average Precision (mAP50-95). This metric measures the model's ability to correctly identify and segment objects across a range of Intersection over Union (IoU) thresholds, from 0.5 to 0.95 with a step size of 0.05. The IoU threshold determines the extent of overlap required between the predicted mask and the ground truth mask for a prediction to be considered a true positive: + +At an IoU threshold of 0.5, predictions need to overlap with the ground truth by at least 50%. +At an IoU threshold of 0.95, the overlap requirement is much stricter, requiring a 95% overlap for a correct prediction. +By evaluating the model across multiple IoU thresholds, the mAP50-95 provides a comprehensive measure of the model's performance, capturing both precision (correctness of the predictions) and recall (coverage of all relevant instances). + +Performance Results +The fine-tuned YOLOv8 model achieved a mean Average Precision (mAP50-95) of 0.7318 on the test set after 30 epochs. This score reflects the model's strong capability in accurately detecting and segmenting the target objects. The relatively high mAP score across a wide range of IoU thresholds indicates that the model is not only making accurate predictions but is also robust to variations in the overlap requirement, showcasing its generalization capability across different levels of object localization precision. + + +These are some detections predicted by the fine-tuned YOLOv8-segmentation visualized. +![Alt text](figures/prediction_test.jpg?raw=true "Sample prediction 1") +![Alt text](figures/prediction_test2.jpg?raw=true "Sample prediction 2") + +# Reproducibility + +In order to reproduce the results: +- Use train-test split of 80%-20%, random state 40. +- YOLOv8n-seg model from Ultralytics. +- 30 epochs of training. +- Use the libraries' version specified in 'Requirements' section. +- Use batch size of 4. +- Learning rate of 0.01 (default) + + +# To run the algorithm +Run: ./test_script.sh in order to run the algorithm, note that the algorithm assumes the position of the dataset being the location it is situated in Rangpur and the current location for training is '/home/Student/s4671748/comp3710-project/'. If the location differs, changes inside 'dataset.py' for data loading and 'train.py' must be made in order to run properly + +## Acknowledgments + +- ISIC 2018 Challenge organizers +- Ultralytics for YOLOv8 +- Contributors to the ISIC archive + +## Conclusion + +This work showed the application of YOLOv8-segmentation for skin lesion detection using the ISIC 2018 Challenge dataset. The model achieved a mAP50-95 score of 0.7318 on the test set, suggesting a strong performance in accurately segmenting skin lesions across various IoU thresholds. The implementation leverages YOLOv8's architectural advantages, including its efficient single-pass detection approach, feature fusion capabilities, and dedicated segmentation branch. + +## Future Improvements + +Several potential enhancements could further improve the algorithm's performance and utility: + +1. **Data Augmentation Enhancement** + - Implement more augmentation techniques to generate additional data + - Include domain-specific transformations that reflect real variations in skin lesion appearances + - Introduce synthetic data generation to address class imbalance issues + + +2. **Performance Optimization** + - Fine-tune hyperparameters using advanced search techniques + +3. **Validation and Testing** + - Expand testing to multiple external datasets + - Add metrics specific to medical imaging evaluation + +These improvements would enhance both the technical performance and practical utility of the system, making it a more valuable tool for dermatological diagnosis support. \ No newline at end of file diff --git a/recognition/46717481-ThanhTrungPham-ProjectRecognition/dataset.py b/recognition/46717481-ThanhTrungPham-ProjectRecognition/dataset.py new file mode 100644 index 000000000..98fd73241 --- /dev/null +++ b/recognition/46717481-ThanhTrungPham-ProjectRecognition/dataset.py @@ -0,0 +1,139 @@ +import os +import shutil +from sklearn.model_selection import train_test_split +from pathlib import Path + +def move_dataset(): + # Source directories of data + input_dir = '/home/groups/comp3710/ISIC2018/ISIC2018_Task1-2_Training_Input_x2' + mask_dir = '/home/groups/comp3710/ISIC2018/ISIC2018_Task1_Training_GroundTruth_x2' + + # Create directories to move data to + for dir_path in ['data/images/train', 'data/images/val', 'data/masks/train', 'data/masks/val']: + Path(dir_path).mkdir(parents=True, exist_ok=True) + + # Get list of image files (.jpg) + image_files = [] + for filename in os.listdir(input_dir): + if filename.endswith('.jpg'): + image_files.append(filename) + + # Split the dataset into train and validation sets (80-20 split) + train_files, val_files = train_test_split( + image_files, + test_size=0.2, + random_state=40 + ) + + # Helper used to get mask's file name from the image's name + def get_mask_filename(image_filename): + return image_filename.replace('.jpg', '_segmentation.png') + + # Copy training files + for filename in train_files: + # Copy input image + shutil.copy2( + os.path.join(input_dir, filename), + os.path.join('data/images/train', filename) + ) + # Copy mask + mask_filename = get_mask_filename(filename) + shutil.copy2( + os.path.join(mask_dir, mask_filename), + os.path.join('data/masks/train', mask_filename) + ) + + # Copy validation files + for filename in val_files: + # Copy input image + shutil.copy2( + os.path.join(input_dir, filename), + os.path.join('data/images/val', filename) + ) + # Copy mask + mask_filename = get_mask_filename(filename) + shutil.copy2( + os.path.join(mask_dir, mask_filename), + os.path.join('data/masks/val', mask_filename) + ) + +def generate_labels(): + # Create the directories to store the ground truth label txt files + os.makedirs('data/labels/train', exist_ok=True) + os.makedirs('data/labels/val', exist_ok=True) + + # The directories that store the binary mask images and the directories to store the ground truth label txt files + input_dir_train = './data/masks/train' + output_dir_train = './data/labels/train' + input_dir_val = './data/masks/val' + output_dir_val = './data/labels/val' + dirs_pairs = [[input_dir_train, output_dir_train], [input_dir_val, output_dir_val]] + + # Create the ground truth label txt files for the training and validation sets to suit the Ultralytics's yolo model + for dirs_pair in dirs_pairs: + input_dir = dirs_pair[0] + output_dir = dirs_pair[1] + for file_name in os.listdir(input_dir): + image_path = os.path.join(input_dir, file_name) + + # This is used to get the binary mask image in order to retrieve the contours + # Ultralytics's yolo model requires the mask to be in the format of a polygon + mask = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE) + _, mask = cv2.threshold(mask, 1, 255, cv2.THRESH_BINARY) + + # find the contours + height, width = mask.shape + contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) + + # Change the contours to polygons + polygons_list = [] + for contour in contours: + if cv2.contourArea(contour) > 200: + polygon = [] + for point in contour: + x, y = point[0] + # Normalize the points + polygon.append(x / width) + polygon.append(y / height) + polygons_list.append(polygon) + + # Put the polygons into the txt file + polyglon_file_name = f"{os.path.splitext(os.path.join(output_dir, file_name))[0]}.txt" + + with open(polyglon_file_name, 'w') as file: + for polygon in polygons_list: + for index, p in enumerate(polygon): + if index == len(polygon) - 1: + file.write('{}\n'.format(p)) + elif index == 0: + file.write('0 {} '.format(p)) + else: + file.write('{} '.format(p)) + + file.close() + +def rename_groundtruth(): + # List of directories to process + directories = ['data/labels/train', 'data/labels/val'] + + # Process each directory + for directory in directories: + # Get all files in directory + files = os.listdir(directory) + + # Go through each file + for filename in files: + if '_segmentation.txt' in filename: + # Create new filename by replacing '_segmentation' with '' + new_filename = filename.replace('_segmentation.txt', '.txt') + + # Generate full paths to do rename + old_path = os.path.join(directory, filename) + new_path = os.path.join(directory, new_filename) + + # Rename file + os.rename(old_path, new_path) + +move_dataset() +generate_labels() +rename_groundtruth() diff --git a/recognition/46717481-ThanhTrungPham-ProjectRecognition/dataset.yaml b/recognition/46717481-ThanhTrungPham-ProjectRecognition/dataset.yaml new file mode 100644 index 000000000..ef17519f6 --- /dev/null +++ b/recognition/46717481-ThanhTrungPham-ProjectRecognition/dataset.yaml @@ -0,0 +1,6 @@ +names: +- lesion +nc: 1 +path: /home/Student/s4671748/comp3710-project/data +train: images/train +val: images/val diff --git a/recognition/46717481-ThanhTrungPham-ProjectRecognition/figures/figure1.png b/recognition/46717481-ThanhTrungPham-ProjectRecognition/figures/figure1.png new file mode 100644 index 000000000..acecd627f Binary files /dev/null and b/recognition/46717481-ThanhTrungPham-ProjectRecognition/figures/figure1.png differ diff --git a/recognition/46717481-ThanhTrungPham-ProjectRecognition/figures/prediction_test.jpg b/recognition/46717481-ThanhTrungPham-ProjectRecognition/figures/prediction_test.jpg new file mode 100644 index 000000000..62a90bc13 Binary files /dev/null and b/recognition/46717481-ThanhTrungPham-ProjectRecognition/figures/prediction_test.jpg differ diff --git a/recognition/46717481-ThanhTrungPham-ProjectRecognition/figures/prediction_test2.jpg b/recognition/46717481-ThanhTrungPham-ProjectRecognition/figures/prediction_test2.jpg new file mode 100644 index 000000000..2e9fa6a88 Binary files /dev/null and b/recognition/46717481-ThanhTrungPham-ProjectRecognition/figures/prediction_test2.jpg differ diff --git a/recognition/46717481-ThanhTrungPham-ProjectRecognition/figures/results.png b/recognition/46717481-ThanhTrungPham-ProjectRecognition/figures/results.png new file mode 100644 index 000000000..006720c7f Binary files /dev/null and b/recognition/46717481-ThanhTrungPham-ProjectRecognition/figures/results.png differ diff --git a/recognition/46717481-ThanhTrungPham-ProjectRecognition/modules.py b/recognition/46717481-ThanhTrungPham-ProjectRecognition/modules.py new file mode 100644 index 000000000..8fc60a905 --- /dev/null +++ b/recognition/46717481-ThanhTrungPham-ProjectRecognition/modules.py @@ -0,0 +1,77 @@ +# Import YOLO from ultralytics package +# Ultralytics YOLO is an updated, Python-native implementation +from ultralytics import YOLO + +class YOLOSegmentation: + """ + A wrapper class for YOLO segmentation model that provides simplified interface + for training, evaluation, and prediction tasks. + + This class encapsulates common YOLO operations and provides a clean API + for the main tasks in computer vision: training, evaluation, and inference. + """ + + def __init__(self, weights_path): + """ + Initialize the YOLO model with specified weights. + + Args: + weights_path (str): Path to the model weights file. + Can be either pretrained weights (e.g., 'yolov8n-seg.pt') + or custom trained weights + """ + # Create YOLO model instance using provided weights + self.model = YOLO(weights_path) + + def train(self, params): + """ + Train the YOLO model with given parameters. + + Args: + params (dict): Dictionary containing training parameters such as: + - epochs: number of training epochs + - batch_size: batch size for training + - data: path to data configuration file + - imgsz: input image size + And other training configurations + + Returns: + results: Training results and metrics + """ + # Unpack parameters dictionary and train the model + results = self.model.train(**params) + return results + + def evaluate(self): + """ + Evaluate the model on validation dataset. + + This method runs validation on the dataset specified + in the data configuration file used during training. + + Returns: + results: Validation metrics including mAP, precision, recall + """ + # Run validation and return metrics + results = self.model.val() + return results + + def predict(self, img, conf): + """ + Perform segmentation prediction on an input image. + + Args: + img: Input image (can be path or numpy array) + conf (float): Confidence threshold for predictions + Only predictions above this threshold are returned + + Returns: + results: Model predictions including: + - Segmentation masks + - Bounding boxes + - Confidence scores + - Class predictions + """ + # Run prediction with specified confidence threshold + results = self.model.predict(img, conf=conf) + return results \ No newline at end of file diff --git a/recognition/46717481-ThanhTrungPham-ProjectRecognition/predict.py b/recognition/46717481-ThanhTrungPham-ProjectRecognition/predict.py new file mode 100644 index 000000000..e9f13ae47 --- /dev/null +++ b/recognition/46717481-ThanhTrungPham-ProjectRecognition/predict.py @@ -0,0 +1,43 @@ +# Import required modules +from modules import YOLOSegmentation # Custom YOLO segmentation module +import random # For generating random colors +import cv2 # OpenCV for image processing +import numpy as np # For numerical operations + +# Initialize the YOLO model with trained weights +# 'best.pt' contains the weights from the best performing epoch during training +model = YOLOSegmentation("runs/segment/train2/weights/best.pt") + +# Load the input image for prediction +# This is reading a specific image from the training dataset +img = cv2.imread("data/images/train/ISIC_0015071.jpg") + +# Set confidence threshold for predictions +# Model will only return predictions with confidence > 0.4 +conf = 0.4 + +# Perform prediction on the image +# results will contain detected objects, their bounding boxes, and segmentation masks +results = model.predict(img, conf) + +# Generate random RGB color for visualization +# Creates a list of 3 random integers between 0-255 for RGB values +color = random.choices(range(256), k=3) + +# Process each prediction result +for result in results: + # Iterate through corresponding masks and bounding boxes + for mask, box in zip(result.masks.segments, result.boxes): + # Convert mask coordinates to image coordinates: + # 1. mask contains normalized coordinates (0-1) + # 2. multiply by bounding box width to get actual pixel coordinates + # 3. convert to integer coordinates for drawing + points = np.int32([np.float64(mask) * box.xywh.numpy()[0][2]]) + + # Draw the segmentation mask on the image + # fillPoly fills the area defined by points with the random color + cv2.fillPoly(img, points, color) + +# Save the annotated image +# The output shows the original image with colored segmentation masks +cv2.imwrite("prediction_test.jpg", img) \ No newline at end of file diff --git a/recognition/46717481-ThanhTrungPham-ProjectRecognition/requirements.txt b/recognition/46717481-ThanhTrungPham-ProjectRecognition/requirements.txt new file mode 100644 index 000000000..d627b35da --- /dev/null +++ b/recognition/46717481-ThanhTrungPham-ProjectRecognition/requirements.txt @@ -0,0 +1,6 @@ +python>=3.8 +ultralytics==8.0.58 +opencv-python>=4.1.2 +albumentations==1.4 +scikit-learn +scikit-build \ No newline at end of file diff --git a/recognition/46717481-ThanhTrungPham-ProjectRecognition/slurm_script.sh b/recognition/46717481-ThanhTrungPham-ProjectRecognition/slurm_script.sh new file mode 100644 index 000000000..73a5183fe --- /dev/null +++ b/recognition/46717481-ThanhTrungPham-ProjectRecognition/slurm_script.sh @@ -0,0 +1,26 @@ +#!/bin/bash +#SBATCH --time=0-03:30:00 # Time limit: 30 minutes +#SBATCH --nodes=1 # Run on single node +#SBATCH --ntasks-per-node=1 # Single task per node +#SBATCH --gres=gpu:1 + +#SBATCH --partition=comp3710 # GPU partition +#SBATCH --account=comp3710 +#SBATCH --job-name="yolov8-victor" # Job name +#SBATCH --mail-user=phamtrung0633@email.com # Email address for notifications +#SBATCH --mail-type=BEGIN # Email at start +#SBATCH --mail-type=END # Email at completion +#SBATCH --mail-type=FAIL # Email on failure +#SBATCH --output=yolo_train_%j.out # Output file + +# Load required modules +echo "Job ID: $SLURM_JOB_ID" +echo "Node: $SLURM_JOB_NODELIST" +echo "Start Time: $(date)" + +# +source /home/Student/s4671748/.bashrc +# Run the training script +python train.py + +echo "End Time: $(date)" diff --git a/recognition/46717481-ThanhTrungPham-ProjectRecognition/test_script.sh b/recognition/46717481-ThanhTrungPham-ProjectRecognition/test_script.sh new file mode 100644 index 000000000..cb2afec96 --- /dev/null +++ b/recognition/46717481-ThanhTrungPham-ProjectRecognition/test_script.sh @@ -0,0 +1,13 @@ +#!/bin/bash + +# Install requirements +pip install -r requirements.txt + +# Preprocess the dataset +python3 dataset.py + +# Run the training +python3 train.py + +# Run the test to see a sample of the prediction +python3 predict.py \ No newline at end of file diff --git a/recognition/46717481-ThanhTrungPham-ProjectRecognition/train.py b/recognition/46717481-ThanhTrungPham-ProjectRecognition/train.py new file mode 100644 index 000000000..37367bbdc --- /dev/null +++ b/recognition/46717481-ThanhTrungPham-ProjectRecognition/train.py @@ -0,0 +1,64 @@ +# Import required modules +from modules import YOLOSegmentation # Custom module for YOLO segmentation tasks +import os # For operating system related operations +import shutil # For high-level file operations +import yaml # For reading/writing YAML configuration files +from PIL import Image # For image processing operations +import numpy as np # For numerical operations + +def create_dataset_yaml(): + """ + Creates a YAML configuration file for the dataset structure. + This file tells YOLOv8 where to find the training and validation data. + """ + yaml_content = { + 'path': '/home/Student/s4671748/comp3710-project/data', # Root directory of dataset + 'train': 'images/train', # Directory containing training images (relative to path) + 'val': 'images/val', # Directory containing validation images (relative to path) + 'names': ['lesion'], # List of class names (in this case, only 'lesion') + 'nc': 1 # Number of classes to detect/segment + } + + # Write the configuration to dataset.yaml file + with open('dataset.yaml', 'w') as f: + yaml.dump(yaml_content, f) + +def main(): + """ + Main function that orchestrates the training process: + 1. Creates dataset configuration + 2. Initializes model + 3. Sets training parameters + 4. Executes training + 5. Evaluates the model + """ + + # Create the dataset configuration file + create_dataset_yaml() + + # Initialize YOLOv8 model for segmentation + # 'yolov8n-seg.pt' is the nano (smallest) version of YOLOv8 segmentation model + model = YOLOSegmentation('yolov8n-seg.pt') + + # Define training arguments/hyperparameters + training_args = { + 'data': 'dataset.yaml', # Path to dataset configuration file + 'epochs': 1, # Number of training epochs + 'imgsz': 640, # Input image size + 'batch': 4, # Batch size for training + 'device': 0, # GPU device index (0 = first GPU) + 'name': 'isic2018_run_victor', # Name of the training run + 'save': False, # Whether to save training results + 'cache': False, # Whether to cache images in memory + } + + # Start the training process using defined parameters + results = model.train(training_args) + + # Evaluate the trained model on validation set + model.evaluate() + +# Standard Python idiom to ensure that the main() function is only run +# if the script is executed directly (not imported as a module) +if __name__ == "__main__": + main() \ No newline at end of file diff --git a/recognition/README.md b/recognition/README.md new file mode 100644 index 000000000..32c99e899 --- /dev/null +++ b/recognition/README.md @@ -0,0 +1,10 @@ +# Recognition Tasks +Various recognition tasks solved in deep learning frameworks. + +Tasks may include: +* Image Segmentation +* Object detection +* Graph node classification +* Image super resolution +* Disease classification +* Generative modelling with StyleGAN and Stable Diffusion \ No newline at end of file