shakes76 · phamtrung0633 · Sep 16, 2024 · Oct 24, 2024 · Oct 24, 2024 · Oct 24, 2024
diff --git a/.DS_Store b/.DS_Store
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,5 @@
+data/
+runs/
+ISIC2018_Task1-2_Test_Input/
+yolov8n-seg.pt
+yolov8n.pt
diff --git a/README.md b/README.md
diff --git a/recognition/46717481-ThanhTrungPham-ProjectRecognition/README.md b/recognition/46717481-ThanhTrungPham-ProjectRecognition/README.md
@@ -0,0 +1,104 @@
+# YOLOv8 for Skin Lesion Detection: ISIC 2018 Challenge - Thanh Trung Pham (46717481)
+
+## Problem and algorithm
+
+- This repository makes use of YOLOv8, a highly accurate and fast object detection model (but modified to do segmentation task) in order to tackle the ISIC 2018 Skin Lesion Challenge, it is a challenge focused on the dagnosis of dangerous skin issues such as melanoma. This work leverages the strength of YOLO algorithm in order to segment skin lesions.
+
+![Alt text](figures/figure1.png?raw=true "YOLOv8 architecture")
+
+- YOLO (You only look once) algorithm for object detection works by first dividing the input images into a S x S grid of cells, then, each cell's responsibility is detecting objects whose center is within that cell, by doing this YOLO is superior to many other object detection algorithms such as R-CNN as it is able to do works simultaneously and 'only look once' through the entire images, making its speed ideal for real-time applications.
+
+- Each cell in the grid needs to predict many bounding boxes and generate a confidence score for all of the boxes, where each bounding box's defined by using: 'x' representing x-coordinate of the center of the bounding box with respect to the cell, 'y' representing y-coordinate of the center of the bounding box with respect to the cell, and 'w' and 'h' representing width and height of the box with respect to the image itself. Moreover, a confidence score is also generated for each cell, it is a product of the probability of the object being there and the IoU (Intersection over union) between the box and the ground truth. Each cell can produce many bounding boxes but then Non-Maximum Suppression (NMS) is applied to prevent detection of multiple same objects hence getting rid of redundant ones, this is done by sorting the bounding boxes  -> select the box with the highest confidence score -> removing boxes that have high IoU score with this box -> repeat until boxes do not overlap too much anymore (no more redundant boxes).
+
+- YOLO also makes use of CNN network as its backbone for feature extraction. It also leverages 'Feature fusion' where features at different scales when being extracted are combined to 'see the bigger picture'. YOLO algorithm loss is a combination of three losses: The first one is bounding box loss to make sure predicted boxes are close to ground truth boxes, second one is 'object presence/objectness' loss designed to penalizes false positives where YOLO detect an object (of any class) when there's only the background (it measures the confidence of the model when it comes to the presence of objects in a bounding box), the third is the classification loss to ensure the predicted probabilities for classes match the ground truth.
+
+- YOLOv8-segmentation goes a step further and instead of just identifying the bounding boxes, it segments each individual object's pixel (assigning pixel-wise labels to objects in images). YOLOv8-segmentation includes another branch in its architecture to detect segmentation masks for all detected instances by bounding boxes.
+
+```
+YOLOv8 segmentation:
+├── Backbone: CSPDarknet
+├── Neck: PANet
+└── Head: Decoupled Detection Head
+    ├── Classification Branch
+    └── Regression Branch
+```
+
+## Requirements
+
+```
+python>=3.8
+ultralytics==8.0.58
+opencv-python>=4.1.2
+albumentations==1.4
+scikit-learn
+scikit-build
+```
+# Preprocessing
+
+The dataset provided has ground truth labels being a binary mask, however, YOLOv8-segmentation from ultralytics accepts a different type of label, that is a polyglon, therefore Opencv was utilised to generate contours from the mask in order to create a polyglon that can be used to train.
+
+The train-validation/test split is 80-20.
+
+# Results
+![Alt text](figures/results.png?raw=true "Training results")
+
+Model Training Observations
+The training process of the fine-tuned YOLOv8 model was monitored over 30 epochs, during which a steady decline in the loss function was observed. This indicates that the model effectively minimized the error and adapted well to the underlying patterns in the training data. The decreasing trend of the loss function suggests a successful learning process, as the model adjusted its weights to better fit the provided examples.
+
+Evaluation Metric: mAP50-95 (Mean Average Precision)
+The key evaluation metric used to assess the model's performance is the mean Average Precision (mAP50-95). This metric measures the model's ability to correctly identify and segment objects across a range of Intersection over Union (IoU) thresholds, from 0.5 to 0.95 with a step size of 0.05. The IoU threshold determines the extent of overlap required between the predicted mask and the ground truth mask for a prediction to be considered a true positive:
+
+At an IoU threshold of 0.5, predictions need to overlap with the ground truth by at least 50%.
+At an IoU threshold of 0.95, the overlap requirement is much stricter, requiring a 95% overlap for a correct prediction.
+By evaluating the model across multiple IoU thresholds, the mAP50-95 provides a comprehensive measure of the model's performance, capturing both precision (correctness of the predictions) and recall (coverage of all relevant instances).
+
+Performance Results
+The fine-tuned YOLOv8 model achieved a mean Average Precision (mAP50-95) of 0.7318 on the test set after 30 epochs. This score reflects the model's strong capability in accurately detecting and segmenting the target objects. The relatively high mAP score across a wide range of IoU thresholds indicates that the model is not only making accurate predictions but is also robust to variations in the overlap requirement, showcasing its generalization capability across different levels of object localization precision.
+
+
+These are some detections predicted by the fine-tuned YOLOv8-segmentation visualized.
+![Alt text](figures/prediction_test.jpg?raw=true "Sample prediction 1")
+![Alt text](figures/prediction_test2.jpg?raw=true "Sample prediction 2")
+
+# Reproducibility
+
+In order to reproduce the results:
+- Use train-test split of 80%-20%, random state 40.
+- YOLOv8n-seg model from Ultralytics.
+- 30 epochs of training.
+- Use the libraries' version specified in 'Requirements' section.
+- Use batch size of 4.
+- Learning rate of 0.01 (default)
+
+
+# To run the algorithm
+Run: ./test_script.sh in order to run the algorithm, note that the algorithm assumes the position of the dataset being the location it is situated in Rangpur and the current location for training is '/home/Student/s4671748/comp3710-project/'. If the location differs, changes inside 'dataset.py' for data loading and 'train.py' must be made in order to run properly
+
+## Acknowledgments
+
+- ISIC 2018 Challenge organizers
+- Ultralytics for YOLOv8
+- Contributors to the ISIC archive
+
+## Conclusion
+
+This work showed the application of YOLOv8-segmentation for skin lesion detection using the ISIC 2018 Challenge dataset. The model achieved a mAP50-95 score of 0.7318 on the test set, suggesting a strong performance in accurately segmenting skin lesions across various IoU thresholds. The implementation leverages YOLOv8's architectural advantages, including its efficient single-pass detection approach, feature fusion capabilities, and dedicated segmentation branch.
+
+## Future Improvements
+
+Several potential enhancements could further improve the algorithm's performance and utility:
+
+1. **Data Augmentation Enhancement**
+   - Implement more augmentation techniques to generate additional data
+   - Include domain-specific transformations that reflect real variations in skin lesion appearances
+   - Introduce synthetic data generation to address class imbalance issues
+
+
+2. **Performance Optimization**
+   - Fine-tune hyperparameters using advanced search techniques
+
+3. **Validation and Testing**
+   - Expand testing to multiple external datasets
+   - Add metrics specific to medical imaging evaluation
+
+These improvements would enhance both the technical performance and practical utility of the system, making it a more valuable tool for dermatological diagnosis support.
diff --git a/recognition/46717481-ThanhTrungPham-ProjectRecognition/dataset.py b/recognition/46717481-ThanhTrungPham-ProjectRecognition/dataset.py
@@ -0,0 +1,139 @@
+import os
+import shutil
+from sklearn.model_selection import train_test_split
+from pathlib import Path
+
+def move_dataset():
+    # Source directories of data
+    input_dir = '/home/groups/comp3710/ISIC2018/ISIC2018_Task1-2_Training_Input_x2'
+    mask_dir = '/home/groups/comp3710/ISIC2018/ISIC2018_Task1_Training_GroundTruth_x2'
+
+    # Create directories to move data to
+    for dir_path in ['data/images/train', 'data/images/val', 'data/masks/train', 'data/masks/val']:
+        Path(dir_path).mkdir(parents=True, exist_ok=True)
+
+    # Get list of image files (.jpg)
+    image_files = []
+    for filename in os.listdir(input_dir):
+        if filename.endswith('.jpg'):
+            image_files.append(filename)
+
+    # Split the dataset into train and validation sets (80-20 split)
+    train_files, val_files = train_test_split(
+        image_files,
+        test_size=0.2,
+        random_state=40
+    )
+
+    # Helper used to get mask's file name from the image's name
+    def get_mask_filename(image_filename):
+        return image_filename.replace('.jpg', '_segmentation.png')
+
+    # Copy training files
+    for filename in train_files:
+        # Copy input image
+        shutil.copy2(
+            os.path.join(input_dir, filename),
+            os.path.join('data/images/train', filename)
+        )
+        # Copy mask
+        mask_filename = get_mask_filename(filename)
+        shutil.copy2(
+            os.path.join(mask_dir, mask_filename),
+            os.path.join('data/masks/train', mask_filename)
+        )
+
+    # Copy validation files
+    for filename in val_files:
+        # Copy input image
+        shutil.copy2(
+            os.path.join(input_dir, filename),
+            os.path.join('data/images/val', filename)
+        )
+        # Copy mask
+        mask_filename = get_mask_filename(filename)
+        shutil.copy2(
+            os.path.join(mask_dir, mask_filename),
+            os.path.join('data/masks/val', mask_filename)
+        )
+
+def generate_labels():
+    # Create the directories to store the ground truth label txt files
+    os.makedirs('data/labels/train', exist_ok=True)
+    os.makedirs('data/labels/val', exist_ok=True)
+
+    # The directories that store the binary mask images and the directories to store the ground truth label txt files
+    input_dir_train = './data/masks/train'
+    output_dir_train = './data/labels/train'
+    input_dir_val = './data/masks/val'
+    output_dir_val = './data/labels/val'
+    dirs_pairs = [[input_dir_train, output_dir_train], [input_dir_val, output_dir_val]]
+
+    # Create the ground truth label txt files for the training and validation sets to suit the Ultralytics's yolo model
+    for dirs_pair in dirs_pairs:
+        input_dir = dirs_pair[0]
+        output_dir = dirs_pair[1]
+        for file_name in os.listdir(input_dir):
+            image_path = os.path.join(input_dir, file_name)
+
+            # This is used to get the binary mask image in order to retrieve the contours
+            # Ultralytics's yolo model requires the mask to be in the format of a polygon
+            mask = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
+            _, mask = cv2.threshold(mask, 1, 255, cv2.THRESH_BINARY)
+
+            # find the contours
+            height, width = mask.shape
+            contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
+
+            # Change the contours to polygons
+            polygons_list = []
+            for contour in contours:
+                if cv2.contourArea(contour) > 200:
+                    polygon = []
+                    for point in contour:
+                        x, y = point[0]
+                        # Normalize the points
+                        polygon.append(x / width)
+                        polygon.append(y / height)
+                    polygons_list.append(polygon)
+
+            # Put the polygons into the txt file
+            polyglon_file_name = f"{os.path.splitext(os.path.join(output_dir, file_name))[0]}.txt"
+
+            with open(polyglon_file_name, 'w') as file:
+                for polygon in polygons_list:
+                    for index, p in enumerate(polygon):
+                        if index == len(polygon) - 1:
+                            file.write('{}\n'.format(p))
+                        elif index == 0:
+                            file.write('0 {} '.format(p))
+                        else:
+                            file.write('{} '.format(p))
+
+                file.close()
+
+def rename_groundtruth():
+    # List of directories to process
+    directories = ['data/labels/train', 'data/labels/val']
+
+    # Process each directory
+    for directory in directories:
+        # Get all files in directory
+        files = os.listdir(directory)
+
+        # Go through each file
+        for filename in files:
+            if '_segmentation.txt' in filename:
+                # Create new filename by replacing '_segmentation' with ''
+                new_filename = filename.replace('_segmentation.txt', '.txt')
+
+                # Generate full paths to do rename
+                old_path = os.path.join(directory, filename)
+                new_path = os.path.join(directory, new_filename)
+
+                # Rename file
+                os.rename(old_path, new_path)
+
+move_dataset()
+generate_labels()
+rename_groundtruth()
diff --git a/recognition/46717481-ThanhTrungPham-ProjectRecognition/dataset.yaml b/recognition/46717481-ThanhTrungPham-ProjectRecognition/dataset.yaml
@@ -0,0 +1,6 @@
+names:
+- lesion
+nc: 1
+path: /home/Student/s4671748/comp3710-project/data
+train: images/train
+val: images/val
diff --git a/recognition/46717481-ThanhTrungPham-ProjectRecognition/figures/figure1.png b/recognition/46717481-ThanhTrungPham-ProjectRecognition/figures/figure1.png
diff --git a/recognition/46717481-ThanhTrungPham-ProjectRecognition/figures/prediction_test.jpg b/recognition/46717481-ThanhTrungPham-ProjectRecognition/figures/prediction_test.jpg
diff --git a/...gnition/46717481-ThanhTrungPham-ProjectRecognition/figures/prediction_test2.jpg b/...gnition/46717481-ThanhTrungPham-ProjectRecognition/figures/prediction_test2.jpg
diff --git a/recognition/46717481-ThanhTrungPham-ProjectRecognition/figures/results.png b/recognition/46717481-ThanhTrungPham-ProjectRecognition/figures/results.png
diff --git a/recognition/46717481-ThanhTrungPham-ProjectRecognition/modules.py b/recognition/46717481-ThanhTrungPham-ProjectRecognition/modules.py
@@ -0,0 +1,77 @@
+# Import YOLO from ultralytics package
+# Ultralytics YOLO is an updated, Python-native implementation
+from ultralytics import YOLO
+
+class YOLOSegmentation:
+    """
+    A wrapper class for YOLO segmentation model that provides simplified interface
+    for training, evaluation, and prediction tasks.
+
+    This class encapsulates common YOLO operations and provides a clean API
+    for the main tasks in computer vision: training, evaluation, and inference.
+    """
+
+    def __init__(self, weights_path):
+        """
+        Initialize the YOLO model with specified weights.
+
+        Args:
+            weights_path (str): Path to the model weights file.
+                              Can be either pretrained weights (e.g., 'yolov8n-seg.pt')
+                              or custom trained weights
+        """
+        # Create YOLO model instance using provided weights
+        self.model = YOLO(weights_path)
+
+    def train(self, params):
+        """
+        Train the YOLO model with given parameters.
+
+        Args:
+            params (dict): Dictionary containing training parameters such as:
+                         - epochs: number of training epochs
+                         - batch_size: batch size for training
+                         - data: path to data configuration file
+                         - imgsz: input image size
+                         And other training configurations
+
+        Returns:
+            results: Training results and metrics
+        """
+        # Unpack parameters dictionary and train the model
+        results = self.model.train(**params)
+        return results
+
+    def evaluate(self):
+        """
+        Evaluate the model on validation dataset.
+
+        This method runs validation on the dataset specified
+        in the data configuration file used during training.
+
+        Returns:
+            results: Validation metrics including mAP, precision, recall
+        """
+        # Run validation and return metrics
+        results = self.model.val()
+        return results
+
+    def predict(self, img, conf):
+        """
+        Perform segmentation prediction on an input image.
+
+        Args:
+            img: Input image (can be path or numpy array)
+            conf (float): Confidence threshold for predictions
+                         Only predictions above this threshold are returned
+
+        Returns:
+            results: Model predictions including:
+                    - Segmentation masks
+                    - Bounding boxes
+                    - Confidence scores
+                    - Class predictions
+        """
+        # Run prediction with specified confidence threshold
+        results = self.model.predict(img, conf=conf)
+        return results