Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added .DS_Store
Binary file not shown.
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
data/
runs/
ISIC2018_Task1-2_Test_Input/
yolov8n-seg.pt
yolov8n.pt
19 changes: 0 additions & 19 deletions README.md

This file was deleted.

104 changes: 104 additions & 0 deletions recognition/46717481-ThanhTrungPham-ProjectRecognition/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# YOLOv8 for Skin Lesion Detection: ISIC 2018 Challenge - Thanh Trung Pham (46717481)

## Problem and algorithm

- This repository makes use of YOLOv8, a highly accurate and fast object detection model (but modified to do segmentation task) in order to tackle the ISIC 2018 Skin Lesion Challenge, it is a challenge focused on the dagnosis of dangerous skin issues such as melanoma. This work leverages the strength of YOLO algorithm in order to segment skin lesions.

![Alt text](figures/figure1.png?raw=true "YOLOv8 architecture")

- YOLO (You only look once) algorithm for object detection works by first dividing the input images into a S x S grid of cells, then, each cell's responsibility is detecting objects whose center is within that cell, by doing this YOLO is superior to many other object detection algorithms such as R-CNN as it is able to do works simultaneously and 'only look once' through the entire images, making its speed ideal for real-time applications.

- Each cell in the grid needs to predict many bounding boxes and generate a confidence score for all of the boxes, where each bounding box's defined by using: 'x' representing x-coordinate of the center of the bounding box with respect to the cell, 'y' representing y-coordinate of the center of the bounding box with respect to the cell, and 'w' and 'h' representing width and height of the box with respect to the image itself. Moreover, a confidence score is also generated for each cell, it is a product of the probability of the object being there and the IoU (Intersection over union) between the box and the ground truth. Each cell can produce many bounding boxes but then Non-Maximum Suppression (NMS) is applied to prevent detection of multiple same objects hence getting rid of redundant ones, this is done by sorting the bounding boxes -> select the box with the highest confidence score -> removing boxes that have high IoU score with this box -> repeat until boxes do not overlap too much anymore (no more redundant boxes).

- YOLO also makes use of CNN network as its backbone for feature extraction. It also leverages 'Feature fusion' where features at different scales when being extracted are combined to 'see the bigger picture'. YOLO algorithm loss is a combination of three losses: The first one is bounding box loss to make sure predicted boxes are close to ground truth boxes, second one is 'object presence/objectness' loss designed to penalizes false positives where YOLO detect an object (of any class) when there's only the background (it measures the confidence of the model when it comes to the presence of objects in a bounding box), the third is the classification loss to ensure the predicted probabilities for classes match the ground truth.

- YOLOv8-segmentation goes a step further and instead of just identifying the bounding boxes, it segments each individual object's pixel (assigning pixel-wise labels to objects in images). YOLOv8-segmentation includes another branch in its architecture to detect segmentation masks for all detected instances by bounding boxes.

```
YOLOv8 segmentation:
├── Backbone: CSPDarknet
├── Neck: PANet
└── Head: Decoupled Detection Head
├── Classification Branch
└── Regression Branch
```

## Requirements

```
python>=3.8
ultralytics==8.0.58
opencv-python>=4.1.2
albumentations==1.4
scikit-learn
scikit-build
```
# Preprocessing

The dataset provided has ground truth labels being a binary mask, however, YOLOv8-segmentation from ultralytics accepts a different type of label, that is a polyglon, therefore Opencv was utilised to generate contours from the mask in order to create a polyglon that can be used to train.

The train-validation/test split is 80-20.

# Results
![Alt text](figures/results.png?raw=true "Training results")

Model Training Observations
The training process of the fine-tuned YOLOv8 model was monitored over 30 epochs, during which a steady decline in the loss function was observed. This indicates that the model effectively minimized the error and adapted well to the underlying patterns in the training data. The decreasing trend of the loss function suggests a successful learning process, as the model adjusted its weights to better fit the provided examples.

Evaluation Metric: mAP50-95 (Mean Average Precision)
The key evaluation metric used to assess the model's performance is the mean Average Precision (mAP50-95). This metric measures the model's ability to correctly identify and segment objects across a range of Intersection over Union (IoU) thresholds, from 0.5 to 0.95 with a step size of 0.05. The IoU threshold determines the extent of overlap required between the predicted mask and the ground truth mask for a prediction to be considered a true positive:

At an IoU threshold of 0.5, predictions need to overlap with the ground truth by at least 50%.
At an IoU threshold of 0.95, the overlap requirement is much stricter, requiring a 95% overlap for a correct prediction.
By evaluating the model across multiple IoU thresholds, the mAP50-95 provides a comprehensive measure of the model's performance, capturing both precision (correctness of the predictions) and recall (coverage of all relevant instances).

Performance Results
The fine-tuned YOLOv8 model achieved a mean Average Precision (mAP50-95) of 0.7318 on the test set after 30 epochs. This score reflects the model's strong capability in accurately detecting and segmenting the target objects. The relatively high mAP score across a wide range of IoU thresholds indicates that the model is not only making accurate predictions but is also robust to variations in the overlap requirement, showcasing its generalization capability across different levels of object localization precision.


These are some detections predicted by the fine-tuned YOLOv8-segmentation visualized.
![Alt text](figures/prediction_test.jpg?raw=true "Sample prediction 1")
![Alt text](figures/prediction_test2.jpg?raw=true "Sample prediction 2")

# Reproducibility

In order to reproduce the results:
- Use train-test split of 80%-20%, random state 40.
- YOLOv8n-seg model from Ultralytics.
- 30 epochs of training.
- Use the libraries' version specified in 'Requirements' section.
- Use batch size of 4.
- Learning rate of 0.01 (default)


# To run the algorithm
Run: ./test_script.sh in order to run the algorithm, note that the algorithm assumes the position of the dataset being the location it is situated in Rangpur and the current location for training is '/home/Student/s4671748/comp3710-project/'. If the location differs, changes inside 'dataset.py' for data loading and 'train.py' must be made in order to run properly

## Acknowledgments

- ISIC 2018 Challenge organizers
- Ultralytics for YOLOv8
- Contributors to the ISIC archive

## Conclusion

This work showed the application of YOLOv8-segmentation for skin lesion detection using the ISIC 2018 Challenge dataset. The model achieved a mAP50-95 score of 0.7318 on the test set, suggesting a strong performance in accurately segmenting skin lesions across various IoU thresholds. The implementation leverages YOLOv8's architectural advantages, including its efficient single-pass detection approach, feature fusion capabilities, and dedicated segmentation branch.

## Future Improvements

Several potential enhancements could further improve the algorithm's performance and utility:

1. **Data Augmentation Enhancement**
- Implement more augmentation techniques to generate additional data
- Include domain-specific transformations that reflect real variations in skin lesion appearances
- Introduce synthetic data generation to address class imbalance issues


2. **Performance Optimization**
- Fine-tune hyperparameters using advanced search techniques

3. **Validation and Testing**
- Expand testing to multiple external datasets
- Add metrics specific to medical imaging evaluation

These improvements would enhance both the technical performance and practical utility of the system, making it a more valuable tool for dermatological diagnosis support.
139 changes: 139 additions & 0 deletions recognition/46717481-ThanhTrungPham-ProjectRecognition/dataset.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
import os
import shutil
from sklearn.model_selection import train_test_split
from pathlib import Path

def move_dataset():
# Source directories of data
input_dir = '/home/groups/comp3710/ISIC2018/ISIC2018_Task1-2_Training_Input_x2'
mask_dir = '/home/groups/comp3710/ISIC2018/ISIC2018_Task1_Training_GroundTruth_x2'

# Create directories to move data to
for dir_path in ['data/images/train', 'data/images/val', 'data/masks/train', 'data/masks/val']:
Path(dir_path).mkdir(parents=True, exist_ok=True)

# Get list of image files (.jpg)
image_files = []
for filename in os.listdir(input_dir):
if filename.endswith('.jpg'):
image_files.append(filename)

# Split the dataset into train and validation sets (80-20 split)
train_files, val_files = train_test_split(
image_files,
test_size=0.2,
random_state=40
)

# Helper used to get mask's file name from the image's name
def get_mask_filename(image_filename):
return image_filename.replace('.jpg', '_segmentation.png')

# Copy training files
for filename in train_files:
# Copy input image
shutil.copy2(
os.path.join(input_dir, filename),
os.path.join('data/images/train', filename)
)
# Copy mask
mask_filename = get_mask_filename(filename)
shutil.copy2(
os.path.join(mask_dir, mask_filename),
os.path.join('data/masks/train', mask_filename)
)

# Copy validation files
for filename in val_files:
# Copy input image
shutil.copy2(
os.path.join(input_dir, filename),
os.path.join('data/images/val', filename)
)
# Copy mask
mask_filename = get_mask_filename(filename)
shutil.copy2(
os.path.join(mask_dir, mask_filename),
os.path.join('data/masks/val', mask_filename)
)

def generate_labels():
# Create the directories to store the ground truth label txt files
os.makedirs('data/labels/train', exist_ok=True)
os.makedirs('data/labels/val', exist_ok=True)

# The directories that store the binary mask images and the directories to store the ground truth label txt files
input_dir_train = './data/masks/train'
output_dir_train = './data/labels/train'
input_dir_val = './data/masks/val'
output_dir_val = './data/labels/val'
dirs_pairs = [[input_dir_train, output_dir_train], [input_dir_val, output_dir_val]]

# Create the ground truth label txt files for the training and validation sets to suit the Ultralytics's yolo model
for dirs_pair in dirs_pairs:
input_dir = dirs_pair[0]
output_dir = dirs_pair[1]
for file_name in os.listdir(input_dir):
image_path = os.path.join(input_dir, file_name)

# This is used to get the binary mask image in order to retrieve the contours
# Ultralytics's yolo model requires the mask to be in the format of a polygon
mask = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
_, mask = cv2.threshold(mask, 1, 255, cv2.THRESH_BINARY)

# find the contours
height, width = mask.shape
contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

# Change the contours to polygons
polygons_list = []
for contour in contours:
if cv2.contourArea(contour) > 200:
polygon = []
for point in contour:
x, y = point[0]
# Normalize the points
polygon.append(x / width)
polygon.append(y / height)
polygons_list.append(polygon)

# Put the polygons into the txt file
polyglon_file_name = f"{os.path.splitext(os.path.join(output_dir, file_name))[0]}.txt"

with open(polyglon_file_name, 'w') as file:
for polygon in polygons_list:
for index, p in enumerate(polygon):
if index == len(polygon) - 1:
file.write('{}\n'.format(p))
elif index == 0:
file.write('0 {} '.format(p))
else:
file.write('{} '.format(p))

file.close()

def rename_groundtruth():
# List of directories to process
directories = ['data/labels/train', 'data/labels/val']

# Process each directory
for directory in directories:
# Get all files in directory
files = os.listdir(directory)

# Go through each file
for filename in files:
if '_segmentation.txt' in filename:
# Create new filename by replacing '_segmentation' with ''
new_filename = filename.replace('_segmentation.txt', '.txt')

# Generate full paths to do rename
old_path = os.path.join(directory, filename)
new_path = os.path.join(directory, new_filename)

# Rename file
os.rename(old_path, new_path)

move_dataset()
generate_labels()
rename_groundtruth()
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
names:
- lesion
nc: 1
path: /home/Student/s4671748/comp3710-project/data
train: images/train
val: images/val
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
77 changes: 77 additions & 0 deletions recognition/46717481-ThanhTrungPham-ProjectRecognition/modules.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# Import YOLO from ultralytics package
# Ultralytics YOLO is an updated, Python-native implementation
from ultralytics import YOLO

class YOLOSegmentation:
"""
A wrapper class for YOLO segmentation model that provides simplified interface
for training, evaluation, and prediction tasks.

This class encapsulates common YOLO operations and provides a clean API
for the main tasks in computer vision: training, evaluation, and inference.
"""

def __init__(self, weights_path):
"""
Initialize the YOLO model with specified weights.

Args:
weights_path (str): Path to the model weights file.
Can be either pretrained weights (e.g., 'yolov8n-seg.pt')
or custom trained weights
"""
# Create YOLO model instance using provided weights
self.model = YOLO(weights_path)

def train(self, params):
"""
Train the YOLO model with given parameters.

Args:
params (dict): Dictionary containing training parameters such as:
- epochs: number of training epochs
- batch_size: batch size for training
- data: path to data configuration file
- imgsz: input image size
And other training configurations

Returns:
results: Training results and metrics
"""
# Unpack parameters dictionary and train the model
results = self.model.train(**params)
return results

def evaluate(self):
"""
Evaluate the model on validation dataset.

This method runs validation on the dataset specified
in the data configuration file used during training.

Returns:
results: Validation metrics including mAP, precision, recall
"""
# Run validation and return metrics
results = self.model.val()
return results

def predict(self, img, conf):
"""
Perform segmentation prediction on an input image.

Args:
img: Input image (can be path or numpy array)
conf (float): Confidence threshold for predictions
Only predictions above this threshold are returned

Returns:
results: Model predictions including:
- Segmentation masks
- Bounding boxes
- Confidence scores
- Class predictions
"""
# Run prediction with specified confidence threshold
results = self.model.predict(img, conf=conf)
return results
Loading