Skip to content

ada-ggf25/Multi-Objective-Machine-Learning-an-Optimized-Lexicographic-Approach

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multi-Objective Machine Learning: An Optimized Lexicographic Approach

This repository explores an optimized lexicographic approach for multi-objective machine learning, focusing on determining the tolerance threshold. The primary goal is to balance multiple competing objectives, such as model accuracy and complexity (e.g., L2 regularization).

Project Context and Origin

This project is the result of an internship at the Universidade Federal de Ouro Preto (UFOP) focusing on machine learning and algorithms. It served as the final project for a Bachelor's degree in Engineering Physics and Technology from Instituto Superior Técnico and contributed to academic requirements at UFOP.

Publication

The research conducted in this project led to a publication in the Encontro Nacional de Inteligência Artificial e Computacional (ENIAC):

ENIAC Logo

Project Overview

In many machine learning scenarios, we aim to optimize more than one objective simultaneously. For instance, we might want a model that not only achieves high accuracy but is also simple and robust. This project investigates a lexicographic method to handle such multi-objective problems, where objectives are optimized sequentially based on their priority. A key aspect of this research is the determination of an appropriate tolerance threshold, which dictates how much degradation in a higher-priority objective is acceptable when optimizing a lower-priority one.

The core of this work is demonstrated in the original.ipynb notebook, which includes:

  • Implementation of different multi-objective optimization techniques:
    • Weighted Sum Method
    • Chebyshev Scalarization
    • Lexicographic Optimization
  • Analysis of Pareto frontiers generated by these methods.
  • Investigation into the impact of varying tolerance levels in the lexicographic approach.
  • Experiments primarily using the MNIST dataset for image classification.

Key Concepts

  • Multi-Objective Optimization (MOO): Deals with optimization problems involving more than one objective function to be optimized simultaneously.
  • Lexicographic Method: A hierarchical approach where objectives are ranked by importance and optimized sequentially. Once a higher-priority objective is optimized, a certain tolerance is allowed for its degradation while optimizing subsequent, lower-priority objectives.
  • Pareto Frontier: A set of non-dominated solutions in a multi-objective optimization problem. A solution is Pareto optimal if no objective can be improved without worsening at least one other objective.
  • Tolerance Threshold: In the lexicographic approach, this is a crucial parameter that defines the acceptable trade-off when moving from optimizing one objective to the next in the hierarchy.

Repository Structure

├── LICENSE
├── original.ipynb
├── publication.pdf
├── README.md
├── Assets/
│   └── Logo_BR.png
└── Finais/
    ├── Copy of Loss_Lexicográfica_PneumoniaMNIST.ipynb
    ├── Loss_Lexicográfica_BreastMNIST.ipynb
    ├── Loss_Lexicográfica_FashionMNIST.ipynb
    ├── Loss_Lexicográfica_MNIST.ipynb
    └── Loss_Lexicográfica_PneumoniaMNIST.ipynb
  • original.ipynb: The main Jupyter Notebook containing the foundational research, implementation of various multi-objective techniques (Weighted Sum, Chebyshev, Lexicographic), and analysis of tolerance thresholds, primarily using the MNIST dataset.
  • Finais/: This directory contains Jupyter Notebooks with experiments applying the lexicographic approach to different datasets:
    • Loss_Lexicográfica_BreastMNIST.ipynb: Experiments on the BreastMNIST dataset.
    • Loss_Lexicográfica_FashionMNIST.ipynb: Experiments on the FashionMNIST dataset.
    • Loss_Lexicográfica_MNIST.ipynb: Further experiments or refined analysis on the MNIST dataset.
    • Loss_Lexicográfica_PneumoniaMNIST.ipynb: Experiments on the PneumoniaMNIST dataset.
  • LICENSE: The MIT License file for this project.
  • README.md: This file.
  • publication.pdf: A local copy of the published paper (if included in the repository).

Tech Stack

This project leverages the following core technologies:

Python TensorFlow Keras NumPy Matplotlib Scikit-learn Jupyter Notebook

Methodology

The primary approach involves:

  1. Defining Objectives: Typically, minimizing classification error (or maximizing accuracy) and minimizing model complexity (e.g., L2 norm of weights).

  2. Implementing Loss Functions:

    • Custom Weighted Sum Loss: Combines multiple objectives into a single scalar function using weights.
    • Chebyshev Loss: Aims to minimize the maximum weighted deviation from an ideal objective vector.
    • Lexicographic Approach:
      • Prioritize objectives (e.g., error rate first, then model complexity).
      • Optimize the first objective.
      • Optimize the second objective, allowing the first objective's value to deviate within a defined tolerance.
  3. Training Neural Networks: Using TensorFlow and Keras to build and train models.

  4. Analyzing Results:

    • Plotting Pareto frontiers to visualize the trade-offs between objectives.
    • Investigating how different tolerance values in the lexicographic method affect the solution quality and the shape of the Pareto front.
    • Comparing the performance and characteristics of solutions obtained through different multi-objective strategies.

Datasets

The experiments in this repository utilize several standard image classification datasets:

  • MNIST
  • FashionMNIST
  • BreastMNIST
  • PneumoniaMNIST

Requirements

The primary libraries used in this project are:

  • TensorFlow
  • Keras
  • NumPy
  • Matplotlib
  • Scikit-learn

To run the notebooks, ensure you have a Python environment with these packages installed. You can typically install them using pip:

pip install tensorflow keras numpy matplotlib scikit-learn

How to Use

  1. Clone the repository:

    git clone https://github.com/your-username/Multi-Objective-Machine-Learning-an-Optimized-Lexicographic-Approach.git
    cd Multi-Objective-Machine-Learning-an-Optimized-Lexicographic-Approach
  2. Install the required dependencies (see Requirements section).

  3. Open the Jupyter Notebooks (original.ipynb or notebooks in the Finais/ directory) using Jupyter Lab or Jupyter Notebook to explore the code and results.

Future Work

  • Exploration of other multi-objective optimization algorithms.
  • Application to different types of machine learning models and datasets.
  • Development of more sophisticated methods for determining the optimal tolerance threshold automatically.
  • Investigating the impact of the order of objectives in the lexicographic approach.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author

Guilherme Grancho


This README provides a comprehensive overview of the project. Feel free to explore the notebooks for detailed implementations and experimental results.

About

Enhancing Multi-Objective Machine Learning with an Optimized Lexicographic Approach: Determining the Tolerance Threshold

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published