This repository explores an optimized lexicographic approach for multi-objective machine learning, focusing on determining the tolerance threshold. The primary goal is to balance multiple competing objectives, such as model accuracy and complexity (e.g., L2 regularization).
This project is the result of an internship at the Universidade Federal de Ouro Preto (UFOP) focusing on machine learning and algorithms. It served as the final project for a Bachelor's degree in Engineering Physics and Technology from Instituto Superior Técnico and contributed to academic requirements at UFOP.
The research conducted in this project led to a publication in the Encontro Nacional de Inteligência Artificial e Computacional (ENIAC):
- Title: Enhancing Multi-Objective Machine Learning with an Optimized Lexicographic Approach: Determining the Tolerance Threshold
 - Conference: Encontro Nacional de Inteligência Artificial e Computacional (ENIAC)
 - Link: https://sol.sbc.org.br/index.php/eniac/article/view/33854
 
In many machine learning scenarios, we aim to optimize more than one objective simultaneously. For instance, we might want a model that not only achieves high accuracy but is also simple and robust. This project investigates a lexicographic method to handle such multi-objective problems, where objectives are optimized sequentially based on their priority. A key aspect of this research is the determination of an appropriate tolerance threshold, which dictates how much degradation in a higher-priority objective is acceptable when optimizing a lower-priority one.
The core of this work is demonstrated in the original.ipynb notebook, which includes:
- Implementation of different multi-objective optimization techniques:
- Weighted Sum Method
 - Chebyshev Scalarization
 - Lexicographic Optimization
 
 - Analysis of Pareto frontiers generated by these methods.
 - Investigation into the impact of varying tolerance levels in the lexicographic approach.
 - Experiments primarily using the MNIST dataset for image classification.
 
- Multi-Objective Optimization (MOO): Deals with optimization problems involving more than one objective function to be optimized simultaneously.
 - Lexicographic Method: A hierarchical approach where objectives are ranked by importance and optimized sequentially. Once a higher-priority objective is optimized, a certain tolerance is allowed for its degradation while optimizing subsequent, lower-priority objectives.
 - Pareto Frontier: A set of non-dominated solutions in a multi-objective optimization problem. A solution is Pareto optimal if no objective can be improved without worsening at least one other objective.
 - Tolerance Threshold: In the lexicographic approach, this is a crucial parameter that defines the acceptable trade-off when moving from optimizing one objective to the next in the hierarchy.
 
├── LICENSE
├── original.ipynb
├── publication.pdf
├── README.md
├── Assets/
│   └── Logo_BR.png
└── Finais/
    ├── Copy of Loss_Lexicográfica_PneumoniaMNIST.ipynb
    ├── Loss_Lexicográfica_BreastMNIST.ipynb
    ├── Loss_Lexicográfica_FashionMNIST.ipynb
    ├── Loss_Lexicográfica_MNIST.ipynb
    └── Loss_Lexicográfica_PneumoniaMNIST.ipynb
original.ipynb: The main Jupyter Notebook containing the foundational research, implementation of various multi-objective techniques (Weighted Sum, Chebyshev, Lexicographic), and analysis of tolerance thresholds, primarily using the MNIST dataset.Finais/: This directory contains Jupyter Notebooks with experiments applying the lexicographic approach to different datasets:Loss_Lexicográfica_BreastMNIST.ipynb: Experiments on the BreastMNIST dataset.Loss_Lexicográfica_FashionMNIST.ipynb: Experiments on the FashionMNIST dataset.Loss_Lexicográfica_MNIST.ipynb: Further experiments or refined analysis on the MNIST dataset.Loss_Lexicográfica_PneumoniaMNIST.ipynb: Experiments on the PneumoniaMNIST dataset.
LICENSE: The MIT License file for this project.README.md: This file.publication.pdf: A local copy of the published paper (if included in the repository).
This project leverages the following core technologies:
The primary approach involves:
- 
Defining Objectives: Typically, minimizing classification error (or maximizing accuracy) and minimizing model complexity (e.g., L2 norm of weights).
 - 
Implementing Loss Functions:
- Custom Weighted Sum Loss: Combines multiple objectives into a single scalar function using weights.
 - Chebyshev Loss: Aims to minimize the maximum weighted deviation from an ideal objective vector.
 - Lexicographic Approach:
- Prioritize objectives (e.g., error rate first, then model complexity).
 - Optimize the first objective.
 - Optimize the second objective, allowing the first objective's value to deviate within a defined tolerance.
 
 
 - 
Training Neural Networks: Using TensorFlow and Keras to build and train models.
 - 
Analyzing Results:
- Plotting Pareto frontiers to visualize the trade-offs between objectives.
 - Investigating how different tolerance values in the lexicographic method affect the solution quality and the shape of the Pareto front.
 - Comparing the performance and characteristics of solutions obtained through different multi-objective strategies.
 
 
The experiments in this repository utilize several standard image classification datasets:
- MNIST
 - FashionMNIST
 - BreastMNIST
 - PneumoniaMNIST
 
The primary libraries used in this project are:
- TensorFlow
 - Keras
 - NumPy
 - Matplotlib
 - Scikit-learn
 
To run the notebooks, ensure you have a Python environment with these packages installed. You can typically install them using pip:
pip install tensorflow keras numpy matplotlib scikit-learn- 
Clone the repository:
git clone https://github.com/your-username/Multi-Objective-Machine-Learning-an-Optimized-Lexicographic-Approach.git cd Multi-Objective-Machine-Learning-an-Optimized-Lexicographic-Approach - 
Install the required dependencies (see Requirements section).
 - 
Open the Jupyter Notebooks (
original.ipynbor notebooks in theFinais/directory) using Jupyter Lab or Jupyter Notebook to explore the code and results. 
- Exploration of other multi-objective optimization algorithms.
 - Application to different types of machine learning models and datasets.
 - Development of more sophisticated methods for determining the optimal tolerance threshold automatically.
 - Investigating the impact of the order of objectives in the lexicographic approach.
 
This project is licensed under the MIT License - see the LICENSE file for details.
Guilherme Grancho
This README provides a comprehensive overview of the project. Feel free to explore the notebooks for detailed implementations and experimental results.
