Skip to content

krassiaa/Optimization-of-CV-models

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Model Conversion, Optimization and Benchmarking: MobileNetV3 on Cats vs. Dogs

Overview

This project explores the process of converting a pre-trained MobileNetV3 model (trained on a cats vs. dogs dataset) into different formats and optimizing it through quantization, pruning, and distillation. It benchmarks each format and optimization in terms of model size, inference speed, and classification quality.


Objectives

  1. Use a pre-trained model
  2. Evaluate its size, quality, and speed on test data
  3. Convert the model to the following formats:
    • Keras (.keras, .h5)
    • ONNX (.onnx)
    • TensorFlow SavedModel (.pb)
    • TensorFlow Lite (.tflite)
    • TensorFlowJS (.js)
  4. Benchmark every format (size, speed, accuracy, precision, recall, F1)
  5. Quantize to INT8 and FLOAT16 (tflite)
  6. Prune (weight thinning) the model
  7. Distill the model to a smaller student network
  8. Tabulate and compare results
  9. Draw conclusions on practical applications

Workflow

1. Dataset Preparation

  • Dataset: Kaggle Cats vs. Dogs (custom split into train/val/test)
  • Preprocessing: Resized images to 224x224, preprocessed as per MobileNetV3 requirements.
  • Batching: Implemented tf.data pipeline with augmentation (outside model graph for TFJS compatibility).

2. Base Model

  • Architecture: MobileNetV3 (pre-trained, fine-tuned on dataset)
  • Baseline Metrics: Model size, accuracy, precision, recall, F1-score, inference time measured on test set.

3. Conversion to Multiple Formats

Saved the original model as:

  • Keras (.keras, .h5)
  • TensorFlow SavedModel
  • ONNX
  • TensorFlow Lite (.tflite)
  • TensorFlowJS

4. Benchmarking per Format

For each format:

  • Measured file size
  • Evaluated classification quality on test data
  • Calculated average inference time per image

5. Quantization

  • Converted TFLite model to:
    • INT8 quantization - FLOAT16 quantization
  • Benchmarked metrics post-quantization

6. Pruning

  • Applied magnitude-based pruning using TensorFlow Model Optimization Toolkit
  • Retrained and stripped pruning wrappers
  • Measured size/speed/accuracy post-pruning

7. Distillation

  • Distilled a compact “student” model from the original “teacher” MobileNetV3
  • Trained the student on soft targets from teacher network at different temperatures
  • Compared speed/size/performance

8. Tabulation

All metrics for each model/format/optimization are summarized in a comparison table.


Key Findings

  1. Format Suitability: Conversion to Keras, ONNX, TF SavedModel, TFLite, and TFJS enables deployment on various platforms—cloud, edge, mobile, and web respectively.
  2. Quantization: Reduces size and increases speed (especially INT8), but with some accuracy/F1 loss.
  3. Pruning: Further compresses models with minimal loss in accuracy.
  4. Distillation: Enables a much smaller and faster student network, maintaining reasonable performance compared to the original model.
  5. ONNX & TFLite: Offer great interoperability and inference acceleration outside TensorFlow environments.
  6. Combination: Maximum optimization is achieved by combining pruning, quantization, and distillation for edge/IoT deployment.
  7. Trade-offs: The final choice depends on hardware constraints, inference speed requirements, and the acceptable drop in model quality.

Usage

Dataset Preparation

Split images and prepare CSVs for train/val/test sets as per the notebook scripts.

Training

Train or fine-tune MobileNetV3 using provided code.

Model Conversion

Use provided scripts to export models to required formats.

Evaluation

Run benchmarking scripts for each format and fill in results table.


Result Table

Model/Format Precision Recall Accuracy F1-Score Size (MB) Inference Time (s/img)
Keras_original 0.996667 0.996667 0.996667 0.996667 12.103346 0.008293
Keras 0.996667 0.996667 0.996667 0.996667 12.088532 0.007742
TF Lite 0.996667 0.996667 0.996667 0.996667 12.088532 0.007742
ONNX 0.996667 0.996667 0.996667 0.996667 12.345000 0.007307
Save_pb 0.996667 0.993333 0.995000 0.996667 11.399948 0.001448
TensorFlowJS NaN NaN NaN NaN 11.610000 NaN
TFLite (int8) 0.500000 0.500000 0.500000 0.500000 3.329346 0.009200
TFLite (float16) 0.500000 0.500000 0.500000 0.500000 5.714710 0.008900
Distilled 0.911667 0.911660 0.911660 0.911660 12.096642 0.051334
Pruned 0.999400 0.988200 0.990800 0.990800 12.096902 0.006943

Если нужны данные в другом формате, дай знать!


License

MIT License


Acknowledgements

Thanks to TensorFlow, ONNX, and TensorFlow Model Optimization open-source communities for frameworks/libraries.


For any questions, please contact (perinadaria19@gmail.com).