Skip to content

Latest commit

 

History

History

README.md

Deep Learning Portfolio

Professional implementation of 5 advanced deep learning architectures, demonstrating mastery of CNN, RNN, Autoencoders, VAE, and GAN techniques for image classification, generation, and representation learning. All projects feature production-ready code, comprehensive documentation, and validated results.


📚 Table of Contents


🎯 Overview

This portfolio showcases 5 comprehensive deep learning projects covering the major neural network architectures used in modern computer vision and generative AI:

Architecture Use Case Key Metric Complexity
CNN/RNN Image Classification 88% test accuracy Intermediate
Autoencoder Feature Learning + Transfer 72.2% with 1,800 samples Advanced
VAE Generative Modeling 2D latent space visualization Advanced
GAN Image Synthesis Realistic 28×28 generation Expert
Ensemble Multi-model Integration Group collaboration Advanced

Total Implementation: 5,000+ lines of Python code, 11 custom neural architectures, comprehensive evaluation frameworks


🛠️ Skills & Technologies

Core Frameworks & Libraries

tensorflow==2.10.0              # Deep learning framework
tensorflow-probability==0.18.0   # Probabilistic modeling (VAE)
keras                            # High-level neural network API
numpy==1.23.5                    # Numerical computing
matplotlib==3.7.2                # Visualization
scikit-learn                     # Machine learning utilities

Advanced Techniques Implemented

Architecture Design:

  • Convolutional Neural Networks (CNNs) for spatial feature extraction
  • Recurrent Neural Networks (RNNs/LSTM) for sequential processing
  • Encoder-Decoder architectures for dimensionality reduction
  • Generative Adversarial Networks (GANs) for image synthesis
  • Transfer Learning for low-data scenarios

Training Strategies:

  • Adversarial training with Nash equilibrium monitoring
  • Unsupervised pre-training for semi-supervised learning
  • β-weighting for loss function balancing
  • Custom early stopping with multi-metric heuristics
  • Gradient flow optimization via reparameterization trick

Model Evaluation:

  • Multi-class classification metrics (accuracy, confusion matrices)
  • Generative model quality assessment (visual inspection, latent space analysis)
  • Transfer learning validation (baseline vs pre-trained comparison)
  • Convergence analysis (loss oscillation, discriminator accuracy)

Cumulative Skills Demonstrated

  1. Architectural Design - Implemented 11 distinct neural network architectures (CNNs, RNNs, Autoencoders, VAE, GAN)
  2. Training Strategies - Mastered adversarial training, transfer learning, unsupervised pre-training, β-weighting
  3. Custom Implementations - Built SampleLayer (reparameterization), GANEarlyStopping, multi-metric scoring
  4. Critical Analysis - Identified architectural errors, validated through experiments, quantified improvements
  5. Model Evaluation - Comprehensive metrics (accuracy, confusion matrices, latent space analysis, convergence detection)
  6. Technical Communication - Professional documentation with theory, code, results, and real-world applications

📊 Projects

1. CNN vs RNN Image Classification

Location: 00_CNN_RNN_ImageClassification/

Objective: Systematic comparison of CNN and RNN architectures for Fashion MNIST classification, demonstrating when to use each approach.

Key Results:

  • CNN achieved 88% test accuracy vs RNN's 86.2% - validated CNN superiority for spatial tasks
  • Identified architectural mismatch - LSTM treats image rows as sequence (suboptimal for 2D spatial data)
  • Category 6 (Shirts) most challenging - 65.9% accuracy due to visual similarity with T-shirts

Technical Highlights:

  • 2 complete architectures: CNN (~140K params) and LSTM (~84K params)
  • Comprehensive evaluation: training curves, confusion matrices, probability distributions
  • Proper train-validation-test methodology with no overfitting

Real-World Applications:

  • Automated clothing categorization for e-commerce
  • Visual search systems ("find similar items")
  • Transfer learning foundations for custom datasets

View Full Documentation →


2. Transfer Learning with Denoising Autoencoder

Location: 01_AutoencoderTransferLearning/

Objective: Demonstrate how unsupervised pre-training on 57,000 unlabeled images improves CNN performance when only 1,800 labeled samples are available.

Key Results:

  • Semi-supervised learning - Pre-trained model achieved 72.17% vs baseline 70.33% validation accuracy
  • 4x improvement on hardest class - Category 6 accuracy: 32.4% (pre-trained) vs 8.8% (baseline)
  • Denoising forces robust features - Noise factor 0.2 prevents memorization, learns true patterns

Technical Highlights:

  • 3 distinct architectures: Baseline CNN, Denoising Autoencoder, Transfer Learning CNN
  • Frozen encoder layers (feature extraction) + trainable classifier head
  • A/B testing methodology proving value of unsupervised pre-training

Real-World Applications:

  • Medical imaging (few labeled scans, many unlabeled)
  • Rare object detection (limited annotated examples)
  • Domain adaptation (pre-train on similar dataset, fine-tune on target)

View Full Documentation →


3. Variational Autoencoder (VAE)

Location: 02_VAE_FashionMNIST/

Objective: Learn a 2D probabilistic latent space representation of Fashion MNIST, enabling both reconstruction and controlled generation.

Key Results:

  • β-VAE with β=0.001 - Optimized weighting balances reconstruction quality and latent regularization
  • Well-separated class clusters - 10 fashion categories occupy distinct 2D regions (z[0]: -3.66 to 2.76, z[1]: -3.46 to 3.25)
  • Smooth interpolation - Walking between latent points produces realistic intermediate images

Technical Highlights:

  • Custom SampleLayer implementing reparameterization trick (enables backprop through sampling)
  • Principled β derivation via loss magnitude analysis (not trial-and-error)
  • 2D latent space for human-interpretable visualization

Real-World Applications:

  • Fashion design prototyping (interpolate between existing designs)
  • Anomaly detection (high reconstruction error = defective product)
  • Data augmentation (synthesize training samples for limited datasets)

View Full Documentation →


4. Deep Convolutional GAN (DCGAN)

Location: 03_ConvolutionalGAN/

Objective: Build a DCGAN that generates realistic 28×28 grayscale pants images from random noise, demonstrating adversarial training convergence.

Key Results:

  • Architectural error correction - Removed discriminator upsampling (50% parameter reduction: 424K → 213K)
  • Balanced Nash equilibrium - Gen Loss: 0.6-0.9, Disc Loss: 1.2-1.6, Disc Accuracy: 55-60%
  • Custom early stopping - GAN-specific scoring heuristic (accuracy deviation + loss ratio + range penalties)

Technical Highlights:

  • Critical analysis identified contradiction with DCGAN standard (Radford et al., 2015)
  • Generator: Progressive upsampling (7×7 → 14×14 → 28×28) via Conv2DTranspose
  • Discriminator: Progressive downsampling (28×28 → 14×14 → 7×7) via strided Conv2D
  • Training time: ~35 seconds on Colab T4 GPU vs hours/days on CPU

Real-World Applications:

  • Fashion design rapid ideation workflows
  • Data augmentation for computer vision models
  • Synthetic training data generation for object detection

View Full Documentation →


🔬 Key Technical Innovations

1. Custom Keras Layers for Probabilistic Modeling

Challenge: VAE requires sampling from N(μ, σ²) during forward pass, but sampling blocks gradient flow

Solution: Implemented SampleLayer with reparameterization trick:

class SampleLayer(tf.keras.layers.Layer):
    def call(self, inputs):
        z_mean, z_log_var = inputs
        epsilon = tf.random.normal(shape=tf.shape(z_mean))
        return z_mean + tf.exp(0.5 * z_log_var) * epsilon

Mathematical Foundation: z = μ + σ ⊙ ε, where ε ~ N(0,1)

  • Randomness separated from learned parameters (μ, σ)
  • Gradients flow through μ and σ, not ε
  • Enables end-to-end training of probabilistic models

2. Multi-Metric GAN Convergence Detection

Challenge: Standard early stopping fails for GANs (oscillating losses are healthy)

Solution: Custom scoring heuristic balancing multiple signals:

score = 1.0 / (1.0 + acc_penalty + ratio_penalty + gen_penalty + disc_penalty)

# Where:
# - acc_penalty: |disc_acc - 0.55| / 0.55
# - ratio_penalty: 2.0 if (gen_loss/disc_loss) outside [0.4, 0.8]
# - Loss range penalties for pathological values

Stopping Criteria:

  1. Patience exhausted (15 epochs no improvement after 30 min epochs)
  2. Convergence detected (loss variance < 0.001 over 20 epochs)

Result: Automatically detected best checkpoint (Epoch 51, score 0.8567)


3. β-Weighting Derivation for VAE Loss

Challenge: KL divergence loss magnitude often 100-1000x larger than reconstruction loss

Solution: Principled β calculation targeting 10-20% KL contribution:

# Step 1: Measure typical loss ranges
typical_mse = 0.06       # Reconstruction loss
typical_kl = 20          # KL loss (unweighted)

# Step 2: Solve for β
target_proportion = 0.15
β = (target_proportion * typical_mse) / typical_kl
  = (0.15 * 0.06) / 20
  = 0.000450.001  # Round for convenience

Validation: Post-training analysis confirmed 15% KL contribution with sharp reconstructions


🌍 Real-World Applications

Fashion & E-commerce

  • Automated Product Categorization (CNN) - Scale to millions of products
  • Visual Search (CNN + Transfer Learning) - "Find similar items" functionality
  • Design Prototyping (VAE + GAN) - Generate design variations, rapid ideation
  • Synthetic Data Generation (GAN) - Augment limited fashion datasets

Computer Vision Systems

  • Transfer Learning Pipelines (Autoencoder) - Pre-train on unlabeled data, fine-tune on target task
  • Anomaly Detection (VAE) - Identify defects via reconstruction error
  • Data Augmentation (GAN) - Generate training samples for rare classes
  • Feature Extraction (CNN) - Frozen encoders for downstream tasks

Research & Development

  • Architecture Comparison Studies (CNN vs RNN) - Validate architectural choices empirically
  • Semi-Supervised Learning (Denoising Autoencoder) - Leverage unlabeled data effectively
  • Latent Space Analysis (VAE) - Understand learned representations via 2D visualization
  • Adversarial Training (GAN) - Apply game-theoretic concepts to neural networks

⚙️ Environment Setup

Recommended: Conda Environment

All projects use Python 3.10 with TensorFlow 2.10.0 for consistency. Each project folder contains:

  • environment.yml - Conda environment specification
  • requirements.txt - pip dependencies (alternative)
# Navigate to any project folder
cd 00_CNN_RNN_ImageClassification/

# Create environment from YAML
conda env create -f environment.yml

# Activate environment (name varies by project)
conda activate dl-cnn-rnn  # or dl-autoencoder, vae-project, gan-project

# Run main script
python matheus_CNN_RNN.py  # Filename varies by project

Hardware Requirements

Project CPU GPU Recommended
CNN/RNN 5-10 min 2-3 min CPU acceptable
Autoencoder 10-15 min 3-5 min CPU acceptable
VAE 30-35 min 3-4 min GPU recommended
GAN Hours/Days 35 seconds GPU required
Group Project Varies Varies Depends on models

Free GPU: Google Colab provides free T4 GPU access (sufficient for all projects)

Alternative: pip Installation

# Create virtual environment
python -m venv dl-env
source dl-env/bin/activate  # Linux/Mac
# or
dl-env\Scripts\activate  # Windows

# Install dependencies
pip install -r requirements.txt

# Run project
python main_script.py

Common Dependencies

# Core (all projects)
tensorflow==2.10.0
numpy==1.23.5
matplotlib==3.7.2

# Additional (specific projects)
tensorflow-probability==0.18.0  # VAE only
scikit-learn                     # CNN/RNN evaluation
seaborn                          # Enhanced visualizations

Author: Matheus Ferreira Teixeira
GitHub: github.com/domvito55
LinkedIn: linkedin.com/in/mathteixeira


Note: Each project folder contains its own detailed README with complete implementation details, architecture diagrams, training procedures, and results analysis.