Professional implementation of 5 advanced deep learning architectures, demonstrating mastery of CNN, RNN, Autoencoders, VAE, and GAN techniques for image classification, generation, and representation learning. All projects feature production-ready code, comprehensive documentation, and validated results.
- 🎯 Overview
- 🛠️ Skills & Technologies
- 📊 Projects
- 🔬 Key Technical Innovations
- 🌍 Real-World Applications
- ⚙️ Environment Setup
This portfolio showcases 5 comprehensive deep learning projects covering the major neural network architectures used in modern computer vision and generative AI:
| Architecture | Use Case | Key Metric | Complexity |
|---|---|---|---|
| CNN/RNN | Image Classification | 88% test accuracy | Intermediate |
| Autoencoder | Feature Learning + Transfer | 72.2% with 1,800 samples | Advanced |
| VAE | Generative Modeling | 2D latent space visualization | Advanced |
| GAN | Image Synthesis | Realistic 28×28 generation | Expert |
| Ensemble | Multi-model Integration | Group collaboration | Advanced |
Total Implementation: 5,000+ lines of Python code, 11 custom neural architectures, comprehensive evaluation frameworks
tensorflow==2.10.0 # Deep learning framework
tensorflow-probability==0.18.0 # Probabilistic modeling (VAE)
keras # High-level neural network API
numpy==1.23.5 # Numerical computing
matplotlib==3.7.2 # Visualization
scikit-learn # Machine learning utilitiesArchitecture Design:
- Convolutional Neural Networks (CNNs) for spatial feature extraction
- Recurrent Neural Networks (RNNs/LSTM) for sequential processing
- Encoder-Decoder architectures for dimensionality reduction
- Generative Adversarial Networks (GANs) for image synthesis
- Transfer Learning for low-data scenarios
Training Strategies:
- Adversarial training with Nash equilibrium monitoring
- Unsupervised pre-training for semi-supervised learning
- β-weighting for loss function balancing
- Custom early stopping with multi-metric heuristics
- Gradient flow optimization via reparameterization trick
Model Evaluation:
- Multi-class classification metrics (accuracy, confusion matrices)
- Generative model quality assessment (visual inspection, latent space analysis)
- Transfer learning validation (baseline vs pre-trained comparison)
- Convergence analysis (loss oscillation, discriminator accuracy)
- Architectural Design - Implemented 11 distinct neural network architectures (CNNs, RNNs, Autoencoders, VAE, GAN)
- Training Strategies - Mastered adversarial training, transfer learning, unsupervised pre-training, β-weighting
- Custom Implementations - Built SampleLayer (reparameterization), GANEarlyStopping, multi-metric scoring
- Critical Analysis - Identified architectural errors, validated through experiments, quantified improvements
- Model Evaluation - Comprehensive metrics (accuracy, confusion matrices, latent space analysis, convergence detection)
- Technical Communication - Professional documentation with theory, code, results, and real-world applications
Location: 00_CNN_RNN_ImageClassification/
Objective: Systematic comparison of CNN and RNN architectures for Fashion MNIST classification, demonstrating when to use each approach.
Key Results:
- ✅ CNN achieved 88% test accuracy vs RNN's 86.2% - validated CNN superiority for spatial tasks
- ✅ Identified architectural mismatch - LSTM treats image rows as sequence (suboptimal for 2D spatial data)
- ✅ Category 6 (Shirts) most challenging - 65.9% accuracy due to visual similarity with T-shirts
Technical Highlights:
- 2 complete architectures: CNN (~140K params) and LSTM (~84K params)
- Comprehensive evaluation: training curves, confusion matrices, probability distributions
- Proper train-validation-test methodology with no overfitting
Real-World Applications:
- Automated clothing categorization for e-commerce
- Visual search systems ("find similar items")
- Transfer learning foundations for custom datasets
Location: 01_AutoencoderTransferLearning/
Objective: Demonstrate how unsupervised pre-training on 57,000 unlabeled images improves CNN performance when only 1,800 labeled samples are available.
Key Results:
- ✅ Semi-supervised learning - Pre-trained model achieved 72.17% vs baseline 70.33% validation accuracy
- ✅ 4x improvement on hardest class - Category 6 accuracy: 32.4% (pre-trained) vs 8.8% (baseline)
- ✅ Denoising forces robust features - Noise factor 0.2 prevents memorization, learns true patterns
Technical Highlights:
- 3 distinct architectures: Baseline CNN, Denoising Autoencoder, Transfer Learning CNN
- Frozen encoder layers (feature extraction) + trainable classifier head
- A/B testing methodology proving value of unsupervised pre-training
Real-World Applications:
- Medical imaging (few labeled scans, many unlabeled)
- Rare object detection (limited annotated examples)
- Domain adaptation (pre-train on similar dataset, fine-tune on target)
Location: 02_VAE_FashionMNIST/
Objective: Learn a 2D probabilistic latent space representation of Fashion MNIST, enabling both reconstruction and controlled generation.
Key Results:
- ✅ β-VAE with β=0.001 - Optimized weighting balances reconstruction quality and latent regularization
- ✅ Well-separated class clusters - 10 fashion categories occupy distinct 2D regions (z[0]: -3.66 to 2.76, z[1]: -3.46 to 3.25)
- ✅ Smooth interpolation - Walking between latent points produces realistic intermediate images
Technical Highlights:
- Custom SampleLayer implementing reparameterization trick (enables backprop through sampling)
- Principled β derivation via loss magnitude analysis (not trial-and-error)
- 2D latent space for human-interpretable visualization
Real-World Applications:
- Fashion design prototyping (interpolate between existing designs)
- Anomaly detection (high reconstruction error = defective product)
- Data augmentation (synthesize training samples for limited datasets)
Location: 03_ConvolutionalGAN/
Objective: Build a DCGAN that generates realistic 28×28 grayscale pants images from random noise, demonstrating adversarial training convergence.
Key Results:
- ✅ Architectural error correction - Removed discriminator upsampling (50% parameter reduction: 424K → 213K)
- ✅ Balanced Nash equilibrium - Gen Loss: 0.6-0.9, Disc Loss: 1.2-1.6, Disc Accuracy: 55-60%
- ✅ Custom early stopping - GAN-specific scoring heuristic (accuracy deviation + loss ratio + range penalties)
Technical Highlights:
- Critical analysis identified contradiction with DCGAN standard (Radford et al., 2015)
- Generator: Progressive upsampling (7×7 → 14×14 → 28×28) via Conv2DTranspose
- Discriminator: Progressive downsampling (28×28 → 14×14 → 7×7) via strided Conv2D
- Training time: ~35 seconds on Colab T4 GPU vs hours/days on CPU
Real-World Applications:
- Fashion design rapid ideation workflows
- Data augmentation for computer vision models
- Synthetic training data generation for object detection
Challenge: VAE requires sampling from N(μ, σ²) during forward pass, but sampling blocks gradient flow
Solution: Implemented SampleLayer with reparameterization trick:
class SampleLayer(tf.keras.layers.Layer):
def call(self, inputs):
z_mean, z_log_var = inputs
epsilon = tf.random.normal(shape=tf.shape(z_mean))
return z_mean + tf.exp(0.5 * z_log_var) * epsilonMathematical Foundation: z = μ + σ ⊙ ε, where ε ~ N(0,1)
- Randomness separated from learned parameters (μ, σ)
- Gradients flow through μ and σ, not ε
- Enables end-to-end training of probabilistic models
Challenge: Standard early stopping fails for GANs (oscillating losses are healthy)
Solution: Custom scoring heuristic balancing multiple signals:
score = 1.0 / (1.0 + acc_penalty + ratio_penalty + gen_penalty + disc_penalty)
# Where:
# - acc_penalty: |disc_acc - 0.55| / 0.55
# - ratio_penalty: 2.0 if (gen_loss/disc_loss) outside [0.4, 0.8]
# - Loss range penalties for pathological valuesStopping Criteria:
- Patience exhausted (15 epochs no improvement after 30 min epochs)
- Convergence detected (loss variance < 0.001 over 20 epochs)
Result: Automatically detected best checkpoint (Epoch 51, score 0.8567)
Challenge: KL divergence loss magnitude often 100-1000x larger than reconstruction loss
Solution: Principled β calculation targeting 10-20% KL contribution:
# Step 1: Measure typical loss ranges
typical_mse = 0.06 # Reconstruction loss
typical_kl = 20 # KL loss (unweighted)
# Step 2: Solve for β
target_proportion = 0.15
β = (target_proportion * typical_mse) / typical_kl
= (0.15 * 0.06) / 20
= 0.00045
≈ 0.001 # Round for convenienceValidation: Post-training analysis confirmed 15% KL contribution with sharp reconstructions
- Automated Product Categorization (CNN) - Scale to millions of products
- Visual Search (CNN + Transfer Learning) - "Find similar items" functionality
- Design Prototyping (VAE + GAN) - Generate design variations, rapid ideation
- Synthetic Data Generation (GAN) - Augment limited fashion datasets
- Transfer Learning Pipelines (Autoencoder) - Pre-train on unlabeled data, fine-tune on target task
- Anomaly Detection (VAE) - Identify defects via reconstruction error
- Data Augmentation (GAN) - Generate training samples for rare classes
- Feature Extraction (CNN) - Frozen encoders for downstream tasks
- Architecture Comparison Studies (CNN vs RNN) - Validate architectural choices empirically
- Semi-Supervised Learning (Denoising Autoencoder) - Leverage unlabeled data effectively
- Latent Space Analysis (VAE) - Understand learned representations via 2D visualization
- Adversarial Training (GAN) - Apply game-theoretic concepts to neural networks
All projects use Python 3.10 with TensorFlow 2.10.0 for consistency. Each project folder contains:
environment.yml- Conda environment specificationrequirements.txt- pip dependencies (alternative)
# Navigate to any project folder
cd 00_CNN_RNN_ImageClassification/
# Create environment from YAML
conda env create -f environment.yml
# Activate environment (name varies by project)
conda activate dl-cnn-rnn # or dl-autoencoder, vae-project, gan-project
# Run main script
python matheus_CNN_RNN.py # Filename varies by project| Project | CPU | GPU | Recommended |
|---|---|---|---|
| CNN/RNN | 5-10 min | 2-3 min | CPU acceptable |
| Autoencoder | 10-15 min | 3-5 min | CPU acceptable |
| VAE | 30-35 min | 3-4 min | GPU recommended |
| GAN | Hours/Days | 35 seconds | GPU required |
| Group Project | Varies | Varies | Depends on models |
Free GPU: Google Colab provides free T4 GPU access (sufficient for all projects)
# Create virtual environment
python -m venv dl-env
source dl-env/bin/activate # Linux/Mac
# or
dl-env\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txt
# Run project
python main_script.py# Core (all projects)
tensorflow==2.10.0
numpy==1.23.5
matplotlib==3.7.2
# Additional (specific projects)
tensorflow-probability==0.18.0 # VAE only
scikit-learn # CNN/RNN evaluation
seaborn # Enhanced visualizationsAuthor: Matheus Ferreira Teixeira
GitHub: github.com/domvito55
LinkedIn: linkedin.com/in/mathteixeira
Note: Each project folder contains its own detailed README with complete implementation details, architecture diagrams, training procedures, and results analysis.