Instructor: Sergio A. Mora Pardo
- email: sergioa.mora@javeriana.edu.co
- github: sergiomorapardo
Knowledge of the challenges and solutions present in specific situations of organizations that require advanced and special handling of information, such as text mining, process mining, data flow mining (stream data mining) and social network analysis. This module on Natural Language Processing will explain how to build systems that learn and adapt using real-world applications. Some of the topics to be covered include text preprocessing, text representation, modeling of common NLP problems such as sentiment analysis, similarity, recurrent models, word embeddings, introduction to lenguage generative models. The course will be project-oriented, with emphasis placed on writing software implementations of learning algorithms applied to real-world problems, in particular, language processing, sentiment detection, among others.
- Python version >= 3.7;
- Numpy, the core numerical extensions for linear algebra and multidimensional arrays;
- Scipy, additional libraries for scientific programming;
- Matplotlib, excellent plotting and graphing libraries;
- IPython, with the additional libraries required for the notebook interface.
- Pandas, Python version of R dataframe
- Seaborn, used mainly for plot styling
- scikit-learn, Machine learning library!
A good, easy to install option that supports Mac, Windows, and Linux, and that has all of these packages (and much more) is the Anaconda.
GIT!! Unfortunatelly out of the scope of this class, but please take a look at these tutorials
- 50% Project
- 40% Exercises
- 10% Class participation
| Session | Activity | Deadline | Comments |
|---|---|---|---|
| Deep Learning | Exercises Project |
March 21th | Expo March 22th |
| NLP | Exercises Project |
April 25th April 11th |
Expo April 12th |
| Graph Learning | Exercises Project |
May 24th | |
| Final grade | project | May 31thth |
| Date | Session | Notebooks/Presentations | Exercises |
|---|---|---|---|
| March 1st | Machine Learning Operations (MLOps) | Intro MLOps | |
| March 1st | ML monitoring & Data Drift | Intro Data Drift L2 - Intro Data Drift L3 - Intro Model Monitoring |
E1 - Data Drift in Used Vehicle Price Prediction |
| March 1st | Machine Learning as a Service (AIaaS) | 1 - Intro to APIs L1 - Model Deployment |
E2 - Model Deployment in Used Vehicle Price Prediction |
| Date | Session | Notebooks/Presentations | Exercises |
|---|---|---|---|
| March 8th | First steps in deep learning | 3 - Intro Deep Learning L3 - Introduction Deep Learining MLP L4 - Simple Neural Network (handcraft) L5 - Simple Neural Network (Images) L6 - Deep Learning with Keras L7 - Deep Learning with Pytorch |
E3 - Neural Networks in Keras and PyTorch |
| March 15th | Deep Computer Vision | 4 - Convolutional Neural Networks L5 - CNN with TensorFlow L6 - CNN with PyTorch 🔥 L7 - Tranfer Learning with TensorFlow |
E4 - Tranfer Learning with PyTorch |
| March 22th | Computer Vision Project | Exercises Deadline | P1 - Frailejon Detection (a.k.a "Big Monks Detection") |
| Date | Session | Notebooks/Presentations | Exercises |
|---|---|---|---|
| March 22th | Introduction to NLP | 1 - Introduction to NLP 2 - NLP Pipeline E1 - Tokenization |
| Date | Session | Notebooks/Presentations | Exercises |
|---|---|---|---|
| March 22th | Space Vector Models | 1 - Basic Vectorizarion Approaches L2 - OneHot Encoding L3 - Bag of Words L4 - N-grams L5 - TF-IDF L6 - Basic Vectorization Approaches |
E2 - Sentiment Analysis |
| March 29th | Distributed Representations | 2 - Word Embbedings L7 - Text Similarity L8 - Exploring Word Embeddings L9 - Song Embeddings L10 - Visualizing Embeddings |
E2 - Homework Analysis (Bonus) E3 - Song Embedding Visualization E4 - Spam Classification |
| Date | Session | Notebooks/Presentations | Exercises |
|---|---|---|---|
| April 5th | Deep Learning in NLP (RNN, LSTM, GRU) | 4 - RNN, LSTM, GRU L12 - NLP with Keras L11 - NLP with Keras L13 - Recurrent Neural Network and LSTM L14 - Headline Generator |
E5 - Neural Networks in Keras for NLP E6 - Neural Networks in PyTorch for NLP E7 - RNN, LSTM, GRU |
| April 12th | NLP Project | P1 - Movie Genre Prediction | |
| April 12th | Attention, Tranformers and BERT | 5 - Encoder-Decoder 6 - Attention Mechanisms and Transformers 7 - BERT and Family L16 - Positional Encoding L17 - BERT for Sentiment Clasification L18 - Transformers Introduction |
E8 - Text Summary E9 - Question Answering |
| April 19th | Holy Week | Holy Week | Holy Week |
| April 25th | Exercises Deadline |
| Date | Session | Notebooks/Presentations | Exercises |
|---|---|---|---|
| April 26th | Intro to Graphs | Intro to Graphs L19 - Intro to Graphs |
|
| April 26th | Graphs Metrics | L20 - Graph Metrics L21 - Graphs Benchmarks L22 - Facebook Analysis |
E10 - Twitter Analysis |
| Date | Session | Notebooks/Presentations | Exercises |
|---|---|---|---|
| May 3th | Graph Representation | Graph Representations L23 - Graph Embedding L24 - Deep Walk L25 - Node2Vec L26 - Recommendation System with Node2Vec |
E11 - Patent Citation Network (Node2Vec with RecSys) |
| Date | Session | Notebooks/Presentations | Exercises |
|---|---|---|---|
| May 10th | Graph Neural Network | L27 - Graph Neural Networks - Node Features L28 - Graph Neural Networks - Node2Vec L29 - Graph Neural Networks - Adjacency Matrix L31 - Graph Neural Networks - Graph Convolutional Networks (GCN) L33 - Graph Neural Networks - Graph Attention Networks (GAT) L34 - Graph Convolutional Networks - Node Regression |
L30 - Graph Neural Networks - Facebook Page-Page dataset L32 - Graph Convolutional Networks - Facebook Page-Page dataset L34 - Graph Attention Networks - Cite Seer |
| May 17th | Graph Machine Learning Task [Optional] | L35 - Graph AutoEncoder - Link Prediction L36 - Graph Variational AutoEncoder - Link Prediction [extra] L37 - Node2Vec - Link Classification L38 - Graph Isomorphism Network - Graph Classification |
|
| May 24th | Geometric Deep Learning Project | Exercises Deadline | P3 - Graph Machine Learning / P3 - Graph Machine Learning [old < 2022] |
| Date | Session | Notebooks/Presentations | Exercises |
|---|---|---|---|
| May 31th | Final Grades |
| Module | Topic | Material |
|---|---|---|
| NLP | Word Embedding Projector | Tensorflow Embeddings Projector |
| NLP | Time Series with LSTM | ARIMA-SARIMA-Prophet-LSTM |
| NLP | Stanford | Natural Language Processing with Deep Learning |
| GML | Stanford | CS224W: Machine Learning with Graphs |
| Module | Topic | Material |
|---|---|---|
| NLP | Polarity | Sentiment Analysis - Polarity |
| NLP | Image & Text | Image Captions |
| ML | Hyperparameter Tuning [WIP] | Exhaustive Grid Search Randomized Parameter Optimization Automate Hyperparameter Search |
| NLP | Neural Style Transfer | Style Transfer |