Machine Learning - PyTorch

Welcome to my machine learning repository! Here you'll find a collection of notebooks that I've created while exploring the world of machine learning. I've used a variety of libraries, including PyTorch, transformers, and xformers, to build models and complete tasks from scratch. Many of the notebooks are well-commented in English, so feel free to learn along with me. Please note that there may be some mistakes or unfinished notebooks - any issues or pull requests are welcome!

Warning There may be some unfinished notebooks, please use with caution.

Getting Started

To get started, you can either install the required environment using conda or build a docker image.

Clone the Repository

First, clone the repository and navigate to the MachineLearning directory:

git clone https://github.com/JenkinsGage/MachineLearning.git
cd MachineLearning

Install with Conda

To install the environment using conda, run the following commands:

conda env create --file environment.yml
conda activate ml-torch

Or Build with Docker

Alternatively, you can build a docker image and run a container:

docker build -t ml-torch-cuda .
docker run -dp 8888:8888 ml-torch-cuda

Once the container is up and running, a Jupyter Lab server will be available on port 8888.

Build a Translation Model Using the Transformer Module of PyTorch
In this notebook, I use PyTorch's transformer module to build a translation model that can translate Chinese to English. The Chinese text is tokenized using jieba and the English text is tokenized using torchtext's basic English tokenizer. The model is trained on the wmt19 dataset from Hugging Face.
Build Tokenizer Using Tokenizers Library
Here I use the tokenizers library to build tokenizers for both English and Chinese. The WordPiece model is used, so this approach can be applied to other languages as well.
Build a Translation Model Using the XFormers Library and Tokenizers
In this notebook, I use XFormers library to build the transformer model quickly. And such memory efficient model uses less memory which means we can train a more complex model with limited memory. I also use the tokenizers I just trained in [Build a Translation Model Using the XFormers Library and Tokenizers], so please make sure to run this notebook first to get tokenizers before training the model.
Build a Translation Model With Pretrained BERT as Encoder
In this notebook, I use a pretrained BERT model to replace the encoder in the last notebook and freeze all the parameters of the encoder. I also tried to use some learning rate warmup techniques to make the model more stable.

Paraphrasing

Using Pretrained Model from Huggingface for Paraphrasing
In this notebook, I use a pretrained model from Hugging Face (humarin/chatgpt_paraphraser_on_T5_base) to paraphrase sentences.
Paraphrase with Gradio WebUI App
Here I use Gradio to build a web-based user interface for interacting with the (humarin/chatgpt_paraphraser_on_T5_base) model.

Computer Vision

Use Quantized Pretrained Model for Fast and Auto Label/Classification

Time Series Forcasting

...

Projects Structure

The repository is organized as follows:

├── MachineLearning
│   ├── Area(NLP, Machine Vision, ...)
│   │   ├── Task(Translation, Paraphrasing, ...)
│   │   │   ├── Model
│   │   │   │   ├── SavedModels
│   │   │   ├── Data
│   │   │   │   ├── Datasets
│   │   │   ├── Notebook1.ipynb
│   │   │   ├── Notebook2.ipynb
│   │   │   ├── ...
│   │   │   ├── GradioApp.py

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
Art/MidiGen		Art/MidiGen
ComputerVision		ComputerVision
DataVis		DataVis
General/Modules		General/Modules
NLP		NLP
TimeSeriesForecasting/TiDE		TimeSeriesForecasting/TiDE
Dockerfile		Dockerfile
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Machine Learning - PyTorch

Getting Started

Clone the Repository

Install with Conda

Or Build with Docker

Contents

Natural Language Processing (NLP)

Neural Machine Translation

Paraphrasing

Computer Vision

Time Series Forcasting

Projects Structure

About

Uh oh!

Releases

Packages

Languages

JenkinsGage/Learn-MachineLearning-PyTorch

Folders and files

Latest commit

History

Repository files navigation

Machine Learning - PyTorch

Getting Started

Clone the Repository

Install with Conda

Or Build with Docker

Contents

Natural Language Processing (NLP)

Neural Machine Translation

Paraphrasing

Computer Vision

Time Series Forcasting

Projects Structure

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages