Food-101-Classifcation-using-Distilbert-and-LSTM

This repository compares the performace of two deep learning architectures, DistilBert transformer and LSTM, for classification on the Food 101-Dataset. The Food-101 dataset comprises 250 images and the corresponding captions of each of the 101 food dishes. For the experiemnts here, we do not use the images, only text captions of the images. The text files, train_titles.csv and test_titles.csv, are provided in the repository.

Files and description

There are two python code files in this repository, which are explained below:

1. distilBertClassifier.ipynb: python notebook that imports the pre-trained distilbert transfomer from huggingface, fine-tunes the weights, and tests the fine-tuned model.

2. lstmClassifier.ipynb: python notebok that trains and tests the lstm + feedfoward network using pytorch.

The test and train dataset, train_titles.csv and test_tiles.csv, are also included in the repsitory.

Please email any questions to aabbasi1@iastae.edu

Expected Output

The distilbert clasifier achieves an expected 86 percent classification accuracy on the text Food 101 Dataset.

A good sanity check is to inspect a single data point output from the data loaders by decoding the output token ids and reading the decoded sentence to see if it makes sense. For example,

Encoded Text = tensor([ 101, 3313, 6207, 11345, 2007, 9781, 4168, 2389, 19116, 17974, 1064, 2026, 2890, 6895, 10374, 1012, 4012, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]), torch.LongTensor, torch.Size([128])

Decoded tokens from encoded ids: '[CLS] double apple pie with cornmeal crust recipe | myrecipes. com [SEP] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]'

Notice how the decoded sentence has special tokens like [CLS], [SEP], [PAD] that indicate the Start-of-Sentece/Classification token, the Separator/End-of-sentence token and the padding token [PAD].

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
__pycache__		__pycache__
figures		figures
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
distilBertClassifier.ipynb		distilBertClassifier.ipynb
lstmClassifier.ipynb		lstmClassifier.ipynb
requirements.txt		requirements.txt
test_titles.csv		test_titles.csv
train_titles.csv		train_titles.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Food-101-Classifcation-using-Distilbert-and-LSTM

Files and description

Expected Output

About

Uh oh!

Releases

Packages

Languages

aabbas02/Food-101-Classifcation-using-Distilbert-and-LSTM

Folders and files

Latest commit

History

Repository files navigation

Food-101-Classifcation-using-Distilbert-and-LSTM

Files and description

Expected Output

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages