All code was taken from Max deGroot's & Ellis Brown's ssd.pytorch repository except the object_detection.py file. However, some modifications were done in order to make this project run on Windows 10 and Python 3.6 with PyTorch 0.4.1 for CUDA 9.2.
| Local Machine Specs |
|---|
| Windows 10 |
| NVIDIA GeForce GTX 850M |
| CUDA 10.0 (Download) |
| cuDNN 10.0 (Download) |
Even though PyTorch 0.4.1 for CUDA 9.2 was installed, the library also works for CUDA 10.0.
Before you start please refer to the original repository on how to use this code properly.
Basically, what you will need to do is:
-
Download the datasets and the pretrained VGG-16 base network (both described in the original repository).
-
Install PyTorch by visiting the website and choosing your specifications.
-
Install OpenCV, NumPy & imageio by executing the following line in your Terminal/Command Prompt:
pip install -r requirements.txt
The following modifications has been made to successfully execute train.py:
In train.py line 203 (line 165 in the original repo) was changed from:
images, targets = next(batch_iterator)to:
try:
images, targets = next(batch_iterator)
except StopIteration:
batch_iterator = iter(data_loader)
images, targets = next(batch_iterator)The fix was copied from this comment.
Fixed naming of the saved model in train.py on line 239 & 244 (line 196 & 198).
In layers/modules/multibox_loss.py add loss_c = loss_c.view(pos.size()[0], pos.size()[1]) on line 97 like so:
# Hard Negative Mining
loss_c = loss_c.view(pos.size()[0], pos.size()[1])
loss_c[pos] = 0 # filter out pos boxes for now
loss_c = loss_c.view(num, -1)and then change N = num_pos.data.sum() to N = num_pos.data.sum().float() on line 115.
In layers/functions/detection.py line 62 was changed from:
if scores.dim() == 0:
continueto:
if scores.size(0) == 0:
continueIf you are training on a Windows machine make sure to set the value of the --num_workers flag to 0 or you will get a BrokenPipeError: [Errno 32] Broken pipe error. On my machine, I also need to close all programs (except the Command Prompt of course) and set the batch size to 2 as well as the learning rate to 0.000006 in order to train the model otherwise I get a RuntimeError: CUDA error: out of memory error.
python train.py --num_workers 0 --batch_size 2 --lr 1e-6Since training on my local machine with the settings/flags above would take days (or even weeks) to get reasonable results I decided to train the SSD on an AWS spot instance.
To set up an AWS spot instance do the following steps:
- Login to your Amazon AWS Account
- Navigate to EC2 > Instances > Spot Requests > Request Spot Instances
- Under
AMIclick onSearch for AMI, typeAWS Deep Learning AMIin the search field, chooseCommunity AMIsfrom the drop-down and select theDeep Learning AMI (Ubuntu) Version 14.0 - Delete the default instance type, click on Select and select the p2.xlarge instance
- Uncheck the
Deletecheckbox under EBS Volumes so your progress is not deleted when the instance gets terminated - Set Security Groups to default
- Select your key pair under Key pair name (if you don't have one create a new key pair)
- At the very bottom set
Request valid untilto about 10 - 12 hours and setTerminate instances at expirationas checked (You don't have to do this but keep in mind to receive a very large bill from AWS if you forget to terminate your spot instance because the default value for termination is set to 1 year.) - Click
Launch, wait until the instance is created and then connect to your instance via ssh
There's also a detailed explanation from AWS about AWS Deep Learning AMIs. You might give it a shot as well.
When your spot instance is up and running AND you have connected to your spot instance you then need to activate the PyTorch environment like so:
source activate pytorch_p36
Activate PyTorch environment on spot instance
Lastly, clone this repository, proceed with the installation process (except for PyTorch) and start training by executing:
## you probably don't need to add any arguments here
python train.pyIf you don't want to train an SSD model and want to try the detection only you can download my trained SSD model. I've trained the model with all default values/parameters from the original repository but stopped the training after 1500 iterations because the loss stagnated.
To detect objects in a video you first need to install ffmpeg by executing the following line:
conda install ffmpeg -c conda-forgeNote: This command only works if you have the Anaconda Distribution installed on your computer.
After you have trained the SSD model and you want to detect objects in a video execute the following line in your Terminal/Command Prompt.
python object_detection.py path_to/your_ssd_model.pth path_to/your_video.mp4 -o name_of_your_output_video.mp4If the -o flag is not specified the output video will simply have the name output.mp4
You can watch sample outputs from here:
