This repo contains code+results for a project in which we train and evaluate image classification networks that are rotation and scale-invariant out-of-the-box.
At the moment, there is a small powerpoint that describes the general architecture of the deep log-polar network. (I can't say whether that might be more or less helpful than the README).
The building block of most of the networks is an invariant log-polar space convolution layer ("LPConv" layer). This takes in a H x W x C_in tensor, performs a bunch of log-polar transforms, one at every pixel in the image, and produces a H x W x C_in x N_tau x N_theta tensor (order of dims may be incorrect). Then a 2D convolution and pooling is performed over the last 2 dims. This extracts invariant features, and outputs as H x W x C_in. For example, a high activation at (2, 1) in the first channel might mean there's an edge at some angle and some scale.
The deep log-polar network is a series of invariant log-polar space convolution layers, with depthwise linear layers (== 1x1 convolutions) in between them.
The LP-ResNet network is just a ResNet, except every convolutional layer is replaced by an invariant log-polar space convolution layer.
There's also the single log-polar transform network, where instead of just one conv-pooling operation in log-polar space, the whole network operates in log-polar space.
In short:
- DeepLP:
[LP Transform -> Conv+pool] -> [LP Transform -> Conv+pool] -> ... - SingleLP:
LP Transform -> Deep Network -> Pool
models/: Python model code. Containing lots of weird custom layers that might not be useful.out/: script-generated logs + checkpoints + config files + model weights.running_jobscontains the stderr and stdout files for the currently running jobs, which are useful to model to see training progress and model summary (Note: these aren't auto-deleted, unfortunately).param-files/: YAML files that fully specify a training run. After creating one here, you can run it, or a few at a time, by editingtrain-batch.shthen runningsbatch train_batch.sh.
Both out and param-files are organized like model_name/dataset_name.
-
A training run is specified by a YAML file, for example this one for MNIST.
-
These YAML files are loaded using
train_batch.sh, which creates the models and trains them- Put the filenames (without .yaml) in
$ParamFilesintrain_batch.shlike this:ParamFiles=(resnet20 resnet32 resnet44 resnet56). You'll have to change#SBATCH --array=0-3to select from that list (for example, if you have 3 param files you only need--array=0-2). To specify the parent directory of those YAML files, edit the secondParamFileprefix. Then editExperimentRelativePathto specify the output directory underout/$SLURM_JOB_NAME.
- Put the filenames (without .yaml) in
- After training, you can evaluate a trained model by running
python3 evaluate_2d.pywith the model's experiment directory, for exampleout/Deep_LP_train/mnist_r/bilinear_ntau20_new_427144/30deg_rotations_tk12_lg_1. Or you can useevaluate_batch.shto run a few at a time. - You'll have to edit
evaluate_2d.pyto generate whatever datasets you want to test the models on (it's a bit of a mess, sorry). - After running that script, an
evaluate_results.yamlfile will appear in the experiment's directory. If you're usingevaluate_batch.sh, you can also monitor the progress by looking in theout/.../running_jobsdirectory.