Low accuracy from provided ONNX models

I downloaded the ONNX models listed at:
https://github.com/snap-research/EfficientFormer/blob/2c0e950dc269b7f0229a3917fd54661b964554e0/README.md#models
and the v2 ONNX models do not achieve the expected accuracy on ImageNet. It appears that they were exported *without* the trained weights.

I evaluated the ONNX models, run with ONNXRuntime, using the `eval_onnx.py` script at:
https://gist.github.com/mcollinswisc/5652651fcb59e574fa51571e09507764

Using this script, for the `efficientformerv2_s0.onnx` that I downloaded from the Google Drive link in the README.md, I see:

```
$ python3 eval_onnx.py --model_path=efficientformerv2_s0.onnx --imagenet_dir="${HOME}/data/image-classification/imagenet"
100%|██████████████████████████████████████████████████████████████████████████████████████████| 50000/50000 [22:38<00:00, 36.80it/s]
Top-1 Accuracy: 0.10200000000000001
Top-5 Accuracy: 0.52
```

However, if I do the following:

1. Download the trained weights in PyTorch format
2. Re-run the export to ONNX with `toolbox.py`
3. Evaluate the new ONNX model that I exported, instead of the downloaded ONNX model.

Then I see approximately the expected accuracy:

```
$ python toolbox.py --model efficientformerv2_s0 --ckpt weights/eformer_s0_450.pth --onnx
Torch version 2.0.1+cu117 has not been tested with coremltools. You may run into unexpected errors. Torch 2.0.0 is the most recent version that has been tested.
load success, model is initialized with pretrained checkpoint
number of tracked layers (conv, fc, gelu, ...): 126
0.396088544 GMACs
3.534553 M parameters
============= Diagnostic Run torch.onnx.export version 2.0.1+cu117 =============
verbose: False, log level: Level.ERROR
======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ========================

successfully export onnx

$ python3 eval_onnx.py     --model_path=efficientformerv2_s0.onnx     --imagenet_dir="${HOME}/data/image-classification/imagenet"
100%|█████████████████████████████████████████████████████████████████████████████| 50000/50000 [13:56<00:00, 59.76it/s]
Top-1 Accuracy: 75.896
Top-5 Accuracy: 92.778
```

For the other model variants, I observed with `eval_onnx,py`:

| Downloaded ONNX File      | Top-1 Accuracy  |
| :------------------------ | --------------: |
| efficientformerv2_s0.onnx |           0.10%  |
| efficientformerv2_s1.onnx |           0.08% |
| efficientformerv2_s2.onnx |           0.1%   |
| efficientformerv2_l.onnx  |           0.09% |
| efficientformer_l1.onnx   |          79.22%  |
| efficientformer_l3.onnx   |          82.42%  |
| efficientformer_l7.onnx   |          83.35% |

The weights in the EfficientFormerV2 ONNX files from Google Drive also, as far I can guess, look more like randomly-initialized weights. Maybe the checkpoint loading in `toolbox.py` did not succeed for some of these, when they were exported? As allowed by the `try`/`except` block here:
https://github.com/snap-research/EfficientFormer/blob/2c0e950dc269b7f0229a3917fd54661b964554e0/toolbox.py#L98
Were the Google Drive links intended to point to ONNX models with the same trained weights as the PyTorch checkpoint?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Low accuracy from provided ONNX models #59

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Downloaded ONNX File	Top-1 Accuracy
efficientformerv2_s0.onnx	0.10%
efficientformerv2_s1.onnx	0.08%
efficientformerv2_s2.onnx	0.1%
efficientformerv2_l.onnx	0.09%
efficientformer_l1.onnx	79.22%
efficientformer_l3.onnx	82.42%
efficientformer_l7.onnx	83.35%

Low accuracy from provided ONNX models #59

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions