Training different architectures (PyTorch) on the CIFAR10 dataset without any tricks i.e., auto-augmentation, cutout, droppath, dropout.
- Python 3.6+
- PyTorch 1.5+
| Model | Acc. | FLOPS | param | training time (hours) |
|---|---|---|---|---|
| Lenet | 77.56% | 0.65M | 0.06M | 0.63 |
| googlenet | 95.26% | 1529M | 6.16M | 6.16 |
| Mobilenet | 92.18% | 47M | 3.21M | 0.85 |
| MobilenetV2 | 93.81% | 94M | 2.296M | 1.95 |
| MobilenetV3Large | 92.89% | 79.4M | 2.688M | 1.76 |
| MobilenetV3Small | 91.37% | 18.5M | 1.241M | 1.08 |
| ResNet18 | 95.59% | 556M | 11.173M | 1.61 |
| ResNet34 | 95.32% | 1161M | 21.282M | 1.99 |
| ResNet50 | 95.74% | 1304M | 23.52M | 4.36 |
| ResNet101 | 95.43% | 2520M | 42.51M | 7.07 |
| ResNet152 | 95.91% | 3736M | 58.15M | 9.99 |
| PreACtResNet18 | 95.37% | 556M | 11.17M | 1.22 |
| PreACtResNet34 | 95.12% | 1161M | 21.27M | 1.96 |
| PreACtResNet50 | 95.95% | 1303M | 23.50M | 4.28 |
| PreACtResNet101 | 95.44% | 2519M | 42.50M | 6.98 |
| PreACtResNet152 | 95.76% | 3735M | 58.14M | 9.92 |
| SENet18 | 95.46% | 556M | 11.26M | 1.87 |
| RegNetX_200MF | 95.19% | 226M | 2.32M | 2.83 |
| RegNetX_400MF | 94.12% | 471M | 4.77M | 4.77 |
| RegNetY_400MF | 95.51% | 472M | 5.71M | 4.91 |
| ResNeXt29(32x4d) | 95.49% | 779M | 4.77M | 4.18 |
| ResNeXt29(2x64d) | 95.41% | 1416M | 9.12M | 4.39 |
| ResNeXt29(4x64d) | 95.76% | 4242M | 27.1M | 11.0 |
| DenseNet121_Cifar | 95.28% | 128M | 1.0M | 2.46 |
| DPN26 | 95.64% | 670M | 11.5M | 5.69 |
| DPN92 | 95.66% | 2053M | 34.2M | 15.43 |
| EfficientB0 | 93.24% | 112M | 3.69M | 2.92 |
| NASNet | 95.18% | 615M | 3.83M | 14.7 |
| AmoebaNet | 95.38% | 499M | 3.14M | 11.99 |
| Darts_V1 | 95.05% | 511M | 3.16M | 11.69 |
| Darts_V2 | 94.97% | 539M | 3.34M | 12.32 |
The learning rate is adjusted by the consine learning schedular.
Resume the training with python main.py --lr=0.1 --model_name resnet18