- Batch Normalization.
- Group Normalization.
- Layer Normalization.
All the three models in the three approach follow the below structure:
C1 C2 c3 P1 C4 C5 C6 c7 P2 C8 C9 C10 GAP c11
where c3, c7 and c11 are 1x1 convolution.
The difference between the 3 models is the normalization technique.
The models aims to classify the images in CIFAR10 dataset. The aim of the models is to achieve 70% accuracy using 50K or less params in under 20 EPOCHS.
The model in the S8CIFAR_BN.ipynb is a convolution network that uses Batch Normalization.
Total Parms: 48096
Training accuracy: 70.82%
Test accuracy: 74.5%


The model in the S8CIFAR_GN.ipynb is a convolution network that uses Group Normalization.
Total Parms: 48096
Training accuracy: 71.01%
Test accuracy: 73.08%


The model in the S8CIFAR_LN.ipynb is a convolution network that uses Layer Normalization.
Total Parms: 111,568
Training accuracy: 52.05%
Test accuracy: 59.44%


- Based on the implementation here, Batch Normalization and Group Normalization added similar number of params.
- The number of param added by Layer Normalization is large.
- accuracy from BN and GN is better than LN.