ResNet
scroll ↓ to Resources
Contents
- Note
- Example: ResNet-34
- Comparison of ResNet networks
- Analysis
- [[#Analysis#Bag of Tricks|Bag of Tricks]]
- Resources
Note
- family of network architectures with advancing architectures (ResNext and others) and different number of layers (ResNet-18, ResNet-34, ResNet-50, ResNet-101, and ResNet-152)
- dominated computer vision architecture before the vision transformer appeared, still remains the go-to solution when solving a standard well-known problem or prototyping
- The original paper proposed a number of groundbreaking architectural solutions such as skip connection
Example: ResNet-34
- 4 blocks of convolution layers with skip connection every two layers
- First skip connection in the beginning of each block of layers creates a problem due to the increasing number of layers between blocks: when skipping a convolution output from 3x3x64 needs to be summed with 3x3x128 of the next convo after the skip connection. Such mismatches denoted with a dashed skip connection line.
- One widespread solution for this is a 1x1xR convolution with stride 2
- Full pre-activation with ReLU - order of batch normalization, activation and convolution (weights) which yielded best experimental results.
- global average pooling before the last classification head reduces the number of parameters by averaging each output filter map to one number: 7x7x512 becomes just 1x512 and eliminates most of the connections to the fully-connected layer. This was another difference to the VGG architectures of the past (which had 90% of trainable parameters in fully connected layers), affordable in terms of quality due to the fact that much deeper ResNet networks produced higher quality output features.
Comparison of ResNet networks
Analysis
Bag of Tricks
- Bag of Tricks for Image Classification with Convolutional Neural Networks
- Model tweaks
- Linear scaling learning rate
- learning rate warm up
- Cosine learning rate decay
- Zero gamma
- No bias decay
- MixUp training
Resources
- Deep Residual Learning for Image Recognition
- Residual Networks Behave Like Ensembles of Relatively Shallow Networks
Links to this File
table file.inlinks, file.outlinks from [[]] and !outgoing([[]]) AND -"Changelog"