Skip to content

ResNet V2

AdamW for a ResNet56v2 – II – Adam with weight decay vs. AdamW, linear LR-schedule and L2-regularization

This series is about a ResNetv56v2 tested on the CIFAR10 dataset. In the last post AdamW for a ResNet56v2 – I – a detailed look at results based on the Adam optimizer we investigated a piecewise constant reduction schedule for the Learning Rate [LR] over 200 epochs. We found that we could reproduce results of R. Atienza, who had claimed validation accuracy values of with the Adam optimizer. We saw a dependency on the batch size [BS] – and concluded that BS=64 was a good… Read More »AdamW for a ResNet56v2 – II – Adam with weight decay vs. AdamW, linear LR-schedule and L2-regularization

ResNet basics – II – ResNet V2 architecture

In the 1st post of this series ResNet basics – I – problems of standard CNNs I gave an overview over building blocks of standard Convolutional Neural Networks [CNN]. I also briefly discussed some problems that come up when we try to build really deep networks with multiple stacks of convolutional layers [e.g. Conv1D- or Conv2D-layers of the Keras framework]. In this 2nd post I discuss the core elements of so called deep Residual Networks [ResNets]. ResNets have been published in multiple versions. The versions… Read More »ResNet basics – II – ResNet V2 architecture