Skip to content

ResNets

Posts on Residual Networks

AdamW for a ResNet56v2 – II – Adam with weight decay vs. AdamW, linear LR-schedule and L2-regularization

This series is about a ResNetv56v2 tested on the CIFAR10 dataset. In the last post AdamW for a ResNet56v2 – I – a detailed look at results based on the Adam optimizer we investigated a piecewise constant reduction schedule for the Learning Rate [LR] over 200 epochs. We found that we could reproduce results of R. Atienza, who had claimed validation accuracy values of with the Adam optimizer. We saw a dependency on the batch size [BS] – and concluded that BS=64 was a good… Read More »AdamW for a ResNet56v2 – II – Adam with weight decay vs. AdamW, linear LR-schedule and L2-regularization

AdamW for a ResNet56v2 – I – a detailed look at results based on the Adam optimizer

This post requires Javascript to display formulas! The last days I started to work on ResNets again. The first thing I did was to use a ResNet code which Rowel Atienza has published in his very instructive book “Advanced Deep Learning with Tensorflow2 and Keras” [1]. I used the code on the CIFAR10 dataset. Atienza’s approach for this test example is to use image augmentation in addition to L2-regularization with the good old Adam optimizer and a piecewise constant Learning Rate schedule. For a ResNet56v2… Read More »AdamW for a ResNet56v2 – I – a detailed look at results based on the Adam optimizer

ResNet basics – II – ResNet V2 architecture

In the 1st post of this series ResNet basics – I – problems of standard CNNs I gave an overview over building blocks of standard Convolutional Neural Networks [CNN]. I also briefly discussed some problems that come up when we try to build really deep networks with multiple stacks of convolutional layers [e.g. Conv1D- or Conv2D-layers of the Keras framework]. In this 2nd post I discuss the core elements of so called deep Residual Networks [ResNets]. ResNets have been published in multiple versions. The versions… Read More »ResNet basics – II – ResNet V2 architecture

ResNet basics – I – problems of standard CNNs

Convolutional Neural Networks [CNNs] do a good job regarding the analysis of image or video data. They extract correlation patterns hidden in the numeric data of our media objects, e.g. images or videos. Thereby, they get an indirect access to (visual) properties of displayed physical objects – like e.g. industry tools, vehicles, human faces, …. But there are also problems with standard CNNs. They have a tendency to eliminate some small scale patterns. Visually this leads to smoothing or smear-out effects. Due to an interference… Read More »ResNet basics – I – problems of standard CNNs