Skip to content

epoch reduction

AdamW for a ResNet56v2 – II – Adam with weight decay vs. AdamW, linear LR-schedule and L2-regularization

This series is about a ResNetv56v2 tested on the CIFAR10 dataset. In the last post AdamW for a ResNet56v2 – I – a detailed look at results based on the Adam optimizer we investigated a piecewise constant reduction schedule for the Learning Rate [LR] over 200 epochs. We found that we could reproduce results of R. Atienza, who had claimed validation accuracy values of with the Adam optimizer. We saw a dependency on the batch size [BS] – and concluded that BS=64 was a good… Read More »AdamW for a ResNet56v2 – II – Adam with weight decay vs. AdamW, linear LR-schedule and L2-regularization

AdamW for a ResNet56v2 – I – a detailed look at results based on the Adam optimizer

This post requires Javascript to display formulas! The last days I started to work on ResNets again. The first thing I did was to use a ResNet code which Rowel Atienza has published in his very instructive book “Advanced Deep Learning with Tensorflow2 and Keras” [1]. I used the code on the CIFAR10 dataset. Atienza’s approach for this test example is to use image augmentation in addition to L2-regularization with the good old Adam optimizer and a piecewise constant Learning Rate schedule. For a ResNet56v2… Read More »AdamW for a ResNet56v2 – I – a detailed look at results based on the Adam optimizer