Skip to content

Machine Learning

Concentric surfaces of ellipsoids

Multivariate Normal Distributions – III – Variance-Covariance Matrix and a distance measure for vectors of non-degenerate distributions

In previous posts of this series I have motivated the functional form of the probability density of a so called “non-degenerate Multivariate Normal Distribution“. In this post we will have a closer look at the matrix Σ that controls the probability density function [pdf] of such a distribution. We will show that it actually is the covariance matrix of the… Read More »Multivariate Normal Distributions – III – Variance-Covariance Matrix and a distance measure for vectors of non-degenerate distributions

contour ellipsoids of a projected MND

Multivariate Normal Distributions – I – Basics and a random vector of independent Gaussians

This post series is about mathematical aspects of so called “Multivariate Normal Distributions“. In the literature two abbreviations are common: MNDs or MVNs. I will use both synonymously. To get an easy access I want to introduce MNDs as the result of a linear transformations applied to random vectors whose components can be described by independent 1-dimensional normal distributions. Afterward… Read More »Multivariate Normal Distributions – I – Basics and a random vector of independent Gaussians

Short ResNet training on CIFAR10 over 21 epochs

AdamW for a ResNet56v2 – V – weight decay and cosine shaped schedule of the learning rate

In this post series we try to find methods to reduce the number of epochs for the training of ResNets on image datasets. Our test case is CIFAR10. In this post we will test a modified cosine shaped schedule for a systematic and fast reduction of the learning rate LR. This supplements the approaches described in previous posts of this… Read More »AdamW for a ResNet56v2 – V – weight decay and cosine shaped schedule of the learning rate

AdamW for a ResNet56v2 – III – excursion: weight decay vs. L2 regularization in Adam and AdamW

A major topic of this post series is the investigation of methods to reduce the number of required training epochs for ResNets. In particular with respect to image analysis. Our test case is defined by a ResNet56v2 neural network trained on the CIFAR10 dataset. For intermediate results of numerical experiments see the first two posts During the last week I… Read More »AdamW for a ResNet56v2 – III – excursion: weight decay vs. L2 regularization in Adam and AdamW