Machine Learning

Multivariate Normal Distributions – III – Variance-Covariance Matrix and a distance measure for vectors of non-degenerate distributions

by eremo
24 Jul 202424 Jul 2024
Multivariate Normal Distributions

In previous posts of this series I have motivated the functional form of the probability density of a so called “non-degenerate Multivariate Normal Distribution“. In this post we will have a closer look at the matrix Σ that controls the probability density function [pdf] of such a distribution. We will show that it actually is the covariance matrix of the… Read More »Multivariate Normal Distributions – III – Variance-Covariance Matrix and a distance measure for vectors of non-degenerate distributions

Multivariate Normal Distributions – I – Basics and a random vector of independent Gaussians

by eremo
19 Jul 202424 Jul 2024
Multivariate Normal Distributions

This post series is about mathematical aspects of so called “Multivariate Normal Distributions“. In the literature two abbreviations are common: MNDs or MVNs. I will use both synonymously. To get an easy access I want to introduce MNDs as the result of a linear transformations applied to random vectors whose components can be described by independent 1-dimensional normal distributions. Afterward… Read More »Multivariate Normal Distributions – I – Basics and a random vector of independent Gaussians

Short ResNet training on CIFAR10 over 21 epochs

AdamW for a ResNet56v2 – V – weight decay and cosine shaped schedule of the learning rate

by eremo
14 Jul 202415 Jul 2024
ResNets

In this post series we try to find methods to reduce the number of epochs for the training of ResNets on image datasets. Our test case is CIFAR10. In this post we will test a modified cosine shaped schedule for a systematic and fast reduction of the learning rate LR. This supplements the approaches described in previous posts of this… Read More »AdamW for a ResNet56v2 – V – weight decay and cosine shaped schedule of the learning rate

AdamW for a ResNet56v2 – III – excursion: weight decay vs. L2 regularization in Adam and AdamW

by eremo
12 Jun 202410 Jul 2024
ResNets

A major topic of this post series is the investigation of methods to reduce the number of required training epochs for ResNets. In particular with respect to image analysis. Our test case is defined by a ResNet56v2 neural network trained on the CIFAR10 dataset. For intermediate results of numerical experiments see the first two posts During the last week I… Read More »AdamW for a ResNet56v2 – III – excursion: weight decay vs. L2 regularization in Adam and AdamW