Skip to content
Runtime vs.number of dataloader workers and batch size

PyTorch / datasets / dataloader / data transfer to GPU – II – dataloader too slow on CPU?

Editorial hint: This post has been revised and changed in parts on March, 18th/19th/22nd, 2025, after some new tests and insights. The changes did not concern the result data of the performed experiments, but their interpretation. In the last post of this mini-series we saw that some Torchvision datasets have a directly accessible property “data“. It contains image data in… Read More »PyTorch / datasets / dataloader / data transfer to GPU – II – dataloader too slow on CPU?

Examples images from the FashionMnist dataset

PyTorch / datasets / dataloader / data transfer to GPU – I – properties of some torchvision datasets

For an old fan of Tensorflow2 it is somewhat satisfactory to notice that some problems also exist in analogous form in a PyTorch environment. Anyone who has worked with visual data knows that one needs to modify, augment and transform the image data and then load them from some storage under CPU control to the GPU’s VRAM before or during… Read More »PyTorch / datasets / dataloader / data transfer to GPU – I – properties of some torchvision datasets

Two CUDA/cudnn versions with Pytorch and Tensorflow in one virtual Python environment

One of the problems I recently ran into was the coexistence of Tensorflow2 [TF2] and PyTorch in the very same virtual Python environment. I just wanted to make experiments to compare the performance of some Keras-based models with the TF2-backend on one side and, on the other side, with the PyTorch-backend. My trouble resulted from a mismatch of two CUDA/cudnn… Read More »Two CUDA/cudnn versions with Pytorch and Tensorflow in one virtual Python environment

ResNet56V2 Convergence within epoch 20

AdamW for a ResNet56v2 – VI – Super-Convergence after improving the ResNetV2

In previous posts of this series I have shown that a Resnet56V2 with AdamW can converge to acceptable values of the validation accuracy for the CIFAR10 dataset – within less than 26 epochs. An optimal schedule of the learning rate [LR] and optimal values for the weight decay parameter [WD] were required. My network – a variation of the ResNetV2-structure… Read More »AdamW for a ResNet56v2 – VI – Super-Convergence after improving the ResNetV2