diff --git a/README.md b/README.md index b5d3371a94bea1c6525d0f182dd16b0996b1e996..771e6366c22e7e211610308520bc8706d69a51e7 100644 --- a/README.md +++ b/README.md @@ -133,6 +133,34 @@ Using the example project from this repository i will show some metrics and comp <!-- System memory efficiency --> <!-- Training time TODO --> +## Caveats +#### Data Types +To save GPU memory while training we often use 16 bit instead of 32 bit floats. The lower accuracy rarely effects model performance but we can almost fit twice as many parameters on the GPU. +PyTorch has a collection of different float types (most commmonly `torch.bfloat16` is used, as it offers the best balance between accuracy and value range). +When loading FFCV the data is initially represented as Numpy's `ndarray`. +This of course is incompatible with PyTorch tensors aswell as Torchvision transforms. +To avoid type issues a pipeline (just a list of transforms) should look as follows +1. FFCV Decoder (such as `RandomResizedCropRGBImageDecoder`) +2. FFCV Transforms (such as `RandomHorizontalFlip` or `Cutout`) +3. Necessary Operations: `ToTensor`, `ToDevice` and `ToTorchImage` +4. Now that the Data is a PyTorch tensor we have the following options to set the datatype: + 1. `Convert(dtype)` dtype can be a numpy _or_ torch type + 2. `NormalizeImage(mean, std, type)` type _must_ be a numpy type, however most numpy types have an equivalent that can be automatically converted to a PyTorch type # FIXME: is the automatic tpye conversion really happening? +5. Torchvision transforms + + +# Other Notes on performance +## AMP +use bfloat16 +TODO + +## Memory Format +use channels_last + amp +TODO + + + + ## Notes Most information about this guide comes from the [original paper](https://arxiv.org/abs/2306.12517), the [official website](https://ffcv.io/) and personal experience. At the time of writing the documentation was incomplete so information was retrieved by reading the source code.