CUDNN Versions

Performance Across CUDNN Versions

The generated code is compatible with many CUDNN Versions. Here, we installed different CUDNN versions (maintaining the CUDA Version to be 10.1 update 2) on a Tesla_V100-SXM2-16GB system. All plots bellow show a slice of the heuristics within the 7.x CUDNN release cycle.

Impact on End-to-End Latency

We examined the choice of heuristic on end-to-end latency and find that it can be significant. This is especially true for older architectures, where the heuristic has not been finely tuned